I definitely appreciate that the authors are trying to explore provocative ideas - specifically, it seems like they want to claim that hierarchical structure isn't required for language use (defined as production, comprehension, and acquisition). However, from what I can tell, the evidence they present is more about how they can lessen the amount of hierarchical structure required for any given aspect of language use, rather than eliminate it altogether (e.g., section 3b: "...evidence for the primacy of sequential processing...", not "sequential processing is the only type of processing going on"; section 4c: "...while a syntactic structure is only assigned at a later stage...", not "while a syntactic structure is never assigned"; section 5a: "...reanalyses that deemphasize hierarchical structure...", not "reanalyses that eliminate hierarchical structure"). This then seems to play into the continuing debate going on about exactly what kind of structure is required for language (for example, generativist representations vs. constructionist representations). And this debate isn't particularly new, as far as I'm aware.
The basic issue that kept occurring to me as I read this was that hierarchy, in its most basic conception, is the idea that you have units that are made out of other units (which can be made out of other units, etc.). Constructing these hierarchical units (or constituents, if you prefer) is one idea of how you derive meaning from a sequence of word-units, for example. As far as I can tell, I don't think the authors would argue against this view of hierarchical structure. (And if they did, it's unclear to me what alternative they would propose to replace it.)
Also, the authors don't appear to be unhappy with the idea that hierarchy is part of the complete knowledge representation that's built for language (and so would therefore be the target of acquisition). Their claim seems to be more about how we don't need to use all that hierarchical knowledge all the time when we're producing or comprehending language (they try to claim this for acquisition as well, but that seems more tenuous if we think the point of acquisition is to acquire the target knowledge representation). If we focus just on production and comprehension, I think they still need to be more explicit about how to get from a sequence of word units to the complex knowledge representation an entire sentence corresponds to (they say something like this in section 5c: "...if subjects are motivated to read for comprehension, if sentence meaning depends on the precise (hierarchical) sentence structure.."). They present a kind of idea about this with the parallel streams in Figure 1, but I think this doesn't really take care of the underlying problem of constructing compositional (and non-compositional) meaning (more on this below).
Some more targeted thoughts:
The issue of translating between linear pieces and the entire meaning of a sentence appears right at the beginning, with example 2 in particular. While it's true that the pieces can be chunked this way, if you don't have some kind of additional relationship between "sentences" and "can be analysed" (for example, IP if we think of these pieces as NP and VP), how do you know how to put them together to get the larger meaning corresponding to "sentences can be analysed"? And if you do have that relationship somewhere, isn't that equivalent to having hierarchical structure, since these two pieces are subsumed under a larger unit (called IP above)?
In section 2, they mention the idea that "the mechanisms employed for language learning and use are likely to be fundamentally sequential in nature, rather than hierarchical". I have no problem with talking about the mechanism this way - in fact, that makes perfect sense (incremental processing, etc.). But isn't the mechanism distinct from the knowledge representations being manipulated? And that's the part whose structure people generally argue about?
In section 3c, where they talk about some of the models that can learn different aspects of syntax by just using sequential information, do they believe that the target knowledge for these structures doesn't involve hierarchy at all? If they believe the target knowledge does in fact involve hierarchy, then this falls back onto the mechanism vs. knowledge distinction I mentioned above. If they instead think there's no hierarchy even in the target knowledge, then I think they run into the basic problem of how you map words to sentential meaning without hierarchy (or dependency relations, etc.). I think they're aiming towards the former idea where hierarchy is present in the target knowledge (section 4b on combining constructions: "...seems intuitive to regard a combination of constructions as a part-whole relation...").
This then brings me to the parallel sequential streams idea presented in Figure 1. The fact that the pieces combine into a whole seems to be exactly what hierarchy accomplishes (part-whole relationships, etc). Beyond this, it seems one thing to slot pieces together in a parallel stream, and another to create a mental model from this (i.e., get the interpretation of the whole meaning once the pieces are composed together in particular ways).
Also, here's an excellent blog post by someone very knowledgeable who read this article last year, and had some very specific (and similar) things to say about it:
Norbert Hornstein's Faculty of Language: Three psychologists walk into a bar...