There are many things that made me happy about this manuscript as a modeler, not the least of which is the callout to modelers about what ought to be included in their models of language acquisition (hurrah for experimentally-motivated guidance!). For example, there's good reason to believe that a "noise parameter" that simply distorts the input in some way can be replaced by a more targeted perceptual intake noise parameter that distorts the input in particular ways. Also, I love how explicit O&L are about the observed vs. latent variables in their view of the acquisition process -- it makes me want to draw plate diagrams. And of course, I'm a huge fan of the distinction between input and intake.
Another thing that struck me was the effects incrementality could have. For example, it could cause prioritization of working memory constraints over repair costs, especially when repair is costly, because the data's coming at you
now and you have to do something about it. This is discussed in light of the parser and syntax, but I'm wondering how it translates to other types of linguistic knowledge (and perhaps more basic things like word segmentation, lexical acquisition, and grammatical categorization). If this is about working memory constraints, we might expect it to apply whenever the child's "processor" (however that's instantiated for each of these tasks) gets overloaded. So, at the beginning of word segmentation, it's all about making your first guess and sticking to it (perhaps leading to snowball effects of mis-segmentation, as you use your mis-segmentations to segment other words). But maybe later, when you have more of a handle of word segmentation, it's easier to revise bad guesses (which is one way to recover from mis-segmentations, aside from continuing experience).
This relates to the cost of revision in areas besides syntax. In some sense, you might expect that cost is very much tied to how hard it is to construct the representation in the first place. For syntax (and the related sentential semantics), that can continue to be hard for a really long time, because these structures are so complex. And as you get better at it, it gets faster, so revision gets less costly. But looking at word segmentation, is constructing the "representation" ever that hard? (I'm trying to think what the "representation" would be, other than the identification of the lexical item, which seems pretty basic assuming you've abstracted to the phonemic level.) If not, then maybe word segmentation revision might be less costly, and so the swing from being revision-averse to revision-friendly might happen sooner for this task than in other tasks.
Some more targeted thoughts:
(i) One thing about the lovely schematic in Figure 1: I can definitely get behind the perceptual intake feeding the language acquisition device (LAD) and (eventually) feeding the action encoding, but I'm wondering why it's squished together with "linguistic representations". I would have imagined that perceptual intake directly feeds the LAD, and the LAD feeds the linguistic representation (which then feeds the action encoding). Is the idea that there's a transparent mapping between perceptual intake and linguistic representations, so separating them is unnecessary? And if so, where's the place for acquisitional intake (talked about in footnote 1 on p.7), which seems like it might come between perceptual intake and LAD?
(ii) I found it a bit funny that footnote 2 refers to the learning problem as "inference-under-uncertainty" rather than the more familiar "poverty of the stimulus" (PoS). Maybe PoS has too many other associations with it, and O&L just wanted to sidestep any misconceptions arising from the term? (In which case, probably a shrewd move.)
(iii) In trying to understand the relationship between vocabulary size and knowledge of pronoun interpretation (principle C), O&L note that children who had faster lexical access were not faster at computing principle C, so it's not simply that children who could access meaning faster were then able to do the overall computation faster. This means that the hypothesis that "more vocabulary" equals "better at dealing with word meaning", which equals "better at doing computations that require word meaning as input" can't be what's going on. So do we have any idea what the link between vocabulary size and principle C computation actually is? Is vocabulary size the result of some kind of knowledge or ability that would happen after initial lexical access, and so would be useful for computing principle C too? One thought that occurred to me was that someone who's good at extracting sentential level meaning (i.e., because their computations over words happen faster) might find it easier to learn new words in the first place. This then could lead to a larger vocabulary size. So, this underlying ability to compute meaning over utterances (including using principle C) could cause a larger vocabulary, rather than knowing lots of words causing faster computation.
(iv) I totally love the U-shaped development of filler-gap knowledge in the Gagliardi et al. (submitted) study. It's nice to see an example of this qualitative behavior in a realm besides morphology. The explanation seems similar, too -- a more sophisticated view of the input causes errors, which take some time to recover from. But the initial simplified view leads to surface behavior that seems right, even if the underlying representation isn't at that point. Hence, U-shaped performance curve. Same for the White et al. 2011 study -- U-shaped learning in syntactic bootstrapping for the win.
(v) I really liked the note on p.45 in the conclusion about how the input vs. intake distinction could really matter for L2 acquisition. It's nice to see some explicit ideas about what the skew that occurs is and why it might be occurring. (Basically, this feels like a more explicit form of the "less is more" hypothesis, where adult processing is distorting the input in predictable ways.)