Some more targeted thoughts:
- In the introductory bit on p.3, I was surprised at the continuing use of "strong language specific learning biases" (in contrast to domain-general learning biases). Maybe this is because that's the kind of language-specific biases nativists often claim, but to me, any innate domain-specific bias would be part of Universal Grammar (UG), whether it's strong or not.
- (p.4) I thought the separation between the learnability of a particular grammar formalism and the learnability of a class of natural languages was very nice. It does seem like sometimes the motivation for UG learning biases comes from assuming a particular representation that's being learned, rather than accounting for empirical coverage of the data.
- (p.5) It seemed odd to me to say that it's not necessary to incorporate a characterization of the hypothesis space into the learner, but rather that the "design of the learner" will limit the hypothesis space. Is the difference that the hypothesis space is explicitly built in in the first case while in the second case, it's implicit (via some other constraints on how the learner learns)?
- (describing Gold's results and the PAC framework) I thought the step-through of the various learnability results was remarkably clear. I've seen a number of different attempts to do just the basic Gold one about identifiability in the limit from positive evidence, and this is definitely one of the best. In particular, C&L really take care to point out the limitations of each of the results (as they apply to acquisition) simply and concisely.
- (p.11) Mostly, I just found myself saying "Yes!" enthusiastically at the end of section 3, where C&L talk about how learnability connects to acquisition.
- (p.13) I also appreciated the explicit connection being made to current probabilistic techniques, such as MDL and Bayesian inference.
- (p.20) When talking about efficient learning, C&L say "...the core issues in learning concern efficient inference from probabilistic data and assumptions". It's the "assumptions" part that I think the focus of the debate is on - assumptions about the data, assumptions about what's generating the data, something else? What kind of assumptions are these and how/why does the learner make them?
- (p.22) I admit great curiosity in the Distributional Lattice Grammars, since they apparently have empirical coverage, a syntax-semantics interface, and good learnability properties. This really underscores how the representation we think children are trying to learn will determine what they need to learn it. Maybe this is something to read about in more detail in the fall...
- Section 7 (starting on p.25): After all the emphasis on targeting learnability research to be more informative to acquisition, I was a bit surprised to see the concluding discussion about machine learning (ML) methods (especially the supervised ML methods). While it's true that unsupervised learning is much closer to the acquisition problem, it seems like this loses the point about tractable cognition (i.e., use strategies humans could use).