Computational Models of Language (at UC Irvine): May 2011

Monday, May 23, 2011

Thanks and see you at the end of the summer!

Thanks to everyone who was able to join us for our discussion of Clark & Lappin's (2011) article - as usual, we had quite the rousing debate about various points! This concludes the reading group activities for the spring quarter. We'll be picking up again at the end of the summer, around late August. Have a good break!

Friday, May 20, 2011

Thoughts on Clark & Lappin (2011)

I was really pleased with the overall approach of this paper, particularly how it discussed integrating a probabilistic component into learnability theory, the emphasis on the importance of tractable cognition, and the mention about how it's important to identify efficient algorithms for acquisition even if you already know the hypothesis space, such as in the principle & parameters framework (that last bit is particularly near and dear to my academic heart). It really seems like C&L take a solid psychological perspective, even though they're often dealing with idealized scenarios. To me, this seems to echo the original intuitions of generative grammarians - something like "we realize things are more complicated, but we can get a long way by making sensible simplifications".

Some more targeted thoughts:

In the introductory bit on p.3, I was surprised at the continuing use of "strong language specific learning biases" (in contrast to domain-general learning biases). Maybe this is because that's the kind of language-specific biases nativists often claim, but to me, any innate domain-specific bias would be part of Universal Grammar (UG), whether it's strong or not.
(p.4) I thought the separation between the learnability of a particular grammar formalism and the learnability of a class of natural languages was very nice. It does seem like sometimes the motivation for UG learning biases comes from assuming a particular representation that's being learned, rather than accounting for empirical coverage of the data.
(p.5) It seemed odd to me to say that it's not necessary to incorporate a characterization of the hypothesis space into the learner, but rather that the "design of the learner" will limit the hypothesis space. Is the difference that the hypothesis space is explicitly built in in the first case while in the second case, it's implicit (via some other constraints on how the learner learns)?
(describing Gold's results and the PAC framework) I thought the step-through of the various learnability results was remarkably clear. I've seen a number of different attempts to do just the basic Gold one about identifiability in the limit from positive evidence, and this is definitely one of the best. In particular, C&L really take care to point out the limitations of each of the results (as they apply to acquisition) simply and concisely.
(p.11) Mostly, I just found myself saying "Yes!" enthusiastically at the end of section 3, where C&L talk about how learnability connects to acquisition.
(p.13) I also appreciated the explicit connection being made to current probabilistic techniques, such as MDL and Bayesian inference.
(p.20) When talking about efficient learning, C&L say "...the core issues in learning concern efficient inference from probabilistic data and assumptions". It's the "assumptions" part that I think the focus of the debate is on - assumptions about the data, assumptions about what's generating the data, something else? What kind of assumptions are these and how/why does the learner make them?
(p.22) I admit great curiosity in the Distributional Lattice Grammars, since they apparently have empirical coverage, a syntax-semantics interface, and good learnability properties. This really underscores how the representation we think children are trying to learn will determine what they need to learn it. Maybe this is something to read about in more detail in the fall...
Section 7 (starting on p.25): After all the emphasis on targeting learnability research to be more informative to acquisition, I was a bit surprised to see the concluding discussion about machine learning (ML) methods (especially the supervised ML methods). While it's true that unsupervised learning is much closer to the acquisition problem, it seems like this loses the point about tractable cognition (i.e., use strategies humans could use).

Monday, May 9, 2011

Next time: Clark & Lappin (2011)

Thanks to everyone who was able to join us for our spirited discussion of Heinz (2010)'s computational phonology papers! I think we really brought up some excellent points and made some interesting connections between computational phonology and cognition.

Next time on May 23rd, we'll be looking into another study on formal mathematical representations of grammar inference, and how they connect to human language learning:

Clark & Lappin (2011)

See you then!

Friday, May 6, 2011

Thoughts on Heinz (2010a + b)

Again, I was surprised by how fast I read through these two papers - Heinz definitely knows how to explain abstract concepts in very comprehensible ways. One thing I did notice about these papers was that there were parts that whirled by almost a little too quickly, so that instead of giving background for someone not already in the know about computational phonology, they felt more like a brief literature review for someone already familiar with the relevant concepts. Still, I liked that Heinz was pretty up front about the goal of computational phonology - identifying the shape of the human phonological system (its universal properties, etc.). This definitely feels like a cognitive science-oriented approach, even if the specifics sometimes seem a little disconnected from what we might normally think of as the psychology of language.

Some more specific thoughts:

The discussion of finding a theory with the right balance between restrictiveness and expressiveness reminded me very much of Bayesian inference (find the hypothesis that has the best balance between simplicity and fit).
My inner theoretical computer science geek was pretty happy about the discussion of problems and algorithms and tractability, and the like. When discussing determinism, though, I do think there's some wiggle room with respect to non-deterministic processes (i.e., those that guess when unsure). A number of acquisition models incorporate some aspect of probabilistically-informed guessing, with reasonable success.
I thought the outline of phonological problems in particular (on p.9 of the first paper) neatly described a number of different interesting questions. I think the recognition problem is something like what psycholinguists would call parsing, while the phonotactic learning problem is what psycholinguists would generally call acquisition.
I believe Heinz mentions that transducers aren't necessarily the mental representation of grammars, but a lot of the conclusions he mentions seems predicated on that being true in order for the conclusions to have psychological relevance. That is, if the mental representations of grammar aren't something like the transducers discussed here, how informative is it to know that a surface form can be computed in so many steps, etc.? Or maybe there's still a way to translate that kind of conclusion, even if transducers aren't similar to the grammar representation?
The fact that two grammar formalisms (SPE and 2LP) are functionally the same is an interesting conclusion. What should then choose between them, besides personal preference? Ease of acquisition maybe?
I really liked the discussion distinguishing simulations from demonstrations. I think that pretty much all of my recent models seem to fall more under the demonstration category.

Computational Models of Language (at UC Irvine)