Computational Models of Language (at UC Irvine): Some thoughts on Heinz 2015 book chapters, parts 6-9

Continuing on from last time, where we read up through the discussion about constraints on strings, Heinz’s 2015 book chapter now gets into the constraints on maps between the underlying form and the observable form of a phonological string. As before, I found the more leisurely walk-through of the different ideas (complete with illustrative figures) quite accessible. The only gap in that respect for me as a non-phonologist was what an opaque map was, since Heinz mentions that opaque maps raise potential issues for the computational approach here. A quick googling pulled up some examples, but a brief concrete example would have been helpful.

On a more a contentful note, I found the compare and contrast with the optimality approach quite interesting. We have this great setup for some logically possible maps that are derivationally simple (e.g. “Sour Grapes”), and yet we find these maps unattested. Optimality has to add stuff in to take care of it, while the computational ontology Heinz presents neatly separates them out. Boom. Simple.

So then this leads me (as an acquisition person) to wondering what we can do with this learning-wise. Let’s say we have the set of phonological maps that occur in human language captured by a certain type of relationship (input-strictly local [ISL]) — there are some exceptions currently, but let’s say those get sorted out. Then, we also have some computational learnability results about how to learn these types of maps in the limit. Can I, as an acquisition modeler, then do something with those algorithms? Or do I need to develop other algorithms based off of those that do the same thing, only in plausible time limits?

And let’s make this even more concrete, actually — suppose there are a set of maps capturing English phonology that we think children learn by a certain age. Suppose that we do the kind of analysis Heinz suggests and discover all these maps are ISL. What kind of learning algorithms should I model to see if children could learn the right maps from English child-directed data? Are the existing learnability algorithms the ones? Or do I need to adapt them somehow? Or is it more that they serve to show it’s possible, but they may bear no resemblance to the algorithms kids would actually have to use? Given Heinz’s comment at the end of part 5 about the link between algorithm and representation, I feel like the existing algorithms should be related to the ones kids approximate if that kind of link is there.

A few other thoughts:

(1) Heinz points out the interesting dichotomy between tone maps and segment maps, where the tone maps allow more complex relationships. He mentions that this has been used to argue for modularity (where tones are in one module and segments are in the other, presumably), and that could very well be. What it also shows is that there isn’t just one restriction on the complexity in general — a more restrictive one occurs for segment maps but a less restrictive one occurs for tone maps. Why? Two thoughts: (1) Maybe the less restrictive one is the general abstract restriction, and something special happens for segments that further restricts it. This fits into the modularity explanation above. But (2) maybe it’s just chance that we haven’t found segment maps that violate the more restrictive restriction. If so, we wouldn’t need the modularity explanation since the difference between segment maps and tonal maps would just be, in effect, a sampling error (more samples if we had them would show segment maps that don’t follow that extra restriction). Caveat: I’m not sure how plausible this second idea is, given how many segment maps we have access to.

(2) I’m still not sure how much faith I have in the artificial language learning experiments that are meant to show that humans can’t learn certain types of generalizations/rules/mappings. I definitely believe that the subjects struggled to learn certain ones in the experiment while finding others easy to learn. But how much of that is effectively an L2 transfer effect? That is, the easy-to-learn ones are the ones in your native language, so (abstractly) you already have a bunch of experience with those and no experience with the other hard-to-learn kind. To be fair, I’m not sure how you could factor out the L2 transfer effect — no matter what you do with adults (or even kids), if it’s a language thing, they’ve already had exposure from their native language.

(3) Something for NLP applications (maybe): Section 6.4, “The simplest maps are Markovian on the input or the output (ISL, LOSL, and ROSL), and very many phonological transformations belong to these classes.” — This makes me think that the simpler representations NLP apps tend to use for speech recognition and production (ex: various forms of Hidden Markov Models, I think) may not be so far off from the truth, if this approach is correct.

Computational Models of Language (at UC Irvine)

Wednesday, April 29, 2015

Some thoughts on Heinz 2015 book chapters, parts 6-9

No comments:

Post a Comment

People who think this blog is awesome

Members