Computational Models of Language (at UC Irvine): Some thoughts on Sondregger & Niyogi (2010)

I think this paper is a really nice example of how to use real data for language change modeling, and why you would want to. I like this methodology in particular, where properties of the individual learner are explored and measured by their effects on the population dynamics. Interestingly, I think this is different than some of the other work I'm familiar with relating language acquisition and language change, since I'm not sure it restricts the learning period to the period of language acquisition, per se. In particular, the knowledge being modeled - stress patterns of lexical items, possibly based on influence from the rest of the lexicon - is something that seems like it can change after native language acquisition is over. That is, the learners here don't have to be children (which is something that Pearl & Weinberg (2007) assumed for the knowledge they looked at, and something with work by Lightfoot (1999, 2010) generally assumes). Based on some of the learning assumptions involved in this paper (e.g., probability matching when given noisy input, using the lexicon to determine the most likely stress pattern), I would say that the modeled learners probably aren't children. And that's totally fine. The only caveat is that then the explanatory power of learning to explain the observed changes becomes a little less, simply because other factors may be involved (language contact, synchronic change within the adults of a population, etc.), and these other factors aren't modeled here. So, when you get the population reproducing the observed behaviors, it's true that this learning behavior on its own could be the explanatory story - but it's also possible that a different learning behavior coupled with these other factors might be the true explanatory story. I think this is inherently a problem in explanatory models of language change, though - what you provide is an existence proof of a particular theory of how change happens. So then it's up to people who don't like your particular theory to provide an alternative. ;)

More targeted thoughts:

- I was definitely intrigued by the constrained variation observed in the stress patterns of English nouns and verbs together. Ross' generalization seems to describe it well enough (primary stress for nouns is further to the left than primary stress for verbs), but that doesn't explain where this preference comes from - it certainly seems quite arbitrary. Presumably, it could be an accident of history that a bunch of the "original" nouns happened to have that pattern while the verbs didn't, and that got passed along through the generations of speakers. The authors mention something later on about how nouns appear in trochaic-biasing contexts, while verbs appear in iambic-biasing contexts (based on work by Kelly and colleagues). This again seems like the result of some process, rather than the cause of it. Maybe it has something to do with the order of verbs and their arguments? I could imagine that there's some kind of preference for binary feet where stress occurs every other syllable, and then the stress context for nouns vs. verbs comes from that (somehow)...

- The authors mention that falling frequency (rather than low frequency) seems to be the trigger for change to {1,2}. This means that something could be highly frequent, but because its frequency lessens some (maybe lessens rapidly?), change is triggered. That seems odd to me. Instead, it seems more likely that both falling frequency and low frequency might be caused by the same underlying something, and that's the something that triggers change. (Caveat: I haven't read the work the authors mentioned, so maybe it's laid out more clearly there.) However, they restate it again at the end of this paper, relating to the last model they look at.

- The last model the authors explore (coupling by priors + mistransmission) is the one that does best at matching the desired behaviors, such as changing to {1,2} more often. I interpreted this model as something like the following: If enough examples are heard, the mistransmission bias encourages mis-hearing in the right direction, given the priors that come from the lexicon on overall stress patterns. However, the mistransmission also means that it goes towards that {1,2} pattern more slowly, so only higher frequencies can make it happen the way we want it to (and this is how it differs from the fourth model that just has coupling by priors).

~~~
References
~~~

Lightfoot, D. (1999). The development of language: Acquisition, change, and evolution. Oxford, Eng-
land: Blackwell.

Lightfoot, D. (2010). Language acquisition and language change. Wiley Interdisciplinary Reviews: Cognitive Science, 1, 677-684. doi: 10.1002/wcs.39.

Pearl, L. & Weinberg, A. (2007) Input Filtering in Syntactic Acquisition: Answers from Language Change Modeling, Language Learning and Development, 3(1), 43-72.

Computational Models of Language (at UC Irvine)

Monday, May 28, 2012

Some thoughts on Sondregger & Niyogi (2010)

No comments:

Post a Comment

People who think this blog is awesome

Members