Friday, September 30, 2011

Some thoughts on Dunbar et al. (2010)

This is probably one of the more linguistically technical articles we've read in the group to date, but I think that even if the linguistic details weren't as accessible to someone without a linguistic background, there was still a very good, basic point made about the simplicity of abstract structures, given principles of Bayesian reasoning. On the one hand, this might seem surprising since adding another layer of representation might seem de facto more complex; on the other hand, there's something clearly simpler about having three basic units of representation instead of six (for instance).

Some more targeted thoughts:

p.7: The particular example they discuss involving phonemes (specifically, three with derivational rules vs. six with no need for derivational rules) - this reminds me of Perfors et al. (2010), where they were looking at recursion in language, also from a Bayesian perspective. In that case, the decision was between a non-recursive grammar, a partially-recursive grammar, and a fully recursive grammar. The outcome turned out to be that for different structures (subject embedding vs. object embedding), different grammars fit the data best, with one of the winners being the partially-recursive grammar. In essence, this is a "direct store + some computation" approach. For the phoneme example in Dunbar et al., it seems like the choices are between "direct store of six" vs. "store three + some computation", and the "some computation" option ends up being the best. (Related note on p.30: I agree that it would be nice to have formal theoretical debates take place at this level when discussing learnability, rather than relying on intuitions of whether computation or direct storage is more complex/costly.)

p.9: Just a quick note about their justification of looking for a theoretically optimal solution (using the ideal learner paradigm, essentially) - I do agree that this has a place in acquisition studies. Basically, if you formulate a problem (and accompanying hypothesis space), and then find that this problem is unsolvable by an ideal learner, this is a clue that some thing is not right - maybe it's the hypothesis space, maybe it's a missing learning bias on how to use the data, etc.

p.14: Another main message of the authors: "Probability theory...is simply a way...of formalizing reasoning under uncertainty." I get the impression that this is to persuade readers who aren't normally very fond of probability.

1 comment:

  1. Page 15 included the most succinct differentiation between the frequentist and Bayesian approaches that I've read. I feel that the authors might have been able to present more compelling results by using a less idealized model. In particular, their concluding remark that, "under a more realistic modeling modeling scheme, we might have obtained different results, of course," left me scratching my head a bit. The notion that "model selection is almost inevitably strongly dependent on the parameterization ... and how the underlying space is parameterized" is simply a reflection of an oversimplification of the problem. Finally, I would be really interested to read more about Bayesian model selection frameworks that attempted to match developmental data for child-learner model selection.

    ReplyDelete