Friday, May 2, 2014

Some thoughts on Orita et al. 2013

There are several aspects of this paper that I really enjoyed. First, I definitely appreciate the clean and clear description of the circularity in this learning task, where you can learn about the syntax if you know the referents…and you can learn about the referents if you know the syntax (chicken and egg, check). 

I also love how hard the authors strive to ground their computational model in empirical data. Now granted, the human simulation paradigm may have its own issues (more on this below), but it’s a great way to try to get at least some approximation of the contextual knowledge children might have access to. 

I also really liked the demonstration of the utility of discourse/non-linguistic context information vs. strong syntactic prior knowledge — and how having the super-strong syntax knowledge isn’t enough. This is something that’s a really important point, I think: It’s all well and good to posit detailed, innate, linguistic knowledge as a necessary component for solving an acquisition problem, but it’s important to make sure that this component actually does solve the learning problem (and be aware of what else it might need in order to do so). This paper provides an excellent demonstration of why we need to check this…because in this case, that super-strong syntactic knowledge didn’t actually work on its own. (Side note: The authors are very aware that their model still relies on some less-strong syntactic knowledge, like the relevance of syntactic locality and c-command, but the super-strong syntactic knowledge was on top of that less-strong knowledge.)

More specific thoughts:

(1) The human simulation paradigm (HSP): 
In some sense, this task strikes me as similar to ideal learner computational models — we want to see what information is useful in the available input. For the HSP, we do this by seeing what a learner with adult-level cognitive resources can extract. For ideal learners, we do this by seeing what inferences a learner with unlimited computational resources can make, based on the information available. 

On the other hand, there’s definitely a sense in which the HSP is not really an ideal learner parallel. First, adult-level processing resources is not the same as unlimited processing resources (it’s just better than child-level processing resources). Second, the issue with adults is that they have a bunch of knowledge to build on about how to extract information from both linguistic and non-linguistic context…and that puts constraints on how they process the available information that children might not have. In effect, the adults may have biases that cause them to perceive the information differently, and this may actually be sub-optimal when compared to children (we don’t really know for sure…but it’s definitely different than children).

Something which is specific to this particular HSP task is that the stated goal is to “determine whether conversational context provides sufficient information for adults” to guess the intended referent.  But where does the knowledge about how to use the conversational context to interpret the blanked out NP (as either reflexive, non-reflexive, or lexical) come from? Presumably from adults’ prior experience with how these NPs are typically used. This isn’t something we think children would have access to, though, right? So this is a very specific case of that second issue above, where it’s not clear that the information adults extract is a fair representation of the information children extract, due to prior knowledge that adults have about the language.

Now to be fair, the authors are very aware of this (they have a nice discussion about it in the Experiment 1 discussion section), so again, this is about trying to get some kind of empirical estimate to base their computational model’s priors on. And maybe in the future we can come up with a better way to get this information.  For example, it occurs to me that the non-linguistic context (i.e., environment, visual scene info) might be usable. If the caretaker has just bumped her knee, saying “Oops, I hurt myself” is more likely than “Oops, I hurt you”. It may be that the conversational context approximated this to some extent for adults, but I wonder if this kind of thing could be extracted from the video samples we have on CHILDES. What you’d want to do is do a variant of the HSP where you show the video clip with the NP beeped out, so the non-linguistic context is available, along with the discourse information in the preceding and subsequent utterances.

(2) Figure 2: Though I’m fairly familiar with Bayesian models by now, I admit that I loved having text next to each level reminding me what each variable corresponded to. Yay, authors.

(3) General discussion point at the end about unambiguous data: This is a really excellent point, since we don’t like to have to rely on the presence of unambiguous data too much in real life (because typically when we go look for it in realistic input, it’s only very rarely there). Something I’d be interested in is how often unambiguous data for this pronoun categorization issue does actually occur. If it’s never (or almost never, relatively speaking), then this becomes a very nice selling point for this learning model.

No comments:

Post a Comment