Computational Models of Language (at UC Irvine): Some thoughts on Cao et al. 2022

Tuesday, October 4, 2022

Some thoughts on Cao et al. 2022

I really like seeing modeling work like this where a more complex, ideal computation (here, EIG) can be well-approximated by a simpler, more-heuristic computation (here, surprisal and KL divergence) when it comes to capturing developmental behavior. Of course, this paper is presenting a first-pass evaluation over adult behavior, but as the authors note, future work can extend their evaluation to infant looking behavior. I definitely would like to see how well this approach works for infant data, since I’d be surprised if there wasn’t some immaturity (i.e., resource constraints, other biases) at work for the computation itself in infants, compared with adult decision-making. And then the interesting question is how to capture that immaturity – for instance, do the approximations of the computation work even better than the idealized computation with EIG? Would even simpler heuristics that don’t approximate EIG as well but are also backward-looking, rather than forward-looking, be better?

Other specific thoughts:

(1) Noisy perception: It’s really nice to see this worked into a developmental model, since – especially for infants – imperfect representations of stimuli seems like a plausible situation. That is, the “perceptual intake” into the learning system depends on immature knowledge and abilities, and is therefore different from the input signal that’s out there in the world. (To be fair, the perceptual intake for adults is also different from the input signal out there in the world, and adults don’t have immature knowledge and abilities. So children basically have to learn to be adult-like in how they “skew” the input signal.)

(2) The RANCH model involves accumulating noisy samples and choosing what to do at each moment. This sounds like the diffusion model of decision-making from mathematical psych to me. I wonder if RANCH is an implementation of that (and if not, how they differ)?

(3) What the learner needs to know: A key idea here is that the motivation to sample the input at all is because the learner knows perception is noisy. To me, this is pretty reasonable knowledge to build into a modeled child. It reminds me of Perkins et al. 2022 where the learner knows misperception occurs, and so has to learn to filter out erroneous data. Importantly there, the modeled learner doesn’t have to know the specifics beyond that.

Perkins, L., Feldman, N. H., & Lidz, J. (2022). The Power of Ignoring: Filtering Input for Argument Structure Acquisition. Cognitive Science, 46(1), e13080.

Computational Models of Language (at UC Irvine)

Tuesday, October 4, 2022

Some thoughts on Cao et al. 2022

No comments:

Post a Comment

People who think this blog is awesome

Members