This work strikes me as a nice demonstration of the Rational Speech Act model framework, extended to allow multiple dimensions of communicative goals (in this case, true state of the world vs. affective content vs. both). Beyond the formalization of these components in the RSA model, the key seems to be that the listener must know that both communicative goals are possible. This got me thinking about how to apply the RSA model to child language processing — for example, would a model that only had the true state of the world as its sole communicative goal match children’s interpretations better at a certain point in development? It seems possible. And then, we could track development by when this second communicative goal seems to be taken into consideration (i.e., does the RSA model+affect fit the behavioral data better than the basic RSA model), and potentially how much weight it’s given a priori.
A related thought occurred to me as I was reading the implementation details of the RSA model. The basic framework is that you have a listener, and the listener assumes the speaker generated the utterance by keeping in mind how a literal listener would interpret it. This clearly involves some pretty sophisticated theory of mind (ToM). So, similar to the above, could we track children’s development by how well this model fits their behavior vs. a model where the listener assumes a speaker who deviates from the above in some way (e.g., a speaker who has the same knowledge as the listener, rather than a speaker who thinks about how a naive literal listener will interpret the utterance)? To be honest, I really don’t know how to cash this out exactly, but the intuition feels right to me. Kids may have various kinds of ToM abilities early, but the ToM required in this model seems pretty sophisticated. So maybe kids have a limited ToM to begin with, and that plays out in this model in a different way than the model is currently set up. Then, we compare the ToM-limited model vs. the model given here against children’s behavior, and see which fits best at different stages of development.
Some additional comments:
(1) Looking at Figure 2B, it seems like humans (far right panel) still have a bit of the literal interpretation bias (more of a spike at exactly 1000 for “costs $1,000”) and a bit of the imprecise goal bias (more of a spike at 1001) than the full model does (next panel to the left). I wonder if this separates out by individuals — I could imagine some people being more literal than others (maybe due to natural variation, or because of an Asperger Syndrome type condition).
(2) Related to the above, the imprecise goal seems to be another communicative dimension, but it’s not talked about that way. Instead, we have “truth” vs. “affect”, and then imprecise goal gets folded into affect. I wonder why — perhaps because “imprecise goal” is a way to signal “this is not the truth”? If so, that would require fairly sophisticated communicative knowledge. On the other hand, Kao et al. (2014) treat it as completely separate in the Materials and Methods section — precision of goal (precise vs. imprecise) is fully crossed with communicative goal (truth vs. affect vs. both). So, it does start to feel like an additional communicative dimension.