Wednesday, October 27, 2021

Some thoughts on Tal et al. 2021

This seemed to me like a straightforward application of a measure of redundancy (measuring whatever level of representation you like) to quantify redundancy in child-directed speech over developmental time. As T&al2021 note, the idea of repetition and redundancy in child-directed speech isn’t new, but this way of measuring it is, and the results certainly accord with current wisdom that (i) repetition in speech is helpful for young children, and (ii) repetition gets less as children get older (and the speech directed at them gets more adult-like). The contributions therefore also seem pretty straightforward: a new, more holistic measure of repetition/redundancy at the lexical level, and the finding that multi-word utterances seem to be the thing that gets repeated less as children get older.


Some other thoughts:

(1) Corpus analysis: For the Providence corpus, with such large samples, I wonder why T&al2021 chose to make only two age bins (12-24 months, and 24-36 months). It seems like there would be enough data there to go finer-grained (like maybe every two months: 12-14, 14-16, etc), and especially zoom in on the gaps in the NewmanRatner corpus between 12 and 24 months.


(2) I had some confusion over the discussion of the NewmanRatner results, regarding the entropy decrease they found with the shuffled word order of Study 2. In particular, I think the explanation for the entropy decrease was that lexical diversity didn’t increase in this sample as children got older. But, I didn’t quite follow why this explained the entropy decrease. More specifically, if lexical diversity stays the same, the shuffled word order keeps the same frequencies of individual words over time, so no change in entropy at the lexical level. With shuffled word order, the multi-word sequences are destroyed, so that should increase entropy. How does no change + entropy increase lead to an overall entropy decrease? 


Relatedly, T&al2021 say  about Study 2 that “the opposite tendencies of lexical- and multi-word repetitiveness in this corpus seem to cancel each other out at 11 months”. This related to my confusion above. Basically, we have constant lexical diversity, so there’s no change to entropy over time coming from the lexical level. Decreasing multi-word repetitions leads to higher entropy over time. What are the opposite tendencies here? It seems like there’s only one tendency (increasing entropy from the loss of the multi-word repetitions).


Thursday, October 14, 2021

Some thoughts on Harmon et al. 2021

 I think it’s a testament to the model description that the simulations seemed almost unnecessary to me -- they turned out exactly as (I) expected, given what the model is described as trying to do, based on the frequency of novel types. I also really love seeing modeling work of this kind used to investigate developmental language disorders -- I feel like there’s just not as much of this kind of work out there, and the atypical development community really benefits from it. That said, I do think the paper suffers a bit from length limitations. I definitely had points of confusion about what conceptually was going on (more on this below).


(1) Production probability: The inference problem is described as trying to identify the “production probability”, but it took me awhile to figure out what this might be referring to. For instance, does “production probability” refer to the probability that this item will take some kind of morphology (i.e., be “productive”) vs. not in some moment? If an item has a production probability of say, .5, does that mean that the item is actually “fully” productive, but that productivity is only accessed 50% of the time (so it would be a deployment issue that we see 50% in the output)? Or does it mean that only 50% of the inflections that should be used with that item are actually used (e.g. -ed but not -ing)? (That seems more like a representation issue.) Or does “production probability” mean something else? 


I guess here, if H&al2021 are focusing on just one morpheme, it would be the deployment option, since that morpheme is either used or not. Later on, H&al2021 talk about this probability as “the probability for the inflection”, which does make me think it’s how often one inflection applies, which also aligns with the deployment option. Even later, when talking about the Pitman-Yor process, it seems like H&al2021 are talking about the probability assigned to the fragment that incorporates the inflection directly. So, this corresponds to how often that fragment gets deployed, I think.


(2) Competition, H&al2021 start a train of thought with “if competition is too difficult to resolve on the fly”: I don’t think I understand what “competition” means in this case. That is, what does it mean not to resolve the competition? I thought what was going on was that if the production probability is too low, the competition is lost (resolved) in favor of the non-inflected form. But this description makes it sound like the competition is a separate process (maybe among all the possible inflected forms?), and if that “doesn’t resolve”, then the inflected form loses to another option (which is compensation).


(3) In the description of the Procedural Deficit Hypothesis, DLD kids are said to “produce an unproductive rule”: I don’t think I follow what this means -- is it that these kids produce a form that should be unproductive, like “thank” for think-past tense? This doesn’t seem to align with “memorization using the declarative memory system”, unless these kids are hearing “thank” as think-past tense in their input (which seems unlikely). Maybe this was a typo for “produce an uninflected form”?


(4) The proposed account of H&al2021 is that children are trying to access appropriate semantics, and not just the appropriate form (i.e., they prioritize meaning); so, this is why bare forms win out.  This makes intuitive sense to me from a bottleneck standpoint. If you want to get your message across, you prioritize content over form. This is what little typically-developing kids do, too, during telegraphic speech.


(5) Potentially related work on productivity: I’m honestly surprised there’s no mention of Yang’s work on productivity here -- he has a whole book of work on it (Yang 2016), and his approach focuses on specifying how many types are necessary for a rule to be productive, which seems relevant here.

 

Yang, C. (2016). The price of linguistic productivity: How children learn to break the rules of language. MIT Press.


(6) During inference, the modeled learner is given parsed input and has to infer fragments: So the assumption is that the DLD child perceived the form and the inflection correctly in the input, but the issue is retrieving that form and inflection during production. I guess this is because DLD kids comprehend morphology just fine, but struggle with production?


(7) Results: “the results of t tests showed that in all models, the probability of producing wug was higher than wugged...due to the high frequency of the base form”: Was this true even for the TD (typically developing child) model? If so, isn’t that not what we want to see, because TD children pass the wug test? 


Also, were these the only two alternatives available, or were other inflectional options on the table too? 


Also, is it that the modeled child just picked the one with the highest probability? 


Are the only options available the chunked inflections (including the null of the bare form), or are fragments that just have STEM + INFLECTION (without specifying the inflection) also possible? If so, how can we tell that option from the STEM + null of the bare form in practice? Both would result in the bare form, I would think.


(8) In the discussion, processing difficulties are said to skew the intake to have fewer novel types, which is crucial for inferring productivity. So, this means that kids don’t infer a high enough probability for the productive fragment, as it were; I guess this doesn’t affect their comprehension, because they can still use the less efficient fragments to parse the input (but maybe not parse it as fast). So maybe this is a more specific hypothesis about the “processing difficulties” that cause them not to parse novel types in the input that well?


(9) Discussion, “past tense rule in the DLD models was not entirely unproductive”: Is this because the fragment probability wasn’t 0? Or, how low does it have to be to be considered unproductive? This brings me back to Yang’s work, where there’s a specific threshold. Below that threshold, it’s unproductive. And that threshold can actually be pretty high  (like, definitely above 50%).


(10) Discussion, the qualitative pattern match with TD kids is higher than with DLD kids: I get that qualitative pattern matching is important and useful when talking about child behavior, but 90-95% production vs. 30-60% production looks pretty different from Figure 3. I guess Figure 3’s in log space, and who knows what other linking components are involved. But still, I feel like it would have been rhetorically more effective to talk about higher vs lower usage than give the actual percentages here.


(11) Discussion, “possible that experience with fewer verb types in the past tense, especially with higher frequency, biases children with DLD to store a large number of inflected verbs as a single unit (stem plus inflection) compared to TD children, further undermining productivity": This description makes it sound like storing STEM + inflection directly isn’t productive. But, I thought that was the productive fragment we wanted. Or was this meant as a particular stem + inflection, like hug + ed?