Tuesday, February 19, 2019

Some thoughts on Tessler & Franke 2018

This is a great example of theoretically-motivated computational modeling coupled with behavioral experiments, here in the realm of negated antonyms (e.g.., "not unhappy"). My main qualm is with the paper length — there’s a lot of interesting stuff going on, and we just don’t get the space to see it fully discussed (more specifics on this below). This of course isn’t the authors’ fault — it just highlights the difficulty of explaining work like this in the space you normally get for conference proceedings.

Specific comments:
(1) The case study here with negated antonyms (which involve double negations like “not unhappy”) seems very relevant for sentiment analysis, where we still struggle to deal precisely with negated expressions. So, more generally, this is a particular case where I can see the NLP community paying closer attention and taking inspiration from cognitive work. For example, based on the results here for single utterances ("unhappy" = "not happy"), the antonym dictionary approach to negation (where "not happy" = "unhappy" or "sad") may not be a bad move in non-contrastive utterances.

(2) I love the clearcut hypothesis space, and the building blocks of contrary (tall vs. short) vs. contradictory (even vs. odd) adjectives. My own sense is that my prior experience is mostly comprised of contrary adjectives, but I wonder if that’s true. (Helloooo, corpus analysis. Also, what do we know about children’s development of these types of fine semantic distinctions?)

(3) I wish there had been a bit more space to explain why we see the modeling results we do. For the full uncertain negation, we get some mileage from a single utterance because it’s unnecessarily costly to say “not unhappy” unless it had a different meaning from "happy", which makes sense. When there are multiple utterances, we see a complete separation of all four options because...there are four different individuals who presumably have different states (or else why use different expressions)?

For the more restricted hypothesis of bonafide contraries that connects morphological negation explicitly to an opposite valence, we see separation for both single and multiple utterances, but much moreso for the multiple utterances. This is definitely a case of a more restricted hypothesis yielding stronger generalizations from ambiguous data, but I don’t quite see how we’re getting it. Certainly, “not unhappy” is more costly to produce than “happy”, so we get separation between those two terms, just as with the full uncertain negation hypothesis. But why, in the single utterance case, do we also get separation between “unhappy” and “not happy”?

For the most restricted hypothesis of logical negation, I get why we never get any separation — by definition, “unhappy” = “not happy” = not(happy), and so “not unhappy” = not(not(happy)) = “happy”.

No comments:

Post a Comment