Computational Models of Language (at UC Irvine): Johnson & Goldwater (2009): Some comments

Tuesday, August 31, 2010

Johnson & Goldwater (2009): Some comments

So I admit I found this paper a bit tougher going than the previous one, most likely due to how much information they had to fit into a limited space. Anyway, once I wrapped my head around what it meant for something to be "adapted", things started to make more sense.

Some more targeted thoughts:

(1) Given our discussion last time about the syllable as a likely basic unit of representation (given neurological evidence), we had talked about implementing learning models that take the syllable as the basic unit. How similar is what Johnson & Goldwater have done here with their collocation-syllable adaptor grammar to this idea? Clearly, the syllable is one unit of representation that matters in this model, but they also go below the syllable level to include properties of syllables that correspond (roughly) to phonotactic constraints on syllable-hood. Does this mean a learner would have to be able to analyze individual phonemes in order to use this model? If so, what happens if we get rid of any representation below the syllable-level? Is there any place for phonotactic constraints then?

(2) I'd like to look closer at Table 1 to try to understand what benefits the learner. Because there are so many conditions, it's a bit hard to pick apart the impact of any one condition. For example, J&G argue that table label resampling leads to goodness for the models with rich hierarchical structure (like the collocation-syllable model), and point to figure 1 to show this. But looking at the 3rd and 4th entries from the bottom of table 1, it seems like performance worsens with table label resampling.

(3) The idea of maximum marginal decoding is interesting to me, because it reminds me of the difference between "weakly equivalent" grammars and "strongly equivalent" grammars. Weakly equivalent = output is the same, even if internal structure isn't; Strongly equivalent = output is the same and internal structure is the same. It seems here that aggregating "weakly equivalent" word segmentations leads to better performance.

Computational Models of Language (at UC Irvine)

Tuesday, August 31, 2010

Johnson & Goldwater (2009): Some comments

No comments:

Post a Comment

People who think this blog is awesome

Members