I really liked how this paper tackled a really big problem head on. It's inclusion in subsequent works speaks strongly for the interest in this kind of research. I would really like to see more language papers set a high bar like this and establish a framework for achieving it.
My largest concern about this paper is the fact that the authors seemed to feel that human-guided learning can overcome some of the deficits in the model framework. The large drop off in precision (from 90% to 57%) is not surprising as methods such as the Coupled SEAL and Coupled Morphological Classifier are not robust in the face of locally optimal solutions; it is inevitable that as more and more data is added, the fitness will decline, because the models are already anchored to their fit of previous data. Errors will beget errors, and human intervention will only limit this inherent multiplication.
These errors are further compounded by the fact that the framework does not take into account the degree of independence between its various models. Using group and individual model thresholds for decision making is a decent heuristic, but it is unworkable as an architecture because guaranteeing each model's independence is a hard constraint on the number and types of models that can be used. I believe the framework would be better served by combining the underlying information in a proper, hierarchical framework. By including more models that can inform each other, perhaps the necessity of human-supervised learning can be kept to a minimum.