Hy-NLI : a Hybrid system for state-of-the-art Natural Language Inference
2021, Kalouli, Aikaterini-Lida
A main characteristic of human language and understanding is our ability to reason about things, i.e., to infer conclusions from given facts. Within the field of Natural Language Processing and Natural Language Understanding, the task of inferring such conclusions has come to be known as Natural Language Inference (NLI) and is currently a popular field of research. NLI is most often formulated as the task of determining where a sentence entails (i.e., implies) or contradicts (i.e., implies the opposite) or is neutral (i.e., does not have any relation) with respect to another sentence (MacCartney and Manning, 2007). Although such a task sounds trivial for humans, it is less so for machines: the processes behind human inference require even more than understanding linguistic input; they presuppose our understanding about the world and everyday life and require the complex combination and interaction of this information.
In this thesis, I implement a hybrid NLI system, Hy-NLI, which is able to determine the inference relation between a pair of sentences. Hy-NLI consists of a symbolic and a deep-learning component, combining the best of both worlds: it exploits the strengths that each approach exhibits and mitigates their weaknesses. The implemented system relies on the finding that each of the two very different approaches is particularly suitable for a specific kind of phenomena. Deep-learning methods are good in dealing with graded and more fluid aspects of meaning, while symbolic approaches can efficiently deal with contextual phenomena of natural language, e.g., modals, negations, implicatives, etc. Hy-NLI learns to distinguish between the cases and respectively employ the component that is known to work best for each of them. Thus, this thesis contributes to the state-of-the-art in NLI. It also contributes to the general debate whether symbolic or deep-learning approaches are most efficient by showing that systems can benefit from both of them in different ways. Hence, the thesis at hand motivates research that does not choose one of the two, but marries them up into a successful combination.
To reach the ultimate goal of closing the gap between these two approaches, this thesis makes four major contributions. First, it sheds light on the available NLI datasets, their issues and the insights they can offer us about the NLI task in general. Precisely, I investigate one of the well-known mainstream NLI datasets, SICK (Marelli et al., 2014b), and observe how certain corpus construction practices have influenced the quality of the data itself and of its annotations. I also show how the quality of annotations is not only affected by the corpus construction process but also from inherent human disagreements and fine-grained nuances of human inference. The issues found in the datasets are addressed in a variety of ways, from manually correcting subsets of the corpus to performing experiments that quantify and identify these aspects of the NLI task. The second major contribution of the thesis at hand is the development of the Graphical Knowledge Representation (GKR, Kalouli and Crouch (2018)), a semantic representation suitable for semantic tasks such as NLI, and the implementation of an efficient GKR parser. The representation stands out from other similar representations for its separation of the sentence information in different layers/graphs. Particularly, there is a strict separation between the conceptual, predicate-argument structure and the contextual, boolean structure. This modularity and projection architecture gives rise to a concept-based, intensional Description Logic (Baader et al., 2003) semantics. The efficiency and suitability of GKR for NLI is revealed through the implementation of the symbolic inference engine GKR4NLI, a further major goal of this thesis. GKR4NLI is developed as an inference mechanism relying on Natural Logic (Van Benthem, 1986, Sánchez-Valencia, 1991) and on the semantics imposed by GKR. Its performance on different datasets confirms previous findings that symbolic engines are good in dealing with semantically complex phenomena, but struggle with more robust aspects of meaning. Thus, these results motivate the need for a hybrid system, where each aspect of meaning is treated by the most suitable component. This need is addressed with the implementation of Hy-NLI, the final major goal of this thesis. The hybrid system uses GKR4NLI as its symbolic component and the state-of-the-art language representation model BERT (Devlin et al., 2019) as its deep-learning component. Their successful combination leads Hy-NLI to outperform other state-of-the-art methods across datasets of different nature and complexity. With such performance across the board, Hy-NLI confirms the need for hybrid systems and paves the way for more work in this research direction.