LMFingerprints : Visual Explanations of Language Model Embedding Spaces through Layerwise Contextualization Scores
2022-07-29, Sevastjanova, Rita, Kalouli, Aikaterini-Lida, Schätzle, Christin, Schäfer, Hanna, El-Assady, Mennatallah
Language models, such as BERT, construct multiple, contextualized embeddings for each word occurrence in a corpus. Understanding how the contextualization propagates through the model's layers is crucial for deciding which layers to use for a specific analysis task. Currently, most embedding spaces are explained by probing classifiers; however, some findings remain inconclusive. In this paper, we present LMFingerprints, a novel scoring-based technique for the explanation of contextualized word embeddings. We introduce two categories of scoring functions, which measure (1) the degree of contextualization, i.e., the layerwise changes in the embedding vectors, and (2) the type of contextualization, i.e., the captured context information. We integrate these scores into an interactive explanation workspace. By combining visual and verbal elements, we provide an overview of contextualization in six popular transformer-based language models. We evaluate hypotheses from the domain of computational linguistics, and our results not only confirm findings from related work but also reveal new aspects about the information captured in the embedding spaces. For instance, we show that while numbers are poorly contextualized, stopwords have an unexpected high contextualization in the models' upper layers, where their neighborhoods shift from similar functionality tokens to tokens that contribute to the meaning of the surrounding sentences.
Explaining Contextualization in Language Models using Visual Analytics
2021, Sevastjanova, Rita, Kalouli, Aikaterini-Lida, Schätzle, Christin, Schäfer, Hanna, El-Assady, Mennatallah
Despite the success of contextualized language models on various NLP tasks, it is still unclear what these models really learn. In this paper, we contribute to the current efforts of explaining such models by exploring the continuum between function and content words with respect to contextualization in BERT, based on linguistically-informed insights. In particular, we utilize scoring and visual analytics techniques: we use an existing similarity-based score to measure contextualization and integrate it into a novel visual analytics technique, presenting the model’s layers simultaneously and highlighting intra-layer properties and inter-layer differences. We show that contextualization is neither driven by polysemy nor by pure context variation. We also provide insights on why BERT fails to model words in the middle of the functionality continuum.
Mixed-Initiative Active Learning for Generating Linguistic Insights in Question Classification
2018, Sevastjanova, Rita, El-Assady, Mennatallah, Hautli-Janisz, Annette, Kalouli, Aikaterini-Lida, Kehlbeck, Rebecca, Deussen, Oliver, Keim, Daniel A., Butt, Miriam
We propose a mixed-initiative active learning system to tackle the challenge of building descriptive models for under-studied linguistic phenomena. Our particular use case is the linguistic analysis of question types, in particular in understanding what characterizes information-seeking vs. non-information-seeking questions (i.e., whether the speaker wants to elicit an answer from the hearer or not) and how automated methods can assist with the linguistic analysis. Our approach is motivated by the need for an effective and efficient human-in-the-loop process in natural language processing that relies on example-based learning and provides immediate feedback to the user. In addition to the concrete implementation of a question classification system, we describe general paradigms of explainable mixed-initiative learning, allowing for the user to access the patterns identified automatically by the system, rather than being confronted by a machine learning black box. Our user study demonstrates the capability of our system in providing deep linguistic insight into this particular analysis problem. The results of our evaluation are competitive with the current state-of-the-art.
Negation, Coordination, and Quantifiers in Contextualized Language Models
2022, Kalouli, Aikaterini-Lida, Sevastjanova, Rita, Schätzle, Christin, Romero, Maribel
With the success of contextualized language models, much research explores what these models really learn and in which cases they still fail. Most of this work focuses on specific NLP tasks and on the learning outcome. Little research has attempted to decouple the models' weaknesses from specific tasks and focus on the embeddings per se and their mode of learning. In this paper, we take up this research opportunity: based on theoretical linguistic insights, we explore whether the semantic constraints of function words are learned and how the surrounding context impacts their embeddings. We create suitable datasets, provide new insights into the inner workings of LMs vis-a-vis function words and implement an assisting visual web interface for qualitative analysis.
XplaiNLI : Explainable Natural Language Inference through Visual Analytics
2020, Kalouli, Aikaterini-Lida, Sevastjanova, Rita, de Paiva, Valeria, Crouch, Richard, El-Assady, Mennatallah
Advances in Natural Language Inference (NLI) have helped us understand what state-of-the-art models really learn and what their generalization power is. Recent research has revealed some heuristics and biases of these models. However, to date, there is no systematic effort to capitalize on those insights through a system that uses these to explain the NLI decisions. To this end, we propose XplaiNLI, an eXplainable, interactive, visualization interface that computes NLI with different methods and provides explanations for the decisions made by the different approaches.
Is that really a question? : Going beyond factoid questions in NLP
2021, Kalouli, Aikaterini-Lida, Kehlbeck, Rebecca, Sevastjanova, Rita, Deussen, Oliver, Keim, Daniel A., Butt, Miriam
Research in NLP has mainly focused on factoid questions, with the goal of finding quick and reliable ways of matching a query to an answer. However, human discourse involves more than that: it contains non-canonical questions deployed to achieve specific communicative goals. In this paper, we investigate this under-studied aspect of NLP by introducing a targeted task, creating an appropriate corpus for the task and providing baseline models of diverse nature. With this, we are also able to generate useful insights on the task and open the way for future research in this direction.
ParHistVis : Visualization of Parallel Multilingual Historical Data
2019, Kalouli, Aikaterini-Lida, Kehlbeck, Rebecca, Sevastjanova, Rita, Kaiser, Katharina, Kaiser, Georg A., Butt, Miriam
The study of language change through parallel corpora can be advantageous for the analysis of complex interactions between time, text domain and language. Often, those advantages cannot be fully exploited due to the sparse but high-dimensional nature of such historical data. To tackle this challenge, we introduce ParHistVis: a novel, free, easy-to-use, interactive visualization tool for parallel, multilingual, diachronic and synchronic linguistic data. We illustrate the suitability of the components of the tool based on a use case of word order change in Romance wh-interrogatives.