The Role of Interactive Visualization in Fostering Trust in AI
2021, Beauxis-Aussalet, Emma, Behrisch, Michael, Borgo, Rita, Chau, Duen Horng, Collins, Christopher, El-Assady, Mennatallah, Keim, Daniel A., Oelke, Daniela, Schreck, Tobias, Strobelt, Hendrik
The increasing use of artificial intelligence (AI) technologies across application domains has prompted our society to pay closer attention to AI's trustworthiness, fairness, interpretability, and accountability. In order to foster trust in AI, it is important to consider the potential of interactive visualization, and how such visualizations help build trust in AI systems. This manifesto discusses the relevance of interactive visualizations and makes the following four claims: i) trust is not a technical problem, ii) trust is dynamic, iii) visualization cannot address all aspects of trust, and iv) visualization is crucial for human agency in AI.
SOMFlow : Guided Exploratory Cluster Analysis with Self-Organizing Maps and Analytic Provenance
2018-01, Sacha, Dominik, Kraus, Matthias, Bernard, Jürgen, Behrisch, Michael, Schreck, Tobias, Asano, Yuki, Keim, Daniel A.
Clustering is a core building block for data analysis, aiming to extract otherwise hidden structures and relations from raw datasets, such as particular groups that can be effectively related, compared, and interpreted. A plethora of visual-interactive cluster analysis techniques has been proposed to date, however, arriving at useful clusterings often requires several rounds of user interactions to fine-tune the data preprocessing and algorithms. We present a multi-stage Visual Analytics (VA) approach for iterative cluster refinement together with an implementation (SOMFlow) that uses Self-Organizing Maps (SOM) to analyze time series data. It supports exploration by offering the analyst a visual platform to analyze intermediate results, adapt the underlying computations, iteratively partition the data, and to reflect previous analytical activities. The history of previous decisions is explicitly visualized within a flow graph, allowing to compare earlier cluster refinements and to explore relations. We further leverage quality and interestingness measures to guide the analyst in the discovery of useful patterns, relations, and data partitions. We conducted two pair analytics experiments together with a subject matter expert in speech intonation research to demonstrate that the approach is effective for interactive data analysis, supporting enhanced understanding of clustering results as well as the interactive process itself.
Magnostics : Image-Based Search of Interesting Matrix Views for Guided Network Exploration
2017-01, Behrisch, Michael, Bach, Benjamin, Blumenschein, Michael, Delz, Michael, von Rüden, Laura, Fekete, Jean-Daniel, Schreck, Tobias
In this work we address the problem of retrieving potentially interesting matrix views to support the exploration of networks. We introduce Matrix Diagnostics (or Magnostics), following in spirit related approaches for rating and ranking other visualization techniques, such as Scagnostics for scatter plots. Our approach ranks matrix views according to the appearance of specific visual patterns, such as blocks and lines, indicating the existence of topological motifs in the data, such as clusters, bi-graphs, or central nodes. Magnostics can be used to analyze, query, or search for visually similar matrices in large collections, or to assess the quality of matrix reordering algorithms. While many feature descriptors for image analyzes exist, there is no evidence how they perform for detecting patterns in matrices. In order to make an informed choice of feature descriptors for matrix diagnostics, we evaluate 30 feature descriptors-27 existing ones and three new descriptors that we designed specifically for MAGNOSTICS-with respect to four criteria: pattern response, pattern variability, pattern sensibility, and pattern discrimination. We conclude with an informed set of six descriptors as most appropriate for Magnostics and demonstrate their application in two scenarios; exploring a large collection of matrices and analyzing temporal networks.
Matrix Reordering Methods for Table and Network Visualization
2016, Behrisch, Michael, Bach, Benjamin, Henry Riche, Nathalie, Schreck, Tobias, Fekete, Jean-Daniel
This survey provides a description of algorithms to reorder visual matrices of tabular data and adjacency matrix of Networks. The goal of this survey is to provide a comprehensive list of reordering algorithms published in different fields such as statistics, bioinformatics, or graph theory. While several of these algorithms are described in publications and others are available in software libraries and programs, there is little awareness of what is done across all fields. Our survey aims at describing these reordering algorithms in a unified manner to enable a wide audience to understand their differences and subtleties. We organize this corpus in a consistent manner, independently of the application or research field. We also provide practical guidance on how to select appropriate algorithms depending on the structure and size of the matrix to reorder, and point to implementations when available.
FDive : Learning Relevance Models Using Pattern-based Similarity Measures
2019, Dennig, Frederik L., Polk, Tom, Lin, Zudi, Schreck, Tobias, Pfister, Hanspeter, Behrisch, Michael
The detection of interesting patterns in large high-dimensional datasets is difficult because of their dimensionality and pattern complexity. Therefore, analysts require automated support for the extraction of relevant patterns. In this paper, we present FDive, a visual active learning system that helps to create visually explorable relevance models, assisted by learning a pattern-based similarity. We use a small set of user-provided labels to rank similarity measures, consisting of feature descriptor and distance function combinations, by their ability to distinguish relevant from irrelevant data. Based on the best-ranked similarity measure, the system calculates an interactive Self-Organizing Map-based relevance model, which classifies data according to the cluster affiliation. It also automatically prompts further relevance feedback to improve its accuracy. Uncertain areas, especially near the decision boundaries, are highlighted and can be refined by the user. We evaluate our approach by comparison to state-of-the-art feature selection techniques and demonstrate the usefulness of our approach by a case study classifying electron microscopy images of brain cells. The results show that FDive enhances both the quality and understanding of relevance models and can thus lead to new insights for brain research.
Quality Metrics for Information Visualization
2018, Behrisch, Michael, Blumenschein, Michael, Kim, Naam Wook, El-Assady, Mennatallah, Fuchs, Johannes, Seebacher, Daniel, Diehl, Alexandra, Brandes, Ulrik, Schreck, Tobias, Weiskopf, Daniel, Keim, Daniel A.
The visualization community has developed to date many intuitions and understandings of how to judge the quality of views in visualizing data. The computation of a visualization’s quality and usefulness ranges from measuring clutter and overlap, up to the existence and perception of speciﬁc (visual) patterns. This survey attempts to report, categorize and unify the diverse understandings and aims to establish a common vocabulary that will enable a wide audience to understand their differences and subtleties. For this purpose, we present a commonly applicable quality metric formalization that should detail and relate all constituting parts of a quality metric. We organize our corpus of reviewed research papers along the data types established in the information visualization community: multi- and high-dimensional, relational, sequential, geospatial and text data. For each data type, we select the visualization subdomains in which quality metrics are an active research ﬁeld and report their ﬁndings, reason on the underlying concepts, describe goals and outline the constraints and requirements. One central goal of this survey is to provide guidance on future research opportunities for the ﬁeld and outline how different visualization communities could beneﬁt from each other by applying or transferring knowledge to their respective subdomain. Additionally, we aim to motivate the visualization community to compare computed measures to the perception of humans.
Guiding the exploration of scatter plot data using motif-based interest measures
2016, Shao, Lin, Schleicher, Timo, Behrisch, Michael, Schreck, Tobias, Sipiran, Ivan, Keim, Daniel A.
Finding interesting patterns in large scatter plot spaces is a challenging problem and becomes even more difficult with increasing number of dimensions. Previous approaches for exploring large scatter plot spaces like e.g., the well-known Scagnostics approach, mainly focus on ranking scatter plots based on their global properties. However, often local patterns contribute significantly to the interestingness of a scatter plot. We are proposing a novel approach for the automatic determination of interesting views in scatter plot spaces based on analysis of local scatter plot segments. Specifically, we automatically classify similar local scatter plot segments, which we call scatter plot motifs . Inspired by the well-known tf×idftf×idf-approach from information retrieval, we compute local and global quality measures based on frequency properties of the local motifs. We show how we can use these to filter, rank and compare scatter plots and their incorporated motifs. We demonstrate the usefulness of our approach with synthetic and real-world data sets and showcase our data exploration tools that visualize the distribution of local scatter plot motifs in relation to a large overall scatter plot space.
Urban Mobility Analysis With Mobile Network Data : A Visual Analytics Approach
2018-05, Senaratne, Hansi, Mueller, Manuel, Behrisch, Michael, Lalanne, Felipe, Bustos-Jimenez, Javier, Schneidewind, Jörn, Keim, Daniel A., Schreck, Tobias
Urban planning and intelligent transportation management are facing key challenges in today's ever more urbanized world. Providing the right tools to city planners is crucial to cope with these challenges. Data collected from citizens' mobile communication can be used as the foundation for such tools. These kinds of data can facilitate various analysis tasks, such as the extraction of human movement patterns or determining the urban dynamics of a city. City planners can closely monitor such patterns based on which strategic decisions can be taken to improve a city's infrastructure. In this paper, we introduce a novel visual analytics approach for pattern exploration and search in global system for mobile communications mobile networks. We define geospatial and matrix representations of data, which can be interactively navigated. The approach integrates data visualization with suitable data analysis algorithms, allowing to spatially and temporally compare mobile usage, identify regularities, as well as anomalies in daily mobility patterns across regions and user groups. As an extension to our visual analytics approach, we further introduce space-time prisms with uncertain markers to visually analyze the uncertainty of urban mobility patterns.
Pattern Trails : Visual Analysis of Pattern Transitions in Subspaces
2017, Jäckle, Dominik, Blumenschein, Michael, Behrisch, Michael, Keim, Daniel A., Schreck, Tobias
Subspace analysis methods have gained interest for identifying patterns in subspaces of high-dimensional data. Existing techniques allow to visualize and compare patterns in subspaces. However, many subspace analysis methods produce an abundant amount of patterns, which often remain redundant and are difficult to relate. Creating effective layouts for comparison of subspace patterns remains challenging. We introduce Pattern Trails, a novel approach for visually ordering and comparing subspace patterns. Central to our approach is the notion of pattern transitions as an interpretable structure imposed to order and compare patterns between subspaces. The basic idea is to visualize projections of subspaces side-by-side, and indicate changes between adjacent patterns in the subspaces by a linked representation, hence introducing pattern transitions. Our contributions comprise a systematization for how pairs of subspace patterns can be compared, and how changes can be interpreted in terms of pattern transitions. We also contribute a technique for visual subspace analysis based on a data-driven similarity measure between subspace representations. This measure is useful to order the patterns, and interactively group subspaces to reduce redundancy. We demonstrate the usefulness of our approach by application to several use cases, indicating that data can be meaningfully ordered and interpreted in terms of pattern transitions.
Visual Quality Assessment of Subspace Clusterings
2016, Blumenschein, Michael, Färber, Ines, Behrisch, Michael, Tatu, Andrada, Schreck, Tobias, Keim, Daniel A., Seidl, Thomas
The quality assessment of results of clustering algorithms is challenging as different cluster methodologies lead to different cluster characteristics and topologies. A further complication is that in high-dimensional data, subspace clustering adds to the complexity by detecting clusters in multiple different lower-dimensional projections. The quality assessment for (subspace) clustering is especially difficult if no benchmark data is available to compare the clustering results. In this research paper, we present SubEval, a novel subspace evaluation framework, which provides visual support for comparing quality criteria of subspace clusterings. We identify important aspects for evaluation of subspace clustering results and show how our system helps to derive quality assessments. SubEval allows assessing subspace cluster quality at three different granularity levels: (1) A global overview of similarity of clusters and estimated redundancy in cluster members and subspace dimensions. (2) A view of a selection of multiple clusters supports in-depth analysis of object distributions and potential cluster overlap. (3) The detail analysis of characteristics of individual clusters helps to understand the (non-)validity of a cluster. We demonstrate the usefulness of SubEval in two case studies focusing on the targeted algorithm- and domain scientists and show how the generated insights lead to a justified selection of an appropriate clustering algorithm and an improved parameter setting. Likewise, SubEval can be used for the understanding and improvement of newly developed subspace clustering algorithms. SubEval is part of SubVA, a novel open-source web-based framework for the visual analysis of different subspace analysis techniques.