The Role of Interactive Visualization in Fostering Trust in AI
2021, Beauxis-Aussalet, Emma, Behrisch, Michael, Borgo, Rita, Chau, Duen Horng, Collins, Christopher, El-Assady, Mennatallah, Keim, Daniel A., Oelke, Daniela, Schreck, Tobias, Strobelt, Hendrik
The increasing use of artificial intelligence (AI) technologies across application domains has prompted our society to pay closer attention to AI's trustworthiness, fairness, interpretability, and accountability. In order to foster trust in AI, it is important to consider the potential of interactive visualization, and how such visualizations help build trust in AI systems. This manifesto discusses the relevance of interactive visualizations and makes the following four claims: i) trust is not a technical problem, ii) trust is dynamic, iii) visualization cannot address all aspects of trust, and iv) visualization is crucial for human agency in AI.
Quality Metrics for Information Visualization
2018, Behrisch, Michael, Blumenschein, Michael, Kim, Naam Wook, El-Assady, Mennatallah, Fuchs, Johannes, Seebacher, Daniel, Diehl, Alexandra, Brandes, Ulrik, Schreck, Tobias, Weiskopf, Daniel, Keim, Daniel A.
The visualization community has developed to date many intuitions and understandings of how to judge the quality of views in visualizing data. The computation of a visualization’s quality and usefulness ranges from measuring clutter and overlap, up to the existence and perception of speciﬁc (visual) patterns. This survey attempts to report, categorize and unify the diverse understandings and aims to establish a common vocabulary that will enable a wide audience to understand their differences and subtleties. For this purpose, we present a commonly applicable quality metric formalization that should detail and relate all constituting parts of a quality metric. We organize our corpus of reviewed research papers along the data types established in the information visualization community: multi- and high-dimensional, relational, sequential, geospatial and text data. For each data type, we select the visualization subdomains in which quality metrics are an active research ﬁeld and report their ﬁndings, reason on the underlying concepts, describe goals and outline the constraints and requirements. One central goal of this survey is to provide guidance on future research opportunities for the ﬁeld and outline how different visualization communities could beneﬁt from each other by applying or transferring knowledge to their respective subdomain. Additionally, we aim to motivate the visualization community to compare computed measures to the perception of humans.
Guiding the exploration of scatter plot data using motif-based interest measures
2016, Shao, Lin, Schleicher, Timo, Behrisch, Michael, Schreck, Tobias, Sipiran, Ivan, Keim, Daniel A.
Finding interesting patterns in large scatter plot spaces is a challenging problem and becomes even more difficult with increasing number of dimensions. Previous approaches for exploring large scatter plot spaces like e.g., the well-known Scagnostics approach, mainly focus on ranking scatter plots based on their global properties. However, often local patterns contribute significantly to the interestingness of a scatter plot. We are proposing a novel approach for the automatic determination of interesting views in scatter plot spaces based on analysis of local scatter plot segments. Specifically, we automatically classify similar local scatter plot segments, which we call scatter plot motifs . Inspired by the well-known tf×idftf×idf-approach from information retrieval, we compute local and global quality measures based on frequency properties of the local motifs. We show how we can use these to filter, rank and compare scatter plots and their incorporated motifs. We demonstrate the usefulness of our approach with synthetic and real-world data sets and showcase our data exploration tools that visualize the distribution of local scatter plot motifs in relation to a large overall scatter plot space.
Visual Analysis of Sets of Heterogeneous Matrices Using Projection-Based Distance Functions and Semantic Zoom
2014, Behrisch, Michael, Davey, James, Fischer, Fabian, Thonnard, Olivier, Schreck, Tobias, Keim, Daniel A., Kohlkammer, Jörn
Matrix visualization is an established technique in the analysis of relational data. It is applicable to large, dense networks, where node-link representations may not be effective. Recently, domains have emerged in which the comparative analysis of sets of matrices of potentially varying size is relevant. For example, to monitor computer network traffic a dynamic set of hosts and their peer-to-peer connections on different ports must be analysed. A matrix visualization focused on the display of one matrix at a time cannot cope with this task.
We address the research problem of the visual analysis of sets of matrices. We present a technique for comparing matrices of potentially varying size. Our approach considers the rows and/or columns of a matrix as the basic elements of the analysis. We project these vectors for pairs of matrices into a low-dimensional space which is used as the reference to compare matrices and identify relationships among them. Bipartite graph matching is applied on the projected elements to compute a measure of distance. A key advantage of this measure is that it can be interpreted and manipulated as a visual distance function, and serves as a comprehensible basis for ranking, clustering and comparison in sets of matrices. We present an interactive system in which users may explore the matrix distances and understand potential differences in a set of matrices. A flexible semantic zoom mechanism enables users to navigate through sets of matrices and identify patterns at different levels of detail. We demonstrate the effectiveness of our approach through a case study and provide a technical evaluation to illustrate its strengths.
Urban Mobility Analysis With Mobile Network Data : A Visual Analytics Approach
2018-05, Senaratne, Hansi, Mueller, Manuel, Behrisch, Michael, Lalanne, Felipe, Bustos-Jimenez, Javier, Schneidewind, Jörn, Keim, Daniel A., Schreck, Tobias
Urban planning and intelligent transportation management are facing key challenges in today's ever more urbanized world. Providing the right tools to city planners is crucial to cope with these challenges. Data collected from citizens' mobile communication can be used as the foundation for such tools. These kinds of data can facilitate various analysis tasks, such as the extraction of human movement patterns or determining the urban dynamics of a city. City planners can closely monitor such patterns based on which strategic decisions can be taken to improve a city's infrastructure. In this paper, we introduce a novel visual analytics approach for pattern exploration and search in global system for mobile communications mobile networks. We define geospatial and matrix representations of data, which can be interactively navigated. The approach integrates data visualization with suitable data analysis algorithms, allowing to spatially and temporally compare mobile usage, identify regularities, as well as anomalies in daily mobility patterns across regions and user groups. As an extension to our visual analytics approach, we further introduce space-time prisms with uncertain markers to visually analyze the uncertainty of urban mobility patterns.
Pattern Trails : Visual Analysis of Pattern Transitions in Subspaces
2017, Jäckle, Dominik, Blumenschein, Michael, Behrisch, Michael, Keim, Daniel A., Schreck, Tobias
Subspace analysis methods have gained interest for identifying patterns in subspaces of high-dimensional data. Existing techniques allow to visualize and compare patterns in subspaces. However, many subspace analysis methods produce an abundant amount of patterns, which often remain redundant and are difficult to relate. Creating effective layouts for comparison of subspace patterns remains challenging. We introduce Pattern Trails, a novel approach for visually ordering and comparing subspace patterns. Central to our approach is the notion of pattern transitions as an interpretable structure imposed to order and compare patterns between subspaces. The basic idea is to visualize projections of subspaces side-by-side, and indicate changes between adjacent patterns in the subspaces by a linked representation, hence introducing pattern transitions. Our contributions comprise a systematization for how pairs of subspace patterns can be compared, and how changes can be interpreted in terms of pattern transitions. We also contribute a technique for visual subspace analysis based on a data-driven similarity measure between subspace representations. This measure is useful to order the patterns, and interactively group subspaces to reduce redundancy. We demonstrate the usefulness of our approach by application to several use cases, indicating that data can be meaningfully ordered and interpreted in terms of pattern transitions.
Subspace Nearest Neighbor Search : Problem Statement, Approaches, and Discussion
2015, Blumenschein, Michael, Behrisch, Michael, Färber, Ines, Sedlmair, Michael, Schreck, Tobias, Seidl, Thomas, Keim, Daniel A.
Computing the similarity between objects is a central task for many applications in the field of information retrieval and data mining. For finding k-nearest neighbors, typically a ranking is computed based on a predetermined set of data dimensions and a distance function, constant over all possible queries. However, many high-dimensional feature spaces contain a large number of dimensions, many of which may contain noise, irrelevant, redundant, or contradicting information. More specifically, the relevance of dimensions may depend on the query object itself, and in general, different dimension sets (subspaces) may be appropriate for a query. Approaches for feature selection or -weighting typically provide a global subspace selection, which may not be suitable for all possibly queries. In this position paper, we frame a new research problem, called subspace nearest neighbor search, aiming at multiple query-dependent subspaces for nearest neighbor search. We describe relevant problem characteristics, relate to existing approaches, and outline potential research directions.
SOMFlow : Guided Exploratory Cluster Analysis with Self-Organizing Maps and Analytic Provenance
2018-01, Sacha, Dominik, Kraus, Matthias, Bernard, Jürgen, Behrisch, Michael, Schreck, Tobias, Asano, Yuki, Keim, Daniel A.
Clustering is a core building block for data analysis, aiming to extract otherwise hidden structures and relations from raw datasets, such as particular groups that can be effectively related, compared, and interpreted. A plethora of visual-interactive cluster analysis techniques has been proposed to date, however, arriving at useful clusterings often requires several rounds of user interactions to fine-tune the data preprocessing and algorithms. We present a multi-stage Visual Analytics (VA) approach for iterative cluster refinement together with an implementation (SOMFlow) that uses Self-Organizing Maps (SOM) to analyze time series data. It supports exploration by offering the analyst a visual platform to analyze intermediate results, adapt the underlying computations, iteratively partition the data, and to reflect previous analytical activities. The history of previous decisions is explicitly visualized within a flow graph, allowing to compare earlier cluster refinements and to explore relations. We further leverage quality and interestingness measures to guide the analyst in the discovery of useful patterns, relations, and data partitions. We conducted two pair analytics experiments together with a subject matter expert in speech intonation research to demonstrate that the approach is effective for interactive data analysis, supporting enhanced understanding of clustering results as well as the interactive process itself.
Visual Quality Assessment of Subspace Clusterings
2016, Blumenschein, Michael, Färber, Ines, Behrisch, Michael, Tatu, Andrada, Schreck, Tobias, Keim, Daniel A., Seidl, Thomas
The quality assessment of results of clustering algorithms is challenging as different cluster methodologies lead to different cluster characteristics and topologies. A further complication is that in high-dimensional data, subspace clustering adds to the complexity by detecting clusters in multiple different lower-dimensional projections. The quality assessment for (subspace) clustering is especially difficult if no benchmark data is available to compare the clustering results. In this research paper, we present SubEval, a novel subspace evaluation framework, which provides visual support for comparing quality criteria of subspace clusterings. We identify important aspects for evaluation of subspace clustering results and show how our system helps to derive quality assessments. SubEval allows assessing subspace cluster quality at three different granularity levels: (1) A global overview of similarity of clusters and estimated redundancy in cluster members and subspace dimensions. (2) A view of a selection of multiple clusters supports in-depth analysis of object distributions and potential cluster overlap. (3) The detail analysis of characteristics of individual clusters helps to understand the (non-)validity of a cluster. We demonstrate the usefulness of SubEval in two case studies focusing on the targeted algorithm- and domain scientists and show how the generated insights lead to a justified selection of an appropriate clustering algorithm and an improved parameter setting. Likewise, SubEval can be used for the understanding and improvement of newly developed subspace clustering algorithms. SubEval is part of SubVA, a novel open-source web-based framework for the visual analysis of different subspace analysis techniques.
Guiding the Exploration of Scatter Plot Data Using Motif-Based Interest Measures
2015, Shao, Lin, Schleicher, Timo, Behrisch, Michael, Schreck, Tobias, Sipiran, Ivan, Keim, Daniel A.
Finding interesting patterns in large scatter plot spaces is a challenging problem and becomes even more difficult with increasing number of dimensions. Previous approaches for exploring large scatter plot spaces like e.g., the well-known Scagnostics approach, mainly focus on ranking scatter plots based on their global properties. However, often local patterns contribute significantly to the interestingness of a scatter plot. We are proposing a novel approach for the automatic determination of interesting views in scatter plot spaces based on analysis of local scatter plot segments. Specifically, we automatically classify similar local scatter plot segments, which we call scatter plot motifs. Inspired by the well-known tf-idf approach from information retrieval, we compute local and global quality measures based on certain frequency properties of the local motifs. We show how we can use these to filter, rank and compare scatter plots and their incorporated motifs. We demonstrate the usefulness of our approach with synthetic and real-world data sets and showcase our corresponding data exploration tool that visualizes the distribution of local scatter plot motifs in relation to a large overall scatter plot space.