FDive : Learning Relevance Models Using Pattern-based Similarity Measures
2019, Dennig, Frederik L., Polk, Tom, Lin, Zudi, Schreck, Tobias, Pfister, Hanspeter, Behrisch, Michael
The detection of interesting patterns in large high-dimensional datasets is difficult because of their dimensionality and pattern complexity. Therefore, analysts require automated support for the extraction of relevant patterns. In this paper, we present FDive, a visual active learning system that helps to create visually explorable relevance models, assisted by learning a pattern-based similarity. We use a small set of user-provided labels to rank similarity measures, consisting of feature descriptor and distance function combinations, by their ability to distinguish relevant from irrelevant data. Based on the best-ranked similarity measure, the system calculates an interactive Self-Organizing Map-based relevance model, which classifies data according to the cluster affiliation. It also automatically prompts further relevance feedback to improve its accuracy. Uncertain areas, especially near the decision boundaries, are highlighted and can be refined by the user. We evaluate our approach by comparison to state-of-the-art feature selection techniques and demonstrate the usefulness of our approach by a case study classifying electron microscopy images of brain cells. The results show that FDive enhances both the quality and understanding of relevance models and can thus lead to new insights for brain research.
Magnostics : Image-Based Search of Interesting Matrix Views for Guided Network Exploration
2017-01, Behrisch, Michael, Bach, Benjamin, Blumenschein, Michael, Delz, Michael, von Rüden, Laura, Fekete, Jean-Daniel, Schreck, Tobias
In this work we address the problem of retrieving potentially interesting matrix views to support the exploration of networks. We introduce Matrix Diagnostics (or Magnostics), following in spirit related approaches for rating and ranking other visualization techniques, such as Scagnostics for scatter plots. Our approach ranks matrix views according to the appearance of specific visual patterns, such as blocks and lines, indicating the existence of topological motifs in the data, such as clusters, bi-graphs, or central nodes. Magnostics can be used to analyze, query, or search for visually similar matrices in large collections, or to assess the quality of matrix reordering algorithms. While many feature descriptors for image analyzes exist, there is no evidence how they perform for detecting patterns in matrices. In order to make an informed choice of feature descriptors for matrix diagnostics, we evaluate 30 feature descriptors-27 existing ones and three new descriptors that we designed specifically for MAGNOSTICS-with respect to four criteria: pattern response, pattern variability, pattern sensibility, and pattern discrimination. We conclude with an informed set of six descriptors as most appropriate for Magnostics and demonstrate their application in two scenarios; exploring a large collection of matrices and analyzing temporal networks.
Matrix Reordering Methods for Table and Network Visualization
2016, Behrisch, Michael, Bach, Benjamin, Henry Riche, Nathalie, Schreck, Tobias, Fekete, Jean-Daniel
This survey provides a description of algorithms to reorder visual matrices of tabular data and adjacency matrix of Networks. The goal of this survey is to provide a comprehensive list of reordering algorithms published in different fields such as statistics, bioinformatics, or graph theory. While several of these algorithms are described in publications and others are available in software libraries and programs, there is little awareness of what is done across all fields. Our survey aims at describing these reordering algorithms in a unified manner to enable a wide audience to understand their differences and subtleties. We organize this corpus in a consistent manner, independently of the application or research field. We also provide practical guidance on how to select appropriate algorithms depending on the structure and size of the matrix to reorder, and point to implementations when available.
Guidelines for Effective Usage of Text Highlighting Techniques
2016, Strobelt, Hendrik, Oelke, Daniela, Kwon, Bum Chul, Schreck, Tobias, Pfister, Hanspeter
Semi-automatic text analysis involves manual inspection of text. Often, different text annotations (like part-of-speech or named entities) are indicated by using distinctive text highlighting techniques. In typesetting there exist well-known formatting conventions, such as bold typeface, italics, or background coloring, that are useful for highlighting certain parts of a given text. Also, many advanced techniques for visualization and highlighting of text exist; yet, standard typesetting is common, and the effects of standard typesetting on the perception of text are not fully understood. As such, we surveyed and tested the effectiveness of common text highlighting techniques, both individually and in combination, to discover how to maximize pop-out effects while minimizing visual interference between techniques. To validate our findings, we conducted a series of crowdsourced experiments to determine: i) a ranking of nine commonly-used text highlighting techniques; ii) the degree of visual interference between pairs of text highlighting techniques; iii) the effectiveness of techniques for visual conjunctive search. Our results show that increasing font size works best as a single highlighting technique, and that there are significant visual interferences between some pairs of highlighting techniques. We discuss the pros and cons of different combinations as a design guideline to choose text highlighting techniques for text viewers.
SocialOcean : Visual Analysis and Characterization of Social Media Bubbles
2018, Diehl, Alexandra, Hundt, Michael, Häußler, Johannes, Seebacher, Daniel, Chen, Siming, Cilasun, Nida, Keim, Daniel A., Schreck, Tobias
Social media allows citizens, corporations, and authorities to create, post, and exchange information. The study of its dynamics will enable analysts to understand user activities and social group characteristics such as connectedness, geospatial distribution, and temporal behavior. In this context, social media bubbles can be defined as social groups that exhibit certain biases in social media. These biases strongly depend on the dimensions selected in the analysis, for example, topic affinity, credibility, sentiment, and geographic distribution. In this paper, we present SocialOcean, a visual analytics system that allows for the investigation of social media bubbles. There exists a large body of research in social sciences which identifies important dimensions of social media bubbles (SMBs). While such dimensions have been studied separately, and also some of them in combination, it is still an open question which dimensions play the most important role in defining SMBs. Since the concept of SMBs is fairly recent, there are many unknowns regarding their characterization. We investigate the thematic and spatiotemporal characteristics of SMBs and present a visual analytics system to address questions such as: What are the most important dimensions that characterize SMBs? and How SMBs embody in the presence of specific events that resonate with them? We illustrate our approach using three different real scenarios related to the single event of Boston Marathon Bombing, and political news about Global Warming. We perform an expert evaluation, analyze the experts' feedback, and present the lessons learned.
Visual Analytics and Similarity Search : Concepts and Challenges for Effective Retrieval Considering Users, Tasks, and Data
2017, Seebacher, Daniel, Häußler, Johannes, Stein, Manuel, Janetzko, Halldor, Schreck, Tobias
A major challenge of the contemporary information age is the overwhelming and increasing data amount, especially when looking for specific information. Searching for relevant information is no longer manually possible, but has to rely on automatic methods, specifically, similarity search. From a formal perspective, similarity search can be seen as the problem of finding entities, which are considered to be similar to a query with respect to certain describing features. The question which features or which weighted combination of features to use for a given query creates a need for semi-automatic methods to address the needs of diverse users. Furthermore, the quality of the results of a similarity search is more than effectiveness, measured by precision and recall. The user ideally needs to trust the results and understand how they were computed. We propose to apply Visual Analytics methodologies, for synergistic cooperation of user and algorithms, to integrate three key dimensions of similarity search: users, tasks, and data for effective search. However, there exists a gap in knowledge how user, task as well as the available data influence each other and the similarity search. In this concept paper, we envision how Visual Analytics can be used to tackle current challenges of similarity search.
Visual-Interactive Search for Soccer Trajectories to Identify Interesting Game Situations
2016, Shao, Lin, Sacha, Dominik, Neldner, Benjamin, Stein, Manuel, Schreck, Tobias
Recently, sports analytics has turned into an important research area of visual analytics and may provide interesting findings, such as the best player of the season, for various kinds of sports. Soccer is a very popular and tactical game, which also attracted great attention in the last few years. However, the search for complex game movements is a very crucial and challenging task. We present a system for searching trajectory data in soccer matches by means of an interactive search interface that enables the user to sketch a situation of interest. Furthermore, we apply a domain specific prefiltering process to extract a set of local movement segments, which are similar to a given sketch. Our approach comprises single-trajectory, multi-trajectory, and event-specific search functions based on two different similarity measures. To demonstrate the usefulness of our approach, we define a domain specific task analysis and conduct a case study together with a domain expert from FC Bayern M¨unchen by investigating a real-world soccer match. Finally, we show that multi-trajectory search in combination with event-specific filtering is needed to describe and retrieve complex moves in soccer matches.
Analysis and Comparison of Feature-Based Patterns in Urban Street Networks
2017-08-09, Shao, Lin, Mittelstädt, Sebastian, Goldblatt, Ran, Omer, Itzhak, Bak, Peter, Schreck, Tobias
Analysis of street networks is a challenging task, needed in urban planning applications such as urban design or transportation network analysis. Typically, different network features of interest are used for within- and between comparisons across street networks. We introduce StreetExplorer, a visual-interactive system for analysis and comparison of global and local patterns in urban street networks. The system uses appropriate similarity functions to search for patterns, taking into account topological and geometric features of a street network. We enhance the visual comparison of street network patterns by a suitable color-mapping and boosting scheme to visualize the similarity between street network portions and the distribution of network features. Together with experts from the urban morphology domain, we apply our approach to analyze and compare two urban street networks, identifying patterns of historic development and modern planning approaches, demonstrating the usefulness of StreetExplorer.
Leaf Glyphs : Story Telling and Data Analysis Using Environmental Data Glyph Metaphors
2016-02-12, Fuchs, Johannes, Jäckle, Dominik, Weiler, Niklas, Schreck, Tobias
In exploratory data analysis, important analysis tasks include the assessment of similarity of data points, labeling of outliers, identifying and relating groups in data, and more generally, the detection of patterns. Specifically, for large data sets, such tasks may be effectively addressed by glyph-based visualizations. Appropriately defined glyph designs and layouts may represent collections of data to address these aforementioned tasks. Important problems in glyph visualization include the design of compact glyph representations, and a similarity- or structure-preserving 2D layout. Projection-based techniques are commonly used to generate layouts, but often suffer from over-plotting in 2D display space, which may hinder comparing and relating tasks. Inspired by contour and venation shapes of natural leafs, and their aggregation by stems, we introduce a novel glyph design for visualizing multi-dimensional data. Motivated by the human ability to visually discriminate natural shapes like trees in a forest, single flowers in a flower-bed, or leaves at shrubs, we design a flexible leaf-shaped data glyph, where data controls main leaf properties including leaf morphology, leaf venation, and leaf boundary shape. Our basic leaf glyph can map to more than a dozen of numeric and categorical variables. We also define custom visual aggregation schemes to scale the glyph for large numbers of data records, including prototype-based, set-based, and hierarchic aggregation. We show by example that our design is effectively interpretable to solve multivariate data analysis tasks, and provides effective data mapping. The design provides an aesthetically pleasing appearance, and lends itself easily to storytelling in environmental data analysis problems, among others. The glyph and its aggregation schemes are proposed as a scalable multivariate data visualization design, with applications in data visualization for mass media and data journalism, among others.
Enhancing Parallel Coordinates : Statistical Visualizations for Analyzing Soccer Data
2016, Janetzko, Halldor, Stein, Manuel, Sacha, Dominik, Schreck, Tobias
Visualizing multi-dimensional data in an easy and interpretable way is one of the key features of Parallel Coordinate Plots. However, limitations as overplotting or missing density informations have resulted in many enhancements proposed for Parallel Coordinates. In this paper, we will include density information along each axis for clustered data. The main idea is to visually represent the density distribution of each cluster along the axes. We will show the applicability of our method by analyzing the activity phases of professional soccer players. A final discussion and conclusion will complement this paper.