FDive : Learning Relevance Models Using Pattern-based Similarity Measures
2019, Dennig, Frederik L., Polk, Tom, Lin, Zudi, Schreck, Tobias, Pfister, Hanspeter, Behrisch, Michael
The detection of interesting patterns in large high-dimensional datasets is difficult because of their dimensionality and pattern complexity. Therefore, analysts require automated support for the extraction of relevant patterns. In this paper, we present FDive, a visual active learning system that helps to create visually explorable relevance models, assisted by learning a pattern-based similarity. We use a small set of user-provided labels to rank similarity measures, consisting of feature descriptor and distance function combinations, by their ability to distinguish relevant from irrelevant data. Based on the best-ranked similarity measure, the system calculates an interactive Self-Organizing Map-based relevance model, which classifies data according to the cluster affiliation. It also automatically prompts further relevance feedback to improve its accuracy. Uncertain areas, especially near the decision boundaries, are highlighted and can be refined by the user. We evaluate our approach by comparison to state-of-the-art feature selection techniques and demonstrate the usefulness of our approach by a case study classifying electron microscopy images of brain cells. The results show that FDive enhances both the quality and understanding of relevance models and can thus lead to new insights for brain research.
Visual Analytics and Similarity Search : Concepts and Challenges for Effective Retrieval Considering Users, Tasks, and Data
2017, Seebacher, Daniel, Häußler, Johannes, Stein, Manuel, Janetzko, Halldor, Schreck, Tobias
A major challenge of the contemporary information age is the overwhelming and increasing data amount, especially when looking for specific information. Searching for relevant information is no longer manually possible, but has to rely on automatic methods, specifically, similarity search. From a formal perspective, similarity search can be seen as the problem of finding entities, which are considered to be similar to a query with respect to certain describing features. The question which features or which weighted combination of features to use for a given query creates a need for semi-automatic methods to address the needs of diverse users. Furthermore, the quality of the results of a similarity search is more than effectiveness, measured by precision and recall. The user ideally needs to trust the results and understand how they were computed. We propose to apply Visual Analytics methodologies, for synergistic cooperation of user and algorithms, to integrate three key dimensions of similarity search: users, tasks, and data for effective search. However, there exists a gap in knowledge how user, task as well as the available data influence each other and the similarity search. In this concept paper, we envision how Visual Analytics can be used to tackle current challenges of similarity search.
Enhancing Parallel Coordinates : Statistical Visualizations for Analyzing Soccer Data
2016, Janetzko, Halldor, Stein, Manuel, Sacha, Dominik, Schreck, Tobias
Visualizing multi-dimensional data in an easy and interpretable way is one of the key features of Parallel Coordinate Plots. However, limitations as overplotting or missing density informations have resulted in many enhancements proposed for Parallel Coordinates. In this paper, we will include density information along each axis for clustered data. The main idea is to visually represent the density distribution of each cluster along the axes. We will show the applicability of our method by analyzing the activity phases of professional soccer players. A final discussion and conclusion will complement this paper.
Scalability of Non-Rigid 3D Shape Retrieval
2015, Sipiran, Ivan, Bustos, Benjamin, Schreck, Tobias, Bronstein, Alex, Castellani, Umberto, Choi, Sungbin, Lai, Long, Li, Haisheng, Litman, Roee, Sun, Li
Due to recent advances in 3D acquisition and modeling, increasingly large amounts of 3D shape data become available in many application domains. This rises not only the need for effective methods for 3D shape retrieval, but also efficient retrieval and robust implementations. Previous 3D retrieval challenges have mainly considered data sets in the range of a few thousands of queries. In the 2015 SHREC track on Scalability of 3D Shape Retrieval we provide a benchmark with more than 96 thousand shapes. The data set is based on a non-rigid retrieval benchmark enhanced by other existing shape benchmarks. From the baseline models, a large set of partial objects were automatically created by simulating a range-image acquisition process. Four teams have participated in the track, with most methods providing very good to near-perfect retrieval results, and one less complex baseline method providing fair performance. Timing results indicate that three of the methods including the latter baseline one provide near- interactive time query execution. Generally, the cost of data pre-processing varies depending on the method.
SocialOcean : Visual Analysis and Characterization of Social Media Bubbles
2018, Diehl, Alexandra, Hundt, Michael, Häußler, Johannes, Seebacher, Daniel, Chen, Siming, Cilasun, Nida, Keim, Daniel A., Schreck, Tobias
Social media allows citizens, corporations, and authorities to create, post, and exchange information. The study of its dynamics will enable analysts to understand user activities and social group characteristics such as connectedness, geospatial distribution, and temporal behavior. In this context, social media bubbles can be defined as social groups that exhibit certain biases in social media. These biases strongly depend on the dimensions selected in the analysis, for example, topic affinity, credibility, sentiment, and geographic distribution. In this paper, we present SocialOcean, a visual analytics system that allows for the investigation of social media bubbles. There exists a large body of research in social sciences which identifies important dimensions of social media bubbles (SMBs). While such dimensions have been studied separately, and also some of them in combination, it is still an open question which dimensions play the most important role in defining SMBs. Since the concept of SMBs is fairly recent, there are many unknowns regarding their characterization. We investigate the thematic and spatiotemporal characteristics of SMBs and present a visual analytics system to address questions such as: What are the most important dimensions that characterize SMBs? and How SMBs embody in the presence of specific events that resonate with them? We illustrate our approach using three different real scenarios related to the single event of Boston Marathon Bombing, and political news about Global Warming. We perform an expert evaluation, analyze the experts' feedback, and present the lessons learned.
Leaf Glyphs : Story Telling and Data Analysis Using Environmental Data Glyph Metaphors
2016-02-12, Fuchs, Johannes, Jäckle, Dominik, Weiler, Niklas, Schreck, Tobias
In exploratory data analysis, important analysis tasks include the assessment of similarity of data points, labeling of outliers, identifying and relating groups in data, and more generally, the detection of patterns. Specifically, for large data sets, such tasks may be effectively addressed by glyph-based visualizations. Appropriately defined glyph designs and layouts may represent collections of data to address these aforementioned tasks. Important problems in glyph visualization include the design of compact glyph representations, and a similarity- or structure-preserving 2D layout. Projection-based techniques are commonly used to generate layouts, but often suffer from over-plotting in 2D display space, which may hinder comparing and relating tasks. Inspired by contour and venation shapes of natural leafs, and their aggregation by stems, we introduce a novel glyph design for visualizing multi-dimensional data. Motivated by the human ability to visually discriminate natural shapes like trees in a forest, single flowers in a flower-bed, or leaves at shrubs, we design a flexible leaf-shaped data glyph, where data controls main leaf properties including leaf morphology, leaf venation, and leaf boundary shape. Our basic leaf glyph can map to more than a dozen of numeric and categorical variables. We also define custom visual aggregation schemes to scale the glyph for large numbers of data records, including prototype-based, set-based, and hierarchic aggregation. We show by example that our design is effectively interpretable to solve multivariate data analysis tasks, and provides effective data mapping. The design provides an aesthetically pleasing appearance, and lends itself easily to storytelling in environmental data analysis problems, among others. The glyph and its aggregation schemes are proposed as a scalable multivariate data visualization design, with applications in data visualization for mass media and data journalism, among others.
Analysis and Comparison of Feature-Based Patterns in Urban Street Networks
2017-08-09, Shao, Lin, Mittelstädt, Sebastian, Goldblatt, Ran, Omer, Itzhak, Bak, Peter, Schreck, Tobias
Analysis of street networks is a challenging task, needed in urban planning applications such as urban design or transportation network analysis. Typically, different network features of interest are used for within- and between comparisons across street networks. We introduce StreetExplorer, a visual-interactive system for analysis and comparison of global and local patterns in urban street networks. The system uses appropriate similarity functions to search for patterns, taking into account topological and geometric features of a street network. We enhance the visual comparison of street network patterns by a suitable color-mapping and boosting scheme to visualize the similarity between street network portions and the distribution of network features. Together with experts from the urban morphology domain, we apply our approach to analyze and compare two urban street networks, identifying patterns of historic development and modern planning approaches, demonstrating the usefulness of StreetExplorer.
Visual-Interactive Search for Soccer Trajectories to Identify Interesting Game Situations
2016, Shao, Lin, Sacha, Dominik, Neldner, Benjamin, Stein, Manuel, Schreck, Tobias
Recently, sports analytics has turned into an important research area of visual analytics and may provide interesting findings, such as the best player of the season, for various kinds of sports. Soccer is a very popular and tactical game, which also attracted great attention in the last few years. However, the search for complex game movements is a very crucial and challenging task. We present a system for searching trajectory data in soccer matches by means of an interactive search interface that enables the user to sketch a situation of interest. Furthermore, we apply a domain specific prefiltering process to extract a set of local movement segments, which are similar to a given sketch. Our approach comprises single-trajectory, multi-trajectory, and event-specific search functions based on two different similarity measures. To demonstrate the usefulness of our approach, we define a domain specific task analysis and conduct a case study together with a domain expert from FC Bayern M¨unchen by investigating a real-world soccer match. Finally, we show that multi-trajectory search in combination with event-specific filtering is needed to describe and retrieve complex moves in soccer matches.
Range Scans based 3D Shape Retrieval
2015, Godil, Afzal, Dutagaci, Helin, Bustos, Benjamin, Choi, Sunghyun, Dong, Shuilong, Furuya, Takahiko, Li, Haisheng, Link, Norman, Moriyama, A., Meruane, Rafael, Ohbuchi, Ryutarou, Paulus, Dietrich, Schreck, Tobias, Seib, Viktor, Sipiran, Ivan, Yin, Huanpu, Zhang, Chaoli
The objective of the SHREC'15 Range Scans based 3D Shape Retrieval track is to evaluate algorithms that match range scans of real objects to complete 3D mesh models in a target dataset. The task is to retrieve a rank list of complete 3D models that are of the same category given the range scan of a query object. This capability is essential to many computer vision systems that involves recognition and classification of objects in the environment based on depth information. In this track, the target dataset consists of 1200 3D mesh models and the query set has 180 range scans of 60 physical objects. Six research groups participated in the contest with a total of 16 different runs. This paper presents the track datasets, participants' methods and the results of the contest.