History oblivious route recovery on road networks
2022, Chondrogiannis, Theodoros, Bornholdt, Johann, Bouros, Panagiotis, Grossniklaus, Michael
The availability of GPS sensors in vehicles has enabled the collection of trajectory data that can be utilized to improve the quality of location-based services. However, mostly due to privacy concerns, many data sets are published without containing entire trajectories but only the source location, the target location and the duration of recorded trips. In this paper, we study the problem of route recovery from trip data. In contrast to recent works that assume the availability of entire trajectories for past trips, we investigate methods for route recovery in the absence of such historical data, and we present methods for recovering the single most likely route that a vehicle has travelled. Furthermore, we introduce the region recovery problem that aims at determining a small region that is very likely to contain the traveled route. We also introduce region recovery methods for both single trips and trip groups. In a comprehensive experimental evaluation, we study the efficacy of our solutions for both the route and the region recovery problem. For the region recovery problem in particular, we demonstrate the pros and cons of each method along with the trade-off they offer between the size of the recovered region and the likelihood that the region contains the actual route.
Online Landmark-Based Batch Processing of Shortest Path Queries
2021, Hotz, Manuel, Chondrogiannis, Theodoros, Wörteler, Leonard, Grossniklaus, Michael
Processing shortest path queries is a basic operation in many graph problems. Both preprocessing-based and batch processing techniques have been proposed to speed up the computation of a single shortest path by amortizing its costs. However, both of these approaches suffer from limitations. The former techniques are prohibitively expensive in situations where the precomputed information needs to be updated frequently due to changes in the graph, while the latter require coordinates and cannot be used on non-spatial graphs. In this paper, we address both limitations and propose novel techniques for batch processing shortest paths queries using landmarks. We show how preprocessing can be avoided entirely by integrating the computation of landmark distances into query processing. Our experimental results demonstrate that our techniques outperform the state of the art on both spatial and non-spatial graphs with a maximum speedup of 3.61 × in online scenarios.
Experiences with Implementing Landmark Embedding in Neo4j
2019, Hotz, Manuel, Chondrogiannis, Theodoros, Wörteler, Leonard, Grossniklaus, Michael
Reachability, distance, and shortest path queries are fundamental operations in the field of graph data management with various applications in research and industry. However, while various preprocessing-based methods have been proposed to optimize the computation of such queries, the integration of existing methods into graph database management systems and processing frameworks has been limited. In this paper, we present an implementation of a static graph index that employs landmark embedding for Neo4j, to enable the index-based computation of reachability, distance, and shortest path queries on the database. We explore different strategies for selecting landmarks and different schemes for storing the precomputed landmark distances. To evaluate the efficiency of each landmark selection strategy and each storage scheme, we conduct an experimental evaluation using four real-world network datasets. We measure the preprocessing cost, the query processing time, and the accuracy of the distance estimation of different configurations of our index structure.
Robustness Metrics for Relational Query Execution Plans
2018, Wolf, Florian, Brendle, Michael, May, Norman, Willems, Paul R., Sattler, Kai-Uwe, Grossniklaus, Michael
The quality of query execution plans in database systems determines how fast a query can be executed. It has been shown that conventional query optimization still selects suboptimal or even bad execution plans, due to errors in the cardinality estimation. Although cardinality estimation errors are an evident problem, they are in general not considered in the selection of query execution plans. In this paper, we present three novel metrics for the robustness of relational query execution plans w.r.t. cardinality estimation errors. We also present a novel plan selection strategy that takes both, estimated cost and estimated robustness into account, when choosing a plan for execution. Finally, we share the results of our experimental comparison between robust and conventional plan selection on real world and synthetic benchmarks, showing a speedup of at most factor 3:49.
Cardinality Estimation using Label Probability Propagation for Subgraph Matching in Property Graph Databases
2022, Wörteler, Leonard, Renftle, Moritz, Chondrogiannis, Theodoros, Grossniklaus, Michael
Estimating query result cardinality is a central task of cost-based database query optimizers, enabling them to identify and avoid excessively large intermediate results. While cardinality estimation has been studied extensively in relational databases, research in the setting of graph databases has been more limited. In this paper, we address the problem of cardinality estimation for subgraph matching on property graph databases. Our novel cardinality estimation technique starts from a small amount of statistical information about node labels and relationship types, which is propagated along the graph query pattern in terms of label probabilities. Additionally, estimation quality can be improved by providing information about labels or properties to our technique, if available. In our experimental evaluation, we compare our approach to state-of-the-art cardinality estimation techniques for subgraph matching for property graph, RDF, and relational databases, and we demonstrate that our technique offers the best trade-off between accuracy and efficiency.
Where to go : Computational and visual what-if analyses in soccer
2019-12-17, Stein, Manuel, Seebacher, Daniel, Marcelino, Rui, Schreck, Tobias, Grossniklaus, Michael, Keim, Daniel A., Janetzko, Halldor
To prepare their teams for upcoming matches, analysts in professional soccer watch and manually annotate up to three matches a day. When annotating matches, domain experts try to identify and improve suboptimal movements based on intuition and professional experience. The high amount of matches needing to be analysed manually result in a tedious and time-consuming process, and results may be subjective. We propose an automatic approach for the realisation of effective region-based what-if analyses in soccer. Our system covers the automatic detection of region-based faulty movement behaviour, as well as the automatic suggestion of possible improved alternative movements. As we show, our approach effectively supports analysts and coaches investigating matches by speeding up previously time-consuming work. We enable domain experts to include their domain knowledge in the analysis process by allowing to interactively adjust suggested improved movement, as well as its implications on region control. We demonstrate the usefulness of our proposed approach via an expert study with three invited domain experts, one being head coach from the first Austrian soccer league. As our results show that experts most often agree with the suggested player movement (83%), our proposed approach enhances the analytical capabilities in soccer and supports a more efficient analysis.
Revealing the Invisible : Visual Analytics and Explanatory Storytelling for Advanced Team Sport Analysis
2018-10, Stein, Manuel, Breitkreutz, Thorsten, Häußler, Johannes, Seebacher, Daniel, Niederberger, Christoph, Schreck, Tobias, Grossniklaus, Michael, Keim, Daniel A., Janetzko, Halldor
The analysis of invasive team sports often concentrates on cooperative and competitive aspects of collective movement behavior. A main goal is the identification and explanation of strategies, and eventually the development of new strategies. In visual sports analytics, a range of different visual-interactive analysis techniques have been proposed, e.g., based on visualization using for example trajectories, graphs, heatmaps, and animations. Identifying suitable visualizations for a specific situation is key to a successful analysis. Existing systems enable the interactive selection of different visualization facets to support the analysis process. However, an interactive selection of appropriate visualizations is a difficult, complex, and time-consuming task. In this paper, we propose a four-step analytics conceptual workflow for an automatic selection of appropriate views for key situations in soccer games. Our concept covers classification, specification, explanation, and alteration of match situations, effectively enabling the analysts to focus on important game situations and the determination of alternative moves. Combining abstract visualizations with real world video recordings by Immersive Visual Analytics and descriptive storylines, we support domain experts in understanding key situations. We demonstrate the usefulness of our proposed conceptual workflow via two proofs of concept and evaluate our system by comparing our results to manual video annotations by domain experts. Initial expert feedback shows that our proposed concept improves the understanding of competitive sports and leads to a more efficient data analysis.
SAHARA : Memory Footprint Reduction of Cloud Databases with Automated Table Partitioning
2022, Brendle, Michael, Weber, Nick, Valiyev, Mahammad, May, Norman, Schulze, Robert, Böhm, Alexander, Moerkotte, Guido, Grossniklaus, Michael
Enterprises increasingly move their databases into the cloud. As a result, database-as-a-service providers are challenged to meet the performance guarantees assured in service-level agreements (SLAs) while keeping hardware costs as low as possible. Being cost-effective is particularly crucial for cloud databases where the provisioned amount of DRAM dominates the hardware costs. A way to decrease the memory footprint is to leverage access skew in the workload by moving rarely accessed cold data to cheaper storage layers and retaining only frequently accessed hot data in main memory. In this paper, we present SAHARA, an advisor that proposes a table partitioning for column stores with minimal memory footprint while still adhering to all performance SLAs. SAHARA collects lightweight workload statistics, classifies data as hot and cold, and calculates optimal or near-optimal range partitioning layouts with low optimization time using a novel cost model. We integrated SAHARA into a commercial cloud database and show in our experiments for real-world and synthetic benchmarks a memory footprint reduction of 2.5× while still fulfilling all performance SLAs provided by the customer or advertised by the DBaaS provider.
From Movement to Events : Improving Soccer Match Annotations
2019, Stein, Manuel, Seebacher, Daniel, Karge, Tassilo, Polk, Tom, Grossniklaus, Michael, Keim, Daniel A.
Match analysis has become an important task in everyday work at professional soccer clubs in order to improve team performance. Video analysts regularly spend up to several days analyzing and summarizing matches based on tracked and annotated match data. Although there already exists extensive capabilities to track the movement of players and the ball from multimedia data sources such as video recordings, there is no capability to sufficiently detect dynamic and complex events within these data. As a consequence, analysts have to rely on manually created annotations, which are very time-consuming and expensive to create. We propose a novel method for the semi-automatic definition and detection of events based entirely on movement data of players and the ball. Incorporating Allen’s interval algebra into a visual analytics system, we enable analysts to visually define as well as search for complex, hierarchical events. We demonstrate the usefulness of our approach by quantitatively comparing our automatically detected events with manually annotated events from a professional data provider as well as several expert interviews. The results of our evaluation show that the required annotation time for complete matches by using our system can be reduced to a few seconds while achieving a similar level of performance.
Leveraging annotation-based modeling with JUMP
2018-02, Bergmayr, Alexander, Grossniklaus, Michael, Wimmer, Manuel, Kappel, Gerti
The capability of UML profiles to serve as annotation mechanism has been recognized in both research and industry. Today’s modeling tools offer profiles specific to platforms, such as Java, as they facilitate model-based engineering approaches. However, considering the large number of possible annotations in Java, manually developing the corresponding profiles would only be achievable by huge development and maintenance efforts. Thus, leveraging annotation-based modeling requires an automated approach capable of generating platform-specific profiles from Java libraries. To address this challenge, we present the fully automated transformation chain realized by Jump, thereby continuing existing mapping efforts between Java and UML by emphasizing on annotations and profiles. The evaluation of Jump shows that it scales for large Java libraries and generates profiles of equal or even improved quality compared to profiles currently used in practice. Furthermore, we demonstrate the practical value of Jump by contributing profiles that facilitate reverse engineering and forward engineering processes for the Java platform by applying it to a modernization scenario.