An Image-Based Approach to Visual Feature Space Analysis
2008, Schreck, Tobias, Schneidewind, Jörn, Keim, Daniel A.
Methods for management and analysis of non-standard data often rely on the so-called feature vector approach. The technique describes complex data instances by vectors of characteristic numeric values which allow to index the data and to calculate similarity scores between the data elements. Thereby, feature vectors often are a key ingredient to intelligent data analysis algorithms including instances of clustering, classification, and similarity search algorithms. However, identification of appropriate feature vectors for a given database of a given data type is a challenging task. Determining good feature vector extractors usually involves benchmarks relying on supervised information, which makes it an expensive and data dependent process. In this paper, we address the feature selection problem by a novel approach based on analysis of certain feature space images. We develop two image-based analysis techniques for the automatic discrimination power analysis of feature spaces. We evaluate the techniques on a comprehensive feature selection benchmark, demonstrating the effectiveness of our analysis and its potential toward automatically addressing the feature selection problem.
Semiautomatic benchmarking of feature vectors for multimedia retrieval
2007, Schreck, Tobias, Schneidewind, Jörn, Keim, Daniel A., Ward, Matthew O., Tatu, Andrada
Modern Digital Library applications store and process massive amounts of information. Usually, this data is not limited to raw textual or numeric data - typical applications also deal with multimedia data such as images, audio, video, or 3D geometric models. For providing effective retrieval functionality, appropriate meta data descriptors that allow calculation of similarity scores between data instances are requires. Feature vectors are a generic way for describing multimedia data by vectors formed from numerically captured object features. They are used in similarity search, but also, can be used for clustering and wider multimedia analysis applications. Extracting effective feature vectors for a given data type is a challenging task. Determining good feature vector extractors usually involves experimentation and application of supervised information. However, such experimentation usually is expensive, and supervised information often is data dependent. We address the feature selection problem by a novel approach based on analysis of certain feature space images. We develop two image-based analysis techniques for the automatic discrimination power analysis of feature spaces. We evaluate the techniques on a comprehensive feature selection benchmark, demonstrating the effectiveness of our analysis and its potential toward automatically addressing the feature selection problem.
Monitoring Network Traffic with Radial Traffic Analyzer
2006-12, Keim, Daniel A., Mansmann, Florian, Schneidewind, Jörn, Schreck, Tobias
Extensive spread of malicious code on the Internet and also within intranets has risen the user s concern about what kind of data is transferred between her or his computer and other hosts on the network. Visual analysis of this kind of information is a challenging task, due to the complexity and volume of the data type considered, and requires special design of appropriate visualization techniques. In this paper, we present a scalable visualization toolkit for analyzing network activity of computer hosts on a network. The visualization combines network packet volume and type distribution information with geographic information, enabling the analyst to use geographic distortion techniques such as the HistoMap technique to become aware of the traffic components in the course of the analysis. The presented analysis tool is especially useful to compare important network load characteristics in a geographically aware display, to relate communication partners, and to identify the type of network traffic occurring. The results of the analysis are helpful in understanding typical network communication activities, and in anticipating potential performance bottlenecks or problems. It is suited for both off-line analysis of historic data, and via animation for on-line monitoring of packet-based network traffic in real time.