Publikation: A Cost Model for Nearest Neighbor Search in High-Dimensional Data Space
Dateien
Datum
Autor:innen
Herausgeber:innen
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
URI (zitierfähiger Link)
DOI (zitierfähiger Link)
Internationale Patentnummer
Link zur Lizenz
Angaben zur Forschungsförderung
Projekt
Open Access-Veröffentlichung
Core Facility der Universität Konstanz
Titel in einer weiteren Sprache
Publikationstyp
Publikationsstatus
Erschienen in
Zusammenfassung
In this paper, we present a new cost model for nearest neighbor search in high-dimensional data space. We first analyze different nearest neighbor algorithms, present a generalization of an algorithm which has been originally proposed for Quadtrees [13], and show that this algorithm is optimal. Then, we develop a cost model which - in contrast to previous models - takes boundary effects into account and therefore also works in high dimensions. The advantages of our model are in particular: Our model works for data sets with an arbitrary number of dimensions and an arbitrary number of data points, is applicable to different data distributions and index structures, and provides accurate estimates of the expected query execution time. To show the practical relevance and accuracy of our model, we perform a detailed analysis using synthetic and real data. The results of applying our model to Hilbert and X-tree indices show that it provides a good estimation of the query performance, which is considerably better than the estimates by previous models especially for highdimensional data.
Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
Schlagwörter
Konferenz
Rezension
Zitieren
ISO 690
BERCHTOLD, Stefan, Christian BÖHM, Daniel A. KEIM, Hans-Peter KRIEGEL, 1997. A Cost Model for Nearest Neighbor Search in High-Dimensional Data Space. the sixteenth ACM SIGACT-SIGMOD-SIGART symposium. Tucson, Arizona, United States, 11. Mai 1997 - 15. Mai 1997. In: Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems - PODS '97. New York, New York, USA: ACM Press, 1997, pp. 78-86. ISBN 0-89791-910-6. Available under: doi: 10.1145/263661.263671BibTex
@inproceedings{Berchtold1997Model-5738, year={1997}, doi={10.1145/263661.263671}, title={A Cost Model for Nearest Neighbor Search in High-Dimensional Data Space}, isbn={0-89791-910-6}, publisher={ACM Press}, address={New York, New York, USA}, booktitle={Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems - PODS '97}, pages={78--86}, author={Berchtold, Stefan and Böhm, Christian and Keim, Daniel A. and Kriegel, Hans-Peter} }
RDF
<rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:void="http://rdfs.org/ns/void#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/5738"> <dcterms:title>A Cost Model for Nearest Neighbor Search in High-Dimensional Data Space</dcterms:title> <dc:contributor>Keim, Daniel A.</dc:contributor> <dcterms:issued>1997</dcterms:issued> <dc:creator>Berchtold, Stefan</dc:creator> <dc:creator>Keim, Daniel A.</dc:creator> <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2011-03-24T15:59:42Z</dc:date> <dc:rights>Attribution-NonCommercial-NoDerivs 2.0 Generic</dc:rights> <dc:contributor>Kriegel, Hans-Peter</dc:contributor> <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/5738/1/PODS97.pdf"/> <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2011-03-24T15:59:42Z</dcterms:available> <dc:contributor>Böhm, Christian</dc:contributor> <dc:creator>Böhm, Christian</dc:creator> <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <bibo:uri rdf:resource="http://kops.uni-konstanz.de/handle/123456789/5738"/> <dc:format>application/pdf</dc:format> <dcterms:abstract xml:lang="eng">In this paper, we present a new cost model for nearest neighbor search in high-dimensional data space. We first analyze different nearest neighbor algorithms, present a generalization of an algorithm which has been originally proposed for Quadtrees [13], and show that this algorithm is optimal. Then, we develop a cost model which - in contrast to previous models - takes boundary effects into account and therefore also works in high dimensions. The advantages of our model are in particular: Our model works for data sets with an arbitrary number of dimensions and an arbitrary number of data points, is applicable to different data distributions and index structures, and provides accurate estimates of the expected query execution time. To show the practical relevance and accuracy of our model, we perform a detailed analysis using synthetic and real data. The results of applying our model to Hilbert and X-tree indices show that it provides a good estimation of the query performance, which is considerably better than the estimates by previous models especially for highdimensional data.</dcterms:abstract> <foaf:homepage rdf:resource="http://localhost:8080/"/> <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/5738/1/PODS97.pdf"/> <dc:creator>Kriegel, Hans-Peter</dc:creator> <dc:language>eng</dc:language> <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/> <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <dcterms:rights rdf:resource="http://creativecommons.org/licenses/by-nc-nd/2.0/"/> <dcterms:bibliographicCitation>First publ. in: Proceedings / ACM PODS Symposium on Principles of Database Systems, Tucson, AZ, 1997, pp. 78-86</dcterms:bibliographicCitation> <dc:contributor>Berchtold, Stefan</dc:contributor> </rdf:Description> </rdf:RDF>