Improving the Query Performance of High-Dimensional Index Structures Using Bulk-Load Operations

dc.contributor.authorBerchtold, Stefandeu
dc.contributor.authorBöhm, Christiandeu
dc.contributor.authorKriegel, Hans-Peterdeu
dc.date.accessioned2011-03-24T16:00:04Zdeu
dc.date.available2011-03-24T16:00:04Zdeu
dc.date.issued2006-11-22
dc.description.abstractIn this paper, we propose a new bulk-loading technique for high-dimensional indexes which represent an important component of multimedia database systems. Since it is very inefficient to construct an index for a large amount of data by dynamic insertion of single objects, there is an increasing interest in bulk-loading techniques. In contrast to previous approaches, our technique exploits a priori knowledge of the complete data set to improve both construction time and query performance. Our algorithm operates in a mannar similar to the Quicksort algorithm and has an average runtime complexity of O(n log n). We additionally improve the query performance by optimizing the shape of the bounding boxes, by completely avoiding overlap, and by clustering the pages on disk. As we analytically show, the split strategy typically used in dynamic index structures, splitting the data space at the 50%-quantile, results in a bad query performance in high-dimensional spaces. Therefore, we use a sophisticated unbalanced split strategy, which leads to a much better space partitioning. An exhaustive experimental evaluation shows that our technique clearly outperforms both classic index construction and competitive bulk loading techniques. In comparison with dynamic index construction we achieve a speed-up factor of up to 588 for the construction time. The constructed index causes up to 16.88 times fewer page accesses and is up to 198 times faster (real time) in query processing.eng
dc.description.versionpublished
dc.format.mimetypeapplication/pdfdeu
dc.identifier.citationFirst publ. in: Advances in Database Technology EDBT'98. Berlin: Springer, 1998, pp. 216-230deu
dc.identifier.doi10.1007/BFb0100987
dc.identifier.ppn288446100deu
dc.identifier.urihttp://kops.uni-konstanz.de/handle/123456789/5780
dc.language.isoengdeu
dc.legacy.dateIssued2008deu
dc.rightsAttribution-NonCommercial-NoDerivs 2.0 Generic
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/2.0/
dc.subject.ddc004deu
dc.titleImproving the Query Performance of High-Dimensional Index Structures Using Bulk-Load Operationseng
dc.typeINPROCEEDINGSdeu
dspace.entity.typePublication
kops.citation.bibtex
@inproceedings{Berchtold2006-11-22Impro-5780,
  year={2006},
  doi={10.1007/BFb0100987},
  title={Improving the Query Performance of High-Dimensional Index Structures Using Bulk-Load Operations},
  number={1377},
  isbn={978-3-540-64264-0},
  publisher={Springer Berlin Heidelberg},
  address={Berlin, Heidelberg},
  series={Lecture Notes in Computer Science},
  booktitle={Advances in Database Technology — EDBT'98},
  pages={216--230},
  editor={Schek, Hans-Jörg and Alonso, Gustavo and Saltor, Felix and Ramos, Isidro},
  author={Berchtold, Stefan and Böhm, Christian and Kriegel, Hans-Peter}
}
kops.citation.iso690BERCHTOLD, Stefan, Christian BÖHM, Hans-Peter KRIEGEL, 2006. Improving the Query Performance of High-Dimensional Index Structures Using Bulk-Load Operations. In: SCHEK, Hans-Jörg, ed., Gustavo ALONSO, ed., Felix SALTOR, ed., Isidro RAMOS, ed.. Advances in Database Technology — EDBT'98. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006, pp. 216-230. Lecture Notes in Computer Science. 1377. ISBN 978-3-540-64264-0. Available under: doi: 10.1007/BFb0100987deu
kops.citation.iso690BERCHTOLD, Stefan, Christian BÖHM, Hans-Peter KRIEGEL, 2006. Improving the Query Performance of High-Dimensional Index Structures Using Bulk-Load Operations. In: SCHEK, Hans-Jörg, ed., Gustavo ALONSO, ed., Felix SALTOR, ed., Isidro RAMOS, ed.. Advances in Database Technology — EDBT'98. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006, pp. 216-230. Lecture Notes in Computer Science. 1377. ISBN 978-3-540-64264-0. Available under: doi: 10.1007/BFb0100987eng
kops.citation.rdf
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/5780">
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dcterms:bibliographicCitation>First publ. in: Advances in Database Technology   EDBT'98. Berlin: Springer, 1998, pp. 216-230</dcterms:bibliographicCitation>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2011-03-24T16:00:04Z</dc:date>
    <bibo:uri rdf:resource="http://kops.uni-konstanz.de/handle/123456789/5780"/>
    <dcterms:rights rdf:resource="http://creativecommons.org/licenses/by-nc-nd/2.0/"/>
    <dc:creator>Böhm, Christian</dc:creator>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dcterms:abstract xml:lang="eng">In this paper, we propose a new bulk-loading technique for high-dimensional indexes which represent an important component of multimedia database systems. Since it is very inefficient to construct an index for a large amount of data by dynamic insertion of single objects, there is an increasing interest in bulk-loading techniques. In contrast to previous approaches, our technique exploits a priori knowledge of the complete data set to improve both construction time and query performance. Our algorithm operates in a mannar similar to the Quicksort algorithm and has an average runtime complexity of O(n log n). We additionally improve the query performance by optimizing the shape of the bounding boxes, by completely avoiding overlap, and by clustering the pages on disk. As we analytically show, the split strategy typically used in dynamic index structures, splitting the data space at the 50%-quantile, results in a bad query performance in high-dimensional spaces. Therefore, we use a sophisticated unbalanced split strategy, which leads to a much better space partitioning. An exhaustive experimental evaluation shows that our technique clearly outperforms both classic index construction and competitive bulk loading techniques. In comparison with dynamic index construction we achieve a speed-up factor of up to 588 for the construction time. The constructed index causes up to 16.88 times fewer page accesses and is up to 198 times faster (real time) in query processing.</dcterms:abstract>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dc:contributor>Böhm, Christian</dc:contributor>
    <dc:contributor>Berchtold, Stefan</dc:contributor>
    <dcterms:issued>2006-11-22</dcterms:issued>
    <dcterms:title>Improving the Query Performance of High-Dimensional Index Structures Using Bulk-Load Operations</dcterms:title>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dc:creator>Berchtold, Stefan</dc:creator>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/5780/1/edt98.pdf"/>
    <dc:contributor>Kriegel, Hans-Peter</dc:contributor>
    <dc:language>eng</dc:language>
    <dc:creator>Kriegel, Hans-Peter</dc:creator>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2011-03-24T16:00:04Z</dcterms:available>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/5780/1/edt98.pdf"/>
    <dc:format>application/pdf</dc:format>
    <dc:rights>Attribution-NonCommercial-NoDerivs 2.0 Generic</dc:rights>
  </rdf:Description>
</rdf:RDF>
kops.description.openAccessopenaccessgreen
kops.identifier.nbnurn:nbn:de:bsz:352-opus-70426deu
kops.opus.id7042deu
kops.sourcefieldSCHEK, Hans-Jörg, ed., Gustavo ALONSO, ed., Felix SALTOR, ed., Isidro RAMOS, ed.. <i>Advances in Database Technology — EDBT'98</i>. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006, pp. 216-230. Lecture Notes in Computer Science. 1377. ISBN 978-3-540-64264-0. Available under: doi: 10.1007/BFb0100987deu
kops.sourcefield.plainSCHEK, Hans-Jörg, ed., Gustavo ALONSO, ed., Felix SALTOR, ed., Isidro RAMOS, ed.. Advances in Database Technology — EDBT'98. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006, pp. 216-230. Lecture Notes in Computer Science. 1377. ISBN 978-3-540-64264-0. Available under: doi: 10.1007/BFb0100987deu
kops.sourcefield.plainSCHEK, Hans-Jörg, ed., Gustavo ALONSO, ed., Felix SALTOR, ed., Isidro RAMOS, ed.. Advances in Database Technology — EDBT'98. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006, pp. 216-230. Lecture Notes in Computer Science. 1377. ISBN 978-3-540-64264-0. Available under: doi: 10.1007/BFb0100987eng
source.bibliographicInfo.fromPage216
source.bibliographicInfo.seriesNumber1377
source.bibliographicInfo.toPage230
source.contributor.editorSchek, Hans-Jörg
source.contributor.editorAlonso, Gustavo
source.contributor.editorSaltor, Felix
source.contributor.editorRamos, Isidro
source.identifier.isbn978-3-540-64264-0
source.publisherSpringer Berlin Heidelberg
source.publisher.locationBerlin, Heidelberg
source.relation.ispartofseriesLecture Notes in Computer Science
source.titleAdvances in Database Technology — EDBT'98

Dateien

Originalbündel

Gerade angezeigt 1 - 1 von 1
Vorschaubild nicht verfügbar
Name:
edt98.pdf
Größe:
183.69 KB
Format:
Adobe Portable Document Format
edt98.pdf
edt98.pdfGröße: 183.69 KBDownloads: 491