KOPS - Das Institutionelle Repositorium der Universität Konstanz

Optimal Grid-Clustering : Towards Breaking the Curse of Dimensionality in High-Dimensional Clustering

Optimal Grid-Clustering : Towards Breaking the Curse of Dimensionality in High-Dimensional Clustering

Zitieren

Dateien zu dieser Ressource

Prüfsumme: MD5:066167fae66929d2f25ca0c4f5e7ba51

HINNEBURG, Alexander, Daniel A. KEIM, 1999. Optimal Grid-Clustering : Towards Breaking the Curse of Dimensionality in High-Dimensional Clustering. In: Proceedings of the 25 th International Conference on Very Large Databases, 1999, pp. 506-517

@inproceedings{Hinneburg1999Optim-5790, title={Optimal Grid-Clustering : Towards Breaking the Curse of Dimensionality in High-Dimensional Clustering}, year={1999}, booktitle={Proceedings of the 25 th International Conference on Very Large Databases, 1999}, pages={506--517}, author={Hinneburg, Alexander and Keim, Daniel A.} }

<rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:void="http://rdfs.org/ns/void#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > <rdf:Description rdf:about="https://kops.uni-konstanz.de/rdf/resource/123456789/5790"> <foaf:homepage rdf:resource="http://localhost:8080/jspui"/> <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/rdf/resource/123456789/36"/> <dc:contributor>Keim, Daniel A.</dc:contributor> <dc:format>application/pdf</dc:format> <dcterms:bibliographicCitation>First publ. in: Proceedings of the 25th International Conference on Very Large Databases, 1999, pp. 506-517</dcterms:bibliographicCitation> <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/rdf/resource/123456789/36"/> <dcterms:title>Optimal Grid-Clustering : Towards Breaking the Curse of Dimensionality in High-Dimensional Clustering</dcterms:title> <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2011-03-24T16:00:07Z</dc:date> <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/> <dcterms:rights rdf:resource="https://creativecommons.org/licenses/by-nc-nd/2.0/legalcode"/> <dc:contributor>Hinneburg, Alexander</dc:contributor> <bibo:uri rdf:resource="http://kops.uni-konstanz.de/handle/123456789/5790"/> <dcterms:abstract xml:lang="eng">Many applications require the clustering of large amounts of high-dimensional data. Most clustering algorithms, however, do not work effectively and efficiently in high-dimensional space, which is due to the so-called "curse of dimensionality". In addition, the high-dimensional data often contains a significant amount of noise which causes additional effectiveness problems. In this paper, we review and compare the existing algorithms for clustering high-dimensional data and show the impact of the curse of dimensionality on their effectiveness and efficiency. The comparison reveals that condensation-based approaches (such as BIRCH or STING) are the most promising candidates for achieving the necessary efficiency, but it also shows that basically all condensation-based approaches have severe weaknesses with respect to their effectiveness in high-dimensional space. To overcome these problems, we develop a new clustering technique called OptiGrid which is based on constructing an optimal grid-partitioning of the data. The optimal grid-partitioning is determined by calculating the best partitioning hyperplanes for each dimension (if such a partitioning exists) using certain projections of the data. The advantages of our new approach are (1) it has a firm mathematical basis (2) it is by far more effective than existing clustering algorithms for high-dimensional data (3) it is very efficient even for large data sets of high dimensionality. To demonstrate the effectiveness and efficiency of our new approach, we perform a series of experiments on a number of different data sets including real data sets from CAD and molecular biology. A comparison with one of the best known algorithms (BIRCH) shows the superiority of our new approach.</dcterms:abstract> <dc:language>eng</dc:language> <dc:creator>Hinneburg, Alexander</dc:creator> <dc:rights>deposit-license</dc:rights> <dc:creator>Keim, Daniel A.</dc:creator> <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/5790/1/vldb99.pdf"/> <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/5790/1/vldb99.pdf"/> <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2011-03-24T16:00:07Z</dcterms:available> <dcterms:issued>1999</dcterms:issued> </rdf:Description> </rdf:RDF>

Dateiabrufe seit 01.10.2014 (Informationen über die Zugriffsstatistik)

vldb99.pdf 1075

Das Dokument erscheint in:

deposit-license Solange nicht anders angezeigt, wird die Lizenz wie folgt beschrieben: deposit-license

KOPS Suche


Stöbern

Mein Benutzerkonto