Even Faster Exact k-Means Clustering
Dateien
Datum
Autor:innen
Herausgeber:innen
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
URI (zitierfähiger Link)
DOI (zitierfähiger Link)
Internationale Patentnummer
Link zur Lizenz
EU-Projektnummer
DFG-Projektnummer
Projekt
Open Access-Veröffentlichung
Sammlungen
Titel in einer weiteren Sprache
Publikationstyp
Publikationsstatus
Erschienen in
Zusammenfassung
A naïve implementation of k-means clustering requires computing for each of the n data points the distance to each of the k cluster centers, which can result in fairly slow execution. However, by storing distance information obtained by earlier computations as well as information about distances between cluster centers, the triangle inequality can be exploited in different ways to reduce the number of needed distance computations, e.g. [3, 4, 5, 7, 11]. In this paper I present an improvement of the Exponion method [11] that generally accelerates the computations. Furthermore, by evaluating several methods on a fairly wide range of artificial data sets, I derive a kind of map, for which data set parameters which method (often) yields the lowest execution times.
Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
Schlagwörter
Konferenz
Rezension
Zitieren
ISO 690
BORGELT, Christian, 2020. Even Faster Exact k-Means Clustering. IDA 2020: Advances in Intelligent Data Analysis XVIII. Konstanz, 27. Apr. 2020 - 29. Apr. 2020. In: BERTHOLD, Michael R., ed., Ad FEELDERS, ed., Georg KREMPL, ed.. Advances in Intelligent Data Analysis XVIII : 18th International Symposium on Intelligent Data Analysis, Proceedings. Cham: Springer, 2020, pp. 93-105. ISSN 0302-9743. eISSN 1611-3349. ISBN 978-3-030-44583-6. Available under: doi: 10.1007/978-3-030-44584-3_8BibTex
@inproceedings{Borgelt2020-04-22Faste-55969, year={2020}, doi={10.1007/978-3-030-44584-3_8}, title={Even Faster Exact k-Means Clustering}, number={12080}, isbn={978-3-030-44583-6}, issn={0302-9743}, publisher={Springer}, address={Cham}, series={Lecture Notes in Computer Science}, booktitle={Advances in Intelligent Data Analysis XVIII : 18th International Symposium on Intelligent Data Analysis, Proceedings}, pages={93--105}, editor={Berthold, Michael R. and Feelders, Ad and Krempl, Georg}, author={Borgelt, Christian} }
RDF
<rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:void="http://rdfs.org/ns/void#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/55969"> <dcterms:title>Even Faster Exact k-Means Clustering</dcterms:title> <foaf:homepage rdf:resource="http://localhost:8080/"/> <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/55969"/> <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/55969/1/Borgelt_2-1kp54adjk555h8.pdf"/> <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2021-12-21T15:00:13Z</dcterms:available> <dcterms:rights rdf:resource="http://creativecommons.org/licenses/by/4.0/"/> <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/55969/1/Borgelt_2-1kp54adjk555h8.pdf"/> <dcterms:issued>2020-04-22</dcterms:issued> <dc:rights>Attribution 4.0 International</dc:rights> <dc:creator>Borgelt, Christian</dc:creator> <dc:language>eng</dc:language> <dc:contributor>Borgelt, Christian</dc:contributor> <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <dcterms:abstract xml:lang="eng">A naïve implementation of k-means clustering requires computing for each of the n data points the distance to each of the k cluster centers, which can result in fairly slow execution. However, by storing distance information obtained by earlier computations as well as information about distances between cluster centers, the triangle inequality can be exploited in different ways to reduce the number of needed distance computations, e.g. [3, 4, 5, 7, 11]. In this paper I present an improvement of the Exponion method [11] that generally accelerates the computations. Furthermore, by evaluating several methods on a fairly wide range of artificial data sets, I derive a kind of map, for which data set parameters which method (often) yields the lowest execution times.</dcterms:abstract> <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2021-12-21T15:00:13Z</dc:date> <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/> </rdf:Description> </rdf:RDF>