Publikation: Distance phenomena in high-dimensional chemical descriptor spaces : Consequences for similarity-based approaches
Dateien
Datum
Autor:innen
Herausgeber:innen
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
DOI (zitierfähiger Link)
Internationale Patentnummer
Angaben zur Forschungsförderung
Projekt
Open Access-Veröffentlichung
Core Facility der Universität Konstanz
Titel in einer weiteren Sprache
Publikationstyp
Publikationsstatus
Erschienen in
Zusammenfassung
Measuring the (dis)similarity of molecules is important for many cheminformatics applications like compound ranking, clustering, and property prediction. In this work, we focus on real-valued vector representations of molecules (as opposed to the binary spaces of fingerprints). We demonstrate the influence which the choice of (dis)similarity measure can have on results, and provide recommendations for such choices. We review the mathematical concepts used to measure (dis)similarity in vector spaces, namely norms, metrics, inner products, and, similarity coefficients, as well as the relationships between them, employing (dis)similarity measures commonly used in cheminformatics as examples. We present several phenomena (empty space phenomenon, sphere volume related phenomena, distance concentration) in high-dimensional descriptor spaces which are not encountered in two and three dimensions. These phenomena are theoretically characterized and illustrated on both artificial and real (bioactivity) data.
Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
Schlagwörter
Konferenz
Rezension
Zitieren
ISO 690
RUPP, Matthias, Petra SCHNEIDER, Gisbert SCHNEIDER, 2009. Distance phenomena in high-dimensional chemical descriptor spaces : Consequences for similarity-based approaches. In: Journal of Computational Chemistry. Wiley-Blackwell. 2009, 30(14), pp. 2285-2296. ISSN 0192-8651. eISSN 1096-987X. Available under: doi: 10.1002/jcc.21218BibTex
@article{Rupp2009-11-15Dista-52121, year={2009}, doi={10.1002/jcc.21218}, title={Distance phenomena in high-dimensional chemical descriptor spaces : Consequences for similarity-based approaches}, number={14}, volume={30}, issn={0192-8651}, journal={Journal of Computational Chemistry}, pages={2285--2296}, author={Rupp, Matthias and Schneider, Petra and Schneider, Gisbert} }
RDF
<rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:void="http://rdfs.org/ns/void#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/52121"> <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2020-12-15T10:07:54Z</dcterms:available> <dc:contributor>Schneider, Gisbert</dc:contributor> <dc:creator>Schneider, Petra</dc:creator> <dcterms:title>Distance phenomena in high-dimensional chemical descriptor spaces : Consequences for similarity-based approaches</dcterms:title> <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/> <dc:creator>Rupp, Matthias</dc:creator> <dc:language>eng</dc:language> <dc:contributor>Schneider, Petra</dc:contributor> <foaf:homepage rdf:resource="http://localhost:8080/"/> <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <dcterms:abstract xml:lang="eng">Measuring the (dis)similarity of molecules is important for many cheminformatics applications like compound ranking, clustering, and property prediction. In this work, we focus on real-valued vector representations of molecules (as opposed to the binary spaces of fingerprints). We demonstrate the influence which the choice of (dis)similarity measure can have on results, and provide recommendations for such choices. We review the mathematical concepts used to measure (dis)similarity in vector spaces, namely norms, metrics, inner products, and, similarity coefficients, as well as the relationships between them, employing (dis)similarity measures commonly used in cheminformatics as examples. We present several phenomena (empty space phenomenon, sphere volume related phenomena, distance concentration) in high-dimensional descriptor spaces which are not encountered in two and three dimensions. These phenomena are theoretically characterized and illustrated on both artificial and real (bioactivity) data.</dcterms:abstract> <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2020-12-15T10:07:54Z</dc:date> <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/52121"/> <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/> <dc:creator>Schneider, Gisbert</dc:creator> <dc:contributor>Rupp, Matthias</dc:contributor> <dcterms:issued>2009-11-15</dcterms:issued> <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/> <dc:rights>terms-of-use</dc:rights> </rdf:Description> </rdf:RDF>