Distance phenomena in high-dimensional chemical descriptor spaces : Consequences for similarity-based approaches

Lade...
Vorschaubild
Dateien
Zu diesem Dokument gibt es keine Dateien.
Datum
2009
Autor:innen
Schneider, Petra
Schneider, Gisbert
Herausgeber:innen
Kontakt
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
URI (zitierfähiger Link)
DOI (zitierfähiger Link)
ArXiv-ID
Internationale Patentnummer
EU-Projektnummer
DFG-Projektnummer
Projekt
Open Access-Veröffentlichung
Gesperrt bis
Titel in einer weiteren Sprache
Forschungsvorhaben
Organisationseinheiten
Zeitschriftenheft
Publikationstyp
Zeitschriftenartikel
Publikationsstatus
Published
Erschienen in
Journal of Computational Chemistry. Wiley-Blackwell. 2009, 30(14), pp. 2285-2296. ISSN 0192-8651. eISSN 1096-987X. Available under: doi: 10.1002/jcc.21218
Zusammenfassung

Measuring the (dis)similarity of molecules is important for many cheminformatics applications like compound ranking, clustering, and property prediction. In this work, we focus on real-valued vector representations of molecules (as opposed to the binary spaces of fingerprints). We demonstrate the influence which the choice of (dis)similarity measure can have on results, and provide recommendations for such choices. We review the mathematical concepts used to measure (dis)similarity in vector spaces, namely norms, metrics, inner products, and, similarity coefficients, as well as the relationships between them, employing (dis)similarity measures commonly used in cheminformatics as examples. We present several phenomena (empty space phenomenon, sphere volume related phenomena, distance concentration) in high-dimensional descriptor spaces which are not encountered in two and three dimensions. These phenomena are theoretically characterized and illustrated on both artificial and real (bioactivity) data.

Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
004 Informatik
Schlagwörter
distances, high‐dimensional data, chemical descriptors, distance concentration
Konferenz
Rezension
undefined / . - undefined, undefined
Zitieren
ISO 690RUPP, Matthias, Petra SCHNEIDER, Gisbert SCHNEIDER, 2009. Distance phenomena in high-dimensional chemical descriptor spaces : Consequences for similarity-based approaches. In: Journal of Computational Chemistry. Wiley-Blackwell. 2009, 30(14), pp. 2285-2296. ISSN 0192-8651. eISSN 1096-987X. Available under: doi: 10.1002/jcc.21218
BibTex
@article{Rupp2009-11-15Dista-52121,
  year={2009},
  doi={10.1002/jcc.21218},
  title={Distance phenomena in high-dimensional chemical descriptor spaces : Consequences for similarity-based approaches},
  number={14},
  volume={30},
  issn={0192-8651},
  journal={Journal of Computational Chemistry},
  pages={2285--2296},
  author={Rupp, Matthias and Schneider, Petra and Schneider, Gisbert}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/52121">
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2020-12-15T10:07:54Z</dcterms:available>
    <dc:contributor>Schneider, Gisbert</dc:contributor>
    <dc:creator>Schneider, Petra</dc:creator>
    <dcterms:title>Distance phenomena in high-dimensional chemical descriptor spaces : Consequences for similarity-based approaches</dcterms:title>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dc:creator>Rupp, Matthias</dc:creator>
    <dc:language>eng</dc:language>
    <dc:contributor>Schneider, Petra</dc:contributor>
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dcterms:abstract xml:lang="eng">Measuring the (dis)similarity of molecules is important for many cheminformatics applications like compound ranking, clustering, and property prediction. In this work, we focus on real-valued vector representations of molecules (as opposed to the binary spaces of fingerprints). We demonstrate the influence which the choice of (dis)similarity measure can have on results, and provide recommendations for such choices. We review the mathematical concepts used to measure (dis)similarity in vector spaces, namely norms, metrics, inner products, and, similarity coefficients, as well as the relationships between them, employing (dis)similarity measures commonly used in cheminformatics as examples. We present several phenomena (empty space phenomenon, sphere volume related phenomena, distance concentration) in high-dimensional descriptor spaces which are not encountered in two and three dimensions. These phenomena are theoretically characterized and illustrated on both artificial and real (bioactivity) data.</dcterms:abstract>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2020-12-15T10:07:54Z</dc:date>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/52121"/>
    <dcterms:rights rdf:resource="https://rightsstatements.org/page/InC/1.0/"/>
    <dc:creator>Schneider, Gisbert</dc:creator>
    <dc:contributor>Rupp, Matthias</dc:contributor>
    <dcterms:issued>2009-11-15</dcterms:issued>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/36"/>
    <dc:rights>terms-of-use</dc:rights>
  </rdf:Description>
</rdf:RDF>
Interner Vermerk
xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter
Kontakt
URL der Originalveröffentl.
Prüfdatum der URL
Prüfungsdatum der Dissertation
Finanzierungsart
Kommentar zur Publikation
Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Nein
Begutachtet
Ja