Publikation: Fast conformational clustering of extensive molecular dynamics simulation data
Dateien
Datum
Herausgeber:innen
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
URI (zitierfähiger Link)
DOI (zitierfähiger Link)
Internationale Patentnummer
Link zur Lizenz
Angaben zur Forschungsförderung
Projekt
Open Access-Veröffentlichung
Sammlungen
Core Facility der Universität Konstanz
Titel in einer weiteren Sprache
Publikationstyp
Publikationsstatus
Erschienen in
Zusammenfassung
We present an unsupervised data processing workflow that is specifically designed to obtain a fast conformational clustering of long molecular dynamics simulation trajectories. In this approach, we combine two dimensionality reduction algorithms (cc_analysis and encodermap) with a density-based spatial clustering algorithm (hierarchical density-based spatial clustering of applications with noise). The proposed scheme benefits from the strengths of the three algorithms while avoiding most of the drawbacks of the individual methods. Here, the cc_analysis algorithm is applied for the first time to molecular simulation data. The encodermap algorithm complements cc_analysis by providing an efficient way to process and assign large amounts of data to clusters. The main goal of the procedure is to maximize the number of assigned frames of a given trajectory while keeping a clear conformational identity of the clusters that are found. In practice, we achieve this by using an iterative clustering approach and a tunable root-mean-square-deviation-based criterion in the final cluster assignment. This allows us to find clusters of different densities and different degrees of structural identity. With the help of four protein systems, we illustrate the capability and performance of this clustering workflow: wild-type and thermostable mutant of the Trp-cage protein (TC5b and TC10b), NTL9, and Protein B. Each of these test systems poses their individual challenges to the scheme, which, in total, give a nice overview of the advantages and potential difficulties that can arise when using the proposed method.
Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
Schlagwörter
Konferenz
Rezension
Zitieren
ISO 690
HUNKLER, Simon, Kay DIEDERICHS, Oleksandra KUKHARENKO, Christine PETER, 2023. Fast conformational clustering of extensive molecular dynamics simulation data. In: The Journal of Chemical Physics. AIP Publishing. 2023, 158(14), 144109. ISSN 0021-9606. eISSN 1089-7690. Available under: doi: 10.1063/5.0142797BibTex
@article{Hunkler2023-04-14confo-67466, year={2023}, doi={10.1063/5.0142797}, title={Fast conformational clustering of extensive molecular dynamics simulation data}, number={14}, volume={158}, issn={0021-9606}, journal={The Journal of Chemical Physics}, author={Hunkler, Simon and Diederichs, Kay and Kukharenko, Oleksandra and Peter, Christine}, note={This work was supported by the DFG through CRC 969 Article Number: 144109} }
RDF
<rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:void="http://rdfs.org/ns/void#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/67466"> <foaf:homepage rdf:resource="http://localhost:8080/"/> <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/> <dc:creator>Hunkler, Simon</dc:creator> <dcterms:issued>2023-04-14</dcterms:issued> <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/67466/1/Hunkler_2-ekygjmxreoqq9.pdf"/> <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/67466"/> <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/29"/> <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2023-08-01T12:56:17Z</dcterms:available> <dc:contributor>Diederichs, Kay</dc:contributor> <dcterms:abstract>We present an unsupervised data processing workflow that is specifically designed to obtain a fast conformational clustering of long molecular dynamics simulation trajectories. In this approach, we combine two dimensionality reduction algorithms (cc_analysis and encodermap) with a density-based spatial clustering algorithm (hierarchical density-based spatial clustering of applications with noise). The proposed scheme benefits from the strengths of the three algorithms while avoiding most of the drawbacks of the individual methods. Here, the cc_analysis algorithm is applied for the first time to molecular simulation data. The encodermap algorithm complements cc_analysis by providing an efficient way to process and assign large amounts of data to clusters. The main goal of the procedure is to maximize the number of assigned frames of a given trajectory while keeping a clear conformational identity of the clusters that are found. In practice, we achieve this by using an iterative clustering approach and a tunable root-mean-square-deviation-based criterion in the final cluster assignment. This allows us to find clusters of different densities and different degrees of structural identity. With the help of four protein systems, we illustrate the capability and performance of this clustering workflow: wild-type and thermostable mutant of the Trp-cage protein (TC5b and TC10b), NTL9, and Protein B. Each of these test systems poses their individual challenges to the scheme, which, in total, give a nice overview of the advantages and potential difficulties that can arise when using the proposed method.</dcterms:abstract> <dc:language>eng</dc:language> <dc:creator>Diederichs, Kay</dc:creator> <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/29"/> <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/67466/1/Hunkler_2-ekygjmxreoqq9.pdf"/> <dc:contributor>Kukharenko, Oleksandra</dc:contributor> <dc:creator>Kukharenko, Oleksandra</dc:creator> <dcterms:title>Fast conformational clustering of extensive molecular dynamics simulation data</dcterms:title> <dc:contributor>Hunkler, Simon</dc:contributor> <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2023-08-01T12:56:17Z</dc:date> <dc:creator>Peter, Christine</dc:creator> <dc:contributor>Peter, Christine</dc:contributor> </rdf:Description> </rdf:RDF>