Publikation:

Fast conformational clustering of extensive molecular dynamics simulation data

Lade...
Vorschaubild

Dateien

Hunkler_2-ekygjmxreoqq9.pdf
Hunkler_2-ekygjmxreoqq9.pdfGröße: 11.06 MBDownloads: 35

Datum

2023

Herausgeber:innen

Kontakt

ISSN der Zeitschrift

Electronic ISSN

ISBN

Bibliografische Daten

Verlag

Schriftenreihe

Auflagebezeichnung

DOI (zitierfähiger Link)
ArXiv-ID

Internationale Patentnummer

Link zur Lizenz
oops

Angaben zur Forschungsförderung

Projekt

Open Access-Veröffentlichung
Open Access Hybrid
Core Facility der Universität Konstanz

Gesperrt bis

Titel in einer weiteren Sprache

Publikationstyp
Zeitschriftenartikel
Publikationsstatus
Published

Erschienen in

The Journal of Chemical Physics. AIP Publishing. 2023, 158(14), 144109. ISSN 0021-9606. eISSN 1089-7690. Available under: doi: 10.1063/5.0142797

Zusammenfassung

We present an unsupervised data processing workflow that is specifically designed to obtain a fast conformational clustering of long molecular dynamics simulation trajectories. In this approach, we combine two dimensionality reduction algorithms (cc_analysis and encodermap) with a density-based spatial clustering algorithm (hierarchical density-based spatial clustering of applications with noise). The proposed scheme benefits from the strengths of the three algorithms while avoiding most of the drawbacks of the individual methods. Here, the cc_analysis algorithm is applied for the first time to molecular simulation data. The encodermap algorithm complements cc_analysis by providing an efficient way to process and assign large amounts of data to clusters. The main goal of the procedure is to maximize the number of assigned frames of a given trajectory while keeping a clear conformational identity of the clusters that are found. In practice, we achieve this by using an iterative clustering approach and a tunable root-mean-square-deviation-based criterion in the final cluster assignment. This allows us to find clusters of different densities and different degrees of structural identity. With the help of four protein systems, we illustrate the capability and performance of this clustering workflow: wild-type and thermostable mutant of the Trp-cage protein (TC5b and TC10b), NTL9, and Protein B. Each of these test systems poses their individual challenges to the scheme, which, in total, give a nice overview of the advantages and potential difficulties that can arise when using the proposed method.

Zusammenfassung in einer weiteren Sprache

Fachgebiet (DDC)
530 Physik

Schlagwörter

Konferenz

Rezension
undefined / . - undefined, undefined

Forschungsvorhaben

Organisationseinheiten

Zeitschriftenheft

Zugehörige Datensätze in KOPS

Zitieren

ISO 690HUNKLER, Simon, Kay DIEDERICHS, Oleksandra KUKHARENKO, Christine PETER, 2023. Fast conformational clustering of extensive molecular dynamics simulation data. In: The Journal of Chemical Physics. AIP Publishing. 2023, 158(14), 144109. ISSN 0021-9606. eISSN 1089-7690. Available under: doi: 10.1063/5.0142797
BibTex
@article{Hunkler2023-04-14confo-67466,
  year={2023},
  doi={10.1063/5.0142797},
  title={Fast conformational clustering of extensive molecular dynamics simulation data},
  number={14},
  volume={158},
  issn={0021-9606},
  journal={The Journal of Chemical Physics},
  author={Hunkler, Simon and Diederichs, Kay and Kukharenko, Oleksandra and Peter, Christine},
  note={This work was supported by the DFG through CRC 969 Article Number: 144109}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/67466">
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dc:creator>Hunkler, Simon</dc:creator>
    <dcterms:issued>2023-04-14</dcterms:issued>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/67466/1/Hunkler_2-ekygjmxreoqq9.pdf"/>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/67466"/>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/29"/>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2023-08-01T12:56:17Z</dcterms:available>
    <dc:contributor>Diederichs, Kay</dc:contributor>
    <dcterms:abstract>We present an unsupervised data processing workflow that is specifically designed to obtain a fast conformational clustering of long molecular dynamics simulation trajectories. In this approach, we combine two dimensionality reduction algorithms (cc_analysis and encodermap) with a density-based spatial clustering algorithm (hierarchical density-based spatial clustering of applications with noise). The proposed scheme benefits from the strengths of the three algorithms while avoiding most of the drawbacks of the individual methods. Here, the cc_analysis algorithm is applied for the first time to molecular simulation data. The encodermap algorithm complements cc_analysis by providing an efficient way to process and assign large amounts of data to clusters. The main goal of the procedure is to maximize the number of assigned frames of a given trajectory while keeping a clear conformational identity of the clusters that are found. In practice, we achieve this by using an iterative clustering approach and a tunable root-mean-square-deviation-based criterion in the final cluster assignment. This allows us to find clusters of different densities and different degrees of structural identity. With the help of four protein systems, we illustrate the capability and performance of this clustering workflow: wild-type and thermostable mutant of the Trp-cage protein (TC5b and TC10b), NTL9, and Protein B. Each of these test systems poses their individual challenges to the scheme, which, in total, give a nice overview of the advantages and potential difficulties that can arise when using the proposed method.</dcterms:abstract>
    <dc:language>eng</dc:language>
    <dc:creator>Diederichs, Kay</dc:creator>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/29"/>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/67466/1/Hunkler_2-ekygjmxreoqq9.pdf"/>
    <dc:contributor>Kukharenko, Oleksandra</dc:contributor>
    <dc:creator>Kukharenko, Oleksandra</dc:creator>
    <dcterms:title>Fast conformational clustering of extensive molecular dynamics simulation data</dcterms:title>
    <dc:contributor>Hunkler, Simon</dc:contributor>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2023-08-01T12:56:17Z</dc:date>
    <dc:creator>Peter, Christine</dc:creator>
    <dc:contributor>Peter, Christine</dc:contributor>
  </rdf:Description>
</rdf:RDF>

Interner Vermerk

xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter

Kontakt
URL der Originalveröffentl.

Prüfdatum der URL

Prüfungsdatum der Dissertation

Finanzierungsart

Kommentar zur Publikation

This work was supported by the DFG through CRC 969
Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Ja
Begutachtet
Ja
Diese Publikation teilen