Fast conformational clustering of extensive molecular dynamics simulation data

Lade...
Vorschaubild
Dateien
Hunkler_2-ekygjmxreoqq9.pdf
Hunkler_2-ekygjmxreoqq9.pdfGröße: 11.06 MBDownloads: 3
Datum
2023
Herausgeber:innen
Kontakt
ISSN der Zeitschrift
Electronic ISSN
ISBN
Bibliografische Daten
Verlag
Schriftenreihe
Auflagebezeichnung
DOI (zitierfähiger Link)
ArXiv-ID
Internationale Patentnummer
Link zur Lizenz
oops
EU-Projektnummer
DFG-Projektnummer
Projekt
Open Access-Veröffentlichung
Sammlungen
Gesperrt bis
Titel in einer weiteren Sprache
Forschungsvorhaben
Organisationseinheiten
Zeitschriftenheft
Publikationstyp
Zeitschriftenartikel
Publikationsstatus
Published
Erschienen in
The Journal of Chemical Physics. AIP Publishing. 2023, 158(14), 144109. ISSN 0021-9606. eISSN 1089-7690. Available under: doi: 10.1063/5.0142797
Zusammenfassung

We present an unsupervised data processing workflow that is specifically designed to obtain a fast conformational clustering of long molecular dynamics simulation trajectories. In this approach, we combine two dimensionality reduction algorithms (cc_analysis and encodermap) with a density-based spatial clustering algorithm (hierarchical density-based spatial clustering of applications with noise). The proposed scheme benefits from the strengths of the three algorithms while avoiding most of the drawbacks of the individual methods. Here, the cc_analysis algorithm is applied for the first time to molecular simulation data. The encodermap algorithm complements cc_analysis by providing an efficient way to process and assign large amounts of data to clusters. The main goal of the procedure is to maximize the number of assigned frames of a given trajectory while keeping a clear conformational identity of the clusters that are found. In practice, we achieve this by using an iterative clustering approach and a tunable root-mean-square-deviation-based criterion in the final cluster assignment. This allows us to find clusters of different densities and different degrees of structural identity. With the help of four protein systems, we illustrate the capability and performance of this clustering workflow: wild-type and thermostable mutant of the Trp-cage protein (TC5b and TC10b), NTL9, and Protein B. Each of these test systems poses their individual challenges to the scheme, which, in total, give a nice overview of the advantages and potential difficulties that can arise when using the proposed method.

Zusammenfassung in einer weiteren Sprache
Fachgebiet (DDC)
530 Physik
Schlagwörter
Konferenz
Rezension
undefined / . - undefined, undefined
Zitieren
ISO 690HUNKLER, Simon, Kay DIEDERICHS, Oleksandra KUKHARENKO, Christine PETER, 2023. Fast conformational clustering of extensive molecular dynamics simulation data. In: The Journal of Chemical Physics. AIP Publishing. 2023, 158(14), 144109. ISSN 0021-9606. eISSN 1089-7690. Available under: doi: 10.1063/5.0142797
BibTex
@article{Hunkler2023-04-14confo-67466,
  year={2023},
  doi={10.1063/5.0142797},
  title={Fast conformational clustering of extensive molecular dynamics simulation data},
  number={14},
  volume={158},
  issn={0021-9606},
  journal={The Journal of Chemical Physics},
  author={Hunkler, Simon and Diederichs, Kay and Kukharenko, Oleksandra and Peter, Christine},
  note={This work was supported by the DFG through CRC 969 Article Number: 144109}
}
RDF
<rdf:RDF
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:bibo="http://purl.org/ontology/bibo/"
    xmlns:dspace="http://digital-repositories.org/ontologies/dspace/0.1.0#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:void="http://rdfs.org/ns/void#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > 
  <rdf:Description rdf:about="https://kops.uni-konstanz.de/server/rdf/resource/123456789/67466">
    <foaf:homepage rdf:resource="http://localhost:8080/"/>
    <void:sparqlEndpoint rdf:resource="http://localhost/fuseki/dspace/sparql"/>
    <dc:creator>Hunkler, Simon</dc:creator>
    <dcterms:issued>2023-04-14</dcterms:issued>
    <dcterms:hasPart rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/67466/1/Hunkler_2-ekygjmxreoqq9.pdf"/>
    <bibo:uri rdf:resource="https://kops.uni-konstanz.de/handle/123456789/67466"/>
    <dspace:isPartOfCollection rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/29"/>
    <dcterms:available rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2023-08-01T12:56:17Z</dcterms:available>
    <dc:contributor>Diederichs, Kay</dc:contributor>
    <dcterms:abstract>We present an unsupervised data processing workflow that is specifically designed to obtain a fast conformational clustering of long molecular dynamics simulation trajectories. In this approach, we combine two dimensionality reduction algorithms (cc_analysis and encodermap) with a density-based spatial clustering algorithm (hierarchical density-based spatial clustering of applications with noise). The proposed scheme benefits from the strengths of the three algorithms while avoiding most of the drawbacks of the individual methods. Here, the cc_analysis algorithm is applied for the first time to molecular simulation data. The encodermap algorithm complements cc_analysis by providing an efficient way to process and assign large amounts of data to clusters. The main goal of the procedure is to maximize the number of assigned frames of a given trajectory while keeping a clear conformational identity of the clusters that are found. In practice, we achieve this by using an iterative clustering approach and a tunable root-mean-square-deviation-based criterion in the final cluster assignment. This allows us to find clusters of different densities and different degrees of structural identity. With the help of four protein systems, we illustrate the capability and performance of this clustering workflow: wild-type and thermostable mutant of the Trp-cage protein (TC5b and TC10b), NTL9, and Protein B. Each of these test systems poses their individual challenges to the scheme, which, in total, give a nice overview of the advantages and potential difficulties that can arise when using the proposed method.</dcterms:abstract>
    <dc:language>eng</dc:language>
    <dc:creator>Diederichs, Kay</dc:creator>
    <dcterms:isPartOf rdf:resource="https://kops.uni-konstanz.de/server/rdf/resource/123456789/29"/>
    <dspace:hasBitstream rdf:resource="https://kops.uni-konstanz.de/bitstream/123456789/67466/1/Hunkler_2-ekygjmxreoqq9.pdf"/>
    <dc:contributor>Kukharenko, Oleksandra</dc:contributor>
    <dc:creator>Kukharenko, Oleksandra</dc:creator>
    <dcterms:title>Fast conformational clustering of extensive molecular dynamics simulation data</dcterms:title>
    <dc:contributor>Hunkler, Simon</dc:contributor>
    <dc:date rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2023-08-01T12:56:17Z</dc:date>
    <dc:creator>Peter, Christine</dc:creator>
    <dc:contributor>Peter, Christine</dc:contributor>
  </rdf:Description>
</rdf:RDF>
Interner Vermerk
xmlui.Submission.submit.DescribeStep.inputForms.label.kops_note_fromSubmitter
Kontakt
URL der Originalveröffentl.
Prüfdatum der URL
Prüfungsdatum der Dissertation
Finanzierungsart
Kommentar zur Publikation
This work was supported by the DFG through CRC 969
Allianzlizenz
Corresponding Authors der Uni Konstanz vorhanden
Internationale Co-Autor:innen
Universitätsbibliographie
Begutachtet
Ja