Fast conformational clustering of extensive molecular dynamics simulation data
2023-04-14, Hunkler, Simon, Diederichs, Kay, Kukharenko, Oleksandra, Peter, Christine
We present an unsupervised data processing workflow that is specifically designed to obtain a fast conformational clustering of long molecular dynamics simulation trajectories. In this approach, we combine two dimensionality reduction algorithms (cc_analysis and encodermap) with a density-based spatial clustering algorithm (hierarchical density-based spatial clustering of applications with noise). The proposed scheme benefits from the strengths of the three algorithms while avoiding most of the drawbacks of the individual methods. Here, the cc_analysis algorithm is applied for the first time to molecular simulation data. The encodermap algorithm complements cc_analysis by providing an efficient way to process and assign large amounts of data to clusters. The main goal of the procedure is to maximize the number of assigned frames of a given trajectory while keeping a clear conformational identity of the clusters that are found. In practice, we achieve this by using an iterative clustering approach and a tunable root-mean-square-deviation-based criterion in the final cluster assignment. This allows us to find clusters of different densities and different degrees of structural identity. With the help of four protein systems, we illustrate the capability and performance of this clustering workflow: wild-type and thermostable mutant of the Trp-cage protein (TC5b and TC10b), NTL9, and Protein B. Each of these test systems poses their individual challenges to the scheme, which, in total, give a nice overview of the advantages and potential difficulties that can arise when using the proposed method.
Back-mapping based sampling : Coarse grained free energy landscapes as a guideline for atomistic exploration
2019-10-21, Hunkler, Simon, Lemke, Tobias, Peter, Christine, Kukharenko, Oleksandra
One ongoing topic of research in MD simulations is how to enable sampling to chemically and biologically relevant time scales. We address this question by introducing a back-mapping based sampling (BMBS) that combines multiple aspects of different sampling techniques. BMBS uses coarse grained (CG) free energy surfaces (FESs) and dimensionality reduction to initiate new atomistic simulations. These new simulations are started from atomistic conformations that were back-mapped from CG points all over the FES in order to sample the entire accessible phase space as fast as possible. In the context of BMBS, we address relevant back-mapping related questions like where to start the back-mapping from and how to judge the atomistic ensemble that results from the BMBS. The latter is done with the use of the earth mover’s distance, which allows us to quantitatively compare distributions of CG and atomistic ensembles. By using this metric, we can also show that the BMBS is able to correct inaccuracies of the CG model. In this paper, BMBS is applied to a just recently introduced neural network (NN) based approach for a radical coarse graining to predict free energy surfaces for oligopeptides. The BMBS scheme back-maps these FESs to the atomistic scale, justifying and complementing the proposed NN based CG approach. The efficiency benefit of the algorithm scales with the length of the oligomer. Already for the heptamers, the algorithm is about one order of magnitude faster in sampling compared to a standard MD simulation.
Generating a conformational landscape of ubiquitin chains at atomistic resolution by back-mapping based sampling
2023-01-10, Hunkler, Simon, Buhl, Teresa, Kukharenko, Oleksandra, Peter, Christine
Ubiquitin chains are flexible multidomain proteins that have important biological functions in cellular signalling. Computational studies with all-atom molecular dynamics simulations of the conformational spaces of polyubiquitins can be challenging due to the system size and a multitude of long-lived meta-stable states. Coarse graining is an efficient approach to overcome this problem—at the cost of losing high-resolution details. Recently, we proposed the back-mapping based sampling (BMBS) approach that reintroduces atomistic information into a given coarse grained (CG) sampling based on a two-dimensional (2D) projection of the conformational landscape, produces an atomistic ensemble and allows to systematically compare the ensembles at the two levels of resolution. Here, we apply BMBS to K48-linked tri-ubiquitin, showing its applicability to larger systems than those it was originally introduced on and demonstrating that the algorithm scales very well with system size. In an extension of the original BMBS we test three different seeding strategies, i.e. different approaches from where in the CG landscape atomistic trajectories are initiated. Furthermore, we apply a recently introduced conformational clustering algorithm to the back-mapped atomistic ensemble. Thus, we obtain insight into the structural composition of the 2D landscape and illustrate that the dimensionality reduction algorithm separates different conformational characteristics very well into different regions of the map. This cluster analysis allows us to show how atomistic trajectories sample conformational states, move through the projection space and in sum converge to an atomistic conformational landscape that slightly differs from the original CG map, indicating a correction of flaws in the CG template.
Towards a molecular basis of ubiquitin signaling : a dual-scale simulation study of ubiquitin dimers
2018-11, Berg, Andrej, Kukharenko, Oleksandra, Scheffner, Martin, Peter, Christine
Covalent modification of proteins by ubiquitin or ubiquitin chains is one of the most prevalent post-translational modifications in eukaryotes. Different types of ubiquitin chains are assumed to selectively signal respectively modified proteins for different fates. In support of this hypothesis, structural studies have shown that the eight possible ubiquitin dimers adopt different conformations. However, at least in some cases, these structures cannot sufficiently explain the molecular basis of the selective signaling mechanisms. This indicates that the available structures represent only a few distinct conformations within the entire conformational space adopted by a ubiquitin dimer. Here, molecular simulations on different levels of resolution can complement the structural information. We have combined exhaustive coarse grained and atomistic simulations of all eight possible ubiquitin dimers with a suitable dimensionality reduction technique and a new method to characterize protein-protein interfaces and the conformational landscape of protein conjugates. We found that ubiquitin dimers exhibit characteristic linkage type-dependent properties in solution, such as interface stability and the character of contacts between the subunits, which can be directly correlated with experimentally observed linkage-specific properties.
Post-translational modification of proteins by covalent attachment of ubiquitin is a key cellular process, regulating for example the fate and recycling of proteins. We present a new method to combine multiscale simulation with advanced analysis methods to characterize the states of ubiquitin-ubiquitin conjugates. We found that the linkage position affects the conformational space of ubiquitin dimers, determining the number and stability of relevant states, the character of subunit contacts and the nature of the surface exposed to possible binding partners.
Guanidine-II aptamer conformations and ligand binding modes through the lens of molecular simulation
2021-07-07, Steuer, Jakob, Kukharenko, Oleksandra, Riedmiller, Kai, Hartig, Jörg S., Peter, Christine
Regulation of gene expression via riboswitches is a widespread mechanism in bacteria. Here, we investigate ligand binding of a member of the guanidine sensing riboswitch family, the guanidine-II riboswitch (Gd-II). It consists of two stem-loops forming a dimer upon ligand binding. Using extensive molecular dynamics simulations we have identified conformational states corresponding to ligand-bound and unbound states in a monomeric stem-loop of Gd-II and studied the selectivity of this binding. To characterize these states and ligand-dependent conformational changes we applied a combination of dimensionality reduction, clustering, and feature selection methods. In absence of a ligand, the shape of the binding pocket alternates between the conformation observed in presence of guanidinium and a collapsed conformation, which is associated with a deformation of the dimerization interface. Furthermore, the structural features responsible for the ability to discriminate against closely related analogs of guanidine are resolved. Based on these insights, we propose a mechanism that couples ligand binding to aptamer dimerization in the Gd-II system, demonstrating the value of computational methods in the field of nucleic acids research.