Markov state models from hierarchical density-based assignment

Markov state models (MSMs) have become one of the preferred methods for the analysis and interpretation of molecular dynamics (MD) simulations of conformational transitions in biopolymers. While there is great variation in terms of implementation, a well-defined workflow involving multiple steps is often adopted. Typically, molecular coordinates are first subjected to dimensionality reduction and then clustered into small ``microstates'', which are subsequently lumped into ``macrostates'' using the information from the slowest eigenmodes. However, the microstate dynamics is often non-Markovian and long lag times are required to converge the MSM. Here we propose a variation on this typical workflow, taking advantage of hierarchical density-based clustering. When applied to simulation data, this type of clustering separates high population regions of conformational space from others that are rarely visited. In this way, density-based clustering naturally implements assignment of the data based on transitions between metastable states. As a result, the state definition becomes more consistent with the assumption of Markovianity and the timescales of the slow dynamics of the system are recovered more effectively. We present results of this simplified workflow for a model potential and MD simulations of the alanine dipeptide and the FiP35 WW domain.

Download Full-text

Thermodynamics and kinetics of the amyloid-β peptide revealed by Markov state models based on MD data in agreement with experiment

Chemical Science ◽

10.1039/d0sc04657d ◽

2021 ◽

Author(s):

Arghadwip Paul ◽

Suman Samantray ◽

Marco Anteghini ◽

Mohammed Khaled ◽

Birgit Strodel

Keyword(s):

Amyloid Β ◽

Md Simulations ◽

Amyloid Β Peptide ◽

Intrinsically Disordered ◽

Thermodynamics And Kinetics ◽

Markov State ◽

Markov State Models ◽

State Models ◽

Kinetics Of

The convergence of MD simulations is tested using varying measures for the intrinsically disordered amyloid-β peptide (Aβ). Markov state models show that 20–30 μs of MD is needed to reliably reproduce the thermodynamics and kinetics of Aβ.

Download Full-text

Markov models for the elucidation of allosteric regulation

Philosophical Transactions of the Royal Society B Biological Sciences ◽

10.1098/rstb.2017.0178 ◽

2018 ◽

Vol 373 (1749) ◽

pp. 20170178 ◽

Cited By ~ 11

Author(s):

Ushnish Sengupta ◽

Birgit Strodel

Keyword(s):

Markov Models ◽

Allosteric Regulation ◽

Md Simulations ◽

Markov Modelling ◽

Conformational Ensembles ◽

Markov State ◽

Protein Allostery ◽

Markov State Models ◽

State Models ◽

Allosteric Mechanisms

Allosteric regulation refers to the process where the effect of binding of a ligand at one site of a protein is transmitted to another, often distant, functional site. In recent years, it has been demonstrated that allosteric mechanisms can be understood by the conformational ensembles of a protein. Molecular dynamics (MD) simulations are often used for the study of protein allostery as they provide an atomistic view of the dynamics of a protein. However, given the wealth of detailed information hidden in MD data, one has to apply a method that allows extraction of the conformational ensembles underlying allosteric regulation from these data. Markov state models are one of the most promising methods for this purpose. We provide a short introduction to the theory of Markov state models and review their application to various examples of protein allostery studied by MD simulations. We also include a discussion of studies where Markov modelling has been employed to analyse experimental data on allosteric regulation. We conclude our review by advertising the wider application of Markov state models to elucidate allosteric mechanisms, especially since in recent years it has become straightforward to construct such models thanks to software programs like PyEMMA and MSMBuilder. This article is part of a discussion meeting issue ‘Allostery and molecular machines’.

Download Full-text

Molecular dynamics simulations of protein aggregation: protocols for simulation setup and analysis with Markov state models and transition networks

10.1101/2020.04.25.060269 ◽

2020 ◽

Cited By ~ 3

Author(s):

Suman Samantray ◽

Wibke Schumann ◽

Alexander-Maurice Illig ◽

Martin Carballo-Pacheco ◽

Arghadwip Paul ◽

...

Keyword(s):

Molecular Dynamics ◽

Protein Aggregation ◽

Amyloid Fibrils ◽

Md Simulations ◽

Aggregation Process ◽

Markov State ◽

Markov State Models ◽

State Models ◽

Dynamics Simulations ◽

Transition Networks

AbstractProtein disorder and aggregation play significant roles in the pathogenesis of numerous neuro-degenerative diseases, such as Alzheimer’s and Parkinson’s disease. The end products of the aggregation process in these diseases are β-sheet rich amyloid fibrils. Though in most cases small, soluble oligomers formed during amyloid aggregation are the toxic species. A full understanding of the physicochemical forces behind the protein aggregation process is required if one aims to reveal the molecular basis of the various amyloid diseases. Among a multitude of biophysical and biochemical techniques that are employed for studying protein aggregation, molecular dynamics (MD) simulations at the atomic level provide the highest temporal and spatial resolution of this process, capturing key steps during the formation of amyloid oligomers. Here we provide a step-by-step guide for setting up, running, and analyzing MD simulations of aggregating peptides using GROMACS. For the analysis we provide the scripts that were developed in our lab, which allow to determine the oligomer size and inter-peptide contacts that drive the aggregation process. Moreover, we explain and provide the tools to derive Markov state models and transition networks from MD data of peptide aggregation.

Download Full-text

Thermodynamics and kinetics of the amyloid-β peptide revealed by Markov state models based on MD data in agreement with experiment

10.1101/2020.07.27.223487 ◽

2020 ◽

Cited By ~ 3

Author(s):

Arghadwip Paul ◽

Suman Samantray ◽

Marco Anteghini ◽

Birgit Strodel

Keyword(s):

Alzheimer’S Disease ◽

Force Fields ◽

Amyloid Β ◽

Md Simulations ◽

Amyloid Β Peptide ◽

Thermodynamics And Kinetics ◽

Markov State ◽

Markov State Models ◽

State Models ◽

Kinetics Of

AbstractThe amlyoid-β peptide (Aβ) is closely linked to the development of Alzheimer’s disease. Molecular dynamics (MD) simulations have become an indispensable tool for studying the behavior of this peptide at the (sub)molecular level, thereby providing insight into the molecular basis of Alzheimer’s disease. General key aspects of MD simulations are the force field used for modeling the peptide or protein and its environment, which is important for accurate modeling of the system of interest, and the length of the simulations, which determines whether or not equilibrium is reached. In this study we address these points by analyzing 30-µs MD simulations acquired for Aβ40 using seven different force fields. We assess the convergence of these simulations based on the convergence of various structural properties and of NMR and fluorescence spectroscopic observables. Moreover, we calculate Markov state models for each of the seven MD simulations, which provide an unprecedented view of the thermodynamics and kinetics of the amyloid-β peptide. This further allows us to provide answers for pertinent questions, like: Which force fields are suitable for modeling Aβ? (a99SB-UCB and a99SB-ILDN/TIP4P-D); What does Aβ peptide really look like? (mostly extended and disordered) and; How long does it take MD simulations of Aβ to attain equilibrium? (20–30 µs). We believe the analyses presented in this study will provide a useful reference guide for important questions relating to the structure and dynamics of Aβin particular, and by extension other similar disordered peptides.

Download Full-text

Laplacian score and genetic algorithm based automatic feature selection for Markov State Models in adaptive sampling based molecular dynamics

PeerJ Physical Chemistry ◽

10.7717/peerj-pchem.9 ◽

2020 ◽

Vol 2 ◽

pp. e9

Author(s):

Anu George ◽

Madhura Purnaprajna ◽

Prashanth Athri

Keyword(s):

Molecular Dynamics ◽

Adaptive Sampling ◽

Simulated Data ◽

Md Simulations ◽

Considerable Effect ◽

Feature Subset ◽

Markov State ◽

Markov State Models ◽

State Models ◽

Laplacian Score

Adaptive sampling molecular dynamics based on Markov State Models use short parallel MD simulations to accelerate simulations, and are proven to identify hidden conformers. The accuracy of the predictions provided by it depends on the features extracted from the simulated data that is used to construct it. The identification of the most important features in the trajectories of the simulated system has a considerable effect on the results. Methods In this study, we use a combination of Laplacian scoring and genetic algorithms to obtain an optimized feature subset for the construction of the MSM. The approach is validated on simulations of three protein folding complexes, and two protein ligand binding complexes. Results Our experiments show that this approach produces better results when the number of samples is significantly lesser than the number of features extracted. We also observed that this method mitigates over fitting that occurs due to high dimensionality of large biosystems with shorter simulation times.

Download Full-text

What Markov state models can and cannot do: Correlation versus path-based observables in protein folding models

10.1101/2020.11.09.374496 ◽

2020 ◽

Author(s):

Ernesto Suárez ◽

Rafal P. Wiewiora ◽

Chris Wehmeyer ◽

Frank Noé ◽

John D. Chodera ◽

...

Keyword(s):

Protein Folding ◽

Correlation Functions ◽

Conformational Dynamics ◽

Coarse Graining ◽

Md Simulations ◽

Lag Time ◽

Time Correlation ◽

Markov State ◽

Markov State Models ◽

State Models

AbstractMarkov state models (MSMs) have been widely applied to study the kinetics and pathways of protein conformational dynamics based on statistical analysis of molecular dynamics (MD) simulations. These MSMs coarse-grain both configuration space and time in ways that limit what kinds of observables they can reproduce with high fidelity over different spatial and temporal resolutions. Despite their popularity, there is still limited understanding of which biophysical observables can be computed from these MSMs in a robust and unbiased manner, and which suffer from the space-time coarse-graining intrinsic in the MSM model. Most theoretical arguments and practical validity tests for MSMs rely on long-time equilibrium kinetics, such as the slowest relaxation timescales and experimentally observable time-correlation functions. Here, we perform an extensive assessment of the ability of well-validated protein folding MSMs to accuractely reproduce path-based observable such as mean first-passage times (MFPTs) and transition path mechanisms compared to a direct trajectory analysis. We also assess a recently proposed class of history-augmented MSMs (haMSMs) that exploit additional information not accounted for in standard MSMs. We conclude with some practical guidance on the use of MSMs to study various problems in conformational dynamics of biomolecules. In brief, MSMs can accurately reproduce correlation functions slower than the lag time, but path-based observables can only be reliably reproduced if the lifetimes of states exceed the lag time, which is a much stricter requirement. Even in the presence of short-lived states, we find that haMSMs reproduce path-based observables more reliably.

Download Full-text