scholarly journals Machine Learning Assisted Free Energy Simulation of Solution–Phase and Enzyme Reactions

Author(s):  
Xiaoliang Pan ◽  
junjie yang ◽  
Richard Van ◽  
Evgeny Epifanovsky ◽  
Junming Ho ◽  
...  

Despite recent advances in the development of machine learning potentials (MLPs) for biomolecular simulations, there has been limited effort in developing stable and accurate MLPs for enzymatic reactions. Here, we report a protocol for performing machine learning assisted free energy simulation of solution-phase and enzyme reactions at an ab initio quantum mechanical and molecular mechanical (ai-QM/MM) level of accuracy. Within our protocol, the MLP is built to reproduce the ai-QM/MM energy as well as forces on both QM (reactive) and MM (solvent/enzyme) atoms. As an alternative strategy, a delta machine learning potential (DMLP) is trained to reproduce the differences between ai-QM/MM and semiempirical (se) QM/MM energy and forces. To account for the effect of the condensed–phase environment in both MLP and DMLP, the DeePMD representation of a molecular system is extended to incorporate external electrostatic potential and field on each QM atom. Using the Menshutkin and chorismate mutase reactions as examples, we show that the developed MLP and DMLP reproduce the ai-QM/MM energy and forces with an error on average less than 1.0 kcal/mol and 1.0 kcal/mol/Å for representative configurations along the reaction pathway. For both reactions, MLP/DMLP-based simulations yielded free energy profiles that differed by less than 1.0 kcal/mol from the reference ai-QM/MM results, but only at a fractional computational cost.<br>

2021 ◽  
Author(s):  
Xiaoliang Pan ◽  
junjie yang ◽  
Richard Van ◽  
Evgeny Epifanovsky ◽  
Junming Ho ◽  
...  

Despite recent advances in the development of machine learning potentials (MLPs) for biomolecular simulations, there has been limited effort in developing stable and accurate MLPs for enzymatic reactions. Here, we report a protocol for performing machine learning assisted free energy simulation of solution-phase and enzyme reactions at an ab initio quantum mechanical and molecular mechanical (ai-QM/MM) level of accuracy. Within our protocol, the MLP is built to reproduce the ai-QM/MM energy as well as forces on both QM (reactive) and MM (solvent/enzyme) atoms. As an alternative strategy, a delta machine learning potential (DMLP) is trained to reproduce the differences between ai-QM/MM and semiempirical (se) QM/MM energy and forces. To account for the effect of the condensed–phase environment in both MLP and DMLP, the DeePMD representation of a molecular system is extended to incorporate external electrostatic potential and field on each QM atom. Using the Menshutkin and chorismate mutase reactions as examples, we show that the developed MLP and DMLP reproduce the ai-QM/MM energy and forces with an error on average less than 1.0 kcal/mol and 1.0 kcal/mol/Å for representative configurations along the reaction pathway. For both reactions, MLP/DMLP-based simulations yielded free energy profiles that differed by less than 1.0 kcal/mol from the reference ai-QM/MM results, but only at a fractional computational cost.<br>


2021 ◽  
Vol 17 (9) ◽  
pp. 5745-5758 ◽  
Author(s):  
Xiaoliang Pan ◽  
Junjie Yang ◽  
Richard Van ◽  
Evgeny Epifanovsky ◽  
Junming Ho ◽  
...  

2009 ◽  
Vol 87 (10) ◽  
pp. 1322-1337 ◽  
Author(s):  
Hans Martin Senn ◽  
Johannes Kästner ◽  
Jürgen Breidung ◽  
Walter Thiel

We report potential-energy and free-energy data for three enzymatic reactions: carbon–halogen bond formation in fluorinase, hydrogen abstraction from camphor in cytochrome P450cam, and chorismate-to-prephenate Claisen rearrangement in chorismate mutase. The results were obtained by combined quantum mechanics/molecular mechanics (QM/MM) optimizations and two types of QM/MM free-energy simulations (free-energy perturbation and umbrella sampling) using semi-empirical or density-functional QM methods. Based on these results and our previously published free-energy data on electrophilic substitution in para-hydroxybenzoate hydroxylase, we discuss the importance of finite-temperature effects in the chemical step of enzyme reactions. We find that the entropic contribution to the activation barrier is generally rather small, usually of the order of 5 kJ mol–1 or less, consistent with the notion that enzymes bind and pre-organize the reactants in the active site. A somewhat larger entropic contribution is encountered in the case of chorismate mutase where the pericyclic transition state is intrinsically more rigid than the chorismate reactant (also in the enzyme). The present results suggest that barriers from QM/MM geometry optimization may often be close to free-energy barriers for the chemical step in enzymatic reactions.


2021 ◽  
Author(s):  
Zhaoxi Sun ◽  
Qiaole He

<p>The combination of free energy simulations in the alchemical and configurational spaces provides a feasible route to access the thermodynamic profiles under a computationally demanding target Hamiltonian. Normally, due to the significant differences between the computational cost of ab initio quantum mechanics (QM) calculations and those of semi-empirical quantum mechanics (SQM) and molecular mechanics (MM), this indirect method is often applied to obtain the QM thermodynamics by combining the SQM or MM thermodynamics and the SQM-to-QM or MM-to-QM corrections. In our previous works, a multi-dimensional nonequilibrium pulling framework for Hamiltonian variations has been introduced based on bidirectional pulling and bidirectional reweighting. The method performs nonequilibrium free energy simulations in the configurational space to obtain the thermodynamic profile along the conformational change pathway under a selected computationally efficient Hamiltonian, and uses the nonequilibrium alchemical method to correct or perturb the thermodynamic profile to that under the target Hamiltonian. The BAR-based method is designed to achieve the best generality and transferability and thus leads to modest (~20 fold) speedup. In this work, we explore the possibility of further accelerating the nonequilibrium free energy simulation by employing unidirectional pulling and using the selection criterion to obtain the initial configurations used to initiate nonequilibrium trajectories following the idea of adaptive steered molecular dynamics (ASMD). A single initial condition is used to seed the whole multi-dimensional nonequilibrium free energy simulation and the sampling is performed fully in the nonequilibrium ensemble. The ASMD scheme estimates the free energy difference with the unidirectional exponential average (EXP), but it does not follow exactly the requirements of the EXP estimator. Another consequence of the seeding simulation is the inherently sequential or serial pulling due to the inter-segment dependency, which triggers some problems in the parallelizability of the simulation. Therefore, some tests are required to grasp some insights and guidelines for using this selection-criterion-based ASMD scheme. The ASMD method is tested thoroughly on a dihedral flipping model system and encouraging numerical results are obtained. The selection-criterion-based multi-dimensional ASMD framework follows the same perturbation framework of the BAR-based method, and thus could be used in various Hamiltonian-variation cases.</p>


2021 ◽  
Author(s):  
Zhaoxi Sun ◽  
Qiaole He

The combination of free energy simulations in the alchemical and configurational spaces provides a feasible route to access the thermodynamic profiles under a computationally demanding target Hamiltonian. Normally, due to the significant differences between the computational cost of ab initio quantum mechanics (QM) calculations and those of semi-empirical quantum mechanics (SQM) and molecular mechanics (MM), this indirect method could be used to obtain the QM thermodynamics by combining the SQM or MM results and the SQM-to-QM or MM-to-QM corrections. In our previous works, a multi-dimensional nonequilibrium pulling framework for Hamiltonian variations has been introduced based on bidirectional pulling and bidirectional reweighting. The method performs nonequilibrium free energy simulations in the configurational space to obtain the thermodynamic profile along the conformational change pathway under a selected computationally efficient Hamiltonian, and uses the nonequilibrium alchemical method to correct or perturb the thermodynamic profile to that under the target Hamiltonian. The BAR-based method is designed to achieve the best generality and transferability and thus leads to modest (~20 folds) speedup. In this work, we explore the possibility of further accelerating the nonequilibrium free energy simulation by employing unidirectional pulling and using the selection criterion to obtain the initial configurations used to initiate nonequilibrium trajectories following the idea of adaptive steered molecular dynamics (ASMD). A single initial condition is used to seed the whole multi-dimensional nonequilibrium free energy simulation and the sampling is performed fully in the nonequilibrium ensemble. Introducing very short ps-length equilibrium sampling to grab more initial seeds could also be helpful. The ASMD scheme estimates the free energy difference with the unidirectional exponential average (EXP), but it does not follow exactly the requirements of the EXP estimator. Another deficiency of the seeding simulation is the inherently sequential or serial pulling due to the inter-segment dependency, which triggers some problems in the parallelizability of the simulation. Numerical tests are performed to grasp some insights and guidelines for using this selection-criterion-based ASMD scheme. The presented selection-criterion-based multi-dimensional ASMD scheme follows the same perturbation network of the BAR-based method, and thus could be used in various Hamiltonian-variation cases.


2021 ◽  
Author(s):  
Zhaoxi Sun ◽  
Qiaole He

<p>The combination of free energy simulations in the alchemical and configurational spaces provides a feasible route to access the thermodynamic profiles under a computationally demanding target Hamiltonian. Normally, due to the significant differences between the computational cost of ab initio quantum mechanics (QM) calculations and those of semi-empirical quantum mechanics (SQM) and molecular mechanics (MM), this indirect method is often applied to obtain the QM thermodynamics by combining the SQM or MM thermodynamics and the SQM-to-QM or MM-to-QM corrections. In our previous works, a multi-dimensional nonequilibrium pulling framework for Hamiltonian variations has been introduced based on bidirectional pulling and bidirectional reweighting. The method performs nonequilibrium free energy simulations in the configurational space to obtain the thermodynamic profile along the conformational change pathway under a selected computationally efficient Hamiltonian, and uses the nonequilibrium alchemical method to correct or perturb the thermodynamic profile to that under the target Hamiltonian. The BAR-based method is designed to achieve the best generality and transferability and thus leads to modest (~20 fold) speedup. In this work, we explore the possibility of further accelerating the nonequilibrium free energy simulation by employing unidirectional pulling and using the selection criterion to obtain the initial configurations used to initiate nonequilibrium trajectories following the idea of adaptive steered molecular dynamics (ASMD). A single initial condition is used to seed the whole multi-dimensional nonequilibrium free energy simulation and the sampling is performed fully in the nonequilibrium ensemble. The ASMD scheme estimates the free energy difference with the unidirectional exponential average (EXP), but it does not follow exactly the requirements of the EXP estimator. Another consequence of the seeding simulation is the inherently sequential or serial pulling due to the inter-segment dependency, which triggers some problems in the parallelizability of the simulation. Therefore, some tests are required to grasp some insights and guidelines for using this selection-criterion-based ASMD scheme. The ASMD method is tested thoroughly on a dihedral flipping model system and encouraging numerical results are obtained. The selection-criterion-based multi-dimensional ASMD framework follows the same perturbation framework of the BAR-based method, and thus could be used in various Hamiltonian-variation cases.</p>


2020 ◽  
Author(s):  
Jingbai Li ◽  
Patrick Reiser ◽  
André Eberhard ◽  
Pascal Friederich ◽  
Steven Lopez

<p>Photochemical reactions are being increasingly used to construct complex molecular architectures with mild and straightforward reaction conditions. Computational techniques are increasingly important to understand the reactivities and chemoselectivities of photochemical isomerization reactions because they offer molecular bonding information along the excited-state(s) of photodynamics. These photodynamics simulations are resource-intensive and are typically limited to 1–10 picoseconds and 1,000 trajectories due to high computational cost. Most organic photochemical reactions have excited-state lifetimes exceeding 1 picosecond, which places them outside possible computational studies. Westermeyr <i>et al.</i> demonstrated that a machine learning approach could significantly lengthen photodynamics simulation times for a model system, methylenimmonium cation (CH<sub>2</sub>NH<sub>2</sub><sup>+</sup>).</p><p>We have developed a Python-based code, Python Rapid Artificial Intelligence <i>Ab Initio</i> Molecular Dynamics (PyRAI<sup>2</sup>MD), to accomplish the unprecedented 10 ns <i>cis-trans</i> photodynamics of <i>trans</i>-hexafluoro-2-butene (CF<sub>3</sub>–CH=CH–CF<sub>3</sub>) in 3.5 days. The same simulation would take approximately 58 years with ground-truth multiconfigurational dynamics. We proposed an innovative scheme combining Wigner sampling, geometrical interpolations, and short-time quantum chemical trajectories to effectively sample the initial data, facilitating the adaptive sampling to generate an informative and data-efficient training set with 6,232 data points. Our neural networks achieved chemical accuracy (mean absolute error of 0.032 eV). Our 4,814 trajectories reproduced the S<sub>1</sub> half-life (60.5 fs), the photochemical product ratio (<i>trans</i>: <i>cis</i> = 2.3: 1), and autonomously discovered a pathway towards a carbene. The neural networks have also shown the capability of generalizing the full potential energy surface with chemically incomplete data (<i>trans</i> → <i>cis</i> but not <i>cis</i> → <i>trans</i> pathways) that may offer future automated photochemical reaction discoveries.</p>


2020 ◽  
Author(s):  
E. Prabhu Raman ◽  
Thomas J. Paul ◽  
Ryan L. Hayes ◽  
Charles L. Brooks III

<p>Accurate predictions of changes to protein-ligand binding affinity in response to chemical modifications are of utility in small molecule lead optimization. Relative free energy perturbation (FEP) approaches are one of the most widely utilized for this goal, but involve significant computational cost, thus limiting their application to small sets of compounds. Lambda dynamics, also rigorously based on the principles of statistical mechanics, provides a more efficient alternative. In this paper, we describe the development of a workflow to setup, execute, and analyze Multi-Site Lambda Dynamics (MSLD) calculations run on GPUs with CHARMm implemented in BIOVIA Discovery Studio and Pipeline Pilot. The workflow establishes a framework for setting up simulation systems for exploratory screening of modifications to a lead compound, enabling the calculation of relative binding affinities of combinatorial libraries. To validate the workflow, a diverse dataset of congeneric ligands for seven proteins with experimental binding affinity data is examined. A protocol to automatically tailor fit biasing potentials iteratively to flatten the free energy landscape of any MSLD system is developed that enhances sampling and allows for efficient estimation of free energy differences. The protocol is first validated on a large number of ligand subsets that model diverse substituents, which shows accurate and reliable performance. The scalability of the workflow is also tested to screen more than a hundred ligands modeled in a single system, which also resulted in accurate predictions. With a cumulative sampling time of 150ns or less, the method results in average unsigned errors of under 1 kcal/mol in most cases for both small and large combinatorial libraries. For the multi-site systems examined, the method is estimated to be more than an order of magnitude more efficient than contemporary FEP applications. The results thus demonstrate the utility of the presented MSLD workflow to efficiently screen combinatorial libraries and explore chemical space around a lead compound, and thus are of utility in lead optimization.</p>


2018 ◽  
Author(s):  
Roman Zubatyuk ◽  
Justin S. Smith ◽  
Jerzy Leszczynski ◽  
Olexandr Isayev

<p>Atomic and molecular properties could be evaluated from the fundamental Schrodinger’s equation and therefore represent different modalities of the same quantum phenomena. Here we present AIMNet, a modular and chemically inspired deep neural network potential. We used AIMNet with multitarget training to learn multiple modalities of the state of the atom in a molecular system. The resulting model shows on several benchmark datasets the state-of-the-art accuracy, comparable to the results of orders of magnitude more expensive DFT methods. It can simultaneously predict several atomic and molecular properties without an increase in computational cost. With AIMNet we show a new dimension of transferability: the ability to learn new targets utilizing multimodal information from previous training. The model can learn implicit solvation energy (like SMD) utilizing only a fraction of original training data, and archive MAD error of 1.1 kcal/mol compared to experimental solvation free energies in MNSol database.</p>


2019 ◽  
Author(s):  
Xiaohui Wang ◽  
Zhaoxi Sun

<p>Correct calculation of the variation of free energy upon base flipping is crucial in understanding the dynamics of DNA systems. The free energy landscape along the flipping pathway gives the thermodynamic stability and the flexibility of base-paired states. Although numerous free energy simulations are performed in the base flipping cases, no theoretically rigorous nonequilibrium techniques are devised and employed to investigate the thermodynamics of base flipping. In the current work, we report a general nonequilibrium stratification scheme for efficient calculation of the free energy landscape of base flipping in DNA duplex. We carefully monitor the convergence behavior of the equilibrium sampling based free energy simulation and the nonequilibrium stratification and determine the empirical length of time blocks required for converged sampling. Comparison between the performances of equilibrium umbrella sampling and nonequilibrium stratification is given. The results show that nonequilibrium free energy simulation is able to give similar accuracy and efficiency compared with the equilibrium enhanced sampling technique in the base flipping cases. We further test a convergence criterion we previously proposed and it comes out that the convergence behavior determined by this criterion agrees with those given by the time-invariant behavior of PMF and the nonlinear dependence of standard deviation on the sample size. </p>


Sign in / Sign up

Export Citation Format

Share Document