Rapid Enthalpy Prediction of Transition States Using Molecular Graph Convolutional Network

AIChE Journal ◽  
2021 ◽  
Author(s):  
Siyuan Gong ◽  
Yutong Wang ◽  
Yajie Tian ◽  
Li Wang ◽  
Guozhu Liu
2019 ◽  
Vol 20 (14) ◽  
pp. 3389 ◽  
Author(s):  
Ke Liu ◽  
Xiangyan Sun ◽  
Lei Jia ◽  
Jun Ma ◽  
Haoming Xing ◽  
...  

Absorption, distribution, metabolism, and excretion (ADME) studies are critical for drug discovery. Conventionally, these tasks, together with other chemical property predictions, rely on domain-specific feature descriptors, or fingerprints. Following the recent success of neural networks, we developed Chemi-Net, a completely data-driven, domain knowledge-free, deep learning method for ADME property prediction. To compare the relative performance of Chemi-Net with Cubist, one of the popular machine learning programs used by Amgen, a large-scale ADME property prediction study was performed on-site at Amgen. For all 13 data sets, Chemi-Net resulted in higher R2 values compared with the Cubist benchmark. The median R2 increase rate over Cubist was 26.7%. We expect that the significantly increased accuracy of ADME prediction seen with Chemi-Net over Cubist will greatly accelerate drug discovery.


2021 ◽  
Author(s):  
Weihe Zhong ◽  
Lu Zhao ◽  
Calvin Yu-Chian Chen

Abstract The LIM kinase 1 (Limk1) has been demonstrated to be considered a therapeutic target and selective inhibitors of Limk1 rather Rho-associated kinase 2 (ROCK2) are considered of interest for the treatment of several indications such as Alzheimer’s disease (AD), Parkinson’s disease (PD) and cancer migration/invasion. Here, we utilized molecular docking to screen potential compounds of Limk1 from Traditional Chinese Medicine (TCM) database. Meanwhile, we performed a three-dimensional graph convolutional network (3DGCN), based on 3D molecular graph, to predict the inhibitory activity of Limk1 and ROCK2. Compared with the baseline models (RF, GCN and Weave), the 3DGCN achieved higher accuracy and the averaged RMSE values on test sets for Limk1 and ROCK2 were 0.721 and 0.852 respectively. In 3DGCN, above 80% of the test-set molecules from both two datasets were predicted within absolute error of 1.0 and the feature visualization suggested that it could automatically learn relevant structure features including 3D molecular information from a specific task for prediction. Furthermore, molecular dynamics (MD) simulations within 100 ns were employed to verify the stability of ligand-protein complexes and reveal the binding modes of the potential selective lead compounds of Limk1. Finally, integrating docking results, the predicted values by the 3DGCN and the MD analysis, we found that 7549, 2007_15649 and 3519 might be the potential and selective inhibitors for Limk1 receptor rather than ROCK2.


2020 ◽  
Author(s):  
Artur Schweidtmann ◽  
Jan Rittig ◽  
Andrea König ◽  
Martin Grohe ◽  
Alexander Mitsos ◽  
...  

<div>Prediction of combustion-related properties of (oxygenated) hydrocarbons is an important and challenging task for which quantitative structure-property relationship (QSPR) models are frequently employed. Recently, a machine learning method, graph neural networks (GNNs), has shown promising results for the prediction of structure-property relationships. GNNs utilize a graph representation of molecules, where atoms correspond to nodes and bonds to edges containing information about the molecular structure. More specifically, GNNs learn physico-chemical properties as a function of the molecular graph in a supervised learning setup using a backpropagation algorithm. This end-to-end learning approach eliminates the need for selection of molecular descriptors or structural groups, as it learns optimal fingerprints through graph convolutions and maps the fingerprints to the physico-chemical properties by deep learning. We develop GNN models for predicting three fuel ignition quality indicators, i.e., the derived cetane number (DCN), the research octane number (RON), and the motor octane number (MON), of oxygenated and non-oxygenated hydrocarbons. In light of limited experimental data in the order of hundreds, we propose a combination of multi-task learning, transfer learning, and ensemble learning. The results show competitive performance of the proposed GNN approach compared to state-of-the-art QSPR models making it a promising field for future research. The prediction tool is available via a web front-end at www.avt.rwth-aachen.de/gnn.</div>


2018 ◽  
Author(s):  
Caitlin C. Bannan ◽  
David Mobley ◽  
A. Geoff Skillman

<div>A variety of fields would benefit from accurate pK<sub>a</sub> predictions, especially drug design due to the affect a change in ionization state can have on a molecules physiochemical properties.</div><div>Participants in the recent SAMPL6 blind challenge were asked to submit predictions for microscopic and macroscopic pK<sub>a</sub>s of 24 drug like small molecules.</div><div>We recently built a general model for predicting pK<sub>a</sub>s using a Gaussian process regression trained using physical and chemical features of each ionizable group.</div><div>Our pipeline takes a molecular graph and uses the OpenEye Toolkits to calculate features describing the removal of a proton.</div><div>These features are fed into a Scikit-learn Gaussian process to predict microscopic pK<sub>a</sub>s which are then used to analytically determine macroscopic pK<sub>a</sub>s.</div><div>Our Gaussian process is trained on a set of 2,700 macroscopic pK<sub>a</sub>s from monoprotic and select diprotic molecules.</div><div>Here, we share our results for microscopic and macroscopic predictions in the SAMPL6 challenge.</div><div>Overall, we ranked in the middle of the pack compared to other participants, but our fairly good agreement with experiment is still promising considering the challenge molecules are chemically diverse and often polyprotic while our training set is predominately monoprotic.</div><div>Of particular importance to us when building this model was to include an uncertainty estimate based on the chemistry of the molecule that would reflect the likely accuracy of our prediction. </div><div>Our model reports large uncertainties for the molecules that appear to have chemistry outside our domain of applicability, along with good agreement in quantile-quantile plots, indicating it can predict its own accuracy.</div><div>The challenge highlighted a variety of means to improve our model, including adding more polyprotic molecules to our training set and more carefully considering what functional groups we do or do not identify as ionizable. </div>


2019 ◽  
Author(s):  
Wengong Jin ◽  
Regina Barzilay ◽  
Tommi S Jaakkola

The problem of accelerating drug discovery relies heavily on automatic tools to optimize precursor molecules to afford them with better biochemical properties. Our work in this paper substantially extends prior state-of-the-art on graph-to-graph translation methods for molecular optimization. In particular, we realize coherent multi-resolution representations by interweaving trees over substructures with the atom-level encoding of the original molecular graph. Moreover, our graph decoder is fully autoregressive, and interleaves each step of adding a new substructure with the process of resolving its connectivity to the emerging molecule. We evaluate our model on multiple molecular optimization tasks and show that our model outperforms previous state-of-the-art baselines by a large margin.


2020 ◽  
Vol 20 (14) ◽  
pp. 1389-1402 ◽  
Author(s):  
Maja Zivkovic ◽  
Marko Zlatanovic ◽  
Nevena Zlatanovic ◽  
Mladjan Golubović ◽  
Aleksandar M. Veselinović

In recent years, one of the promising approaches in the QSAR modeling Monte Carlo optimization approach as conformation independent method, has emerged. Monte Carlo optimization has proven to be a valuable tool in chemoinformatics, and this review presents its application in drug discovery and design. In this review, the basic principles and important features of these methods are discussed as well as the advantages of conformation independent optimal descriptors developed from the molecular graph and the Simplified Molecular Input Line Entry System (SMILES) notation compared to commonly used descriptors in QSAR modeling. This review presents the summary of obtained results from Monte Carlo optimization-based QSAR modeling with the further addition of molecular docking studies applied for various pharmacologically important endpoints. SMILES notation based optimal descriptors, defined as molecular fragments, identified as main contributors to the increase/ decrease of biological activity, which are used further to design compounds with targeted activity based on computer calculation, are presented. In this mini-review, research papers in which molecular docking was applied as an additional method to design molecules to validate their activity further, are summarized. These papers present a very good correlation among results obtained from Monte Carlo optimization modeling and molecular docking studies.


2020 ◽  
Vol 17 (3) ◽  
pp. 224-233
Author(s):  
Xun Zhu ◽  
Chen Jian ◽  
Xiuqin Zhou ◽  
Abdullah M. Asiri ◽  
Khalid A. Alamry ◽  
...  

The pyrolysis of methyl alkyl esters I to III and dithioesters IV to VI were theoretically calculated. All possible pyrolysis paths were considered. Both esters and dithioesters presented three potential paths via six-, four- and five-membered ring transition states, respectively. The calculation processes were calculated using MP2/6-31G(d) set. In-depth theoretical analyses were also presented, including NBO related analyses, synchronicities, and charge distributions, to reveal the detailed pyrolysis process.


Sign in / Sign up

Export Citation Format

Share Document