Fast and effective protein model refinement using deep graph neural networks

Fast and effective protein model refinement by deep graph neural networks

10.1101/2020.12.10.419994 ◽

2020 ◽

Author(s):

Xiaoyang Jing ◽

Jinbo Xu

Keyword(s):

Neural Networks ◽

Structure Prediction ◽

Initial Model ◽

Model Quality ◽

Model Refinement ◽

Protein Model ◽

Improve Model ◽

Graph Neural Networks ◽

Improved Model ◽

Better Than

AbstractProtein structure prediction has been greatly improved, but there are still a good portion of predicted models that do not have very high quality. Protein model refinement is one of the methods that may further improve model quality. Nevertheless, it is very challenging to refine a protein model towards better quality. Currently the most successful refinement methods rely on extensive conformation sampling and thus, takes hours or days to refine even a single protein model. Here we propose a fast and effective method for protein model refinement with very limited conformation sampling. Our method applies GNN (graph neural networks) to predict refined inter-atom distance probability distribution from an initial model and then rebuilds the model using the predicted distance as restraints. On the CASP13 refinement targets our method may refine models with comparable quality as the two leading human groups (Feig and Baker) and greatly outperforms the others. On the CASP14 refinement targets our method is only second to Feig’s method, comparable to Baker’s method and much better than the others (who worsened instead of improved model quality). Our method achieves this result by generating only 5 refined models for an initial model, which can be done in ~15 minutes. Our study also shows that GNN performs much better than convolutional residual neural networks for protein model refinement when conformation sampling is limited.AvailabilityThe code will be released once the manuscript is published and available at http://[email protected]

Download Full-text

Graph Neural Networks for Prediction of Fuel Ignition Quality

10.26434/chemrxiv.12280325.v1 ◽

2020 ◽

Author(s):

Artur Schweidtmann ◽

Jan Rittig ◽

Andrea König ◽

Martin Grohe ◽

Alexander Mitsos ◽

...

Keyword(s):

Neural Networks ◽

Octane Number ◽

Molecular Graph ◽

Chemical Properties ◽

Graph Representation ◽

Structure Property ◽

Oxygenated Hydrocarbons ◽

Physico Chemical ◽

Ignition Quality ◽

Graph Neural Networks

<div>Prediction of combustion-related properties of (oxygenated) hydrocarbons is an important and challenging task for which quantitative structure-property relationship (QSPR) models are frequently employed. Recently, a machine learning method, graph neural networks (GNNs), has shown promising results for the prediction of structure-property relationships. GNNs utilize a graph representation of molecules, where atoms correspond to nodes and bonds to edges containing information about the molecular structure. More specifically, GNNs learn physico-chemical properties as a function of the molecular graph in a supervised learning setup using a backpropagation algorithm. This end-to-end learning approach eliminates the need for selection of molecular descriptors or structural groups, as it learns optimal fingerprints through graph convolutions and maps the fingerprints to the physico-chemical properties by deep learning. We develop GNN models for predicting three fuel ignition quality indicators, i.e., the derived cetane number (DCN), the research octane number (RON), and the motor octane number (MON), of oxygenated and non-oxygenated hydrocarbons. In light of limited experimental data in the order of hundreds, we propose a combination of multi-task learning, transfer learning, and ensemble learning. The results show competitive performance of the proposed GNN approach compared to state-of-the-art QSPR models making it a promising field for future research. The prediction tool is available via a web front-end at www.avt.rwth-aachen.de/gnn.</div>

Download Full-text

Improved Sampling Strategies for Protein Model Refinement based on Molecular Dynamics Simulation

10.26434/chemrxiv.13299197.v1 ◽

2020 ◽

Author(s):

Lim Heo ◽

Collin Arbour ◽

Michael Feig

Keyword(s):

Molecular Dynamics ◽

Molecular Dynamics Simulation ◽

Structure Prediction ◽

Protein Structures ◽

Conformational Space ◽

Dynamics Simulation ◽

Model Refinement ◽

Protein Model ◽

Lower Accuracy ◽

Simulation Based

Protein structures provide valuable information for understanding biological processes. Protein structures can be determined by experimental methods such as X-ray crystallography, nuclear magnetic resonance (NMR) spectroscopy, or cryogenic electron microscopy. As an alternative, in silico methods can be used to predict protein structures. Those methods utilize protein structure databases for structure prediction via template-based modeling or for training machine-learning models to generate predictions. Structure prediction for proteins distant from proteins with known structures often results in lower accuracy with respect to the true physiological structures. Physics-based protein model refinement methods can be applied to improve model accuracy in the predicted models. Refinement methods rely on conformational sampling around the predicted structures, and if structures closer to the native states are sampled, improvements in the model quality become possible. Molecular dynamics simulations have been especially successful for improving model qualities but although consistent refinement can be achieved, the improvements in model qualities are still moderate. To extend the refinement performance of a simulation-based protocol, we explored new schemes that focus on an optimized use of biasing functions and the application of increased simulation temperatures. In addition, we tested the use of alternative initial models so that the simulations can explore conformational space more broadly. Based on the insight of this analysis we are proposing a new refinement protocol that significantly outperformed previous state-of-the-art molecular dynamics simulation-based protocols in the benchmark tests described here. <br>

Download Full-text