Revisiting the “satisfaction of spatial restraints” approach of MODELLER for protein homology modeling

AbstractThe most frequently used approach for protein structure prediction is currently homology modeling. The 3D model building phase of this methodology is critical for obtaining an accurate and biologically useful prediction. The most widely employed tool to perform this task is MODELLER. This program implements the “modeling by satisfaction of spatial restraints” strategy and its core algorithm has not been altered significantly since the early 1990s. In this work, we have explored the idea of modifying MODELLER with two effective, yet computationally light strategies to improve its 3D modeling performance. Firstly, we have investigated how the level of accuracy in the estimation of structural variability between a target protein and its templates in the form of σ values profoundly influences 3D modeling. We show that the σ values produced by MODELLER are on average weakly correlated to the true level of structural divergence between target-template pairs and that increasing this correlation greatly improves the program’s predictions, especially in multiple-template modeling. Secondly, we have inquired into how the incorporation of statistical potential terms (such as the DOPE potential) in the MODELLER’s objective function impacts positively 3D modeling quality by providing a small but consistent improvement in metrics such as GDT-HA and lDDT and a large increase in stereochemical quality. Python modules to harness this second strategy are freely available at https://github.com/pymodproject/altmod. In summary, we show that there is a large room for improving MODELLER in terms of 3D modeling quality and we propose strategies that could be pursued in order to further increase its performance.Author summaryProteins are fundamental biological molecules that carry out countless activities in living beings. Since the function of proteins is dictated by their three-dimensional atomic structures, acquiring structural details of proteins provides deep insights into their function. Currently, the most successful computational approach for protein structure prediction is template-based modeling. In this approach, a target protein is modeled using the experimentally-derived structural information of a template protein assumed to have a similar structure to the target. MODELLER is the most frequently used program for template-based 3D model building. Despite its success, its predictions are not always accurate enough to be useful in Biomedical Research. Here, we show that it is possible to greatly increase the performance of MODELLER by modifying two aspects of its algorithm. First, we demonstrate that providing the program with accurate estimations of local target-template structural divergence greatly increases the quality of its predictions. Additionally, we show that modifying MODELLER’s scoring function with statistical potential energetic terms also helps to improve modeling quality. This work will be useful in future research, since it reports practical strategies to improve the performance of this core tool in Structural Bioinformatics.

Download Full-text

An improved genetic algorithm for statistical potential function design and protein structure prediction

International Journal of Data Mining and Bioinformatics ◽

10.1504/ijdmb.2012.048174 ◽

2012 ◽

Vol 6 (2) ◽

pp. 162

Author(s):

Xin Geng ◽

Jihong Guan ◽

Qiwen Dong ◽

Shuigeng Zhou

Keyword(s):

Genetic Algorithm ◽

Protein Structure ◽

Potential Function ◽

Protein Structure Prediction ◽

Structure Prediction ◽

Improved Genetic Algorithm ◽

Statistical Potential ◽

Function Design

Download Full-text

Protein structure prediction based on statistical potential

Biophysical Journal ◽

10.1016/s0006-3495(92)81793-9 ◽

1992 ◽

Vol 62 (1) ◽

pp. 104-106 ◽

Cited By ~ 10

Author(s):

S. Sun ◽

N. Luo ◽

R.L. Ornstein ◽

R. Rein

Keyword(s):

Protein Structure ◽

Protein Structure Prediction ◽

Structure Prediction ◽

Statistical Potential

Download Full-text

GOAP: A Generalized Orientation-Dependent, All-Atom Statistical Potential for Protein Structure Prediction

Biophysical Journal ◽

10.1016/j.bpj.2011.09.012 ◽

2011 ◽

Vol 101 (8) ◽

pp. 2043-2052 ◽

Cited By ~ 170

Author(s):

Hongyi Zhou ◽

Jeffrey Skolnick

Keyword(s):

Protein Structure ◽

Protein Structure Prediction ◽

Structure Prediction ◽

Statistical Potential

Download Full-text

Development of a Grid-based Statistical Potential for Protein Structure Prediction

2005 IEEE Engineering in Medicine and Biology 27th Annual Conference ◽

10.1109/iembs.2005.1615875 ◽

2005 ◽

Cited By ~ 1

Author(s):

Guijun Zhao ◽

Hui Lu

Keyword(s):

Protein Structure ◽

Protein Structure Prediction ◽

Structure Prediction ◽

Statistical Potential ◽

Grid Based

Download Full-text

Protein Structure Prediction Using an Augmented Homology Modeling Method: Key Importance of Iterative-Procedures for Obtaining Consistent Quality Models

Current Proteomics ◽

10.2174/157016405774641156 ◽

2005 ◽

Vol 2 (3) ◽

pp. 233-258

Author(s):

S. McDonald ◽

S. Mylvaganam ◽

M. Shenderovich ◽

V. Tseitin ◽

C. Fisher ◽

...

Keyword(s):

Protein Structure ◽

Homology Modeling ◽

Protein Structure Prediction ◽

Structure Prediction ◽

Modeling Method ◽

Quality Models ◽

Iterative Procedures

Download Full-text

3P003 Application of a knowledge-based potential for a rotamer library to 3D model verification in protein structure prediction

Seibutsu Butsuri ◽

10.2142/biophys.45.s204_3 ◽

2005 ◽

Vol 45 (supplement) ◽

pp. S204

Author(s):

K. Tomii ◽

T. Hirokawa ◽

M. Ota

Keyword(s):

Protein Structure ◽

Protein Structure Prediction ◽

Structure Prediction ◽

3D Model ◽

Model Verification ◽

Rotamer Library ◽

Knowledge Based

Download Full-text

Improved protein structure prediction by deep learning irrespective of co-evolution information

10.1101/2020.10.12.336859 ◽

2020 ◽

Cited By ~ 1

Author(s):

Jinbo Xu ◽

Matthew Mcpartlon ◽

Jin Li

Keyword(s):

Deep Learning ◽

Protein Structure ◽

Protein Structure Prediction ◽

Protein Design ◽

Structure Prediction ◽

Model Building ◽

Evolutionary Information ◽

Designed Proteins ◽

Structure Relationship ◽

Over The Top

We describe our latest study of the deep convolutional residual neural networks (ResNet) for protein structure prediction, including deeper and wider ResNets, the efficacy of different input features, and improved 3D model building methods. Our ResNet can predict correct folds (TMscore>0.5) for 26 out of 32 CASP13 FM (template-free-modeling) targets and L/5 long-range contacts for these targets with precision over 80%, a significant improvement over the CASP13 results. Although co-evolution analysis plays an important role in the most successful structure prediction methods, we show that when co-evolution is not used, our ResNet can still predict correct folds for 18 of the 32 CASP13 FM targets including several large ones. This marks a significant improvement over the top co-evolution-based, non-deep learning methods at CASP13, and other non-coevolution-based deep learning models, such as the popular recurrent geometric network (RGN). With only primary sequence, our ResNet can also predict correct folds for all 21 human-designed proteins we tested. In contrast, RGN predicts correct folds for only 3 human-designed proteins and zero CASP13 FM target. In addition, we find that ResNet may fare better for the human-designed proteins when trained without co-evolution information than with co-evolution. These results suggest that ResNet does not simply denoise co-evolution signals, but instead is able to learn important sequence-structure relationship from experimental structures. This has important implications on protein design and engineering especially when evolutionary information is not available.

Download Full-text