Exact methods for lattice protein models

2014 ◽  
Vol 10 (4) ◽  
Author(s):  
Martin Mann ◽  
Rolf Backofen

AbstractLattice protein models are well-studied abstractions of globular proteins. By discretizing the structure space and simplifying the energy model over regular proteins, they enable detailed studies of protein structure formation and evolution. However, even in the simplest lattice protein models, the prediction of optimal structures is computationally difficult. Therefore, often, heuristic approaches are applied to find such conformations. Commonly, heuristic methods find only locally optimal solutions. Nevertheless, there exist methods that guarantee to predict globally optimal structures. Currently, only one such exact approach is publicly available, namely the constraint-based protein structure prediction method and variants. Here, we review exact approaches and derived methods. We discuss fundamental concepts like hydrophobic core construction and their use in optimal structure prediction, as well as possible applications like combinations of different energy models.

Author(s):  
Raghunath Satpathy

Proteins play a vital molecular role in all living organisms. Experimentally, it is difficult to predict the protein structure, however alternatively theoretical prediction method holds good for it. The 3D structure prediction of proteins is very much important in biology and this leads to the discovery of different useful drugs, enzymes, and currently this is considered as an important research domain. The prediction of proteins is related to identification of its tertiary structure. From the computational point of view, different models (protein representations) have been developed along with certain efficient optimization methods to predict the protein structure. The bio-inspired computation is used mostly for optimization process during solving protein structure. These algorithms now a days has received great interests and attention in the literature. This chapter aim basically for discussing the key features of recently developed five different types of bio-inspired computational algorithms, applied in protein structure prediction problems.


2020 ◽  
Author(s):  
Jin Li ◽  
Jinbo Xu

AbstractInter-residue distance prediction by deep ResNet (convolutional residual neural network) has greatly advanced protein structure prediction. Currently the most successful structure prediction methods predict distance by discretizing it into dozens of bins. Here we study how well real-valued distance can be predicted and how useful it is for 3D structure modeling by comparing it with discrete-valued prediction based upon the same deep ResNet. Different from the recent methods that predict only a single real value for the distance of an atom pair, we predict both the mean and standard deviation of a distance and then employ a novel method to fold a protein by the predicted mean and deviation. Our findings include: 1) tested on the CASP13 FM (free-modeling) targets, our real-valued distance prediction obtains 81% precision on top L/5 long-range contact prediction, much better than the best CASP13 results (70%); 2) our real-valued prediction can predict correct folds for the same number of CASP13 FM targets as the best CASP13 group, despite generating only 20 decoys for each target; 3) our method greatly outperforms a very new real-valued prediction method DeepDist in both contact prediction and 3D structure modeling; and 4) when the same deep ResNet is used, our real-valued distance prediction has 1-6% higher contact and distance accuracy than our own discrete-valued prediction, but less accurate 3D structure models.


2020 ◽  
Author(s):  
Yu-Hao Xia ◽  
Chun-Xiang Peng ◽  
Xiao-Gen Zhou ◽  
Gui-Jun Zhang

AbstractMotivationMassive local minima on the protein energy surface often causes traditional conformation sampling algorithms to be easily trapped in local basin regions, because they are difficult to stride over high-energy barriers. Also, the lowest energy conformation may not correspond to the native structure due to the inaccuracy of energy models. This study investigates whether these two problems can be alleviated by a sequential niche technique without loss of accuracy.ResultsA sequential niche multimodal conformation sampling algorithm for protein structure prediction (SNfold) is proposed in this study. In SNfold, a derating function is designed based on the knowledge learned from the previous sampling and used to construct a series of sampling-guided energy functions. These functions then help the sampling algorithm stride over high-energy barriers and avoid the re-sampling of the explored regions. In inaccurate protein energy models, the high- energy conformation that may correspond to the native structure can be sampled with successively updated sampling-guided energy functions. The proposed SNfold is tested on 300 benchmark proteins and 24 CASP13 FM targets. Results show that SNfold is comparable with Rosetta restrained by distance (Rosetta-dist) and C-QUARK. SNfold correctly folds (TM-score ≥ 0.5) 231 out of 300 proteins. In particular, compared with Rosetta-dist protocol, SNfold achieves higher average TM- score and improves the sampling efficiency by more than 100 times. On the 24 CASP13 FM targets, SNfold is also comparable with four state-of-the-art methods in the CASP13 server group. As a plugin conformation sampling algorithm, SNfold can be extended to other protein structure prediction methods.AvailabilityThe source code and executable versions are freely available at https://github.com/iobio-zjut/[email protected]


2014 ◽  
Vol 2014 ◽  
pp. 1-17
Author(s):  
Mahmood A. Rashid ◽  
Swakkhar Shatabda ◽  
M. A. Hakim Newton ◽  
Md Tamjidul Hoque ◽  
Abdul Sattar

Protein structure prediction is computationally a very challenging problem. A large number of existing search algorithms attempt to solve the problem by exploring possible structures and finding the one with the minimum free energy. However, these algorithms perform poorly on large sized proteins due to an astronomically wide search space. In this paper, we present a multipoint spiral search framework that uses parallel processing techniques to expedite exploration by starting from different points. In our approach, a set of random initial solutions are generated and distributed to different threads. We allow each thread to run for a predefined period of time. The improved solutions are stored threadwise. When the threads finish, the solutions are merged together and the duplicates are removed. A selected distinct set of solutions are then split to different threads again. In our ab initio protein structure prediction method, we use the three-dimensional face-centred-cubic lattice for structure-backbone mapping. We use both the low resolution hydrophobic-polar energy model and the high-resolution 20×20 energy model for search guiding. The experimental results show that our new parallel framework significantly improves the results obtained by the state-of-the-art single-point search approaches for both energy models on three-dimensional face-centred-cubic lattice. We also experimentally show the effectiveness of mixing energy models within parallel threads.


2010 ◽  
Vol 18 (2) ◽  
pp. 255-275 ◽  
Author(s):  
Milan Mijajlovic ◽  
Mark J. Biggs ◽  
Dusan P. Djurdjevic

Ab initio protein structure prediction involves determination of the three-dimensional (3D) conformation of proteins on the basis of their amino acid sequence, a potential energy (PE) model that captures the physics of the interatomic interactions, and a method to search for and identify the global minimum in the PE (or free energy) surface such as an evolutionary algorithm (EA). Many PE models have been proposed over the past three decades and more. There is currently no understanding of how the behavior of an EA is affected by the PE model used. The study reported here shows that the EA behavior can be profoundly affected: the EA performance obtained when using the ECEPP PE model is significantly worse than that obtained when using the Amber, OPLS, and CVFF PE models, and the optimal EA control parameter values for the ECEPP model also differ significantly from those associated with the other models.


2022 ◽  
Author(s):  
Agata Paulina Perlinska ◽  
Wanda Helena Niemyska ◽  
Bartosz Ambrozy Gren ◽  
Pawel Rubach ◽  
Joanna Ida Sulkowska

AlphaFold is a new, highly accurate machine learning protein structure prediction method that outperforms other methods. Recently this method was used to predict the structure of 98.5% of human proteins. We analyze here the structure of these AlphaFold-predicted human proteins for the presence of knots. We found that the human proteome contains 65 robustly knotted proteins, including the most complex type of a knot yet reported in proteins. That knot type, denoted 63 in mathematical notation, would necessitate a more complex folding path than any knotted proteins characterized to date. In some cases AlphaFold structure predictions are not highly accurate, which either makes their topology hard to verify or results in topological artifacts. Other structures that we found, which are knotted, potentially knotted, and structures with artifacts (knots) we deposited in a database available at: https://knotprot.cent.uw.edu.pl/alphafold.


2021 ◽  
Author(s):  
Zhiye Guo ◽  
Tianqi Wu ◽  
Jian Liu ◽  
Jie Hou ◽  
Jianlin Cheng

AbstractAccurate prediction of residue-residue distances is important for protein structure prediction. We developed several protein distance predictors based on a deep learning distance prediction method and blindly tested them in the 14th Critical Assessment of Protein Structure Prediction (CASP14). The prediction method uses deep residual neural networks with the channel-wise attention mechanism to classify the distance between every two residues into multiple distance intervals. The input features for the deep learning method include co-evolutionary features as well as other sequence-based features derived from multiple sequence alignments (MSAs). Three alignment methods are used with multiple protein sequence/profile databases to generate MSAs for input feature generation. Based on different configurations and training strategies of the deep learning method, five MULTICOM distance predictors were created to participate in the CASP14 experiment. Benchmarked on 37 hard CASP14 domains, the best performing MULTICOM predictor is ranked 5th out of 30 automated CASP14 distance prediction servers in terms of precision of top L/5 long-range contact predictions (i.e. classifying distances between two residues into two categories: in contact (< 8 Angstrom) and not in contact otherwise) and performs better than the best CASP13 distance prediction method. The best performing MULTICOM predictor is also ranked 6th among automated server predictors in classifying inter-residue distances into 10 distance intervals defined by CASP14 according to the F1 measure. The results show that the quality and depth of MSAs depend on alignment methods and sequence databases and have a significant impact on the accuracy of distance prediction. Using larger training datasets and multiple complementary features improves prediction accuracy. However, the number of effective sequences in MSAs is only a weak indicator of the quality of MSAs and the accuracy of predicted distance maps. In contrast, there is a strong correlation between the accuracy of contact/distance predictions and the average probability of the predicted contacts, which can therefore be more effectively used to estimate the confidence of distance predictions and select predicted distance maps.


Sign in / Sign up

Export Citation Format

Share Document