Faculty Opinions recommendation of Assessing the utility of coevolution-based residue-residue contact predictions in a sequence- and structure-rich era.

AbstractMotivationResidue contact prediction was revolutionized recently by the introduction of direct coupling analysis (DCA). Further improvements, in particular for small families, have been obtained by the combination of DCA and deep learning methods. However, existing deep learning contact prediction methods often rely on a number of external programs and are therefore computationally expensive.ResultsHere, we introduce a novel contact predictor, PconsC4, which performs on par with state of the art methods. PconsC4 is heavily optimized, does not use any external programs and therefore is significantly faster and easier to use than other methods.AvailabilityPconsC4 is freely available under the GPL license from https://github.com/ElofssonLab/PconsC4. Installation is easy using the pip command and works on any system with Python 3.5 or later and a modern GCC [email protected]

Download Full-text

Combination of deep neural network with attention mechanism enhances the explainability of protein contact prediction

10.1101/2020.09.04.283937 ◽

2020 ◽

Author(s):

Chen Chen ◽

Tianqi Wu ◽

Zhiye Guo ◽

Jianlin Cheng

Keyword(s):

Neural Network ◽

Deep Learning ◽

Attention Mechanism ◽

Contact Prediction ◽

Residue Contact ◽

Complementary Effect ◽

Essential Components ◽

Internal Mechanism ◽

Contact Predictions ◽

The Relationship

AbstractDeep learning has emerged as a revolutionary technology for protein residue-residue contact prediction since the 2012 CASP10 competition. Considerable advancements in the predictive power of the deep learning-based contact predictions have been achieved since then. However, little effort has been put into interpreting the black-box deep learning methods. Algorithms that can interpret the relationship between predicted contact maps and the internal mechanism of the deep learning architectures are needed to explore the essential components of contact inference and improve their explainability. In this study, we present an attention-based convolutional neural network for protein contact prediction, which consists of two attention mechanism-based modules: sequence attention and regional attention. Our benchmark results on the CASP13 free-modeling (FM) targets demonstrate that the two attention modules added on top of existing typical deep learning models exhibit a complementary effect that contributes to predictive improvements. More importantly, the inclusion of the attention mechanism provides interpretable patterns that contain useful insights into the key fold-determining residues in proteins. We expect the attention-based model can provide a reliable and practically interpretable technique that helps break the current bottlenecks in explaining deep neural networks for contact prediction.

Download Full-text

Residue contacts predicted by evolutionary covariance extend the application ofab initiomolecular replacement to larger and more challenging protein folds

IUCrJ ◽

10.1107/s2052252516008113 ◽

2016 ◽

Vol 3 (4) ◽

pp. 259-270 ◽

Cited By ~ 11

Author(s):

Felix Simkovic ◽

Jens M. H. Thomas ◽

Ronan M. Keegan ◽

Martyn D. Winn ◽

Olga Mayans ◽

...

Keyword(s):

Ab Initio ◽

Structure Prediction ◽

Sequence Information ◽

Protein Targets ◽

Residue Contact ◽

Residue Contacts ◽

Structure Solution ◽

Improved Performance ◽

Model Ensembles ◽

Contact Predictions

For many protein families, the deluge of new sequence information together with new statistical protocols now allow the accurate prediction of contacting residues from sequence information alone. This offers the possibility of more accurateab initio(non-homology-based) structure prediction. Such models can be used in structure solution by molecular replacement (MR) where the target fold is novel or is only distantly related to known structures. Here,AMPLE, an MR pipeline that assembles search-model ensembles fromab initiostructure predictions (`decoys'), is employed to assess the value of contact-assistedab initiomodels to the crystallographer. It is demonstrated that evolutionary covariance-derived residue–residue contact predictions improve the quality ofab initiomodels and, consequently, the success rate of MR using search models derived from them. For targets containing β-structure, decoy quality and MR performance were further improved by the use of a β-strand contact-filtering protocol. Such contact-guided decoys achieved 14 structure solutions from 21 attempted protein targets, compared with nine for simpleRosettadecoys. Previously encountered limitations were superseded in two key respects. Firstly, much larger targets of up to 221 residues in length were solved, which is far larger than the previously benchmarked threshold of 120 residues. Secondly, contact-guided decoys significantly improved success with β-sheet-rich proteins. Overall, the improved performance of contact-guided decoys suggests that MR is now applicable to a significantly wider range of protein targets than were previously tractable, and points to a direct benefit to structural biology from the recent remarkable advances in sequencing.

Download Full-text

COMTOP: Protein Residue–Residue Contact Prediction through Mixed Integer Linear Optimization

Membranes ◽

10.3390/membranes11070503 ◽

2021 ◽

Vol 11 (7) ◽

pp. 503

Author(s):

Md. Selim Reza ◽

Huiling Zhang ◽

Md. Tofazzal Hossain ◽

Langxi Jin ◽

Shengzhong Feng ◽

...

Keyword(s):

Prediction Accuracy ◽

Linear Optimization ◽

Mixed Integer ◽

Consensus Method ◽

Contact Prediction ◽

Residue Contact ◽

Mixed Integer Linear Optimization ◽

Test Sets ◽

Contact Predictions ◽

Integer Linear Optimization

Protein contact prediction helps reconstruct the tertiary structure that greatly determines a protein’s function; therefore, contact prediction from the sequence is an important problem. Recently there has been exciting progress on this problem, but many of the existing methods are still low quality of prediction accuracy. In this paper, we present a new mixed integer linear programming (MILP)-based consensus method: a Consensus scheme based On a Mixed integer linear opTimization method for prOtein contact Prediction (COMTOP). The MILP-based consensus method combines the strengths of seven selected protein contact prediction methods, including CCMpred, EVfold, DeepCov, NNcon, PconsC4, plmDCA, and PSICOV, by optimizing the number of correctly predicted contacts and achieving a better prediction accuracy. The proposed hybrid protein residue–residue contact prediction scheme was tested in four independent test sets. For 239 highly non-redundant proteins, the method showed a prediction accuracy of 59.68%, 70.79%, 78.86%, 89.04%, 94.51%, and 97.35% for top-5L, top-3L, top-2L, top-L, top-L/2, and top-L/5 contacts, respectively. When tested on the CASP13 and CASP14 test sets, the proposed method obtained accuracies of 75.91% and 77.49% for top-L/5 predictions, respectively. COMTOP was further tested on 57 non-redundant ɑ-helical transmembrane proteins and achieved prediction accuracies of 64.34% and 73.91% for top-L/2 and top-L/5 predictions, respectively. For all test datasets, the improvement of COMTOP in accuracy over the seven individual methods increased with the increasing number of predicted contacts. For example, COMTOP performed much better for large number of contact predictions (such as top-5L and top-3L) than for small number of contact predictions such as top-L/2 and top-L/5. The results and analysis demonstrate that COMTOP can significantly improve the performance of the individual methods; therefore, COMTOP is more robust against different types of test sets. COMTOP also showed better/comparable predictions when compared with the state-of-the-art predictors.

Download Full-text

Protein model accuracy estimation empowered by deep learning and inter-residue distance prediction in CASP14

10.1101/2021.01.31.428975 ◽

2021 ◽

Author(s):

Xiao Chen ◽

Jian Liu ◽

Zhiye Guo ◽

Tianqi Wu ◽

Jie Hou ◽

...

Keyword(s):

Deep Learning ◽

Structure Prediction ◽

Structural Models ◽

Single Model ◽

Model Accuracy ◽

Model Quality ◽

Residue Contact ◽

Contact Distance ◽

Protein Model ◽

Contact Predictions

AbstractThe inter-residue contact prediction and deep learning showed the promise to improve the estimation of protein model accuracy (EMA) in the 13th Critical Assessment of Protein Structure Prediction (CASP13). During the 2020 CASP14 experiment, we developed and tested several EMA predictors that used deep learning with the new features based on inter-residue distance/contact predictions as well as the existing model quality features. The average global distance test (GDT-TS) score loss of ranking CASP14 structural models by three multi-model MULTICOM EMA predictors (MULTICOM-CONSTRUCT, MULTICOM-AI, and MULTICOM-CLUSTER) is 0.073, 0.079, and 0.081, respectively, which are ranked first, second, and third places out of 68 CASP14 EMA predictors. The single-model EMA predictor (MULTICOM-DEEP) is ranked 10th place among all the single-model EMA methods in terms of GDT_TS score loss. The results show that deep learning and contact/distance predictions are useful in ranking and selecting protein structural models.

Download Full-text

Correction for Kamisetty et al., Assessing the utility of coevolution-based residue-residue contact predictions in a sequence- and structure-rich era

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1319550110 ◽

2013 ◽

Vol 110 (46) ◽

pp. 18734-18734 ◽

Cited By ~ 4

Keyword(s):

Residue Contact ◽

Contact Predictions

Download Full-text

PconsC4: fast, accurate and hassle-free contact predictions

Bioinformatics ◽

10.1093/bioinformatics/bty1036 ◽

2018 ◽

Vol 35 (15) ◽

pp. 2677-2679 ◽

Cited By ~ 15

Author(s):

Mirco Michel ◽

David Menéndez Hurtado ◽

Arne Elofsson

Keyword(s):

Deep Learning ◽

State Of The Art ◽

Supplementary Information ◽

Prediction Methods ◽

Coupling Analysis ◽

Contact Prediction ◽

Residue Contact ◽

Direct Coupling Analysis ◽

Computationally Expensive ◽

Contact Predictions

Abstract Motivation Residue contact prediction was revolutionized recently by the introduction of direct coupling analysis (DCA). Further improvements, in particular for small families, have been obtained by the combination of DCA and deep learning methods. However, existing deep learning contact prediction methods often rely on a number of external programs and are therefore computationally expensive. Results Here, we introduce a novel contact predictor, PconsC4, which performs on par with state of the art methods. PconsC4 is heavily optimized, does not use any external programs and therefore is significantly faster and easier to use than other methods. Availability and implementation PconsC4 is freely available under the GPL license from https://github.com/ElofssonLab/PconsC4. Installation is easy using the pip command and works on any system with Python 3.5 or later and a GCC compiler. It does not require a GPU nor special hardware. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Evaluation of residue-residue contact predictions in CASP9

Proteins Structure Function and Bioinformatics ◽

10.1002/prot.23160 ◽

2011 ◽

Vol 79 (S10) ◽

pp. 119-125 ◽

Cited By ~ 40

Author(s):

Bohdan Monastyrskyy ◽

Krzysztof Fidelis ◽

Anna Tramontano ◽

Andriy Kryshtafovych

Keyword(s):

Residue Contact ◽

Contact Predictions

Download Full-text

Molecular replacement using structure predictions from databases

Acta Crystallographica Section D Structural Biology ◽

10.1107/s2059798319013962 ◽

2019 ◽

Vol 75 (12) ◽

pp. 1051-1062 ◽

Cited By ~ 4

Author(s):

Adam J. Simpkin ◽

Jens M. H. Thomas ◽

Felix Simkovic ◽

Ronan M. Keegan ◽

Daniel J. Rigden

Keyword(s):

Ab Initio ◽

Large Scale ◽

De Novo ◽

Molecular Replacement ◽

Macromolecular Crystallography ◽

Phase Problem ◽

Search Models ◽

Residue Contact ◽

Single Structure ◽

Contact Predictions

Molecular replacement (MR) is the predominant route to solution of the phase problem in macromolecular crystallography. Where the lack of a suitable homologue precludes conventional MR, one option is to predict the target structure using bioinformatics. Such modelling, in the absence of homologous templates, is called ab initio or de novo modelling. Recently, the accuracy of such models has improved significantly as a result of the availability, in many cases, of residue-contact predictions derived from evolutionary covariance analysis. Covariance-assisted ab initio models representing structurally uncharacterized Pfam families are now available on a large scale in databases, potentially representing a valuable and easily accessible supplement to the PDB as a source of search models. Here, the unconventional MR pipeline AMPLE is employed to explore the value of structure predictions in the GREMLIN and PconsFam databases. It was tested whether these deposited predictions, processed in various ways, could solve the structures of PDB entries that were subsequently deposited. The results were encouraging: nine of 27 GREMLIN cases were solved, covering target lengths of 109–355 residues and a resolution range of 1.4–2.9 Å, and with target–model shared sequence identity as low as 20%. The cluster-and-truncate approach in AMPLE proved to be essential for most successes. For the overall lower quality structure predictions in the PconsFam database, remodelling with Rosetta within the AMPLE pipeline proved to be the best approach, generating ensemble search models from single-structure deposits. Finally, it is shown that the AMPLE-obtained search models deriving from GREMLIN deposits are of sufficiently high quality to be selected by the sequence-independent MR pipeline SIMBAD. Overall, the results help to point the way towards the optimal use of the expanding databases of ab initio structure predictions.

Download Full-text

Assessing the utility of coevolution-based residue-residue contact predictions in a sequence- and structure-rich era

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1314045110 ◽

2013 ◽

Vol 110 (39) ◽

pp. 15674-15679 ◽

Cited By ~ 370

Author(s):

H. Kamisetty ◽

S. Ovchinnikov ◽

D. Baker

Keyword(s):

Residue Contact ◽

Contact Predictions

Download Full-text