QDeep: distance-based protein model quality estimation by residue-level ensemble error classifications using stacked deep residual neural networks

Md Hossain Shuvo; Sutanu Bhattacharya; Debswapna Bhattacharya

doi:10.1093/bioinformatics/btaa455

QDeep: distance-based protein model quality estimation by residue-level ensemble error classifications using stacked deep residual neural networks

Bioinformatics ◽

10.1093/bioinformatics/btaa455 ◽

2020 ◽

Vol 36 (Supplement_1) ◽

pp. i285-i291 ◽

Cited By ~ 1

Author(s):

Md Hossain Shuvo ◽

Sutanu Bhattacharya ◽

Debswapna Bhattacharya

Keyword(s):

Neural Networks ◽

Protein Structure ◽

Protein Structure Prediction ◽

Structure Prediction ◽

Residue Level ◽

Supplementary Information ◽

Model Quality ◽

Quality Estimation ◽

Distance Information ◽

Protein Model

Abstract Motivation Protein model quality estimation, in many ways, informs protein structure prediction. Despite their tight coupling, existing model quality estimation methods do not leverage inter-residue distance information or the latest technological breakthrough in deep learning that has recently revolutionized protein structure prediction. Results We present a new distance-based single-model quality estimation method called QDeep by harnessing the power of stacked deep residual neural networks (ResNets). Our method first employs stacked deep ResNets to perform residue-level ensemble error classifications at multiple predefined error thresholds, and then combines the predictions from the individual error classifiers for estimating the quality of a protein structural model. Experimental results show that our method consistently outperforms existing state-of-the-art methods including ProQ2, ProQ3, ProQ3D, ProQ4, 3DCNN, MESHI, and VoroMQA in multiple independent test datasets across a wide-range of accuracy measures; and that predicted distance information significantly contributes to the improved performance of QDeep. Availability and implementation https://github.com/Bhattacharya-Lab/QDeep. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

QDeep: distance-based protein model quality estimation by residue-level ensemble error classifications using stacked deep residual neural networks

10.1101/2020.01.31.928622 ◽

2020 ◽

Author(s):

Md Hossain Shuvo ◽

Sutanu Bhattacharya ◽

Debswapna Bhattacharya

Keyword(s):

Neural Networks ◽

Protein Structure ◽

Protein Structure Prediction ◽

Structure Prediction ◽

Residue Level ◽

Estimation Methods ◽

Model Quality ◽

Quality Estimation ◽

Distance Information ◽

Protein Model

AbstractMotivationProtein model quality estimation, in many ways, informs protein structure prediction. Despite their tight coupling, existing model quality estimation methods do not leverage inter-residue distance information or the latest technological breakthrough in deep learning that has recently revolutionized protein structure prediction.ResultsWe present a new distance-based single-model quality estimation method called QDeep by harnessing the power of stacked deep residual neural networks (ResNets). Our method first employs stacked deep ResNets to perform residue-level ensemble error classifications at multiple predefined error thresholds, and then combines the predictions from the individual error classifiers for estimating the quality of a protein structural model. Experimental results show that our method consistently out-performs existing state-of-the-art methods including ProQ2, ProQ3, ProQ3D, ProQ4, 3DCNN, MESHI, and VoroMQA in multiple independent test datasets across a wide-range of accuracy measures; and that predicted distance information significantly contributes to the improved performance of QDeep.Availabilityhttps://github.com/Bhattacharya-Lab/[email protected]

Download Full-text

Deep Template-based Protein Structure Prediction

10.1101/2020.12.26.424433 ◽

2020 ◽

Author(s):

Fandi Wu ◽

Jinbo Xu

Keyword(s):

Protein Structure ◽

Protein Structure Prediction ◽

Random Fields ◽

Structure Prediction ◽

Conditional Random Fields ◽

3D Models ◽

Query Protein ◽

Supplementary Information ◽

Distance Information ◽

Alternating Direction

AbstractMotivationTBM (template-based modeling) is a popular method for protein structure prediction. When very good templates are not available, it is challenging to identify the best templates, build accurate sequence-template alignments and construct 3D models from alignments.ResultsThis paper presents a new method NDThreader (New Deep-learning Threader) to address the challenges of TBM. DNThreader first employs DRNF (deep convolutional residual neural fields), which is an integration of deep ResNet (convolutional residue neural networks) and CRF (conditional random fields), to align a query protein to templates without using any distance information. Then NDThreader uses ADMM (alternating direction method of multipliers) and DRNF to further improve sequence-template alignments by making use of predicted distance potential. Finally NDThreader builds 3D models from a sequence-template alignment by feeding it and sequence co-evolution information into a deep ResNet to predict inter-atom distance distribution, which is then fed into PyRosetta for 3D model construction. Our experimental results on the CASP13 and CAMEO data show that our methods outperform existing ones such as CNFpred, HHpred, DeepThreader and CEthreader. NDThreader was blindly tested in CASP14 as a part of RaptorX server, which obtained the best GDT score among all CASP14 servers on the 58 TBM targets.Availability and Implementationavailable as a part of web server at http://[email protected] InformationSupplementary data are available online.

Download Full-text

Improved estimation of model quality using predicted inter-residue distance

Bioinformatics ◽

10.1093/bioinformatics/btab632 ◽

2021 ◽

Author(s):

Lisha Ye ◽

Peikun Wu ◽

Zhenling Peng ◽

Jianzhao Gao ◽

Jian Liu ◽

...

Keyword(s):

Protein Structure ◽

Protein Structure Prediction ◽

Structure Prediction ◽

Superior Performance ◽

Supplementary Information ◽

Prediction Algorithm ◽

Structure Model ◽

Single Model ◽

Model Quality ◽

Reference Models

Abstract Motivation Protein model quality assessment (QA) is an essential component in protein structure prediction, which aims to estimate the quality of a structure model and/or select the most accurate model out from a pool of structure models, without knowing the native structure. QA remains a challenging task in protein structure prediction. Results Based on the inter-residue distance predicted by the recent deep learning-based structure prediction algorithm trRosetta, we developed QDistance, a new approach to the estimation of both global and local qualities. QDistance works for both single-model and multi-models inputs. We designed several distance-based features to assess the agreement between the predicted and model-derived inter-residue distances. Together with a few widely used features, they are fed into a simple yet powerful linear regression model to infer the global QA scores. The local QA scores for each structure model are predicted based on a comparative analysis with a set of selected reference models. For multi-models input, the reference models are selected from the input based on the predicted global QA scores. For single-model input, the reference models are predicted by trRosetta. With the informative distance-based features, QDistance can predict the global quality with satisfactory accuracy. Benchmark tests on the CASP13 and the CAMEO structure models suggested that QDistance was competitive other methods. Blind tests in the CASP14 experiments showed that QDistance was robust and ranked among the top predictors. Especially, QDistance was the top 3 local QA method and made the most accurate local QA prediction for unreliable local region. Analysis showed that this superior performance can be attributed to the inclusion of the predicted inter-residue distance. Availability and Implementation http://yanglab.nankai.edu.cn/QDistance Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Protein structure prediction using multiple deep neural networks in the 13th Critical Assessment of Protein Structure Prediction (CASP13)

Proteins Structure Function and Bioinformatics ◽

10.1002/prot.25834 ◽

2019 ◽

Vol 87 (12) ◽

pp. 1141-1148 ◽

Cited By ~ 53

Author(s):

Andrew W. Senior ◽

Richard Evans ◽

John Jumper ◽

James Kirkpatrick ◽

Laurent Sifre ◽

...

Keyword(s):

Neural Networks ◽

Protein Structure ◽

Protein Structure Prediction ◽

Structure Prediction ◽

Deep Neural Networks ◽

Critical Assessment

Download Full-text

Optimal neural networks for protein-structure prediction

Physical Review E ◽

10.1103/physreve.48.1502 ◽

1993 ◽

Vol 48 (2) ◽

pp. 1502-1515 ◽

Cited By ~ 13

Author(s):

Teresa Head-Gordon ◽

Frank H. Stillinger

Keyword(s):

Neural Networks ◽

Protein Structure ◽

Protein Structure Prediction ◽

Structure Prediction

Download Full-text

AngularQA: Protein Model Quality Assessment with LSTM Networks

10.1101/560995 ◽

2019 ◽

Cited By ~ 1

Author(s):

Matthew Conover ◽

Max Staples ◽

Dong Si ◽

Miao Sun ◽

Renzhi Cao

Keyword(s):

Protein Structure ◽

Quality Assessment ◽

Protein Structure Prediction ◽

Structure Prediction ◽

Model Quality ◽

Time Step ◽

Testing Dataset ◽

Protein Model Quality Assessment ◽

Lstm Network ◽

Protein Structure Prediction Problem

AbstractQuality Assessment (QA) plays an important role in protein structure prediction. Traditional protein QA methods suffer from searching databases or comparing with other models for making predictions, which usually fail. We propose a novel protein single-model QA method which is built on a new representation that converts raw atom information into a series of carbon-alpha (Cα) atoms with side-chain information, defined by their dihedral angles and bond lengths to the prior residue. An LSTM network is used to predict the quality by treating each amino acid as a time-step and consider the final value returned by the LSTM cells. To the best of our knowledge, this is the first time anyone has attempted to use an LSTM model on the QA problem; furthermore, we use a new representation which has not been studied for QA. In addition to angles, we make use of sequence properties like secondary structure at each time-step, without using any database. Our model achieves an overall correlation of 0.651 on the CASP12 testing dataset. Our experiment points out new directions for QA problem and our method could be widely used for protein structure prediction problem. The software is freely available at GitHub:https://github.com/caorenzhi/AngularQA

Download Full-text

Protein structure prediction using sparse NOE and RDC restraints with Rosetta in CASP13

10.1101/597724 ◽

2019 ◽

Author(s):

Georg Kuenze ◽

Jens Meiler

Keyword(s):

Protein Structure ◽

Protein Structure Prediction ◽

Structure Prediction ◽

De Novo ◽

Dipolar Coupling ◽

Nuclear Overhauser Effect ◽

Residual Dipolar Coupling ◽

Comparative Modeling ◽

Model Quality ◽

Blind Test

AbstractComputational methods that produce accurate protein structure models from limited experimental data, e.g. from nuclear magnetic resonance (NMR) spectroscopy, hold great potential for biomedical research. The NMR-assisted modeling challenge in CASP13 provided a blind test to explore the capabilities and limitations of current modeling techniques in leveraging NMR data which had high sparsity, ambiguity and error rate for protein structure prediction. We describe our approach to predict the structure of these proteins leveraging the Rosetta software suite. Protein structure models were predictedde novousing a two-stage protocol. First, low-resolution models were generated with the Rosettade novomethod guided by non-ambiguous nuclear Overhauser effect (NOE) contacts and residual dipolar coupling (RDC) restraints. Second, iterative model hybridization and fragment insertion with the Rosetta comparative modeling method was used to refine and regularize models guided by all ambiguous and non-ambiguous NOE contacts and RDCs. Nine out of 16 of the Rosettade novomodels had the correct fold (GDT-TS score >45) and in three cases high-resolution models were achieved (RMSD <3.5 Å). We also show that a meta-approach applying iterative Rosetta+NMR refinement on server-predicted models which employed non-NMR-contacts and structural templates leads to substantial improvement in model quality. Integrating these data-assisted refinement strategies with innovative non-data-assisted approaches which became possible in CASP13 such as high precision contact prediction will in the near future enable structure determination for large proteins that are outside of the realm of conventional NMR.

Download Full-text

TopModel: Template-Based Protein Structure Prediction at Low Sequence Identity Using Top-Down Consensus and Deep Neural Networks

Journal of Chemical Theory and Computation ◽

10.1021/acs.jctc.9b00825 ◽

2020 ◽

Vol 16 (3) ◽

pp. 1953-1967 ◽

Cited By ~ 6

Author(s):

Daniel Mulnaes ◽

Nicola Porta ◽

Rebecca Clemens ◽

Irina Apanasenko ◽

Jens Reiners ◽

...

Keyword(s):

Neural Networks ◽

Protein Structure ◽

Protein Structure Prediction ◽

Structure Prediction ◽

Deep Neural Networks ◽

Top Down ◽

Sequence Identity

Download Full-text

PconsD: ultra rapid, accurate model quality assessment for protein structure prediction

Bioinformatics ◽

10.1093/bioinformatics/btt272 ◽

2013 ◽

Vol 29 (14) ◽

pp. 1817-1818 ◽

Cited By ~ 26

Author(s):

Marcin J. Skwark ◽

Arne Elofsson

Keyword(s):

Protein Structure ◽

Quality Assessment ◽

Protein Structure Prediction ◽

Structure Prediction ◽

Model Quality ◽

Accurate Model ◽

Model Quality Assessment

Download Full-text

State-of-the-art web services for de novo protein structure prediction

Briefings in Bioinformatics ◽

10.1093/bib/bbaa139 ◽

2020 ◽

Cited By ~ 1

Author(s):

Luciano A Abriata ◽

Matteo Dal Peraro

Keyword(s):

Protein Structure ◽

Protein Structure Prediction ◽

Structure Prediction ◽

Tertiary Structure ◽

De Novo ◽

State Of The Art ◽

Data Bank ◽

End Users ◽

Model Quality ◽

Uncharacterized Protein

Abstract Residue coevolution estimations coupled to machine learning methods are revolutionizing the ability of protein structure prediction approaches to model proteins that lack clear homologous templates in the Protein Data Bank (PDB). This has been patent in the last round of the Critical Assessment of Structure Prediction (CASP), which presented several very good models for the hardest targets. Unfortunately, literature reporting on these advances often lacks digests tailored to lay end users; moreover, some of the top-ranking predictors do not provide webservers that can be used by nonexperts. How can then end users benefit from these advances and correctly interpret the predicted models? Here we review the web resources that biologists can use today to take advantage of these state-of-the-art methods in their research, including not only the best de novo modeling servers but also datasets of models precomputed by experts for structurally uncharacterized protein families. We highlight their features, advantages and pitfalls for predicting structures of proteins without clear templates. We present a broad number of applications that span from driving forward biochemical investigations that lack experimental structures to actually assisting experimental structure determination in X-ray diffraction, cryo-EM and other forms of integrative modeling. We also discuss issues that must be considered by users yet still require further developments, such as global and residue-wise model quality estimates and sources of residue coevolution other than monomeric tertiary structure.

Download Full-text