Improved computational methods of protein sequence alignment, model selection and tertiary structure prediction

Mapping Intimacies ◽

10.32469/10355/46126 ◽

2013 ◽

Author(s):

◽

Xin Deng

Keyword(s):

Protein Structure ◽

Secondary Structure ◽

Model Selection ◽

Sequence Alignment ◽

Protein Sequence ◽

Structure Prediction ◽

Tertiary Structure ◽

Solvent Accessibility ◽

Relative Solvent Accessibility ◽

Tertiary Structure Prediction

Protein sequence and profile alignment has been used essentially in most bioinformatics tasks such as protein structure modeling, function prediction, and phylogenetic analysis. We designed a new algorithm MSACompro to incorporate predicted secondary structure, relative solvent accessibility, and residue-residue contact information into multiple protein sequence alignment. Our experiments showed that it improved multiple sequence alignment accuracy over most existing methods without using the structural information and performed comparably to the method using structural features and additional homologous sequences by slightly lower scores. We also developed HHpacom, a new profile-profile pairwise alignment by integrating secondary structure, solvent accessibility, torsion angle and inferred residue pair coupling information. The evaluation showed that the secondary structure, relative solvent accessibility and torsion angle information significantly improved the alignment accuracy in comparison with the state of the art methods HHsearch and HHsuite. The evolutionary constraint information did help in some cases, especially the alignments of the proteins which are of short lengths, typically 100 to 500 residues. Protein Model selection is also a key step in protein tertiary structure prediction. We developed two SVM model quality assessment methods taking query-template alignment as input. The assessment results illustrated that this could help improve the model selection, protein structure prediction and many other bioinformatics problems. Moreover, we also developed a protein tertiary structure prediction pipeline, of which many components were built in our groupâ€™s MULTICOM system. The MULTICOM performed well in the CASP10 (Critical Assessment of Techniques for Protein Structure Prediction) competition.

Download Full-text

ROPIUS0: A deep learning-based protocol for protein structure prediction and model selection and its performance in CASP14

10.1101/2021.06.22.449457 ◽

2021 ◽

Author(s):

Mindaugas Margelevicius

Keyword(s):

Deep Learning ◽

Protein Structure ◽

Model Selection ◽

Protein Structure Prediction ◽

Structure Prediction ◽

Tertiary Structure ◽

Homologous Proteins ◽

Tertiary Structure Prediction ◽

Direct Use ◽

Selection Of

A protocol ROPIUS0 for protein structure prediction and model selection is presented. At the core of the ROPIUS0 protocol is the deep learning module developed for the selection of protein structural models. It is shown that the direct use of predicted inter-residue distances may be sufficient to discriminate between correct and incorrect protein folds, considering only a small fraction of predicted distances. Having finished the latest CASP14 prediction season, a ROPIUS0 variant based on model selection ranks 13th in the category of tertiary structure prediction. Its performance is on par with top-performing automated prediction servers when tested on the CASP13 dataset. The results suggest ways to improve searching for structurally similar and homologous proteins without considerably increasing speed.

Download Full-text

Template-based prediction of protein structure with deep learning

BMC Genomics ◽

10.1186/s12864-020-07249-8 ◽

2020 ◽

Vol 21 (S11) ◽

Author(s):

Haicang Zhang ◽

Yufeng Shen

Keyword(s):

Deep Learning ◽

Protein Structure ◽

Structure Prediction ◽

Tertiary Structure ◽

Query Sequence ◽

Dynamic Programming Algorithm ◽

Tertiary Structure Prediction ◽

Protein Tertiary Structure ◽

Protein Threading ◽

Protein Tertiary Structure Prediction

Abstract Background Accurate prediction of protein structure is fundamentally important to understand biological function of proteins. Template-based modeling, including protein threading and homology modeling, is a popular method for protein tertiary structure prediction. However, accurate template-query alignment and template selection are still very challenging, especially for the proteins with only distant homologs available. Results We propose a new template-based modelling method called ThreaderAI to improve protein tertiary structure prediction. ThreaderAI formulates the task of aligning query sequence with template as the classical pixel classification problem in computer vision and naturally applies deep residual neural network in prediction. ThreaderAI first employs deep learning to predict residue-residue aligning probability matrix by integrating sequence profile, predicted sequential structural features, and predicted residue-residue contacts, and then builds template-query alignment by applying a dynamic programming algorithm on the probability matrix. We evaluated our methods both in generating accurate template-query alignment and protein threading. Experimental results show that ThreaderAI outperforms currently popular template-based modelling methods HHpred, CNFpred, and the latest contact-assisted method CEthreader, especially on the proteins that do not have close homologs with known structures. In particular, in terms of alignment accuracy measured with TM-score, ThreaderAI outperforms HHpred, CNFpred, and CEthreader by 56, 13, and 11%, respectively, on template-query pairs at the similarity of fold level from SCOPe data. And on CASP13’s TBM-hard data, ThreaderAI outperforms HHpred, CNFpred, and CEthreader by 16, 9 and 8% in terms of TM-score, respectively. Conclusions These results demonstrate that with the help of deep learning, ThreaderAI can significantly improve the accuracy of template-based structure prediction, especially for distant-homology proteins.

Download Full-text

In Silico Study of Secondary Structure of Hemoglobin Protein

Research Journal of Pharmacy and Technology ◽

10.52711/0974-360x.2021.01080 ◽

2021 ◽

pp. 6245-6249

Author(s):

Roma Chandra

Keyword(s):

Secondary Structure ◽

Protein Sequence ◽

Structure Prediction ◽

Tertiary Structure ◽

Secondary Structure Prediction ◽

Three Dimensional ◽

Protein Secondary Structure ◽

Alpha Helix ◽

Prediction Methods ◽

Protein Secondary Structures

Protein structure prediction is one of the important goals in the area of bioinformatics and biotechnology. Prediction methods include structure prediction of both secondary and tertiary structures of protein. Protein secondary structure prediction infers knowledge related to presence of helixes, sheets and coils in a polypeptide chain whereas protein tertiary structure prediction infers knowledge related to three dimensional structures of proteins. Protein secondary structures represent the possible motifs or regular expressions represented as patterns that are predicted from primary protein sequence in the form of alpha helix, betastr and and coils. The secondary structure prediction is useful as it infers information related to the structure and function of unknown protein sequence. There are various secondary structure prediction methods used to predict about helixes, sheets and coils. Based on these methods there are various prediction tools under study. This study includes prediction of hemoglobin using various tools. The results produced inferred knowledge with reference to percentage of amino acids participating to produce helices, sheets and coils. PHD and DSC produced the best of the results out of all the tools used.

Download Full-text

TMBpro: secondary structure, β-contact and tertiary structure prediction of transmembrane β-barrel proteins

Bioinformatics ◽

10.1093/bioinformatics/btm548 ◽

2007 ◽

Vol 24 (4) ◽

pp. 513-520 ◽

Cited By ~ 59

Author(s):

Arlo Randall ◽

Jianlin Cheng ◽

Michael Sweredoski ◽

Pierre Baldi

Keyword(s):

Secondary Structure ◽

Structure Prediction ◽

Tertiary Structure ◽

Tertiary Structure Prediction

Download Full-text

Prospects for Tertiary Structure Prediction of RNA Based on Secondary Structure Information

Journal of Chemical Information and Modeling ◽

10.1021/ci2003413 ◽

2012 ◽

Vol 52 (2) ◽

pp. 557-567 ◽

Cited By ~ 5

Author(s):

Satoshi Yamasaki ◽

Shugo Nakamura ◽

Kazuhiko Fukui

Keyword(s):

Secondary Structure ◽

Structure Prediction ◽

Tertiary Structure ◽

Secondary Structure Information ◽

Tertiary Structure Prediction ◽

Structure Information

Download Full-text

CopulaNet: Learning residue co-evolution directly from multiple sequence alignment for protein structure prediction

10.1101/2020.10.06.327585 ◽

2020 ◽

Author(s):

Fusong Ju ◽

Jianwei Zhu ◽

Bin Shao ◽

Lupeng Kong ◽

Tie-Yan Liu ◽

...

Keyword(s):

Protein Structure ◽

Protein Structure Prediction ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Structure Prediction ◽

Tertiary Structure ◽

Query Protein ◽

Spatial Proximity ◽

Multiple Sequence ◽

Variance Matrix

Protein functions are largely determined by the final details of their tertiary structures, and the structures could be accurately reconstructed based on inter-residue distances. Residue co-evolution has become the primary principle for estimating inter-residue distances since the residues in close spatial proximity tend to co-evolve. The widely-used approaches infer residue co-evolution using an indirect strategy, i.e., they first extract from the multiple sequence alignment (MSA) of query protein some handcrafted features, say, co-variance matrix, and then infer residue co-evolution using these features rather than the raw information carried by MSA. This indirect strategy always leads to considerable information loss and inaccurate estimation of inter-residue distances. Here, we report a deep neural network framework (called CopulaNet) to learn residue co-evolution directly from MSA without any handcrafted features. The CopulaNet consists of two key elements: i) an encoder to model context-specific mutation for each residue, and ii) an aggregator to model correlations among residues and thereafter infer residue co-evolutions. Using the CASP13 (the 13th Critical Assessment of Protein Structure Prediction) target proteins as representatives, we demonstrated the successful application of CopulaNet for estimating inter-residue distances and further predicting protein tertiary structure with improved accuracy and efficiency. Head-to-head comparison suggested that for 24 out of the 31 free modeling CASP13 domains, ProFOLD outperformed AlphaFold, one of the state-of-the-art prediction approaches.

Download Full-text

THE SECONDARY STRUCTURE AND TERTIARY STRUCTURE PREDICTION OF UROKINASE FRAGMENTS AND THEIR STRUCTURE-ACTIVITY RELATIONSHIP RESEARCH

Acta Physico-Chimica Sinica ◽

10.3866/pku.whxb19870602 ◽

1987 ◽

Vol 3 (06) ◽

pp. 565-569 ◽

Cited By ~ 1

Author(s):

Xu Xiaojie ◽

◽

Guan Yue ◽

Chen Zhongguo ◽

Li Genpei ◽

...

Keyword(s):

Secondary Structure ◽

Structure Prediction ◽

Tertiary Structure ◽

Structure Activity Relationship ◽

Activity Relationship ◽

Tertiary Structure Prediction ◽

Structure Activity

Download Full-text

2P266 Tertiary structure prediction of RNA-RNA complex structures using secondary structure information(22A. Bioinformatics: Structural genomics,Poster)

Seibutsu Butsuri ◽

10.2142/biophys.53.s203_1 ◽

2013 ◽

Vol 53 (supplement1-2) ◽

pp. S203

Author(s):

Satoshi Yamasaki ◽

Kazuhiko Fukui

Keyword(s):

Secondary Structure ◽

Structural Genomics ◽

Structure Prediction ◽

Tertiary Structure ◽

Complex Structures ◽

Secondary Structure Information ◽

Tertiary Structure Prediction ◽

Structure Information ◽

Rna Complex

Download Full-text

Template-based prediction of protein structure with deep learning

10.1101/2020.06.02.129270 ◽

2020 ◽

Author(s):

Haicang Zhang ◽

Yufeng Shen

Keyword(s):

Deep Learning ◽

Protein Structure ◽

Structure Prediction ◽

Tertiary Structure ◽

Query Sequence ◽

Dynamic Programming Algorithm ◽

Tertiary Structure Prediction ◽

Protein Tertiary Structure ◽

Protein Threading ◽

Protein Tertiary Structure Prediction

AbstractAccurate prediction of protein structure is fundamentally important to understand biological function of proteins. Template-based modeling, including protein threading and homology modeling, is a popular method for protein tertiary structure prediction. However, accurate template-query alignment and template selection are still very challenging, especially for the proteins with only distant homologs available. We propose a new template-based modelling method called ThreaderAI to improve protein tertiary structure prediction. ThreaderAI formulates the task of aligning query sequence with template as the classical pixel classification problem in computer vision and naturally applies deep residual neural network in prediction. ThreaderAI first employs deep learning to predict residue-residue aligning probability matrix by integrating sequence profile, predicted sequential structural features, and predicted residueresidue contacts, and then builds template-query alignment by applying a dynamic programming algorithm on the probability matrix. We evaluated our methods both in generating accurate template-query alignment and protein threading. Experimental results show that ThreaderAI outperforms currently popular template-based modelling methods HHpred, CNFpred, and the latest contact-assisted method CEthreader, especially on the proteins that do not have close homologs with known structures. In particular, in terms of alignment accuracy measured with TM-score, ThreaderAI outperforms HHpred, CNFpred, and CEthreader by 56%, 13%, and 11%, respectively, on template-query pairs at the similarity of fold level from SCOPe data. And on CASP13’s TBM-hard data, ThreaderAI outperforms HHpred, CNFpred, and CEthreader by 16%, 9% and 8% in terms of TM-score, respectively. These results demonstrate that with the help of deep learning, ThreaderAI can significantly improve the accuracy of template-based structure prediction, especially for distant-homology proteins.Availabilityhttps://github.com/ShenLab/ThreaderAI

Download Full-text