Weave amino acid sequences for protein secondary structure prediction

Author(s):  
Xiaochun Yang ◽  
Bin Wang
PLoS ONE ◽  
2021 ◽  
Vol 16 (7) ◽  
pp. e0255076
Author(s):  
Teng-Ruei Chen ◽  
Sheng-Hung Juan ◽  
Yu-Wei Huang ◽  
Yen-Cheng Lin ◽  
Wei-Cheng Lo

Protein secondary structure prediction (SSP) has a variety of applications; however, there has been relatively limited improvement in accuracy for years. With a vision of moving forward all related fields, we aimed to make a fundamental advance in SSP. There have been many admirable efforts made to improve the machine learning algorithm for SSP. This work thus took a step back by manipulating the input features. A secondary structure element-based position-specific scoring matrix (SSE-PSSM) is proposed, based on which a new set of machine learning features can be established. The feasibility of this new PSSM was evaluated by rigid independent tests with training and testing datasets sharing <25% sequence identities. In all experiments, the proposed PSSM outperformed the traditional amino acid PSSM. This new PSSM can be easily combined with the amino acid PSSM, and the improvement in accuracy was remarkable. Preliminary tests made by combining the SSE-PSSM and well-known SSP methods showed 2.0% and 5.2% average improvements in three- and eight-state SSP accuracies, respectively. If this PSSM can be integrated into state-of-the-art SSP methods, the overall accuracy of SSP may break the current restriction and eventually bring benefit to all research and applications where secondary structure prediction plays a vital role during development. To facilitate the application and integration of the SSE-PSSM with modern SSP methods, we have established a web server and standalone programs for generating SSE-PSSM available at http://10.life.nctu.edu.tw/SSE-PSSM.


2003 ◽  
Vol 07 (03) ◽  
pp. 122-128
Author(s):  
Jagath C. Rajapakse ◽  
Minh N. Aguyen

Bioinformatics techniques to protein secondary structure prediction, such as Support Vector Machine (SVM) and GOR approaches, are mostly single-stage approaches; they predict secondary structures of the protein by taking into account only the information available in amino acid sequences. On the other hand, PHD (Profile network from HeiDelberg) method is a two-stage technique where two Multi-Layer Perceptrons (MLPs) are cascaded; the second neural network receives the output of the first neural network captures any contextual relationships among the secondary structure elements predicted by the first neural network. In this paper, we argue that it is feasible to extend the current single-stage approaches by adding a second-stage prediction scheme to capture the contextual information among secondary structural elements and thereby improving their accuracies. We demonstrate that two-stage SVMs perform better than present techniques for protein secondary structure prediction.


2021 ◽  
Author(s):  
Shutong Yang ◽  
Yuhong Wang ◽  
Kennie Cruz-Gutierrez ◽  
Fangling Wu ◽  
Chuan-Fan Ding

Abstract BackgroundProtein secondary structure prediction (PSSP) is important for protein structure modeling and design. Over the past a few years, deep learning models have shown promising results for PSSP. However, the current good performers for PSSP often require evolutionary information such as multiple sequence alignments and even real protein structures (templates), entire protein sequences, and amino acid property profiles. ResultsIn this study, we used a fixed-size window of adjacent residues and only amino acid sequences, without any evolutionary information, as inputs, and developed a very simple, yet accurate RNN model: LocalNet. The accuracy for three states of secondary structures is as high as 85.15%, indicating that the local amino acid sequence itself contains enough information for PSSP, a well-known classical view. By comparing to other predictors, we also achieve an state-of-art accuracy on dataset of CASP11, CASP12 and CASP13.ConclusionThe well-trained models are expected to have good applications in protein structure modeling and protein design. This model can be downloaded from https://github.com/lake-chao/protein-secondary-structure-prediction.


Sign in / Sign up

Export Citation Format

Share Document