An improved Hybrid Neuro Fuzzy Genetic System (I-HNFGS) for protein secondary structure prediction from amino acid sequence

Author(s):  
Andey Krishnaji ◽  
Allam Appa Rao
PLoS ONE ◽  
2021 ◽  
Vol 16 (7) ◽  
pp. e0255076
Author(s):  
Teng-Ruei Chen ◽  
Sheng-Hung Juan ◽  
Yu-Wei Huang ◽  
Yen-Cheng Lin ◽  
Wei-Cheng Lo

Protein secondary structure prediction (SSP) has a variety of applications; however, there has been relatively limited improvement in accuracy for years. With a vision of moving forward all related fields, we aimed to make a fundamental advance in SSP. There have been many admirable efforts made to improve the machine learning algorithm for SSP. This work thus took a step back by manipulating the input features. A secondary structure element-based position-specific scoring matrix (SSE-PSSM) is proposed, based on which a new set of machine learning features can be established. The feasibility of this new PSSM was evaluated by rigid independent tests with training and testing datasets sharing <25% sequence identities. In all experiments, the proposed PSSM outperformed the traditional amino acid PSSM. This new PSSM can be easily combined with the amino acid PSSM, and the improvement in accuracy was remarkable. Preliminary tests made by combining the SSE-PSSM and well-known SSP methods showed 2.0% and 5.2% average improvements in three- and eight-state SSP accuracies, respectively. If this PSSM can be integrated into state-of-the-art SSP methods, the overall accuracy of SSP may break the current restriction and eventually bring benefit to all research and applications where secondary structure prediction plays a vital role during development. To facilitate the application and integration of the SSE-PSSM with modern SSP methods, we have established a web server and standalone programs for generating SSE-PSSM available at http://10.life.nctu.edu.tw/SSE-PSSM.


In bioinformatics the prediction of the secondary structure of the protein from its primary amino acid sequence is very difficult, which has a huge impact on the field of science and medicine. The hardest part is how to learn the most effective and correct protein features to improve prediction. Here, we carry out a deep learning model to enhance structure prediction. The core achievement of this paper is a group of recurrent neural networks (RNNs) that can manage high-level relational features from a pair of input protein sequence and target protein sequences. This paper contrasts the different type of recurrent network in recurrent neural networks (RNNs). In addition, the emphasis is on more advanced systems which incorporate a gating utility is called long short term memory (LSTM) unit and the newly added gated recurrent unit (GRU). This recurrent units has been calculated on the basis of predicting protein secondary structure using an amino acid sequence. The dataset has been taken from a publicly available database server (RCSB), and this study shows that advanced recurrent units LSTM is better than GRU for a long protein sequence.


Sign in / Sign up

Export Citation Format

Share Document