protein secondary structure prediction
Recently Published Documents


TOTAL DOCUMENTS

400
(FIVE YEARS 15)

H-INDEX

43
(FIVE YEARS 0)



2021 ◽  
Vol 35 (5) ◽  
pp. 403-408
Author(s):  
Subhendu Bhusan Rout ◽  
Sasmita Mishra ◽  
Susanta Kumar Sahoo

The protein secondary structure prediction (PSP) of the large biological molecule protein is an important task of bioinformatics and in the last decades many machines learning and soft computing methodologies play vital roles in achieving satisfactory results. The protein structural class determination is an important topic in protein science because an idea about protein structural class is quite useful to know about the changes and reaction of a living body in order to design new drugs and medicines. Though several hard computing techniques may be helpful in these areas but focusing upon the steady development and big data size in protein sequences that are entering into databanks, it is a challenge to do experiments with the hard computing techniques. Soft computing techniques like Artificial Neural Network, Fuzzy logic, Genetic Algorithm play a vital role for these types of genomic researches. To face these complex challenges, this article presents a novel method to predict the protein structure by using Genetic Algorithm. The Q3 accuracy and SOV measure analysis with SOVH, SOVE, SOVC value of respective α-helix (H), β-sheet (E) and coil/loop(C) structures are also discussed. The application of Genetic algorithm i.e. the proposed technique GApred provides better result than that of SPIDER2, JPred4, FSVM and SSpro5 for all the three datasets in the experiment. This method is helpful for distinct protein secondary structure prediction and a significant success rate was observed, which indicates that it can be used as a powerful tool in drug design and medicine research.



Author(s):  
Qin Wang ◽  
Jun Wei ◽  
Boyuan Wang ◽  
Zhen Li ◽  
Sheng Wang ◽  
...  

Protein secondary structure prediction (PSSP) is essential for protein function analysis. However, for low homologous proteins, the PSSP suffers from insufficient input features. In this paper, we explicitly import external self-supervised knowledge for low homologous PSSP under the guidance of residue-wise (amino acid wise) profile fusion. In practice, we firstly demonstrate the superiority of profile over Position-Specific Scoring Matrix (PSSM) for low homologous PSSP. Based on this observation, we introduce the novel self-supervised BERT features as the pseudo profile, which implicitly involves the residue distribution in all native discovered sequences as the complementary features. Furthermore, a novel residue-wise attention is specially designed to adaptively fuse different features (i.e., original low-quality profile, BERT based pseudo profile), which not only takes full advantage of each feature but also avoids noise disturbance. Besides, the feature consistency loss is proposed to accelerate the model learning from multiple semantic levels. Extensive experiments confirm that our method outperforms state-of-the-arts (i.e., 4.7% for extremely low homologous cases on BC40 dataset).



PLoS ONE ◽  
2021 ◽  
Vol 16 (7) ◽  
pp. e0255076
Author(s):  
Teng-Ruei Chen ◽  
Sheng-Hung Juan ◽  
Yu-Wei Huang ◽  
Yen-Cheng Lin ◽  
Wei-Cheng Lo

Protein secondary structure prediction (SSP) has a variety of applications; however, there has been relatively limited improvement in accuracy for years. With a vision of moving forward all related fields, we aimed to make a fundamental advance in SSP. There have been many admirable efforts made to improve the machine learning algorithm for SSP. This work thus took a step back by manipulating the input features. A secondary structure element-based position-specific scoring matrix (SSE-PSSM) is proposed, based on which a new set of machine learning features can be established. The feasibility of this new PSSM was evaluated by rigid independent tests with training and testing datasets sharing <25% sequence identities. In all experiments, the proposed PSSM outperformed the traditional amino acid PSSM. This new PSSM can be easily combined with the amino acid PSSM, and the improvement in accuracy was remarkable. Preliminary tests made by combining the SSE-PSSM and well-known SSP methods showed 2.0% and 5.2% average improvements in three- and eight-state SSP accuracies, respectively. If this PSSM can be integrated into state-of-the-art SSP methods, the overall accuracy of SSP may break the current restriction and eventually bring benefit to all research and applications where secondary structure prediction plays a vital role during development. To facilitate the application and integration of the SSE-PSSM with modern SSP methods, we have established a web server and standalone programs for generating SSE-PSSM available at http://10.life.nctu.edu.tw/SSE-PSSM.



2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Heba M. Afify ◽  
Mohamed B. Abdelhalim ◽  
Mai S. Mabrouk ◽  
Ahmed Y. Sayed

Abstract Background The computational biology approach has advanced exponentially in protein secondary structure prediction (PSSP), which is vital for the pharmaceutical industry. Extracting protein structure from the laboratory has insufficient information for PSSP that is used in bioinformatics studies. In this paper, the support vector machine (SVM) model and decision tree are presented on the RS126 dataset to address the problem of PSSP. A decision tree is applied for the SVM outcome to obtain the relevant guidelines possible for PSSP. Furthermore, the number of produced rules was fairly small, and they show a greater degree of comprehensibility compared to other rules. Several of the proposed principles have compelling and relevant biological clarification. Results The results confirmed that the existence of a particular amino acid in a protein sequence increases the stability for the forecast of protein secondary structure. The suggested algorithm achieved 85% accuracy for the E|~E classifier. Conclusions The proposed rules can be very important in managing wet laboratory experiments intended at determining protein secondary structure. Lastly, future work will focus mainly on large protein datasets without overfitting and expand the amount of extracted regulations for PSSP.



Sign in / Sign up

Export Citation Format

Share Document