Efficient Dynamic Analysis of Low-similarity Proteins for Structural Class Prediction

Protein Structural Class Prediction Based on Distance-related Statistical Features from Graphical Representation of Predicted Secondary Structure

Letters in Organic Chemistry ◽

10.2174/1570178615666180914110451 ◽

2019 ◽

Vol 16 (4) ◽

pp. 317-324

Author(s):

Liang Kong ◽

Lichao Zhang ◽

Xiaodong Han ◽

Jinfeng Lv

Keyword(s):

Feature Extraction ◽

Secondary Structure ◽

Protein Sequence ◽

Function Analysis ◽

Superior Performance ◽

Support Vector ◽

Chaos Game Representation ◽

Class Prediction ◽

Structural Class ◽

Protein Structural Class

Protein structural class prediction is beneficial to protein structure and function analysis. Exploring good feature representation is a key step for this prediction task. Prior works have demonstrated the effectiveness of the secondary structure based feature extraction methods especially for lowsimilarity protein sequences. However, the prediction accuracies still remain limited. To explore the potential of secondary structure information, a novel feature extraction method based on a generalized chaos game representation of predicted secondary structure is proposed. Each protein sequence is converted into a 20-dimensional distance-related statistical feature vector to characterize the distribution of secondary structure elements and segments. The feature vectors are then fed into a support vector machine classifier to predict the protein structural class. Our experiments on three widely used lowsimilarity benchmark datasets (25PDB, 1189 and 640) show that the proposed method achieves superior performance to the state-of-the-art methods. It is anticipated that our method could be extended to other graphical representations of protein sequence and be helpful in future protein research.

Download Full-text

Amino Acid Principal Component Analysis (AAPCA) and its Applications in Protein Structural Class Prediction

Journal of Biomolecular Structure and Dynamics ◽

10.1080/07391102.2006.10507088 ◽

2006 ◽

Vol 23 (6) ◽

pp. 635-640 ◽

Cited By ~ 70

Author(s):

Qi-Shi Du ◽

Zhi-Qin Jiang ◽

Wen-Zhang He ◽

Da-Peng Li ◽

Kou-Chen Chou

Keyword(s):

Principal Component Analysis ◽

Amino Acid ◽

Principal Component ◽

Component Analysis ◽

Class Prediction ◽

Structural Class ◽

Protein Structural Class

Download Full-text

Evaluating Long-Term Relationship of Protein Sequence by Use of DInterval Conditional Probability and Its Impact on Protein Structural Class Prediction

Protein and Peptide Letters ◽

10.2174/092986609789071225 ◽

2009 ◽

Vol 16 (10) ◽

pp. 1267-1276 ◽

Cited By ~ 4

Author(s):

Fei Gu ◽

Hang Chen

Keyword(s):

Conditional Probability ◽

Protein Sequence ◽

Class Prediction ◽

Structural Class ◽

Protein Structural Class ◽

Relationship Of

Download Full-text

Protein structural class prediction using predicted secondary structure and hydropathy profile

10.32920/ryerson.14657172 ◽

2021 ◽

Author(s):

Syeda Nadia Firdaus

Keyword(s):

Secondary Structure ◽

Classification Problem ◽

Support Vector ◽

Prediction Problem ◽

Class Prediction ◽

Structural Class ◽

Protein Structural Class ◽

Vector Machines ◽

Structural Classes ◽

New Strategies

This thesis explores machine learning models based on various feature sets to solve the protein structural class prediction problem which is a significant classification problem in bioinformatics. Knowledge of protein structural classes contributes to an understanding of protein folding patterns, and this has made structural class prediction research a major topic of interest. In this thesis, features are extracted from predicted secondary structure and hydropathy sequence using new strategies to classify proteins into one of the four major structural classes: all-α, all-β, α/β, and α+β. The prediction accuracy using these features compares favourably with some existing successful methods. We use Support Vector Machines (SVM), since this learning method has well-known efficiency in solving this classification problem. On a standard dataset (25PDB), the proposed system has an overall accuracy of 89% with as few as 22 features, whereas the previous best performing method had an accuracy of 88% using 2510 features.

Download Full-text

Prediction of Protein Structural Classes: Features Extraction to Classification Algorithm

Current Proteomics ◽

10.2174/1570164618666210218141148 ◽

2021 ◽

Vol 18 ◽

Author(s):

Xiaoqing Liu ◽

Zhenyu Yang ◽

Yaoxin Wang ◽

Qi Dai

Keyword(s):

Tertiary Structure ◽

Protein Sequencing ◽

Conformational Space ◽

Folding Rate ◽

Class Prediction ◽

Structural Class ◽

Protein Structural Class ◽

Dna Binding Sites ◽

Structural Classes ◽

Protein Structure Data

: The fast growing of protein sequencing and protein structure data has promoted the development of the protein structural class prediction. Several prediction methods have been proposed to study protein folding rate, DNA binding sites, as well as reducing the search of conformational space and realizing the prediction of tertiary structure. This paper introduces the current approaches of protein structural class prediction and emphasize their steps from information extraction to classification algorithms.

Download Full-text