NONSYMMETRIC TWO-BODY SCORE FUNCTION FOR PROTEIN FOLD RECOGNITION: NEXT NEAREST NEIGHBOR-ADJACENCY OF TWO AMINO ACIDS

The usual two-body score (energy) function to recognize native folds of proteins is Miyazawa–Jernigan (MJ) pairwise-contact function. The pairwise-contact parameters between two amino acids in MJ function are symmetric in a sense that a directional order of amino acids sequence along the backbone of a protein is ignored in constructing score parameters. Here we report that we succeeded in constructing a nonsymmetric two-body score function, capturing a directional order of amino acids sequence, by a perceptron learning and a protein threading. We considered pairs of two adjacent amino acids that are separated by two consecutive peptide bonds with the backbone directionality from the N-terminus to the C-terminus of a protein. We also considered the local environmental character, such as the secondary structures and the hydrophobicity (solvation), of amino acids in protein structures. The score is a corresponding propensity for a directional alignment of these two adjacent amino acids with their local environments. The resulting score function simultaneously recognized native folds of 1006 proteins covering all representative proteins with a homology less than 30% among them. The quality of this score function was validated by a threading test of new distinct 382 proteins with a homology less than 90% among them, and it entailed a high success ratio for recognizing native folds of 364 (95.3%) proteins. It showed a good feasibility of designing protein score functions for protein fold recognition by a perceptron learning and a protein threading.

Download Full-text

K-local hyperplane distance nearest-neighbor algorithm and protein fold recognition

Pattern Recognition and Image Analysis ◽

10.1134/s1054661806010068 ◽

2006 ◽

Vol 16 (1) ◽

pp. 19-22 ◽

Cited By ~ 8

Author(s):

O. G. Okun

Keyword(s):

Nearest Neighbor ◽

Fold Recognition ◽

Protein Fold ◽

Nearest Neighbor Algorithm ◽

Protein Fold Recognition ◽

Local Hyperplane

Download Full-text

A strategy to select suitable physicochemical attributes of amino acids for protein fold recognition

BMC Bioinformatics ◽

10.1186/1471-2105-14-233 ◽

2013 ◽

Vol 14 (1) ◽

Cited By ~ 31

Author(s):

Alok Sharma ◽

Kuldip K Paliwal ◽

Abdollah Dehzangi ◽

James Lyons ◽

Seiya Imoto ◽

...

Keyword(s):

Amino Acids ◽

Fold Recognition ◽

Protein Fold ◽

Protein Fold Recognition

Download Full-text

Protein fold recognition score functions: Unusual construction strategies

Proteins Structure Function and Bioinformatics ◽

10.1002/(sici)1097-0134(19990901)36:4<454::aid-prot9>3.0.co;2-b ◽

1999 ◽

Vol 36 (4) ◽

pp. 454-461 ◽

Cited By ~ 1

Author(s):

Daniel J. Ayers ◽

Thomas Huber ◽

Andrew E. Torda

Keyword(s):

Fold Recognition ◽

Recognition Score ◽

Protein Fold ◽

Score Functions ◽

Protein Fold Recognition ◽

Construction Strategies

Download Full-text

ENSEMBLE VOTING SYSTEM FOR MULTICLASS PROTEIN FOLD RECOGNITION

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001408006454 ◽

2008 ◽

Vol 22 (04) ◽

pp. 747-763 ◽

Cited By ~ 13

Author(s):

YUEHUI CHEN ◽

FENG CHEN ◽

JACK Y. YANG ◽

MARY QU YANG

Keyword(s):

Nearest Neighbor ◽

Fold Recognition ◽

Protein Fold ◽

Biological Databases ◽

K Nearest Neighbors ◽

Protein Fold Recognition ◽

Voting System ◽

Rapid Pace ◽

Protein Structure Classification ◽

High Throughput Experiments

Protein structure classification is an important issue in understanding the associations between sequence and structure as well as possible functional and evolutionary relationships. Recently structural genomes initiatives and other high-throughput experiments have populated the biological databases at a rapid pace. In this paper, three types of classifiers, k nearest neighbors, class center and nearest neighbor and probabilistic neural networks and their homogenous ensemble for multiclass protein fold recognition problem are evaluated firstly, and then a heterogenous ensemble Voting System is designed for the same problem. The different features and/or their combinations extracted from the protein fold dataset are used in these classification models. The heterogenous classification results are then put into a voting system to get the final result. The experimental results show that the proposed method can improve prediction accuracy by 4%–10% on a benchmark dataset containing 27 SCOP folds.

Download Full-text

FoldRec-C2C: protein fold recognition by combining cluster-to-cluster model and protein similarity network

Briefings in Bioinformatics ◽

10.1093/bib/bbaa144 ◽

2020 ◽

Author(s):

Jiangyi Shao ◽

Ke Yan ◽

Bin Liu

Keyword(s):

Language Processing ◽

Cluster Model ◽

Learning To Rank ◽

Protein Structures ◽

Fold Recognition ◽

Protein Fold ◽

Retrieval Task ◽

Similarity Network ◽

Protein Fold Recognition ◽

Homology Relationship

Abstract As a key for studying the protein structures, protein fold recognition is playing an important role in predicting the protein structures associated with COVID-19 and other important structures. However, the existing computational predictors only focus on the protein pairwise similarity or the similarity between two groups of proteins from 2-folds. However, the homology relationship among proteins is in a hierarchical structure. The global protein similarity network will contribute to the performance improvement. In this study, we proposed a predictor called FoldRec-C2C to globally incorporate the interactions among proteins into the prediction. For the FoldRec-C2C predictor, protein fold recognition problem is treated as an information retrieval task in nature language processing. The initial ranking results were generated by a surprised ranking algorithm Learning to Rank, and then three re-ranking algorithms were performed on the ranking lists to adjust the results globally based on the protein similarity network, including seq-to-seq model, seq-to-cluster model and cluster-to-cluster model (C2C). When tested on a widely used and rigorous benchmark dataset LINDAHL dataset, FoldRec-C2C outperforms other 34 state-of-the-art methods in this field. The source code and data of FoldRec-C2C can be downloaded from http://bliulab.net/FoldRec-C2C/download.

Download Full-text

From local structure to a global framework: recognition of protein folds

Journal of The Royal Society Interface ◽

10.1098/rsif.2013.1147 ◽

2014 ◽

Vol 11 (95) ◽

pp. 20131147 ◽

Cited By ~ 6

Author(s):

Agnel Praveen Joseph ◽

Alexandre G. de Brevern

Keyword(s):

Local Structure ◽

Structure Prediction ◽

Structural Information ◽

Protein Structures ◽

Fold Recognition ◽

Protein Fold ◽

Major Area ◽

Huge Amount ◽

Protein Fold Recognition ◽

Structural Environment

Protein folding has been a major area of research for many years. Nonetheless, the mechanisms leading to the formation of an active biological fold are still not fully apprehended. The huge amount of available sequence and structural information provides hints to identify the putative fold for a given sequence. Indeed, protein structures prefer a limited number of local backbone conformations, some being characterized by preferences for certain amino acids. These preferences largely depend on the local structural environment. The prediction of local backbone conformations has become an important factor to correctly identifying the global protein fold. Here, we review the developments in the field of local structure prediction and especially their implication in protein fold recognition.

Download Full-text