NONSYMMETRIC TWO-BODY SCORE FUNCTION FOR PROTEIN FOLD RECOGNITION: NEXT NEAREST NEIGHBOR-ADJACENCY OF TWO AMINO ACIDS

2004 ◽  
Vol 15 (08) ◽  
pp. 1087-1094
Author(s):  
MUYOUNG HEO ◽  
MOOKYUNG CHEON ◽  
IKSOO CHANG

The usual two-body score (energy) function to recognize native folds of proteins is Miyazawa–Jernigan (MJ) pairwise-contact function. The pairwise-contact parameters between two amino acids in MJ function are symmetric in a sense that a directional order of amino acids sequence along the backbone of a protein is ignored in constructing score parameters. Here we report that we succeeded in constructing a nonsymmetric two-body score function, capturing a directional order of amino acids sequence, by a perceptron learning and a protein threading. We considered pairs of two adjacent amino acids that are separated by two consecutive peptide bonds with the backbone directionality from the N-terminus to the C-terminus of a protein. We also considered the local environmental character, such as the secondary structures and the hydrophobicity (solvation), of amino acids in protein structures. The score is a corresponding propensity for a directional alignment of these two adjacent amino acids with their local environments. The resulting score function simultaneously recognized native folds of 1006 proteins covering all representative proteins with a homology less than 30% among them. The quality of this score function was validated by a threading test of new distinct 382 proteins with a homology less than 90% among them, and it entailed a high success ratio for recognizing native folds of 364 (95.3%) proteins. It showed a good feasibility of designing protein score functions for protein fold recognition by a perceptron learning and a protein threading.

2013 ◽  
Vol 14 (1) ◽  
Author(s):  
Alok Sharma ◽  
Kuldip K Paliwal ◽  
Abdollah Dehzangi ◽  
James Lyons ◽  
Seiya Imoto ◽  
...  

Author(s):  
YUEHUI CHEN ◽  
FENG CHEN ◽  
JACK Y. YANG ◽  
MARY QU YANG

Protein structure classification is an important issue in understanding the associations between sequence and structure as well as possible functional and evolutionary relationships. Recently structural genomes initiatives and other high-throughput experiments have populated the biological databases at a rapid pace. In this paper, three types of classifiers, k nearest neighbors, class center and nearest neighbor and probabilistic neural networks and their homogenous ensemble for multiclass protein fold recognition problem are evaluated firstly, and then a heterogenous ensemble Voting System is designed for the same problem. The different features and/or their combinations extracted from the protein fold dataset are used in these classification models. The heterogenous classification results are then put into a voting system to get the final result. The experimental results show that the proposed method can improve prediction accuracy by 4%–10% on a benchmark dataset containing 27 SCOP folds.


Author(s):  
Jiangyi Shao ◽  
Ke Yan ◽  
Bin Liu

Abstract As a key for studying the protein structures, protein fold recognition is playing an important role in predicting the protein structures associated with COVID-19 and other important structures. However, the existing computational predictors only focus on the protein pairwise similarity or the similarity between two groups of proteins from 2-folds. However, the homology relationship among proteins is in a hierarchical structure. The global protein similarity network will contribute to the performance improvement. In this study, we proposed a predictor called FoldRec-C2C to globally incorporate the interactions among proteins into the prediction. For the FoldRec-C2C predictor, protein fold recognition problem is treated as an information retrieval task in nature language processing. The initial ranking results were generated by a surprised ranking algorithm Learning to Rank, and then three re-ranking algorithms were performed on the ranking lists to adjust the results globally based on the protein similarity network, including seq-to-seq model, seq-to-cluster model and cluster-to-cluster model (C2C). When tested on a widely used and rigorous benchmark dataset LINDAHL dataset, FoldRec-C2C outperforms other 34 state-of-the-art methods in this field. The source code and data of FoldRec-C2C can be downloaded from http://bliulab.net/FoldRec-C2C/download.


2014 ◽  
Vol 11 (95) ◽  
pp. 20131147 ◽  
Author(s):  
Agnel Praveen Joseph ◽  
Alexandre G. de Brevern

Protein folding has been a major area of research for many years. Nonetheless, the mechanisms leading to the formation of an active biological fold are still not fully apprehended. The huge amount of available sequence and structural information provides hints to identify the putative fold for a given sequence. Indeed, protein structures prefer a limited number of local backbone conformations, some being characterized by preferences for certain amino acids. These preferences largely depend on the local structural environment. The prediction of local backbone conformations has become an important factor to correctly identifying the global protein fold. Here, we review the developments in the field of local structure prediction and especially their implication in protein fold recognition.


2009 ◽  
Vol 1 (2) ◽  
pp. 1-6
Author(s):  
Muyoung Heo ◽  
Mookyung Cheon ◽  
Suhkmann Kim ◽  
Kwanghoon Chung ◽  
Iksoo Chang

Sign in / Sign up

Export Citation Format

Share Document