scholarly journals Protein functional annotation of simultaneously improved stability, accuracy and false discovery rate achieved by a sequence-based deep learning

2019 ◽  
Vol 21 (4) ◽  
pp. 1437-1447 ◽  
Author(s):  
Jiajun Hong ◽  
Yongchao Luo ◽  
Yang Zhang ◽  
Junbiao Ying ◽  
Weiwei Xue ◽  
...  

Abstract Functional annotation of protein sequence with high accuracy has become one of the most important issues in modern biomedical studies, and computational approaches of significantly accelerated analysis process and enhanced accuracy are greatly desired. Although a variety of methods have been developed to elevate protein annotation accuracy, their ability in controlling false annotation rates remains either limited or not systematically evaluated. In this study, a protein encoding strategy, together with a deep learning algorithm, was proposed to control the false discovery rate in protein function annotation, and its performances were systematically compared with that of the traditional similarity-based and de novo approaches. Based on a comprehensive assessment from multiple perspectives, the proposed strategy and algorithm were found to perform better in both prediction stability and annotation accuracy compared with other de novo methods. Moreover, an in-depth assessment revealed that it possessed an improved capacity of controlling the false discovery rate compared with traditional methods. All in all, this study not only provided a comprehensive analysis on the performances of the newly proposed strategy but also provided a tool for the researcher in the fields of protein function annotation.


2013 ◽  
Vol 11 (Suppl 1) ◽  
pp. S1 ◽  
Author(s):  
Alfredo Benso ◽  
Stefano Di Carlo ◽  
Hafeez ur Rehman ◽  
Gianfranco Politano ◽  
Alessandro Savino ◽  
...  




BMC Genomics ◽  
2008 ◽  
Vol 9 (Suppl 2) ◽  
pp. S2 ◽  
Author(s):  
Inbal Halperin ◽  
Dariya S Glazer ◽  
Shirley Wu ◽  
Russ B Altman


2008 ◽  
Vol 9 (1) ◽  
pp. 52 ◽  
Author(s):  
Chenggang Yu ◽  
Nela Zavaljevski ◽  
Valmik Desai ◽  
Seth Johnson ◽  
Fred J Stevens ◽  
...  




2015 ◽  
Vol 112 (44) ◽  
pp. 13567-13572 ◽  
Author(s):  
Ludovico Sutto ◽  
Simone Marsili ◽  
Alfonso Valencia ◽  
Francesco Luigi Gervasio

The analysis of evolutionary amino acid correlations has recently attracted a surge of renewed interest, also due to their successful use in de novo protein native structure prediction. However, many aspects of protein function, such as substrate binding and product release in enzymatic activity, can be fully understood only in terms of an equilibrium ensemble of alternative structures, rather than a single static structure. In this paper we combine coevolutionary data and molecular dynamics simulations to study protein conformational heterogeneity. To that end, we adapt the Boltzmann-learning algorithm to the analysis of homologous protein sequences and develop a coarse-grained protein model specifically tailored to convert the resulting contact predictions to a protein structural ensemble. By means of exhaustive sampling simulations, we analyze the set of conformations that are consistent with the observed residue correlations for a set of representative protein domains, showing that (i) the most representative structure is consistent with the experimental fold and (ii) the various regions of the sequence display different stability, related to multiple biologically relevant conformations and to the cooperativity of the coevolving pairs. Moreover, we show that the proposed protocol is able to reproduce the essential features of a protein folding mechanism as well as to account for regions involved in conformational transitions through the correct sampling of the involved conformers.



2015 ◽  
Vol 31 (21) ◽  
pp. 3460-3467 ◽  
Author(s):  
Sayoni Das ◽  
David Lee ◽  
Ian Sillitoe ◽  
Natalie L. Dawson ◽  
Jonathan G. Lees ◽  
...  




Author(s):  
Chunyan Yu ◽  
Xiaoxu Li ◽  
Hong Yang ◽  
Yinghong Li ◽  
Weiwei Xue ◽  
...  

The knowledge of protein function is essential for the study of biological processes, the understanding of disease mechanism and the exploration of novel therapeutic target. Apart from experimental methods, a number of in-silico approaches have been developed and extensively used for protein function prediction. Among these approaches, BLAST predicts functions based on protein sequence similarity, and machine learning predicts functional families from protein sequences irrespective of their similarity, which complements BLAST and other methods in predicting diverse classes of proteins including distantly related proteins and homologous proteins of different functions. However, their identification accuracies and the false discovery rate have not yet been assessed so far, which greatly limits the usage of these prediction algorithms. Herein, a comprehensive comparison of the performances among four popular functional prediction algorithms (BLAST, SVM, PNN and KNN) was conducted. In particular, the performance of these algorithms were systematically assessed by four metrics (sensitivity, specificity, accuracy and Matthews correlation coefficient) based on the independent test datasets generated from 93 protein families defined by UniProtKB Keywords. Moreover, the false discovery rates of these algorithms were evaluated by scanning the genomes of four representative model species (homo sapiens, arabidopsis thaliana, saccharomyces cerevisiae and mycobacterium tuberculosis). As a result, the substantially higher sensitivity and stability of BLAST and SVM were observed compared with that of PNN and KNN. But the machine learning algorithms (PNN, KNN and SVM) were found capable of significantly reducing the false discovery rate (SVM < PNN ≈ KNN). In summary, this study comprehensively assessed the performance of four popular algorithms applied to protein function prediction, which could facilitate the selection of the most appropriate method in the related biomedical research.



Sign in / Sign up

Export Citation Format

Share Document