scholarly journals TANA: efficient approach for predicting protein functions by transferring annotation via alignment networks

2020 ◽  
Author(s):  
Warith Eddine DJEDDI ◽  
Sadok BEN YAHIA ◽  
Engelbert MEPHU NGUIFO

Abstract Background: One of the challenges of the post-genomic era is to provide accurate function annotations for orphan and unannotated protein sequences. With the recent availability of huge protein-protein interactions networks for many model species, the computational methods revealed a great requirement to elucidate protein function based on many strategies. In this respect, most computational approaches integrate diverse kinds of functional interactions to unveil protein functions by transferring annotations across different species by relying on similar sequence, structure 2D/3D, amino acid motifs or phylogenetic profiles. Results: In this work, we introduce a new approach called TANA for inferring protein functions. The main originality of the introduced approach stands on the function prediction for the unannotated protein by transferring annotation via a network alignment as well as from the direct interaction neighborhood within their PPI networks. Doing so, we are able to discover the functions of proteins that could not to be easily described by sequence homology. We assess the performance of our method using the standard metrics established by the CAFA and highlight a sharp significant improvement over other competitive methods, in particular for predicting molecular functions. Conclusions: This research is one of the first attempts that combine sequence and networks-multiple-alignment-based function prediction approaches. We have been able to assess the accuracy of the prediction using pairwise and multiple alignment of the PPI networks for the compared species. Therefore, we recommend using different strategies (i.e pairwise, multiple, with/without neighborhood networks) especially in situations where the functions of the protein are not known in advance.

2020 ◽  
Author(s):  
Warith Eddine DJEDDI ◽  
Sadok BEN YAHIA ◽  
Engelbert MEPHU NGUIFO

Abstract Background: One of the challenges of the post-genomic era is to provide accurate function annotations for orphan and unannotated protein sequences. With the recent availability of huge PPIs networks for many model species, the computational methods revealed a great requirement to elucidate protein function based on many strategies. In this respect, most computational approaches integrate diverse kinds of functional interactions to unveil protein functions by transferring annotations across different species by relying on similar sequence, structure 2D/3D, amino acid patterns or phylogenetic profiles. Results: In this work, we introduce a new approach, called TANA, for inferring protein functions. The main originality of the introduced approach stands on the function prediction for the unannotated protein by transferring annotation via a network alignment as well as from the direct interaction neighborhood within their PPI networks. Doing so, we are able to discover the functions of proteins that could not to be easily described by sequence homology. We assess the performance of our approach using the standard metrics established by the CAFA challenge and highlight a sharp significant improvement over other competitive methods, in particular for predicting molecular functions. Conclusions: This research is one of the first attempts that combine sequence and networks-multiple-alignment-based function prediction approaches. We have been able to assess the accuracy of the prediction using pairwise and multiple alignment of the PPI networks for the compared species. Therefore, we recommend using different strategies (i.e pairwise, multiple, with/without neighborhood networks) especially in situations where the functions of the protein are not known beforehand.


2020 ◽  
Author(s):  
Warith Eddine DJEDDI ◽  
Sadok BEN YAHIA ◽  
Engelbert MEPHU NGUIFO

Abstract Background: One of the challenges of the post-genomic era is to provide accurate function annotations for orphan and unannotated protein sequences. With the recent availability of huge PPI networks for many model species, the computational methods revealed a great requirement to elucidate protein function based on many strategies. In this respect, most computational approaches integrate diverse kinds of functional interactions to unveil protein functions by transferring annotations across different species by relying on a similar sequence, structure 2D/3D, amino acid patterns of phylogenetic profiles. Results: In this work, we introduce a new approach, called TANA, for inferring protein functions. The main originality of the introduced approach stands on the function prediction for the unannotated protein by transferring annotation via a network alignment as well as from the direct interaction neighborhood within their PPI networks. In doing so, we are able to discover the functions of proteins that could not be easily described by sequence homology. We assess the performance of our approach using the standard metrics established by the CAFA challenge and highlight a sharp significant improvement over other competitive methods, in particular for predicting molecular functions and cellular components. Conclusions: This research is one of the first attempts that combine sequence and networks-multiple-alignment-based function prediction approaches. We have been able to assess the accuracy of the prediction using pairwise and multiple alignment of the PPI networks for the compared species. Therefore, we recommend using different strategies (i.e. pairwise, multiple, with/without neighborhood networks) especially in situations where the functions of the protein are not known beforehand


2020 ◽  
Author(s):  
A. Khanteymoori ◽  
M. B. Ghajehlo ◽  
S. Behrouzinia ◽  
M. H. Olyaee

AbstractProtein function prediction based on protein-protein interactions (PPI) is one of the most important challenges of the Post-Genomic era. Due to the fact that determining protein function by experimental techniques can be costly, function prediction has become an important challenge for computational biology and bioinformatics. Some researchers utilize graph- (or network-) based methods using PPI networks for un-annotated proteins. The aim of this study is to increase the accuracy of the protein function prediction using two proposed methods.To predict protein functions, we propose a Protein Function Prediction based on Clique Analysis (ProCbA) and Protein Function Prediction on Neighborhood Counting using functional aggregation (ProNC-FA). Both ProCbA and ProNC-FA can predict the functions of unknown proteins. In addition, in ProNC-FA which is not including new algorithm; we try to address the essence of incomplete and noisy data of PPI era in order to achieving a network with complete functional aggregation. The experimental results on MIPS data and the 17 different explained datasets validate the encouraging performance and the strength of both ProCbA and ProNC-FA on function prediction. Experimental result analysis as can be seen in Section IV, the both ProCbA and ProNC-FA are generally able to outperform all the other methods.


2019 ◽  
Vol 3 (4) ◽  
pp. 357-369
Author(s):  
J. Harry Caufield ◽  
Peipei Ping

Abstract Protein–protein interactions, or PPIs, constitute a basic unit of our understanding of protein function. Though substantial effort has been made to organize PPI knowledge into structured databases, maintenance of these resources requires careful manual curation. Even then, many PPIs remain uncurated within unstructured text data. Extracting PPIs from experimental research supports assembly of PPI networks and highlights relationships crucial to elucidating protein functions. Isolating specific protein–protein relationships from numerous documents is technically demanding by both manual and automated means. Recent advances in the design of these methods have leveraged emerging computational developments and have demonstrated impressive results on test datasets. In this review, we discuss recent developments in PPI extraction from unstructured biomedical text. We explore the historical context of these developments, recent strategies for integrating and comparing PPI data, and their application to advancing the understanding of protein function. Finally, we describe the challenges facing the application of PPI mining to the text concerning protein families, using the multifunctional 14-3-3 protein family as an example.


2018 ◽  
Author(s):  
Cen Wan ◽  
Domenico Cozzetto ◽  
Rui Fa ◽  
David T. Jones

Protein-protein interaction network data provides valuable information that infers direct links between genes and their biological roles. This information brings a fundamental hypothesis for protein function prediction that interacting proteins tend to have similar functions. With the help of recently-developed network embedding feature generation methods and deep maxout neural networks, it is possible to extract functional representations that encode direct links between protein-protein interactions information and protein function. Our novel method, STRING2GO, successfully adopts deep maxout neural networks to learn functional representations simultaneously encoding both protein-protein interactions and functional predictive information. The experimental results show that STRING2GO outperforms other network embedding-based prediction methods and one benchmark method adopted in a recent large scale protein function prediction competition.


2017 ◽  
Author(s):  
Pin-San Xu ◽  
Jun Luo ◽  
Tong-Yi Dou

Most biological processes within a cell are carried out by protein-protein interaction (PPI) networks, or so called interactomics. Therefore, identification of PPIs is crucial to elucidating protein functions and further understanding of various cellular biological processes. Currently, a series of high-throughput experimental technologies for detect PPIs have been presented. However, the time-consuming and labor-driven characteristics of these methods forced people to turn to virtual technology for PPIs prediction. Herein, we developed a new predictor which uses stacking algorithm with information extraction by wavelet transform. When applied on the Saccharomyces cerevisiae PPI dataset, the proposed method got a prediction accuracy of 83.35% with sensitivity of 92.95% at the specificity of 65.41%. An independent data set of 2726 Helicobacter pylori PPIs was also used to evaluate this prediction model, and the prediction accuracy is 80.39%, which is better than that of most existing methods.


Sign in / Sign up

Export Citation Format

Share Document