scholarly journals A thorough analysis of the contribution of experimental, derived and sequence-based predicted protein-protein interactions for functional annotation of proteins

2019 ◽  
Author(s):  
Stavros Makrodimitris ◽  
Marcel Reinders ◽  
Roeland van Ham

AbstractPhysical interaction between two proteins is strong evidence that the proteins are involved in the same biological process, making Protein-Protein Interaction (PPI) networks a valuable data resource for predicting the cellular functions of proteins. However, PPI networks are largely incomplete for non-model species. Here, we tested to what extened these incomplete networks are still useful for genome-wide function prediction. We used two network-based classifiers to predict Biological Process Gene Ontology terms from protein interaction data in four species: Saccharomyces cerevisiae, Escherichia coli, Arabidopsis thaliana and Solanum lycopersicum (tomato). The classifiers had reasonable performance in the well-studied yeast, but performed poorly in the other species. We showed that this poor performance can be considerably improved by adding edges predicted from various data sources, such as text mining, and that associations from the STRING database are more useful than interactions predicted by a neural network from sequence-based features.


PLoS ONE ◽  
2020 ◽  
Vol 15 (11) ◽  
pp. e0242723
Author(s):  
Stavros Makrodimitris ◽  
Marcel Reinders ◽  
Roeland van Ham

Physical interaction between two proteins is strong evidence that the proteins are involved in the same biological process, making Protein-Protein Interaction (PPI) networks a valuable data resource for predicting the cellular functions of proteins. However, PPI networks are largely incomplete for non-model species. Here, we tested to what extent these incomplete networks are still useful for genome-wide function prediction. We used two network-based classifiers to predict Biological Process Gene Ontology terms from protein interaction data in four species: Saccharomyces cerevisiae, Escherichia coli, Arabidopsis thaliana and Solanum lycopersicum (tomato). The classifiers had reasonable performance in the well-studied yeast, but performed poorly in the other species. We showed that this poor performance can be considerably improved by adding edges predicted from various data sources, such as text mining, and that associations from the STRING database are more useful than interactions predicted by a neural network from sequence-based features.



2015 ◽  
Vol 4 (4) ◽  
pp. 35-51 ◽  
Author(s):  
Bandana Barman ◽  
Anirban Mukhopadhyay

Identification of protein interaction network is very important to find the cell signaling pathway for a particular disease. The authors have found the differentially expressed genes between two sample groups of HIV-1. Samples are wild type HIV-1 Vpr and HIV-1 mutant Vpr. They did statistical t-test and found false discovery rate (FDR) to identify the genes increased in expression (up-regulated) or decreased in expression (down-regulated). In the test, the authors have computed q-values of test to identify minimum FDR which occurs. As a result they found 172 differentially expressed genes between their sample wild type HIV-1 Vpr and HIV-1 mutant Vpr, R80A. They found 68 up-regulated genes and 104 down-regulated genes. From the 172 differentially expressed genes the authors found protein-protein interaction network with string-db and then clustered (subnetworks) the PPI networks with cytoscape3.0. Lastly, the authors studied significance of subnetworks with performing gene ontology and also studied the KEGG pathway of those subnetworks.



Author(s):  
Hugo Willy

Recent breakthroughs in high throughput experiments to determine protein-protein interaction have generated a vast amount of protein interaction data. However, most of the experiments could only answer the question of whether two proteins interact but not the question on the mechanisms by which proteins interact. Such understanding is crucial for understanding the protein interaction of an organism as a whole (the interactome) and even predicting novel protein interactions. Protein interaction usually occurs at some specific sites on the proteins and, given their importance, they are usually well conserved throughout the evolution of the proteins of the same family. Based on this observation, a number of works on finding protein patterns/motifs conserved in interacting proteins have emerged in the last few years. Such motifs are collectively termed as the interaction motifs. This chapter provides a review on the different approaches on finding interaction motifs with a discussion on their implications, potentials and possible areas of improvements in the future.



Database ◽  
2020 ◽  
Vol 2020 ◽  
Author(s):  
Gregorio Alanis-Lobato ◽  
Jannik S Möllmann ◽  
Martin H Schaefer ◽  
Miguel A Andrade-Navarro

Abstract Cells operate and react to environmental signals thanks to a complex network of protein–protein interactions (PPIs), the malfunction of which can severely disrupt cellular homeostasis. As a result, mapping and analyzing protein networks are key to advancing our understanding of biological processes and diseases. An invaluable part of these endeavors has been the house mouse (Mus musculus), the mammalian model organism par excellence, which has provided insights into human biology and disorders. The importance of investigating PPI networks in the context of mouse prompted us to develop the Mouse Integrated Protein–Protein Interaction rEference (MIPPIE). MIPPIE inherits a robust infrastructure from HIPPIE, its sister database of human PPIs, allowing for the assembly of reliable networks supported by different evidence sources and high-quality experimental techniques. MIPPIE networks can be further refined with tissue, directionality and effect information through a user-friendly web interface. Moreover, all MIPPIE data and meta-data can be accessed via a REST web service or downloaded as text files, thus facilitating the integration of mouse PPIs into follow-up bioinformatics pipelines.



Author(s):  
Guofeng Lv ◽  
Zhiqiang Hu ◽  
Yanguang Bi ◽  
Shaoting Zhang

The study of multi-type Protein-Protein Interaction (PPI) is fundamental for understanding biological processes from a systematic perspective and revealing disease mechanisms. Existing methods suffer from significant performance degradation when tested in unseen dataset. In this paper, we investigate the problem and find that it is mainly attributed to the poor performance for inter-novel-protein interaction prediction. However, current evaluations overlook the inter-novel-protein interactions, and thus fail to give an instructive assessment. As a result, we propose to address the problem from both the evaluation and the methodology. Firstly, we design a new evaluation framework that fully respects the inter-novel-protein interactions and gives consistent assessment across datasets. Secondly, we argue that correlations between proteins must provide useful information for analysis of novel proteins, and based on this, we propose a graph neural network based method (GNN-PPI) for better inter-novel-protein interaction prediction. Experimental results on real-world datasets of different scales demonstrate that GNN-PPI significantly outperforms state-of-the-art PPI prediction methods, especially for the inter-novel-protein interaction prediction.



Author(s):  
Tatsuya Akutsu ◽  
Morihiro Hayashida

Many methods have been proposed for inference of protein-protein interactions from protein sequence data. This chapter focuses on methods based on domain-domain interactions, where a domain is defined as a region within a protein that either performs a specific function or constitutes a stable structural unit. In these methods, the probabilities of domain-domain interactions are inferred from known protein-protein interaction data and protein domain data, and then prediction of interactions is performed based on these probabilities and contents of domains of given proteins. This chapter overviews several fundamental methods, which include association method, expectation maximization-based method, support vector machine-based method, and linear programmingbased method. This chapter also reviews a simple evolutionary model of protein domains, which yields a scalefree distribution of protein domains. By combining with a domain-based protein interaction model, a scale-free distribution of protein-protein interaction networks is also derived.



2020 ◽  
Author(s):  
Brennan Klein ◽  
Ludvig Holmér ◽  
Keith M. Smith ◽  
Mackenzie M. Johnson ◽  
Anshuman Swain ◽  
...  

AbstractProtein-protein interaction (PPI) networks represent complex intra-cellular protein interactions, and the presence or absence of such interactions can lead to biological changes in an organism. Recent network-based approaches have shown that a phenotype’s PPI network’s resilience to environmental perturbations is related to its placement in the tree of life; though we still do not know how or why certain intra-cellular factors can bring about this resilience. One such factor is gene expression, which controls the simultaneous presence of proteins for allowed extant interactions and the possibility of novel associations. Here, we explore the influence of gene expression and network properties on a PPI network’s resilience, focusing especially on ribosomal proteins—vital molecular-complexes involved in protein synthesis, which have been extensively and reliably mapped in many species. Using publicly-available data of ribosomal PPIs for E. coli, S.cerevisae, and H. sapiens, we compute changes in network resilience as new nodes (proteins) are added to the networks under three node addition mechanisms—random, degree-based, and gene-expression-based attachments. By calculating the resilience of the resulting networks, we estimate the effectiveness of these node addition mechanisms. We demonstrate that adding nodes with gene-expression-based preferential attachment (as opposed to random or degree-based) preserves and can increase the original resilience of PPI network. This holds in all three species regardless of their distributions of gene expressions or their network community structure. These findings introduce a general notion of prospective resilience, which highlights the key role of network structures in understanding the evolvability of phenotypic traits.1Author SummaryProteins in organismal cells are present at different levels of concentration and interact with other proteins to provide specific functional roles. Accumulating lists of all of these interactions, complex networks of protein interactions become apparent. This allows us to begin asking whether there are network-level mechanisms at play guiding the evolution of biological systems. Here, using this network perspective, we address two important themes in evolutionary biology (i) How are biological systems able to successfully incorporate novelty? (ii) What is the evolutionary role of biological noise in evolutionary novelty? We consider novelty to be the introduction of a new protein, represented as a new “node”, into a network. We simulate incorporation of novel proteins into Protein-Protein Interaction (PPI) networks in different ways and analyse how the resilience of the PPI network alters. We find that novel interactions guided by gene expression (indicative of concentration levels of proteins) creates a more resilient network than either uniformly random interactions or interactions guided solely by the network structure (preferential attachment). Moreover, simulated biological noise in the gene expression increases network resilience. We suggest that biological noise induces novel structure in the PPI network which has the effect of making it more resilient.



2017 ◽  
Vol 4 (1) ◽  
pp. 100056
Author(s):  
Gregorio Alanis-Lobato ◽  
Spyros Petrakis

Cellular functions are managed by a complex network of protein interactions, the malfunction of which may derive in disease phenotypes. In spite of the incompleteness and noise present in our current protein interaction maps, computational biologists are making strenuous efforts to extract knowledge from these intricate networks and, through their integration with other types of biological data, expedite the development of novel and more effective treatments against human disorders. The 3rd Challenges in Computational Biology meeting revolved around the Protein Interaction Networks and Disease subject, bringing expert network biologists to the city of Mainz, Germany to debate the current status and limitations of protein interaction data and computational resources. This editorial outlines the meeting's background and programme, putting special emphasis on the extended abstracts of contributed talks collected in the present issue of Genomics and Computational Biology.



2019 ◽  
Author(s):  
David Armanious ◽  
Jessica Schuster ◽  
George F. Tollefson ◽  
Anthony Agudelo ◽  
Andrew T. DeWan ◽  
...  

AbstractBackgroundData analysis has become crucial in the post genomic era where the accumulation of genomic information is mounting exponentially. Analyzing protein-protein interactions in the context of the interactome is a powerful approach to understanding disease phenotypes.ResultsWe describe Proteinarium, a multi-sample protein-protein interaction network analysis and visualization tool. Proteinarium can be used to analyze data for samples with dichotomous phenotypes, multiple samples from a single phenotype or a single sample. Then, by similarity clustering, the network-based relations of samples are identified and clusters of related samples are presented as a dendrogram. Each branch of the dendrogram is built based on network similarities of the samples. The protein-protein interaction networks can be analyzed and visualized on any branch of the dendrogram. Proteinarium’s input can be derived from transcriptome analysis, whole exome sequencing data or any high-throughput screening approach. Its strength lies in use of gene lists for each sample as a distinct input which are further analyzed through protein interaction analyses. Proteinarium output includes the gene lists of visualized networks and PPI interaction files where users can analyze the network(s) on other platforms such as Cytoscape. In addition, since the dendrogram is written in Newick tree format, users can visualize it in other software platforms like Dendroscope, ITOL.ConclusionsProteinarium, through the analysis and visualization of PPI networks, allows researchers to make important observations on high throughput data for a variety of research questions. Proteinarium identifies significant clusters of patients based on their shared network similarity for the disease of interest and the associated genes. Proteinarium is a command-line tool written in Java with no external dependencies and it is freely available at https://github.com/Armanious/Proteinarium.



2019 ◽  
Author(s):  
JE Tomkins ◽  
R Ferrari ◽  
N Vavouraki ◽  
J Hardy ◽  
RC Lovering ◽  
...  

AbstractThe past decade has seen the rise of omics data, for the understanding of biological systems in health and disease. This wealth of data includes protein-protein interaction (PPI) derived from both low and high-throughput assays, which is curated into multiple databases that capture the extent of available information from the peer-reviewed literature. Although these curation efforts are extremely useful, reliably downloading and integrating PPI data from the variety of available repositories is challenging and time consuming.We here present a novel user-friendly web-resource called PINOT (Protein Interaction Network Online Tool; available at http://www.reading.ac.uk/bioinf/PINOT/PINOT_form.html) to optimise the collection and processing of PPI data from the IMEx consortium associated repositories (members and observers) and from WormBase for constructing, respectively, human and C. elegans PPI networks.Users submit a query containing a list of proteins of interest for which PINOT will mine PPIs. PPI data is downloaded, merged, quality checked, and confidence scored based on the number of distinct methods and publications in which each interaction has been reported. Examples of PINOT applications are provided to highlight the performance, the ease of use and the potential applications of this tool.PINOT is a tool that allows users to survey the literature, extracting PPI data for a list of proteins of interest. The comparison with analogous tools showed that PINOT was able to extract similar numbers of PPIs while incorporating a set of innovative features. PINOT processes both small and large queries, it downloads PPIs live through PSICQUIC and it applies quality control filters on the downloaded PPI annotations (i.e. removing the need of manual inspection by the user). PINOT provides the user with information on detection methods and publication history for each of the downloaded interaction data entry and provides results in a table format that can be easily further customised and/or directly uploaded in a network visualization software.



Sign in / Sign up

Export Citation Format

Share Document