scholarly journals Increasing the precision of orthology-based complex prediction through network alignment

Author(s):  
Roland Pache ◽  
Patrick Aloy

Macromolecular assemblies play an important role in almost all cellular processes. However, despite several large-scale studies, our current knowledge about protein complexes is still quite limited, thus advocating the use of in silico predictions to gather information on complex composition in model organisms. Since protein-protein interactions present certain constraints on the functional divergence of macromolecular assemblies during evolution, it is possible to predict complexes based on orthology data. Here, we show that incorporating interaction information through network alignment significantly increases the precision of orthology-based complex prediction. Moreover, we performed a large-scale in silico screen for protein complexes in human, yeast and fly, through the alignment of hundreds of known complexes to whole organism interactomes. Systematic comparison of the resulting network alignments to all complexes currently known in those species revealed many conserved complexes, as well as several novel complex components. In addition to validating our predictions using orthogonal data, we were able to assign specific functional roles to the predicted complexes. In several cases, the incorporation of interaction data through network alignment allowed to distinguish real complex components from other orthologous proteins. Our analyses indicate that current knowledge of yeast protein complexes exceeds that in other organisms and that predicting complexes in fly based on human and yeast data is complementary rather than redundant. Lastly, assessing the conservation of protein complexes of the human pathogen Mycoplasma pneumoniae, we discovered that its complexes repertoire is different from that of eukaryotes, suggesting new points of therapeutic intervention, whereas targeting the pathogen’s Restriction enzyme complex might lead to adverse effects due to its similarity to ATP-dependent metalloproteases in the human host.

2014 ◽  
Author(s):  
Roland Pache ◽  
Patrick Aloy

Macromolecular assemblies play an important role in almost all cellular processes. However, despite several large-scale studies, our current knowledge about protein complexes is still quite limited, thus advocating the use of in silico predictions to gather information on complex composition in model organisms. Since protein-protein interactions present certain constraints on the functional divergence of macromolecular assemblies during evolution, it is possible to predict complexes based on orthology data. Here, we show that incorporating interaction information through network alignment significantly increases the precision of orthology-based complex prediction. Moreover, we performed a large-scale in silico screen for protein complexes in human, yeast and fly, through the alignment of hundreds of known complexes to whole organism interactomes. Systematic comparison of the resulting network alignments to all complexes currently known in those species revealed many conserved complexes, as well as several novel complex components. In addition to validating our predictions using orthogonal data, we were able to assign specific functional roles to the predicted complexes. In several cases, the incorporation of interaction data through network alignment allowed to distinguish real complex components from other orthologous proteins. Our analyses indicate that current knowledge of yeast protein complexes exceeds that in other organisms and that predicting complexes in fly based on human and yeast data is complementary rather than redundant. Lastly, assessing the conservation of protein complexes of the human pathogen Mycoplasma pneumoniae, we discovered that its complexes repertoire is different from that of eukaryotes, suggesting new points of therapeutic intervention, whereas targeting the pathogen’s Restriction enzyme complex might lead to adverse effects due to its similarity to ATP-dependent metalloproteases in the human host.


2014 ◽  
Author(s):  
Roland Pache ◽  
Patrick Aloy

Macromolecular assemblies play an important role in almost all cellular processes. However, despite several large-scale studies, our current knowledge about protein complexes is still quite limited, thus advocating the use of in silico predictions to gather information on complex composition in model organisms. Since protein-protein interactions present certain constraints on the functional divergence of macromolecular assemblies during evolution, it is possible to predict complexes based on orthology data. Here, we show that incorporating interaction information through network alignment significantly increases the precision of orthology-based complex prediction. Moreover, we performed a large-scale in silico screen for protein complexes in human, yeast and fly, through the alignment of hundreds of known complexes to whole organism interactomes. Systematic comparison of the resulting network alignments to all complexes currently known in those species revealed many conserved complexes, as well as several novel complex components. In addition to validating our predictions using orthogonal data, we were able to assign specific functional roles to the predicted complexes. In several cases, the incorporation of interaction data through network alignment allowed to distinguish real complex components from other orthologous proteins. Our analyses indicate that current knowledge of yeast protein complexes exceeds that in other organisms and that predicting complexes in fly based on human and yeast data is complementary rather than redundant. Lastly, assessing the conservation of protein complexes of the human pathogen Mycoplasma pneumoniae, we discovered that its complexes repertoire is different from that of eukaryotes, suggesting new points of therapeutic intervention, whereas targeting the pathogen’s Restriction enzyme complex might lead to adverse effects due to its similarity to ATP-dependent metalloproteases in the human host.


2021 ◽  
pp. 074873042110146
Author(s):  
Alexander E. Mosier ◽  
Jennifer M. Hurley

The circadian clock is the broadly conserved, protein-based, timekeeping mechanism that synchronizes biology to the Earth’s 24-h light-dark cycle. Studies of the mechanisms of circadian timekeeping have placed great focus on the role that individual protein-protein interactions play in the creation of the timekeeping loop. However, research has shown that clock proteins most commonly act as part of large macromolecular protein complexes to facilitate circadian control over physiology. The formation of these complexes has led to the large-scale study of the proteins that comprise these complexes, termed here “circadian interactomics.” Circadian interactomic studies of the macromolecular protein complexes that comprise the circadian clock have uncovered many basic principles of circadian timekeeping as well as mechanisms of circadian control over cellular physiology. In this review, we examine the wealth of knowledge accumulated using circadian interactomics approaches to investigate the macromolecular complexes of the core circadian clock, including insights into the core mechanisms that impart circadian timing and the clock’s regulation of many physiological processes. We examine data acquired from the investigation of the macromolecular complexes centered on both the activating and repressing arm of the circadian clock and from many circadian model organisms.


2021 ◽  
Vol 12 ◽  
Author(s):  
Krishna B. S. Swamy ◽  
Scott C. Schuyler ◽  
Jun-Yi Leu

Proteins are the workhorses of the cell and execute many of their functions by interacting with other proteins forming protein complexes. Multi-protein complexes are an admixture of subunits, change their interaction partners, and modulate their functions and cellular physiology in response to environmental changes. When two species mate, the hybrid offspring are usually inviable or sterile because of large-scale differences in the genetic makeup between the two parents causing incompatible genetic interactions. Such reciprocal-sign epistasis between inter-specific alleles is not limited to incompatible interactions between just one gene pair; and, usually involves multiple genes. Many of these multi-locus incompatibilities show visible defects, only in the presence of all the interactions, making it hard to characterize. Understanding the dynamics of protein-protein interactions (PPIs) leading to multi-protein complexes is better suited to characterize multi-locus incompatibilities, compared to studying them with traditional approaches of genetics and molecular biology. The advances in omics technologies, which includes genomics, transcriptomics, and proteomics can help achieve this end. This is especially relevant when studying non-model organisms. Here, we discuss the recent progress in the understanding of hybrid genetic incompatibility; omics technologies, and how together they have helped in characterizing protein complexes and in turn multi-locus incompatibilities. We also review advances in bioinformatic techniques suitable for this purpose and propose directions for leveraging the knowledge gained from model-organisms to identify genetic incompatibilities in non-model organisms.


microLife ◽  
2021 ◽  
Author(s):  
Vanessa Lamm-Schmidt ◽  
Manuela Fuchs ◽  
Johannes Sulzer ◽  
Milan Gerovac ◽  
Jens Hör ◽  
...  

Abstract Much of our current knowledge about cellular RNA-protein complexes in bacteria is derived from analyses in gram-negative model organisms, with the discovery of RNA-binding proteins (RBPs) generally lagging behind in gram-positive species. Here, we have applied Grad-seq analysis of native RNA-protein complexes to a major gram-positive human pathogen, Clostridioides difficile, whose RNA biology remains largely unexplored. Our analysis resolves in-gradient distributions for ∼88% of all annotated transcripts and ∼50% of all proteins, thereby providing a comprehensive resource for the discovery of RNA-protein and protein-protein complexes in C. difficile and related microbes. The sedimentation profiles together with pulldown approaches identify KhpB, previously identified in Streptococcus pneumoniae, as an uncharacterized, pervasive RBP in C. difficile. Global RIP-seq analysis establishes a large suite of mRNA and small RNA targets of KhpB, similar to the scope of the Hfq targetome in C. difficile. The KhpB-bound transcripts include several functionally related mRNAs encoding virulence-associated metabolic pathways and toxin A whose transcript levels are observed to be increased in a khpB deletion strain. Moreover, the production of toxin protein is also increased upon khpB deletion. In summary, this study expands our knowledge of cellular RNA protein interactions in C. difficile and supports the emerging view that KhpB homologues constitute a new class of globally acting RBPs in gram-positive bacteria.


2019 ◽  
Author(s):  
Wojciech Michalak ◽  
Vasileios Tsiamis ◽  
Veit Schwämmle ◽  
Adelina Rogowska-Wrzesińska

AbstractWe have developed ComplexBrowser, an open source, online platform for supervised analysis of quantitative proteomics data that focuses on protein complexes. The software uses information from CORUM and Complex Portal databases to identify protein complex components. Based on the expression changes of individual complex subunits across the proteomics experiment it calculates Complex Fold Change (CFC) factor that characterises the overall protein complex expression trend and the level of subunit co-regulation. Thus up- and down-regulated complexes can be identified. It provides interactive visualisation of protein complexes composition and expression for exploratory analysis. It also incorporates a quality control step that includes normalisation and statistical analysis based on Limma test. ComplexBrowser performance was tested on two previously published proteomics studies identifying changes in protein expression in human adenocarcinoma tissue and during activation of mouse T-cells. The analysis revealed 1519 and 332 protein complexes, of which 233 and 41 were found co-ordinately regulated in the respective studies. The adopted approach provided evidence for a shift to glucose-based metabolism and high proliferation in adenocarcinoma tissues and identification of chromatin remodelling complexes involved in mouse T-cell activation. The results correlate with the original interpretation of the experiments and also provide novel biological details about protein complexes affected. ComplexBrowser is, to our knowledge, the first tool to automate quantitative protein complex analysis for high-throughput studies, providing insights into protein complex regulation within minutes of analysis.A fully functional demo version of ComplexBrowser v1.0 is available online via http://computproteomics.bmb.sdu.dk/Apps/ComplexBrowser/The source code can be downloaded from: https://bitbucket.org/michalakw/complexbrowserHighlightsAutomated analysis of protein complexes in proteomics experimentsQuantitative measure of the coordinated changes in protein complex componentsInteractive visualisations for exploratory analysis of proteomics resultsIn briefComplexBrowser is capable of identifying protein complexes in datasets obtained from large scale quantitative proteomics experiments. It provides, in the form of the CFC factor, a quantitative measure of the coordinated changes in complex components. This facilitates assessing the overall trends in the processes governed by the identified protein complexes providing a new and complementary way of interpreting proteomics experiments.


2018 ◽  
Author(s):  
Yanhui Hu ◽  
Richelle Sopko ◽  
Verena Chung ◽  
Romain A. Studer ◽  
Sean D. Landry ◽  
...  

AbstractPost-translational modification (PTM) serves as a regulatory mechanism for protein function, influencing stability, protein interactions, activity and localization, and is critical in many signaling pathways. The best characterized PTM is phosphorylation, whereby a phosphate is added to an acceptor residue, commonly serine, threonine and tyrosine. As proteins are often phosphorylated at multiple sites, identifying those sites that are important for function is a challenging problem. Considering that many phosphorylation sites may be non-functional, prioritizing evolutionarily conserved phosphosites provides a general strategy to identify the putative functional sites with regards to regulation and function. To facilitate the identification of conserved phosphosites, we generated a large-scale phosphoproteomics dataset from Drosophila embryos collected from six closely-related species. We built iProteinDB (https://www.flyrnai.org/tools/iproteindb/), a resource integrating these data with other high-throughput PTM datasets, including vertebrates, and manually curated information for Drosophila. At iProteinDB, scientists can view the PTM landscape for any Drosophila protein and identify predicted functional phosphosites based on a comparative analysis of data from closely-related Drosophila species. Further, iProteinDB enables comparison of PTM data from Drosophila to that of orthologous proteins from other model organisms, including human, mouse, rat, Xenopus laevis, Danio rerio, and Caenorhabditis elegans.


2014 ◽  
Vol 25 (20) ◽  
pp. 3178-3194 ◽  
Author(s):  
Georg H. H. Borner ◽  
Marco Y. Hein ◽  
Jennifer Hirst ◽  
James R. Edgar ◽  
Matthias Mann ◽  
...  

We developed “fractionation profiling,” a method for rapid proteomic analysis of membrane vesicles and protein particles. The approach combines quantitative proteomics with subcellular fractionation to generate signature protein abundance distribution profiles. Functionally associated groups of proteins are revealed through cluster analysis. To validate the method, we first profiled >3500 proteins from HeLa cells and identified known clathrin-coated vesicle proteins with >90% accuracy. We then profiled >2400 proteins from Drosophila S2 cells, and we report the first comprehensive insect clathrin-coated vesicle proteome. Of importance, the cluster analysis extends to all profiled proteins and thus identifies a diverse range of known and novel cytosolic and membrane-associated protein complexes. We show that it also allows the detailed compositional characterization of complexes, including the delineation of subcomplexes and subunit stoichiometry. Our predictions are presented in an interactive database. Fractionation profiling is a universal method for defining the clathrin-coated vesicle proteome and may be adapted for the analysis of other types of vesicles and particles. In addition, it provides a versatile tool for the rapid generation of large-scale protein interaction maps.


F1000Research ◽  
2016 ◽  
Vol 5 ◽  
pp. 782 ◽  
Author(s):  
Virja Mehta ◽  
Laura Trinkle-Mulcahy

Protein-protein interactions (PPIs) underlie most, if not all, cellular functions. The comprehensive mapping of these complex networks of stable and transient associations thus remains a key goal, both for systems biology-based initiatives (where it can be combined with other ‘omics’ data to gain a better understanding of functional pathways and networks) and for focused biological studies. Despite the significant challenges of such an undertaking, major strides have been made over the past few years. They include improvements in the computation prediction of PPIs and the literature curation of low-throughput studies of specific protein complexes, but also an increase in the deposition of high-quality data from non-biased high-throughput experimental PPI mapping strategies into publicly available databases.


2021 ◽  
Author(s):  
Jimin Pei ◽  
Jing Zhang ◽  
Qian Cong

AbstractRecent development of deep-learning methods has led to a breakthrough in the prediction accuracy of 3-dimensional protein structures. Extending these methods to protein pairs is expected to allow large-scale detection of protein-protein interactions and modeling protein complexes at the proteome level. We applied RoseTTAFold and AlphaFold2, two of the latest deep-learning methods for structure predictions, to analyze coevolution of human proteins residing in mitochondria, an organelle of vital importance in many cellular processes including energy production, metabolism, cell death, and antiviral response. Variations in mitochondrial proteins have been linked to a plethora of human diseases and genetic conditions. RoseTTAFold, with high computational speed, was used to predict the coevolution of about 95% of mitochondrial protein pairs. Top-ranked pairs were further subject to the modeling of the complex structures by AlphaFold2, which also produced contact probability with high precision and in many cases consistent with RoseTTAFold. Most of the top ranked pairs with high contact probability were supported by known protein-protein interactions and/or similarities to experimental structural complexes. For high-scoring pairs without experimental complex structures, our coevolution analyses and structural models shed light on the details of their interfaces, including CHCHD4-AIFM1, MTERF3-TRUB2, FMC1-ATPAF2, ECSIT-NDUFAF1 and COQ7-COQ9, among others. We also identified novel PPIs (PYURF-NDUFAF5, LYRM1-MTRF1L and COA8-COX10) for several proteins without experimentally characterized interaction partners, leading to predictions of their molecular functions and the biological processes they are involved in.


Sign in / Sign up

Export Citation Format

Share Document