scholarly journals Correlations from structure and phylogeny combine constructively in the inference of protein partners from sequences

2021 ◽  
Author(s):  
Andonis Gerardos ◽  
Nicola Dietler ◽  
Anne-Florence Bitbol

Inferring protein-protein interactions from sequences is an important task in computational biology. Recent methods based on Direct Coupling Analysis (DCA) or Mutual Information (MI) allow to find interaction partners among paralogs of two protein families. Does successful inference mainly rely on correlations from structural contacts or from phylogeny, or both? Do these two types of signal combine constructively or hinder each other? To address these questions, we generate and analyze synthetic data produced using a minimal model that allows us to control the amounts of structural constraints and phylogeny. We show that correlations from these two sources combine constructively to increase the performance of partner inference by DCA or MI. Furthermore, signal from phylogeny can rescue partner inference when signal from contacts becomes less informative, including in the realistic case where inter-protein contacts are restricted to a small subset of sites. We also demonstrate that DCA-inferred couplings between non-contact pairs of sites improve partner inference in the presence of strong phylogeny, while deteriorating it otherwise. Moreover, restricting to non-contact pairs of sites preserves inference performance in the presence of strong phylogeny. In a natural dataset, as well as in realistic synthetic data based on it, we find that non-contact pairs of sites contribute positively to partner inference performance, and that restricting to them preserves performance, evidencing an important role of phylogeny.

2018 ◽  
Vol 25 (1) ◽  
pp. 5-21 ◽  
Author(s):  
Ylenia Cau ◽  
Daniela Valensin ◽  
Mattia Mori ◽  
Sara Draghi ◽  
Maurizio Botta

14-3-3 is a class of proteins able to interact with a multitude of targets by establishing protein-protein interactions (PPIs). They are usually found in all eukaryotes with a conserved secondary structure and high sequence homology among species. 14-3-3 proteins are involved in many physiological and pathological cellular processes either by triggering or interfering with the activity of specific protein partners. In the last years, the scientific community has collected many evidences on the role played by seven human 14-3-3 isoforms in cancer or neurodegenerative diseases. Indeed, these proteins regulate the molecular mechanisms associated to these diseases by interacting with (i) oncogenic and (ii) pro-apoptotic proteins and (iii) with proteins involved in Parkinson and Alzheimer diseases. The discovery of small molecule modulators of 14-3-3 PPIs could facilitate complete understanding of the physiological role of these proteins, and might offer valuable therapeutic approaches for these critical pathological states.


2019 ◽  
Author(s):  
Guillaume Marmier ◽  
Martin Weigt ◽  
Anne-Florence Bitbol

AbstractDetermining which proteins interact together is crucial to a systems-level understanding of the cell. Recently, algorithms based on Direct Coupling Analysis (DCA) pairwise maximum-entropy models have allowed to identify interaction partners among the paralogs of ubiquitous prokaryotic proteins families, starting from sequence data alone. Since DCA allows to infer the three-dimensional structure of protein complexes, its success in predicting protein-protein interactions could be mainly based on contacting residues coevolving to remain physicochemically complementary. However, interacting proteins often possess similar evolutionary histories, which also gives rise to correlations among their sequences. What is the role of purely phylogenetic correlations in the performance of DCA-based methods to infer interaction partners? To address this question, we employ controlled synthetic data that only involves phylogeny and no interactions or contacts. We find that DCA accurately identifies the pairs of synthetic sequences that only share evolutionary history. It performs as well as methods explicitly based on sequence similarity, and even slightly better with large and accurate training sets. We further demonstrate the ability of these various methods to correctly predict pairings among actual paralogous proteins with genome proximity but no known direct physical interaction, which illustrates the importance of phylogenetic correlations in real data. However, for actually interacting and strongly coevolving proteins, DCA and mutual information outperform sequence similarity.Author summaryMany biologically important protein-protein interactions are conserved over evolutionary time scales. This leads to two different signals that can be used to computationally predict interactions between protein families and to identify specific interaction partners. First, the shared evolutionary history leads to highly similar phylogenetic relationships between interacting proteins of the two families. Second, the need to keep the interaction surfaces of partner proteins biophysically compatible causes a correlated amino-acid usage of interface residues. Employing simulated data, we show that the shared history alone can be used to detect partner proteins. Similar accuracies are achieved by algorithms comparing phylogenetic relationships and by coevolutionary methods based on Direct Coupling Analysis, which are a priori designed to detect the second type of signal. Using real sequence data, we show that in cases with shared evolutionary but without known physical interactions, both methods work with similar accuracy, while for physically interacting systems, methods based on correlated amino-acid usage outperform purely phylogenetic ones.


2019 ◽  
Author(s):  
Carlos A. Gandarilla-Pérez ◽  
Pierre Mergny ◽  
Martin Weigt ◽  
Anne-Florence Bitbol

Identifying protein-protein interactions is crucial for a systems-level understanding of the cell. Recently, algorithms based on inverse statistical physics, e.g. Direct Coupling Analysis (DCA), have allowed to use evolutionarily related sequences to address two conceptually related inference tasks: finding pairs of interacting proteins, and identifying pairs of residues which form contacts between interacting proteins. Here we address two underlying questions: How are the performances of both inference tasks related? How does performance depend on dataset size and the quality? To this end, we formalize both tasks using Ising models defined over stochastic block models, with individual blocks representing single proteins, and inter-block couplings protein-protein interactions; controlled synthetic sequence data are generated by Monte-Carlo simulations. We show that DCA is able to address both inference tasks accurately when sufficiently large training sets of known interaction partners are available, and that an iterative pairing algorithm (IPA) allows to make predictions even without a training set. Noise in the training data deteriorates performance. In both tasks we find a quadratic scaling relating dataset quality and size that is consistent with noise adding in square-root fashion and signal adding linearly when increasing the dataset. This implies that it is generally good to incorporate more data even if its quality is imperfect, thereby shedding light on the empirically observed performance of DCA applied to natural protein sequences.


2021 ◽  
Vol 43 (2) ◽  
pp. 767-781
Author(s):  
Vanessa Pinatto Gaspar ◽  
Anelise Cardoso Ramos ◽  
Philippe Cloutier ◽  
José Renato Pattaro Junior ◽  
Francisco Ferreira Duarte Junior ◽  
...  

KIN (Kin17) protein is overexpressed in a number of cancerous cell lines, and is therefore considered a possible cancer biomarker. It is a well-conserved protein across eukaryotes and is ubiquitously expressed in all cell types studied, suggesting an important role in the maintenance of basic cellular function which is yet to be well determined. Early studies on KIN suggested that this nuclear protein plays a role in cellular mechanisms such as DNA replication and/or repair; however, its association with chromatin depends on its methylation state. In order to provide a better understanding of the cellular role of this protein, we investigated its interactome by proximity-dependent biotin identification coupled to mass spectrometry (BioID-MS), used for identification of protein–protein interactions. Our analyses detected interaction with a novel set of proteins and reinforced previous observations linking KIN to factors involved in RNA processing, notably pre-mRNA splicing and ribosome biogenesis. However, little evidence supports that this protein is directly coupled to DNA replication and/or repair processes, as previously suggested. Furthermore, a novel interaction was observed with PRMT7 (protein arginine methyltransferase 7) and we demonstrated that KIN is modified by this enzyme. This interactome analysis indicates that KIN is associated with several cell metabolism functions, and shows for the first time an association with ribosome biogenesis, suggesting that KIN is likely a moonlight protein.


Author(s):  
Elise Delaforge ◽  
Sigrid Milles ◽  
Jie-rong Huang ◽  
Denis Bouvier ◽  
Malene Ringkjøbing Jensen ◽  
...  

2020 ◽  
Vol 2020 ◽  
pp. 1-13
Author(s):  
Miaomiao Bai ◽  
Dongdong Ti ◽  
Qian Mei ◽  
Jiejie Liu ◽  
Xin Yan ◽  
...  

The human body is a complex structure of cells, which are exposed to many types of stress. Cells must utilize various mechanisms to protect their DNA from damage caused by metabolic and external sources to maintain genomic integrity and homeostasis and to prevent the development of cancer. DNA damage inevitably occurs regardless of physiological or abnormal conditions. In response to DNA damage, signaling pathways are activated to repair the damaged DNA or to induce cell apoptosis. During the process, posttranslational modifications (PTMs) can be used to modulate enzymatic activities and regulate protein stability, protein localization, and protein-protein interactions. Thus, PTMs in DNA repair should be studied. In this review, we will focus on the current understanding of the phosphorylation, poly(ADP-ribosyl)ation, ubiquitination, SUMOylation, acetylation, and methylation of six typical PTMs and summarize PTMs of the key proteins in DNA repair, providing important insight into the role of PTMs in the maintenance of genome stability and contributing to reveal new and selective therapeutic approaches to target cancers.


2021 ◽  
Author(s):  
Nikolaj Riis Christensen ◽  
Christian Parsbæk Pedersen ◽  
Vita Sereikaite ◽  
Jannik Nedergaard Pedersen ◽  
Maria Vistrup-Parry ◽  
...  

SUMMARYThe organization of the postsynaptic density (PSD), a protein-dense semi-membraneless organelle, is mediated by numerous specific protein-protein interactions (PPIs) which constitute a functional post-synapse. Postsynaptic density protein 95 (PSD-95) interacts with a manifold of proteins, including the C-terminal of transmembrane AMPA receptor (AMAPR) regulatory proteins (TARPs). Here, we uncover the minimal essential peptide responsible for the stargazin (TARP-γ2) mediated liquid-liquid phase separation (LLPS) formation of PSD-95 and other key protein constituents of the PSD. Furthermore, we find that pharmacological inhibitors of PSD-95 can facilitate formation of LLPS. We found that in some cases LLPS formation is dependent on multivalent interactions while in other cases short peptides carrying a high charge are sufficient to promote LLPS in complex systems. This study offers a new perspective on PSD-95 interactions and their role in LLPS formation, while also considering the role of affinity over multivalency in LLPS systems.


2006 ◽  
Vol 398 (1) ◽  
pp. 63-71 ◽  
Author(s):  
Prim de Bie ◽  
Bart van de Sluis ◽  
Ezra Burstein ◽  
Karen J. Duran ◽  
Ruud Berger ◽  
...  

COMMD [copper metabolism gene MURR1 (mouse U2af1-rs1 region 1) domain] proteins constitute a recently identified family of NF-κB (nuclear factor κB)-inhibiting proteins, characterized by the presence of the COMM domain. In the present paper, we report detailed investigation of the role of this protein family, and specifically the role of the COMM domain, in NF-κB signalling through characterization of protein–protein interactions involving COMMD proteins. The small ubiquitously expressed COMMD6 consists primarily of the COMM domain. Therefore COMMD1 and COMMD6 were analysed further as prototype members of the COMMD protein family. Using specific antisera, interaction between endogenous COMMD1 and COMMD6 is described. This interaction was verified by independent techniques, appeared to be direct and could be detected throughout the whole cell, including the nucleus. Both proteins inhibit TNF (tumour necrosis factor)-induced NF-κB activation in a non-synergistic manner. Mutation of the amino acid residues Trp24 and Pro41 in the COMM domain of COMMD6 completely abolished the inhibitory effect of COMMD6 on TNF-induced NF-κB activation, but this was not accompanied by loss of interaction with COMMD1, COMMD6 or the NF-κB subunit RelA. In contrast with COMMD1, COMMD6 does not bind to IκBα (inhibitory κBα), indicating that both proteins inhibit NF-κB in an overlapping, but not completely similar, manner. Taken together, these data support the significance of COMMD protein–protein interactions and provide new mechanistic insight into the function of this protein family in NF-κB signalling.


Sign in / Sign up

Export Citation Format

Share Document