Sequence co-evolution gives 3D contacts and structures of protein complexes

Protein–protein interactions are fundamental to many biological processes. Experimental screens have identified tens of thousands of interactions, and structural biology has provided detailed functional insight for select 3D protein complexes. An alternative rich source of information about protein interactions is the evolutionary sequence record. Building on earlier work, we show that analysis of correlated evolutionary sequence changes across proteins identifies residues that are close in space with sufficient accuracy to determine the three-dimensional structure of the protein complexes. We evaluate prediction performance in blinded tests on 76 complexes of known 3D structure, predict protein–protein contacts in 32 complexes of unknown structure, and demonstrate how evolutionary couplings can be used to distinguish between interacting and non-interacting protein pairs in a large complex. With the current growth of sequences, we expect that the method can be generalized to genome-wide elucidation of protein–protein interaction networks and used for interaction predictions at residue resolution.

Download Full-text

Sequence co-evolution gives 3D contacts and structures of protein complexes

10.1101/004762 ◽

2014 ◽

Cited By ~ 4

Author(s):

Thomas A. Hopf ◽

Charlotta P.I. Schärfe ◽

João P.G.L.M. Rodrigues ◽

Anna G. Green ◽

Chris Sander ◽

...

Keyword(s):

Protein Interactions ◽

Protein Complexes ◽

3D Structure ◽

Three Dimensional ◽

Dimensional Structure ◽

Evolutionary Sequence ◽

Protein Protein Interactions ◽

Interacting Protein ◽

Protein Protein Interaction ◽

Unknown Structure

Protein-protein interactions are fundamental to many biological processes. Experimental screens have identified tens of thousands of interactions and structural biology has provided detailed functional insight for select 3D protein complexes. An alternative rich source of information about protein interactions is the evolutionary sequence record. Building on earlier work, we show that analysis of correlated evolutionary sequence changes across proteins identifies residues that are close in space with sufficient accuracy to determine the three-dimensional structure of the protein complexes. We evaluate prediction performance in blinded tests on 76 complexes of known 3D structure, predict protein-protein contacts in 32 complexes of unknown structure, and demonstrate how evolutionary couplings can be used to distinguish between interacting and non-interacting protein pairs in a large complex. With the current growth of sequence databases, we expect that the method can be generalized to genome-wide elucidation of protein-protein interaction networks and used for interaction predictions at residue resolution.

Download Full-text

Protein Interaction Z Score Assessment (PIZSA): an empirical scoring scheme for evaluation of protein–protein interactions

Nucleic Acids Research ◽

10.1093/nar/gkz368 ◽

2019 ◽

Vol 47 (W1) ◽

pp. W331-W337 ◽

Cited By ~ 3

Author(s):

Ankit A Roy ◽

Abhilesh S Dhawanjewar ◽

Parichit Sharma ◽

Gulzar Singh ◽

M S Madhusudhan

Keyword(s):

Protein Interactions ◽

Protein Complexes ◽

Three Dimensional ◽

Optimal Number ◽

Dimensional Structure ◽

Side Chain ◽

Interface Residue ◽

Z Score ◽

Protein Protein Interactions ◽

Residue Contacts

Abstract Our web server, PIZSA (http://cospi.iiserpune.ac.in/pizsa), assesses the likelihood of protein–protein interactions by assigning a Z Score computed from interface residue contacts. Our score takes into account the optimal number of atoms that mediate the interaction between pairs of residues and whether these contacts emanate from the main chain or side chain. We tested the score on 174 native interactions for which 100 decoys each were constructed using ZDOCK. The native structure scored better than any of the decoys in 146 cases and was able to rank within the 95th percentile in 162 cases. This easily outperforms a competing method, CIPS. We also benchmarked our scoring scheme on 15 targets from the CAPRI dataset and found that our method had results comparable to that of CIPS. Further, our method is able to analyse higher order protein complexes without the need to explicitly identify chains as receptors or ligands. The PIZSA server is easy to use and could be used to score any input three-dimensional structure and provide a residue pair-wise break up of the results. Attractively, our server offers a platform for users to upload their own potentials and could serve as an ideal testing ground for this class of scoring schemes.

Download Full-text

Protein-protein docking using learned three-dimensional representations

10.1101/738690 ◽

2019 ◽

Cited By ~ 1

Author(s):

Georgy Derevyanko ◽

Guillaume Lamoureux

Keyword(s):

Protein Interactions ◽

Network Architecture ◽

Protein Complexes ◽

Three Dimensional ◽

Spatial Arrangement ◽

Protein Docking ◽

Protein Protein Interactions ◽

Translational Invariance ◽

Shape Complementarity ◽

Spatial Features

AbstractProtein-protein interactions are determined by a number of hard-to-capture features related to shape complementarity, electrostatics, and hydrophobicity. These features may be intrinsic to the protein or induced by the presence of a partner. A conventional approach to protein-protein docking consists in engineering a small number of spatial features for each protein, and in minimizing the sum of their correlations with respect to the spatial arrangement of the two proteins. To generalize this approach, we introduce a deep neural network architecture that transforms the raw atomic densities of each protein into complex three-dimensional representations. Each point in the volume containing the protein is described by 48 learned features, which are correlated and combined with the features of a second protein to produce a score dependent on the relative position and orientation of the two proteins. The architecture is based on multiple layers of SE(3)-equivariant convolutional neural networks, which provide built-in rotational and translational invariance of the score with respect to the structure of the complex. The model is trained end-to-end on a set of decoy conformations generated from 851 nonredundant protein-protein complexes and is tested on data from the Protein-Protein Docking Benchmark Version 4.0.

Download Full-text

Furan warheads for covalent trapping of weak protein-protein interactions: cross-linking of thymosin β4 to actin

Chemical Communications ◽

10.1039/d1cc01731d ◽

2021 ◽

Author(s):

Laia Miret Casals ◽

Willem Vannecke ◽

Kurt Hoogewijs ◽

Gianluca Arauz ◽

Marina Gay ◽

...

Keyword(s):

Protein Interaction ◽

Protein Interactions ◽

3D Structure ◽

Cross Linking ◽

Thymosin Β4 ◽

Protein Protein Interactions ◽

Site Specific ◽

Protein Protein Interaction ◽

Covalent Trapping

We describe furan as a triggerable ‘warhead’ for site-specific cross-linking using the actin and thymosin β4 (Tβ4)-complex as model of a weak and dynamic protein-protein interaction with known 3D structure...

Download Full-text

Mass spectrometry-based methods for analysing the mitochondrial interactome in mammalian cells

The Journal of Biochemistry ◽

10.1093/jb/mvz090 ◽

2019 ◽

Vol 167 (3) ◽

pp. 225-231 ◽

Cited By ~ 2

Author(s):

Takumi Koshiba ◽

Hidetaka Kosako

Keyword(s):

Mass Spectrometry ◽

Protein Interactions ◽

Mammalian Cells ◽

Protein Complexes ◽

Living Cells ◽

Protein Protein Interactions ◽

Cellular Functions ◽

Protein Protein Interaction ◽

Membrane Protein Complexes ◽

Protein Protein Interaction Networks

Abstract Protein–protein interactions are essential biologic processes that occur at inter- and intracellular levels. To gain insight into the various complex cellular functions of these interactions, it is necessary to assess them under physiologic conditions. Recent advances in various proteomic technologies allow to investigate protein–protein interaction networks in living cells. The combination of proximity-dependent labelling and chemical cross-linking will greatly enhance our understanding of multi-protein complexes that are difficult to prepare, such as organelle-bound membrane proteins. In this review, we describe our current understanding of mass spectrometry-based proteomics mapping methods for elucidating organelle-bound membrane protein complexes in living cells, with a focus on protein–protein interactions in mitochondrial subcellular compartments.

Download Full-text

Evolutionary diversification of protein–protein interactions by interface add-ons

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1707335114 ◽

2017 ◽

Vol 114 (40) ◽

pp. E8333-E8342 ◽

Cited By ~ 13

Author(s):

Maximilian G. Plach ◽

Florian Semmelmann ◽

Florian Busch ◽

Markus Busch ◽

Leonhard Heizinger ◽

...

Keyword(s):

Protein Interaction ◽

Protein Interactions ◽

Protein Complexes ◽

Evolutionary Strategy ◽

Structural Level ◽

Protein Protein Interactions ◽

Protein Protein Interaction ◽

Evolutionary Diversification

Cells contain a multitude of protein complexes whose subunits interact with high specificity. However, the number of different protein folds and interface geometries found in nature is limited. This raises the question of how protein–protein interaction specificity is achieved on the structural level and how the formation of nonphysiological complexes is avoided. Here, we describe structural elements called interface add-ons that fulfill this function and elucidate their role for the diversification of protein–protein interactions during evolution. We identified interface add-ons in 10% of a representative set of bacterial, heteromeric protein complexes. The importance of interface add-ons for protein–protein interaction specificity is demonstrated by an exemplary experimental characterization of over 30 cognate and hybrid glutamine amidotransferase complexes in combination with comprehensive genetic profiling and protein design. Moreover, growth experiments showed that the lack of interface add-ons can lead to physiologically harmful cross-talk between essential biosynthetic pathways. In sum, our complementary in silico, in vitro, and in vivo analysis argues that interface add-ons are a practical and widespread evolutionary strategy to prevent the formation of nonphysiological complexes by specializing protein–protein interactions.

Download Full-text

Inferring interaction partners from protein sequences

10.1101/050732 ◽

2016 ◽

Cited By ~ 2

Author(s):

Anne-Florence Bitbol ◽

Robert S. Dwyer ◽

Lucy J. Colwell ◽

Ned S. Wingreen

Keyword(s):

Protein Interactions ◽

Sequence Data ◽

Protein Complexes ◽

A Priori ◽

Specific Interaction ◽

Three Dimensional ◽

Specific Protein ◽

Protein Protein Interactions ◽

Entropy Model ◽

Interaction Partners

Specific protein-protein interactions are crucial in the cell, both to ensure the formation and stability of multi-protein complexes, and to enable signal transduction in various pathways. Functional interactions between proteins result in coevolution between the interaction partners. Hence, the sequences of interacting partners are correlated. Here we exploit these correlations to accurately identify which proteins are specific interaction partners from sequence data alone. Our general approach, which employs a pairwise maximum entropy model to infer direct couplings between residues, has been successfully used to predict the three-dimensional structures of proteins from sequences. Building on this approach, we introduce an iterative algorithm to predict specific interaction partners from among the members of two protein families. We assess the algorithm's performance on histidine kinases and response regulators from bacterial two-component signaling systems. The algorithm proves successful without any a priori knowledge of interaction partners, yielding a striking 0.93 true positive fraction on our complete dataset, and we uncover the origin of this surprising success. Finally, we discuss how our method could be used to predict novel protein-protein interactions.

Download Full-text

Inferring interaction partners from protein sequences using mutual information

10.1101/378042 ◽

2018 ◽

Author(s):

Anne-Florence Bitbol

Keyword(s):

Mutual Information ◽

Protein Interactions ◽

Sequence Data ◽

Protein Complexes ◽

Three Dimensional ◽

Specific Protein ◽

Amino Acid Residues ◽

Protein Protein Interactions ◽

Cellular Processes ◽

Interaction Partners

AbstractSpecific protein-protein interactions are crucial in most cellular processes. They enable multiprotein complexes to assemble and to remain stable, and they allow signal transduction in various pathways. Functional interactions between proteins result in coevolution between the interacting partners, and thus in correlations between their sequences. Pairwise maximum-entropy based models have enabled successful inference of pairs of amino-acid residues that are in contact in the three-dimensional structure of multi-protein complexes, starting from the correlations in the sequence data of known interaction partners. Recently, algorithms inspired by these methods have been developed to identify which proteins are specific interaction partners among the paralogous proteins of two families, starting from sequence data alone. Here, we demonstrate that a slightly higher performance for partner identification can be reached by an approximate maximization of the mutual information between the sequence alignments of the two protein families. This stands in contrast with structure prediction of proteins and of multiprotein complexes from sequence data, where pairwise maximum-entropy based global statistical models substantially improve performance compared to mutual information. Our findings entail that the statistical dependences allowing interaction partner prediction from sequence data are not restricted to the residue pairs that are in direct contact at the interface between the partner proteins.Author summarySpecific protein-protein interactions are at the heart of most intra-cellular processes. Mapping these interactions is thus crucial to a systems-level understanding of cells, and has broad applications to areas such as drug targeting. Systematic experimental identification of protein interaction partners is still challenging. However, a large and rapidly growing amount of sequence data is now available. Recently, algorithms have been proposed to identify which proteins interact from their sequences alone, thanks to the co-variation of the sequences of interacting proteins. These algorithms build upon inference methods that have been used with success to predict the three-dimensional structures of proteins and multi-protein complexes, and their focus is on the amino-acid residues that are in direct contact. Here, we propose a simpler method to identify which proteins interact among the paralogous proteins of two families, starting from their sequences alone. Our method relies on an approximate maximization of mutual information between the sequences of the two families, without specifically emphasizing the contacting residue pairs. We demonstrate that this method slightly outperforms the earlier one. This result highlights that partner prediction does not only rely on the identities and interactions of directly contacting amino-acids.

Download Full-text

Three-dimensional structure of a regular bacterial surface layer: The HPI-layer of Deinococcus radiodurans

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100075920 ◽

1983 ◽

Vol 41 ◽

pp. 438-439

Author(s):

W. Baumeister ◽

M. Hahn ◽

W.O. Saxton

Keyword(s):

Outer Membrane ◽

Protein Interactions ◽

Deinococcus Radiodurans ◽

Bacterial Species ◽

Hexagonal Lattice ◽

Three Dimensional ◽

Dimensional Structure ◽

Protein Protein Interactions ◽

Bacterial Surface ◽

Freeze Dried

Regularly organized surface (RS) layers are a feature common to many bacterial species; they are clearly more abundant than was anticipated even a few years ago. The RS-layers are believed to fulfil a variety of functions in the interaction between the cell and its environment (see e.g. [1]). The so-called HPI-layer of the radiotolerant bacterium Deinococcus radiodurans is a typical example of such a layer: It is composed of a single polypeptide species (Mr 105 kDa) arranged on a hexagonal lattice to form a network that covers the entire surface of the bacterium; it is associated with the outer membrane via hydrophobic protein-protein interactions.Isolated HPI-layer sheets, released from the outer membrane by detergent treatment, have been studied in the electron microscope making extensive use of the present arsenal of preparation techniques: negative staining, (auro- thio)glucose embedding, freeze-dried/unstained, freeze-dried/metal shadowed etc.Because of the notorious problem of lattice imperfections image processing usually followed the strategy of correlation averaging as outlined in some detail elsewhere.

Download Full-text

PPI_SVM: Prediction of protein-protein interactions using machine learning, domain-domain affinities and frequency tables

Cellular & Molecular Biology Letters ◽

10.2478/s11658-011-0008-x ◽

2011 ◽

Vol 16 (2) ◽

Cited By ~ 41

Author(s):

Piyali Chatterjee ◽

Subhadip Basu ◽

Mahantapas Kundu ◽

Mita Nasipuri ◽

Dariusz Plewczynski

Keyword(s):

Machine Learning ◽

Protein Interactions ◽

Three Dimensional ◽

Prediction Method ◽

Protein Sequences ◽

Dimensional Structure ◽

Support Vector ◽

Interacting Proteins ◽

Protein Protein Interactions ◽

Protein Functions

AbstractProtein-protein interactions (PPI) control most of the biological processes in a living cell. In order to fully understand protein functions, a knowledge of protein-protein interactions is necessary. Prediction of PPI is challenging, especially when the three-dimensional structure of interacting partners is not known. Recently, a novel prediction method was proposed by exploiting physical interactions of constituent domains. We propose here a novel knowledge-based prediction method, namely PPI_SVM, which predicts interactions between two protein sequences by exploiting their domain information. We trained a two-class support vector machine on the benchmarking set of pairs of interacting proteins extracted from the Database of Interacting Proteins (DIP). The method considers all possible combinations of constituent domains between two protein sequences, unlike most of the existing approaches. Moreover, it deals with both single-domain proteins and multi domain proteins; therefore it can be applied to the whole proteome in high-throughput studies. Our machine learning classifier, following a brainstorming approach, achieves accuracy of 86%, with specificity of 95%, and sensitivity of 75%, which are better results than most previous methods that sacrifice recall values in order to boost the overall precision. Our method has on average better sensitivity combined with good selectivity on the benchmarking dataset. The PPI_SVM source code, train/test datasets and supplementary files are available freely in the public domain at: http://code.google.com/p/cmater-bioinfo/.

Download Full-text