scholarly journals Improved prediction of protein-protein interactions using AlphaFold2

Author(s):  
Patrick Bryant ◽  
Gabriele Pozzati ◽  
Arne Elofsson

Abstract Predicting the structure of interacting protein chains is fundamental for understanding the function of proteins. Here, we examine the use of AlphaFold2 (AF2) for predicting the structure of heterodimeric protein complexes. We find that using the default AF2 protocol, 44% of the models in a test set can be predicted accurately. However, by optimising the multiple sequence alignment, we can increase the accuracy to 59%. In comparison, the alternative fold-and-dock method RoseTTAFold is only successful in 10% of the cases on this set, template-based docking 35% and traditional docking methods 22%. We can distinguish acceptable (DockQ>0.23) from incorrect models with an AUC of 0.85 on the test set by analysing the predicted interfaces. The success is higher for bacterial protein pairs, pairs with large interaction areas consisting of helices or sheets, and many homologous sequences. Further, we test the possibility to distinguish interacting from non-interacting proteins and find that by analysing the predicted interfaces, we can separate truly interacting from non-interacting proteins with an AUC of 0.82 in the ROC curve, compared to 0.76 with a recently published method. In addition, when using a more realistic negative set, including mammalian proteins, the identification rate remains (AUC=0.83), resulting in that 27% of interactions can be identified at a 1% FPR. All scripts and tools to run our protocol are freely available at: https://gitlab.com/ElofssonLab/FoldDock.

2021 ◽  
Author(s):  
Patrick Bryant ◽  
Gabriele Pozzati ◽  
Arne Elofsson

Predicting the structure of single-chain proteins is now close to being a solved problem due to the recent achievement of AlphaFold2 (AF2). However, predicting the structure of interacting protein chains is still a challenge. Here, we utilise AF2 to optimise a protocol for predicting the structure of heterodimeric protein complexes using only sequence information. We find that using the default AF2 protocol, 32% of the models in the Dockground test set can be modelled accurately. By tuning the input alignment and identifying the best model, we adjusted the performance to 43%. Our protocol uses MSAs generated by AF2 and MSAs paired on the organism level generated with HHblits. In a more extensive, more realistic, independent test set, the accuracy is 59%. In comparison, the alternative fold-and-dock method RoseTTAFold is only successful in 10% of the cases on this set and traditional docking methods 22%. However, for the traditional method, the performance would be lower if the bound form of both monomers was not known. The success is higher for bacterial protein pairs, pairs with large interaction areas consisting of helices or sheets, and many homologous sequences. We can distinguish acceptable (DockQ>0.23) from incorrect models with an AUC of 0.84 on the test set by analysing the predicted interfaces. At an error rate of 1%, 13% are acceptable (at a 10% error rate, 40% of the models are acceptable). All scripts and tools to run our protocol are freely available at: https://gitlab.com/ElofssonLab/FoldDock.


2016 ◽  
Vol 113 (52) ◽  
pp. 15018-15023 ◽  
Author(s):  
Juan Rodriguez-Rivas ◽  
Simone Marsili ◽  
David Juan ◽  
Alfonso Valencia

Protein–protein interactions are fundamental for the proper functioning of the cell. As a result, protein interaction surfaces are subject to strong evolutionary constraints. Recent developments have shown that residue coevolution provides accurate predictions of heterodimeric protein interfaces from sequence information. So far these approaches have been limited to the analysis of families of prokaryotic complexes for which large multiple sequence alignments of homologous sequences can be compiled. We explore the hypothesis that coevolution points to structurally conserved contacts at protein–protein interfaces, which can be reliably projected to homologous complexes with distantly related sequences. We introduce a domain-centered protocol to study the interplay between residue coevolution and structural conservation of protein–protein interfaces. We show that sequence-based coevolutionary analysis systematically identifies residue contacts at prokaryotic interfaces that are structurally conserved at the interface of their eukaryotic counterparts. In turn, this allows the prediction of conserved contacts at eukaryotic protein–protein interfaces with high confidence using solely mutational patterns extracted from prokaryotic genomes. Even in the context of high divergence in sequence (the twilight zone), where standard homology modeling of protein complexes is unreliable, our approach provides sequence-based accurate information about specific details of protein interactions at the residue level. Selected examples of the application of prokaryotic coevolutionary analysis to the prediction of eukaryotic interfaces further illustrate the potential of this approach.


2010 ◽  
Vol 38 (4) ◽  
pp. 940-946 ◽  
Author(s):  
Parvez I. Haris

For most biophysical techniques, characterization of protein–protein interactions is challenging; this is especially true with methods that rely on a physical phenomenon that is common to both of the interacting proteins. Thus, for example, in IR spectroscopy, the carbonyl vibration (1600–1700 cm−1) associated with the amide bonds from both of the interacting proteins will overlap extensively, making the interpretation of spectral changes very complicated. Isotope-edited infrared spectroscopy, where one of the interacting proteins is uniformly labelled with 13C or 13C,15N has been introduced as a solution to this problem, enabling the study of protein–protein interactions using IR spectroscopy. The large shift of the amide I band (approx. 45 cm−1 towards lower frequency) upon 13C labelling of one of the proteins reveals the amide I band of the unlabelled protein, enabling it to be used as a probe for monitoring conformational changes. With site-specific isotopic labelling, structural resolution at the level of individual amino acid residues can be achieved. Furthermore, the ability to record IR spectra of proteins in diverse environments means that isotope-edited IR spectroscopy can be used to structurally characterize difficult systems such as protein–protein complexes bound to membranes or large insoluble peptide/protein aggregates. In the present article, examples of application of isotope-edited IR spectroscopy for studying protein–protein interactions are provided.


2016 ◽  
Author(s):  
Juan Rodriguez-Rivas ◽  
Simone Marsili ◽  
David Juan ◽  
Alfonso Valencia

AbstractProtein-protein interactions are fundamental for the proper functioning of the cell. As a result, protein interaction surfaces are subject to strong evolutionary constraints. Recent developments have shown that residue co-evolution provides accurate predictions of heterodimeric protein interfaces from sequence information. So far these approaches have been limited to the analysis of families of prokaryotic complexes for which large multiple sequence alignments of homologous sequences can be compiled. We explore the hypothesis that co-evolution points to structurally conserved contacts at protein-protein interfaces, which can be reliably projected to homologous complexes with distantly related sequences. We introduce a novel domain-centred protocol to study the interplay between residue co-evolution and structural conservation of protein-protein interfaces. We show that sequence-based co-evolutionary analysis systematically identifies residue contacts at prokaryotic interfaces that are structurally conserved at the interface of their eukaryotic counterparts. In turn, this allows the prediction of conserved contacts at eukaryotic protein-protein interfaces with high confidence using solely mutational patterns extracted from prokaryotic genomes. Even in the context of high divergence in sequence, where standard homology modelling of protein complexes is unreliable, our approach provides sequence-based accurate information about specific details of protein interactions at the residue level. Selected examples of the application of prokaryotic co-evolutionary analysis to the prediction of eukaryotic interfaces further illustrates the potential of this novel approach.Significance statementInteracting proteins tend to co-evolve through interdependent changes at the interaction interface. This phenomenon leads to patterns of coordinated mutations that can be exploited to systematically predict contacts between interacting proteins in prokaryotes. We explore the hypothesis that co-evolving contacts at protein interfaces are preferentially conserved through long evolutionary periods. We demonstrate that co-evolving residues in prokaryotes identify inter-protein contacts that are particularly well conserved in the corresponding structure of their eukaryotic homologues. Therefore, these contacts have likely been important to maintain protein-protein interactions during evolution. We show that this property can be used to reliably predict interacting residues between eukaryotic proteins with homologues in prokaryotes even if they are very distantly related in sequence.


2014 ◽  
Vol 2014 ◽  
pp. 1-6 ◽  
Author(s):  
Lei Yang ◽  
Xianglong Tang

Cliques (maximal complete subnets) in protein-protein interaction (PPI) network are an important resource used to analyze protein complexes and functional modules. Clique-based methods of predicting PPI complement the data defection from biological experiments. However, clique-based predicting methods only depend on the topology of network. The false-positive and false-negative interactions in a network usually interfere with prediction. Therefore, we propose a method combining clique-based method of prediction and gene ontology (GO) annotations to overcome the shortcoming and improve the accuracy of predictions. According to different GO correcting rules, we generate two predicted interaction sets which guarantee the quality and quantity of predicted protein interactions. The proposed method is applied to the PPI network from the Database of Interacting Proteins (DIP) and most of the predicted interactions are verified by another biological database, BioGRID. The predicted protein interactions are appended to the original protein network, which leads to clique extension and shows the significance of biological meaning.


2014 ◽  
Author(s):  
Thomas A. Hopf ◽  
Charlotta P.I. Schärfe ◽  
João P.G.L.M. Rodrigues ◽  
Anna G. Green ◽  
Chris Sander ◽  
...  

Protein-protein interactions are fundamental to many biological processes. Experimental screens have identified tens of thousands of interactions and structural biology has provided detailed functional insight for select 3D protein complexes. An alternative rich source of information about protein interactions is the evolutionary sequence record. Building on earlier work, we show that analysis of correlated evolutionary sequence changes across proteins identifies residues that are close in space with sufficient accuracy to determine the three-dimensional structure of the protein complexes. We evaluate prediction performance in blinded tests on 76 complexes of known 3D structure, predict protein-protein contacts in 32 complexes of unknown structure, and demonstrate how evolutionary couplings can be used to distinguish between interacting and non-interacting protein pairs in a large complex. With the current growth of sequence databases, we expect that the method can be generalized to genome-wide elucidation of protein-protein interaction networks and used for interaction predictions at residue resolution.


2021 ◽  
Author(s):  
Mu Gao ◽  
Davi Nakajima An ◽  
Jerry M Parks ◽  
Jeffrey Skolnick

Accurate descriptions of protein-protein interactions are essential for understanding biological systems. Very recently, AlphaFold2 has been shown to be remarkably accurate for predicting the atomic structures of individual proteins. Here, we demonstrate that the same neural network models developed for AlphaFold2 can be adapted to predict the structures of multimeric protein complexes without retraining. In contrast to common approaches that require paired multiple sequence alignments, our method, AF2Complex, works without using such paired alignments. It achieves higher accuracy than complex strategies that combine AlphaFold2 and protein-protein docking. New metrics are then introduced for predicting direct protein-protein interactions between arbitrary protein pairs. The approach is successfully validated on some challenging CASP14 multimeric targets, a small but appropriate benchmark set, and the E. coli proteome. Lastly, using the cytochrome c biogenesis system as an example, we present high-confidence models of three sought-after assemblies formed by eight members of this system.


2018 ◽  
Author(s):  
Daniela Boassa ◽  
Sakina P. Lemieux ◽  
Varda Lev-Ram ◽  
Junru Hu ◽  
Qing Xiong ◽  
...  

AbstractA protein complementation assay (PCA) for detecting and localizing intracellular protein-protein interactions (PPIs) was built by bisection of miniSOG, a fluorescent flavoprotein derived from the light, oxygen, voltage (LOV)-2 domain of Arabidopsis phototropin. When brought together by interacting proteins, the fragments reconstitute a functional reporter that permits tagged protein complexes to be visualized by fluorescence light microscopy (LM), and then by standard as well as “multicolor” electron microscopy (EM) imaging methods via the photooxidation of 3-3’-diaminobenzidine (DAB) and its lanthanide-conjugated derivatives.


eLife ◽  
2014 ◽  
Vol 3 ◽  
Author(s):  
Thomas A Hopf ◽  
Charlotta P I Schärfe ◽  
João P G L M Rodrigues ◽  
Anna G Green ◽  
Oliver Kohlbacher ◽  
...  

Protein–protein interactions are fundamental to many biological processes. Experimental screens have identified tens of thousands of interactions, and structural biology has provided detailed functional insight for select 3D protein complexes. An alternative rich source of information about protein interactions is the evolutionary sequence record. Building on earlier work, we show that analysis of correlated evolutionary sequence changes across proteins identifies residues that are close in space with sufficient accuracy to determine the three-dimensional structure of the protein complexes. We evaluate prediction performance in blinded tests on 76 complexes of known 3D structure, predict protein–protein contacts in 32 complexes of unknown structure, and demonstrate how evolutionary couplings can be used to distinguish between interacting and non-interacting protein pairs in a large complex. With the current growth of sequences, we expect that the method can be generalized to genome-wide elucidation of protein–protein interaction networks and used for interaction predictions at residue resolution.


2019 ◽  
Vol 26 (21) ◽  
pp. 3890-3910 ◽  
Author(s):  
Branislava Gemovic ◽  
Neven Sumonja ◽  
Radoslav Davidovic ◽  
Vladimir Perovic ◽  
Nevena Veljkovic

Background: The significant number of protein-protein interactions (PPIs) discovered by harnessing concomitant advances in the fields of sequencing, crystallography, spectrometry and two-hybrid screening suggests astonishing prospects for remodelling drug discovery. The PPI space which includes up to 650 000 entities is a remarkable reservoir of potential therapeutic targets for every human disease. In order to allow modern drug discovery programs to leverage this, we should be able to discern complete PPI maps associated with a specific disorder and corresponding normal physiology. Objective: Here, we will review community available computational programs for predicting PPIs and web-based resources for storing experimentally annotated interactions. Methods: We compared the capacities of prediction tools: iLoops, Struck2Net, HOMCOS, COTH, PrePPI, InterPreTS and PRISM to predict recently discovered protein interactions. Results: We described sequence-based and structure-based PPI prediction tools and addressed their peculiarities. Additionally, since the usefulness of prediction algorithms critically depends on the quality and quantity of the experimental data they are built on; we extensively discussed community resources for protein interactions. We focused on the active and recently updated primary and secondary PPI databases, repositories specialized to the subject or species, as well as databases that include both experimental and predicted PPIs. Conclusion: PPI complexes are the basis of important physiological processes and therefore, possible targets for cell-penetrating ligands. Reliable computational PPI predictions can speed up new target discoveries through prioritization of therapeutically relevant protein–protein complexes for experimental studies.


Sign in / Sign up

Export Citation Format

Share Document