scholarly journals Benchmarking AlphaFold for protein complex modeling reveals accuracy determinants

2021 ◽  
Author(s):  
Rui Yin ◽  
Brandon Y Feng ◽  
Amitabh Varshney ◽  
Brian G Pierce

High resolution experimental structural determination of protein-protein interactions has led to valuable mechanistic insights, yet due to the massive number of interactions and experimental limitations there is a need for computational methods that can accurately model their structures. Here we explore the use of the recently developed deep learning method, AlphaFold, to predict structures of protein complexes from sequence. With a benchmark of 152 diverse heterodimeric protein complexes, multiple implementations and parameters of AlphaFold were tested for accuracy. Remarkably, many cases had highly accurate models generated as top-ranked predictions, greatly surpassing the performance of unbound protein-protein docking, whereas antibody-antigen docking was largely unsuccessful. While AlphaFold-generated accuracy predictions were able to discriminate near-native models, previously developed scoring protocols improved performance. Our study demonstrates that end-to-end deep learning can accurately model transient protein complexes, and identifies areas for improvement to guide future developments to reliably model any protein-protein interaction of interest.

2020 ◽  
Vol 36 (8) ◽  
pp. 2458-2465 ◽  
Author(s):  
Isak Johansson-Åkhe ◽  
Claudio Mirabello ◽  
Björn Wallner

Abstract Motivation Interactions between proteins and peptides or peptide-like intrinsically disordered regions are involved in many important biological processes, such as gene expression and cell life-cycle regulation. Experimentally determining the structure of such interactions is time-consuming and difficult because of the inherent flexibility of the peptide ligand. Although several prediction-methods exist, most are limited in performance or availability. Results InterPep2 is a freely available method for predicting the structure of peptide–protein interactions. Improved performance is obtained by using templates from both peptide–protein and regular protein–protein interactions, and by a random forest trained to predict the DockQ-score for a given template using sequence and structural features. When tested on 252 bound peptide–protein complexes from structures deposited after the complexes used in the construction of the training and templates sets of InterPep2, InterPep2-Refined correctly positioned 67 peptides within 4.0 Å LRMSD among top10, similar to another state-of-the-art template-based method which positioned 54 peptides correctly. However, InterPep2 displays a superior ability to evaluate the quality of its own predictions. On a previously established set of 27 non-redundant unbound-to-bound peptide–protein complexes, InterPep2 performs on-par with leading methods. The extended InterPep2-Refined protocol managed to correctly model 15 of these complexes within 4.0 Å LRMSD among top10, without using templates from homologs. In addition, combining the template-based predictions from InterPep2 with ab initio predictions from PIPER-FlexPepDock resulted in 22% more near-native predictions compared to the best single method (22 versus 18). Availability and implementation The program is available from: http://wallnerlab.org/InterPep2. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 20 (10) ◽  
pp. 855-882
Author(s):  
Olivia Slater ◽  
Bethany Miller ◽  
Maria Kontoyianni

Drug discovery has focused on the paradigm “one drug, one target” for a long time. However, small molecules can act at multiple macromolecular targets, which serves as the basis for drug repurposing. In an effort to expand the target space, and given advances in X-ray crystallography, protein-protein interactions have become an emerging focus area of drug discovery enterprises. Proteins interact with other biomolecules and it is this intricate network of interactions that determines the behavior of the system and its biological processes. In this review, we briefly discuss networks in disease, followed by computational methods for protein-protein complex prediction. Computational methodologies and techniques employed towards objectives such as protein-protein docking, protein-protein interactions, and interface predictions are described extensively. Docking aims at producing a complex between proteins, while interface predictions identify a subset of residues on one protein that could interact with a partner, and protein-protein interaction sites address whether two proteins interact. In addition, approaches to predict hot spots and binding sites are presented along with a representative example of our internal project on the chemokine CXC receptor 3 B-isoform and predictive modeling with IP10 and PF4.


2017 ◽  
Vol 114 (9) ◽  
pp. 2224-2229 ◽  
Author(s):  
Daniel A. Weisz ◽  
Haijun Liu ◽  
Hao Zhang ◽  
Sundarapandian Thangapandian ◽  
Emad Tajkhorshid ◽  
...  

Photosystem II (PSII), a large pigment protein complex, undergoes rapid turnover under natural conditions. During assembly of PSII, oxidative damage to vulnerable assembly intermediate complexes must be prevented. Psb28, the only cytoplasmic extrinsic protein in PSII, protects the RC47 assembly intermediate of PSII and assists its efficient conversion into functional PSII. Its role is particularly important under stress conditions when PSII damage occurs frequently. Psb28 is not found, however, in any PSII crystal structure, and its structural location has remained unknown. In this study, we used chemical cross-linking combined with mass spectrometry to capture the transient interaction of Psb28 with PSII. We detected three cross-links between Psb28 and the α- and β-subunits of cytochrome b559, an essential component of the PSII reaction-center complex. These distance restraints enable us to position Psb28 on the cytosolic surface of PSII directly above cytochrome b559, in close proximity to the QB site. Protein–protein docking results also support Psb28 binding in this region. Determination of the Psb28 binding site and other biochemical evidence allow us to propose a mechanism by which Psb28 exerts its protective effect on the RC47 intermediate. This study also shows that isotope-encoded cross-linking with the “mass tags” selection criteria allows confident identification of more cross-linked peptides in PSII than has been previously reported. This approach thus holds promise to identify other transient protein–protein interactions in membrane protein complexes.


2019 ◽  
Author(s):  
Georgy Derevyanko ◽  
Guillaume Lamoureux

AbstractProtein-protein interactions are determined by a number of hard-to-capture features related to shape complementarity, electrostatics, and hydrophobicity. These features may be intrinsic to the protein or induced by the presence of a partner. A conventional approach to protein-protein docking consists in engineering a small number of spatial features for each protein, and in minimizing the sum of their correlations with respect to the spatial arrangement of the two proteins. To generalize this approach, we introduce a deep neural network architecture that transforms the raw atomic densities of each protein into complex three-dimensional representations. Each point in the volume containing the protein is described by 48 learned features, which are correlated and combined with the features of a second protein to produce a score dependent on the relative position and orientation of the two proteins. The architecture is based on multiple layers of SE(3)-equivariant convolutional neural networks, which provide built-in rotational and translational invariance of the score with respect to the structure of the complex. The model is trained end-to-end on a set of decoy conformations generated from 851 nonredundant protein-protein complexes and is tested on data from the Protein-Protein Docking Benchmark Version 4.0.


2019 ◽  
Vol 167 (3) ◽  
pp. 225-231 ◽  
Author(s):  
Takumi Koshiba ◽  
Hidetaka Kosako

Abstract Protein–protein interactions are essential biologic processes that occur at inter- and intracellular levels. To gain insight into the various complex cellular functions of these interactions, it is necessary to assess them under physiologic conditions. Recent advances in various proteomic technologies allow to investigate protein–protein interaction networks in living cells. The combination of proximity-dependent labelling and chemical cross-linking will greatly enhance our understanding of multi-protein complexes that are difficult to prepare, such as organelle-bound membrane proteins. In this review, we describe our current understanding of mass spectrometry-based proteomics mapping methods for elucidating organelle-bound membrane protein complexes in living cells, with a focus on protein–protein interactions in mitochondrial subcellular compartments.


2017 ◽  
Vol 114 (40) ◽  
pp. E8333-E8342 ◽  
Author(s):  
Maximilian G. Plach ◽  
Florian Semmelmann ◽  
Florian Busch ◽  
Markus Busch ◽  
Leonhard Heizinger ◽  
...  

Cells contain a multitude of protein complexes whose subunits interact with high specificity. However, the number of different protein folds and interface geometries found in nature is limited. This raises the question of how protein–protein interaction specificity is achieved on the structural level and how the formation of nonphysiological complexes is avoided. Here, we describe structural elements called interface add-ons that fulfill this function and elucidate their role for the diversification of protein–protein interactions during evolution. We identified interface add-ons in 10% of a representative set of bacterial, heteromeric protein complexes. The importance of interface add-ons for protein–protein interaction specificity is demonstrated by an exemplary experimental characterization of over 30 cognate and hybrid glutamine amidotransferase complexes in combination with comprehensive genetic profiling and protein design. Moreover, growth experiments showed that the lack of interface add-ons can lead to physiologically harmful cross-talk between essential biosynthetic pathways. In sum, our complementary in silico, in vitro, and in vivo analysis argues that interface add-ons are a practical and widespread evolutionary strategy to prevent the formation of nonphysiological complexes by specializing protein–protein interactions.


2021 ◽  
Author(s):  
Jimin Pei ◽  
Jing Zhang ◽  
Qian Cong

AbstractRecent development of deep-learning methods has led to a breakthrough in the prediction accuracy of 3-dimensional protein structures. Extending these methods to protein pairs is expected to allow large-scale detection of protein-protein interactions and modeling protein complexes at the proteome level. We applied RoseTTAFold and AlphaFold2, two of the latest deep-learning methods for structure predictions, to analyze coevolution of human proteins residing in mitochondria, an organelle of vital importance in many cellular processes including energy production, metabolism, cell death, and antiviral response. Variations in mitochondrial proteins have been linked to a plethora of human diseases and genetic conditions. RoseTTAFold, with high computational speed, was used to predict the coevolution of about 95% of mitochondrial protein pairs. Top-ranked pairs were further subject to the modeling of the complex structures by AlphaFold2, which also produced contact probability with high precision and in many cases consistent with RoseTTAFold. Most of the top ranked pairs with high contact probability were supported by known protein-protein interactions and/or similarities to experimental structural complexes. For high-scoring pairs without experimental complex structures, our coevolution analyses and structural models shed light on the details of their interfaces, including CHCHD4-AIFM1, MTERF3-TRUB2, FMC1-ATPAF2, ECSIT-NDUFAF1 and COQ7-COQ9, among others. We also identified novel PPIs (PYURF-NDUFAF5, LYRM1-MTRF1L and COA8-COX10) for several proteins without experimentally characterized interaction partners, leading to predictions of their molecular functions and the biological processes they are involved in.


2019 ◽  
Author(s):  
Franziska Seeger ◽  
Anna Little ◽  
Yang Chen ◽  
Tina Woolf ◽  
Haiyan Cheng ◽  
...  

AbstractProtein-protein interactions regulate many essential biological processes and play an important role in health and disease. The process of experimentally charac-terizing protein residues that contribute the most to protein-protein interaction affin-ity and specificity is laborious. Thus, developing models that accurately characterize hotspots at protein-protein interfaces provides important information about how to inhibit therapeutically relevant protein-protein interactions. During the course of the ICERM WiSDM workshop 2017, we combined the KFC2a protein-protein interaction hotspot prediction features with Rosetta scoring function terms and interface filter metrics. A 2-way and 3-way forward selection strategy was employed to train support vector machine classifiers, as was a reverse feature elimination strategy. From these results, we identified subsets of KFC2a and Rosetta combined features that show improved performance over KFC2a features alone.


Author(s):  
Min Zeng ◽  
Fuhao Zhang ◽  
Fang-Xiang Wu ◽  
Yaohang Li ◽  
Jianxin Wang ◽  
...  

Abstract Motivation Protein–protein interactions (PPIs) play important roles in many biological processes. Conventional biological experiments for identifying PPI sites are costly and time-consuming. Thus, many computational approaches have been proposed to predict PPI sites. Existing computational methods usually use local contextual features to predict PPI sites. Actually, global features of protein sequences are critical for PPI site prediction. Results A new end-to-end deep learning framework, named DeepPPISP, through combining local contextual and global sequence features, is proposed for PPI site prediction. For local contextual features, we use a sliding window to capture features of neighbors of a target amino acid as in previous studies. For global sequence features, a text convolutional neural network is applied to extract features from the whole protein sequence. Then the local contextual and global sequence features are combined to predict PPI sites. By integrating local contextual and global sequence features, DeepPPISP achieves the state-of-the-art performance, which is better than the other competing methods. In order to investigate if global sequence features are helpful in our deep learning model, we remove or change some components in DeepPPISP. Detailed analyses show that global sequence features play important roles in DeepPPISP. Availability and implementation The DeepPPISP web server is available at http://bioinformatics.csu.edu.cn/PPISP/. The source code can be obtained from https://github.com/CSUBioGroup/DeepPPISP. Supplementary information Supplementary data are available at Bioinformatics online.


2003 ◽  
Vol 31 (5) ◽  
pp. 985-989 ◽  
Author(s):  
W.I. Burkitt ◽  
P.J. Derrick ◽  
D. Lafitte ◽  
I. Bronstein

Electrospray ionization has made possible the transference of non-covalently bound complexes from solution phase to high vacuum. In the process, a complex acquires a net charge and becomes amenable to measurement by MS. FTICR (Fourier-transform ion cyclotron resonance) MS allows these ions to be measured with sufficiently high resolution for the isotopomers of complexes of small proteins to be resolved from each other (true for complexes up to about 100 kDa for the most powerful FTICR instruments), which is of crucial significance in the interpretation of spectra. Results are presented for members of the S100 family of proteins, demonstrating how non-covalently bound complexes can be distinguished unambiguously from covalently bound species. Consideration relevant both to determination of binding constants in solution from the gas-phase results and to the elucidation of protein folding and unfolding in solution are discussed. The caveats inherent to the basic approach of using electrospray and MS to characterize protein complexes are weighed and evaluated.


Sign in / Sign up

Export Citation Format

Share Document