Protein interaction interface region prediction by geometric deep learning

Author(s):  
Bowen Dai ◽  
Chris Bailey-Kellogg

Abstract Motivation Protein–protein interactions drive wide-ranging molecular processes, and characterizing at the atomic level how proteins interact (beyond just the fact that they interact) can provide key insights into understanding and controlling this machinery. Unfortunately, experimental determination of three-dimensional protein complex structures remains difficult and does not scale to the increasingly large sets of proteins whose interactions are of interest. Computational methods are thus required to meet the demands of large-scale, high-throughput prediction of how proteins interact, but unfortunately, both physical modeling and machine learning methods suffer from poor precision and/or recall. Results In order to improve performance in predicting protein interaction interfaces, we leverage the best properties of both data- and physics-driven methods to develop a unified Geometric Deep Neural Network, ‘PInet’ (Protein Interface Network). PInet consumes pairs of point clouds encoding the structures of two partner proteins, in order to predict their structural regions mediating interaction. To make such predictions, PInet learns and utilizes models capturing both geometrical and physicochemical molecular surface complementarity. In application to a set of benchmarks, PInet simultaneously predicts the interface regions on both interacting proteins, achieving performance equivalent to or even much better than the state-of-the-art predictor for each dataset. Furthermore, since PInet is based on joint segmentation of a representation of a protein surfaces, its predictions are meaningful in terms of the underlying physical complementarity driving molecular recognition. Availability and implementation PInet scripts and models are available at https://github.com/FTD007/PInet. Supplementary information Supplementary data are available at Bioinformatics online.

2019 ◽  
Vol 35 (22) ◽  
pp. 4794-4796 ◽  
Author(s):  
Qingzhen Hou ◽  
Paul F G De Geest ◽  
Christian J Griffioen ◽  
Sanne Abeln ◽  
Jaap Heringa ◽  
...  

Abstract Motivation Interpretation of ubiquitous protein sequence data has become a bottleneck in biomolecular research, due to a lack of structural and other experimental annotation data for these proteins. Prediction of protein interaction sites from sequence may be a viable substitute. We therefore recently developed a sequence-based random forest method for protein–protein interface prediction, which yielded a significantly increased performance than other methods on both homomeric and heteromeric protein–protein interactions. Here, we present a webserver that implements this method efficiently. Results With the aim of accelerating our previous approach, we obtained sequence conservation profiles by re-mastering the alignment of homologous sequences found by PSI-BLAST. This yielded a more than 10-fold speedup and at least the same accuracy, as reported previously for our method; these results allowed us to offer the method as a webserver. The web-server interface is targeted to the non-expert user. The input is simply a sequence of the protein of interest, and the output a table with scores indicating the likelihood of having an interaction interface at a certain position. As the method is sequence-based and not sensitive to the type of protein interaction, we expect this webserver to be of interest to many biological researchers in academia and in industry. Availability and implementation Webserver, source code and datasets are available at www.ibi.vu.nl/programs/serendipwww/. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Author(s):  
Joseph Szymborski ◽  
Amin Emad

Motivation: Computational methods for the prediction of protein-protein interactions, while important tools for researchers, are plagued by challenges in generalising to unseen proteins. Datasets used for modelling protein-protein predictions are particularly predisposed to information leakage and sampling biases. Results: In this study, we introduce RAPPPID, a method for the Regularised Automatic Prediction of Protein-Protein Interactions using Deep Learning. RAPPPID is a twin AWD-LSTM network which employs multiple regularisation methods during training time to learn generalised weights. Testing on stringent interaction datasets composed of proteins not seen during training, RAPPPID outperforms state-of-the-art methods. Further experiments show that RAPPPID's performance holds regardless of the particular proteins in the testing set and its performance is higher for biologically supported edges. This study serves to demonstrate that appropriate regularisation is an important component of overcoming the challenges of creating models for protein-protein interaction prediction that generalise to unseen proteins. Availability and Implementation: Code and datasets are freely available at https://github.com/jszym/rapppid. Contact: [email protected] Supplementary Information: Online-only supplementary data is available at the journal's website.


2018 ◽  
Author(s):  
Cen Wan ◽  
Domenico Cozzetto ◽  
Rui Fa ◽  
David T. Jones

Protein-protein interaction network data provides valuable information that infers direct links between genes and their biological roles. This information brings a fundamental hypothesis for protein function prediction that interacting proteins tend to have similar functions. With the help of recently-developed network embedding feature generation methods and deep maxout neural networks, it is possible to extract functional representations that encode direct links between protein-protein interactions information and protein function. Our novel method, STRING2GO, successfully adopts deep maxout neural networks to learn functional representations simultaneously encoding both protein-protein interactions and functional predictive information. The experimental results show that STRING2GO outperforms other network embedding-based prediction methods and one benchmark method adopted in a recent large scale protein function prediction competition.


2020 ◽  
Vol 5 ◽  
pp. 20
Author(s):  
Rachel Cooley ◽  
Neesha Kara ◽  
Ning Sze Hui ◽  
Jonathan Tart ◽  
Chloë Roustan ◽  
...  

Targeting the interaction of proteins with weak binding affinities or low solubility represents a particular challenge for drug screening. The NanoLucâ ® Binary Technology (NanoBiTâ ®) was originally developed to detect protein-protein interactions in live mammalian cells. Here we report the successful translation of the NanoBit cellular assay into a biochemical, cell-free format using mammalian cell lysates. We show that the assay is suitable for the detection of both strong and weak protein interactions such as those involving the binding of RAS oncoproteins to either RAF or phosphoinositide 3-kinase (PI3K) effectors respectively, and that it is also effective for the study of poorly soluble protein domains such as the RAS binding domain of PI3K. Furthermore, the RAS interaction assay is sensitive and responds to both strong and weak RAS inhibitors. Our data show that the assay is robust, reproducible, cost-effective, and can be adapted for small and large-scale screening approaches. The NanoBit Biochemical Assay offers an attractive tool for drug screening against challenging protein-protein interaction targets, including the interaction of RAS with PI3K.


Yeast ◽  
2000 ◽  
Vol 1 (2) ◽  
pp. 88-94 ◽  
Author(s):  
Albertha J. M. Walhout ◽  
Simon J. Boulton ◽  
Marc Vidal

The availability of complete genome sequences necessitates the development of standardized functional assays to analyse the tens of thousands of predicted gene products in high-throughput experimental settings. Such approaches are collectively referred to as ‘functional genomics’. One approach to investigate the properties of a proteome of interest is by systematic analysis of protein–protein interactions. So far, the yeast two-hybrid system is the most commonly used method for large-scale, high-throughput identification of potential protein–protein interactions. Here, we discuss several technical features of variants of the two-hybrid systems in light of data recently obtained from different protein interaction mapping projects for the budding yeastSaccharomyces cerevisiaeand the nematodeCaenorhabditis elegans.


2020 ◽  
Vol 5 ◽  
pp. 20
Author(s):  
Rachel Cooley ◽  
Neesha Kara ◽  
Ning Sze Hui ◽  
Jonathan Tart ◽  
Chloë Roustan ◽  
...  

Targeting the interaction of proteins with weak binding affinities or low solubility represents a particular challenge for drug screening. The NanoLuc ® Binary Technology (NanoBiT ®) was originally developed to detect protein-protein interactions in live mammalian cells. Here we report the successful translation of the NanoBit cellular assay into a biochemical, cell-free format using mammalian cell lysates. We show that the assay is suitable for the detection of both strong and weak protein interactions such as those involving the binding of RAS oncoproteins to either RAF or phosphoinositide 3-kinase (PI3K) effectors respectively, and that it is also effective for the study of poorly soluble protein domains such as the RAS binding domain of PI3K. Furthermore, the RAS interaction assay is sensitive and responds to both strong and weak RAS inhibitors. Our data show that the assay is robust, reproducible, cost-effective, and can be adapted for small and large-scale screening approaches. The NanoBit Biochemical Assay offers an attractive tool for drug screening against challenging protein-protein interaction targets, including the interaction of RAS with PI3K.


2007 ◽  
Vol 4 (1) ◽  
pp. 40-50 ◽  
Author(s):  
Gautam Chaurasia ◽  
Yasir Iqbal ◽  
Christian Hänig ◽  
Hanspeter Herzel ◽  
Erich E. Wanker ◽  
...  

Summary Protein-protein interactions constitute the backbone of many molecular processes. This has motivated the recent construction of several large-scale human protein-protein interaction maps [1-10]. Although these maps clearly offer a wealth of information, their use is challenging: complexity, rapid growth, and fragmentation of interaction data hamper their usability. To overcome these hurdles, we have developed a publicly accessible database termed UniHI (Unified Human Interactome) for integration of human protein-protein interaction data. This database is designed to provide biomedical researchers a common platform for exploring previously disconnected human interaction maps. UniHI offers researchers flexible integrated tools for accessing comprehensive information about the human interactome. Several features included in the UniHI allow users to perform various types of network-oriented and functional analysis. At present, UniHI contains over 160,000 distinct interactions between 17,000 unique proteins from ten major interaction maps derived by both computational and experimental approaches [1-10]. Here we describe the details of the implementation and maintenance of UniHI and discuss the challenges that have to be addressed for a successful integration of interaction data.


2016 ◽  
Author(s):  
Yu Quan ◽  
Chao Xie ◽  
Rohan B. H. Williams ◽  
Peter F. R Little

AbstractIn this study, we analyse RNA-Seq data from panels of human lymphoblastoid cell lines (LCLs) to identify covariation in the mRNA levels of large numbers of genes. Such large scale covariation may have biological origin or be due to technical variation in analysis (generally referred to as batch effects). We show that batch effects cannot explain this covariation by demonstrating reproducibility across different human populations and across different methods of analysis. This view is also supported by enrichment of single and combinations of transcription factors (TFs) binding to cognate promoter regions, enrichment of genes shown to be sensitive to the knockdown of individual TFs, enrichment of functional pathways, and finally enrichment of protein-protein interactions in proteins encoded by groups of covarying genes. The properties of the groups of covarying genes are therefore most readily explained by the influence of cumulative variations in the effectors of gene expression that act in trans on cognate genes. We suggest that covariation has functional outcomes by showing that covariation of 83 genes involved in the spliceosome pathway accounts for 8–16% of the variation in the alternative splicing patterns of genes expressed in human LCLs.


Yeast ◽  
2000 ◽  
Vol 1 (2) ◽  
pp. 88-94 ◽  
Author(s):  
Albertha J. M. Walhout ◽  
Simon J. Boulton ◽  
Marc Vidal

The availability of complete genome sequences necessitates the development of standardized functional assays to analyse the tens of thousands of predicted gene products in high-throughput experimental settings. Such approaches are collectively referred to as ‘functional genomics’. One approach to investigate the properties of a proteome of interest is by systematic analysis of protein–protein interactions. So far, the yeast two-hybrid system is the most commonly used method for large-scale, high-throughput identification of potential protein–protein interactions. Here, we discuss several technical features of variants of the two-hybrid systems in light of data recently obtained from different protein interaction mapping projects for the budding yeast Saccharomyces cerevisiae and the nematode Caenorhabditis elegans.


2020 ◽  
Author(s):  
Salvador Guardiola ◽  
Monica Varese ◽  
Xavier Roig ◽  
Jesús Garcia ◽  
Ernest Giralt

<p>NOTE: This preprint has been retracted by consensus from all authors. See the retraction notice in place above; the original text can be found under "Version 1", accessible from the version selector above.</p><p><br></p><p>------------------------------------------------------------------------</p><p><br></p><p>Peptides, together with antibodies, are among the most potent biochemical tools to modulate challenging protein-protein interactions. However, current structure-based methods are largely limited to natural peptides and are not suitable for designing target-specific binders with improved pharmaceutical properties, such as macrocyclic peptides. Here we report a general framework that leverages the computational power of Rosetta for large-scale backbone sampling and energy scoring, followed by side-chain composition, to design heterochiral cyclic peptides that bind to a protein surface of interest. To showcase the applicability of our approach, we identified two peptides (PD-<i>i</i>3 and PD-<i>i</i>6) that target PD-1, a key immune checkpoint, and work as protein ligand decoys. A comprehensive biophysical evaluation confirmed their binding mechanism to PD-1 and their inhibitory effect on the PD-1/PD-L1 interaction. Finally, elucidation of their solution structures by NMR served as validation of our <i>de novo </i>design approach. We anticipate that our results will provide a general framework for designing target-specific drug-like peptides.<i></i></p>


Sign in / Sign up

Export Citation Format

Share Document