scholarly journals Deep learning with feature embedding for compound-protein interaction prediction

2016 ◽  
Author(s):  
Fangping Wan ◽  
Jianyang (Michael) Zeng

AbstractAccurately identifying compound-protein interactions in silico can deepen our understanding of the mechanisms of drug action and significantly facilitate the drug discovery and development process. Traditional similarity-based computational models for compound-protein interaction prediction rarely exploit the latent features from current available large-scale unlabelled compound and protein data, and often limit their usage on relatively small-scale datasets. We propose a new scheme that combines feature embedding (a technique of representation learning) with deep learning for predicting compound-protein interactions. Our method automatically learns the low-dimensional implicit but expressive features for compounds and proteins from the massive amount of unlabelled data. Combining effective feature embedding with powerful deep learning techniques, our method provides a general computational pipeline for accurate compound-protein interaction prediction, even when the interaction knowledge of compounds and proteins is entirely unknown. Evaluations on current large-scale databases of the measured compound-protein affinities, such as ChEMBL and BindingDB, as well as known drug-target interactions from DrugBank have demonstrated the superior prediction performance of our method, and suggested that it can offer a useful tool for drug development and drug repositioning.

2021 ◽  
Author(s):  
Jian Wang ◽  
Nikolay V Dokholyan

In recent years, numerous structure-free deep-learning-based neural networks have emerged aiming to predict compound-protein interactions for drug virtual screening. Although these methods show high prediction accuracy in their own tests, we find that they are not generalizable to predict interactions between unknown proteins and unknown small molecules, thus hindering the utilization of state-of-the-art deep learning techniques in the field of virtual screening. In our work, we develop a compound-protein interaction predictor, YueL, which can predict compound-protein interactions with high generalizability. Upon comprehensive tests on various data sets, we find that YueL has the ability to predict interactions between unknown compounds and unknown proteins. We anticipate our work can motivate broad application of deep learning techniques for drug virtual screening to supersede the traditional docking and cheminformatics methods.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Yipin Lei ◽  
Shuya Li ◽  
Ziyi Liu ◽  
Fangping Wan ◽  
Tingzhong Tian ◽  
...  

AbstractPeptide-protein interactions are involved in various fundamental cellular functions and their identification is crucial for designing efficacious peptide therapeutics. Recently, a number of computational methods have been developed to predict peptide-protein interactions. However, most of the existing prediction approaches heavily depend on high-resolution structure data. Here, we present a deep learning framework for multi-level peptide-protein interaction prediction, called CAMP, including binary peptide-protein interaction prediction and corresponding peptide binding residue identification. Comprehensive evaluation demonstrated that CAMP can successfully capture the binary interactions between peptides and proteins and identify the binding residues along the peptides involved in the interactions. In addition, CAMP outperformed other state-of-the-art methods on binary peptide-protein interaction prediction. CAMP can serve as a useful tool in peptide-protein interaction prediction and identification of important binding residues in the peptides, which can thus facilitate the peptide drug discovery process.


Methods ◽  
2016 ◽  
Vol 110 ◽  
pp. 64-72 ◽  
Author(s):  
Kai Tian ◽  
Mingyu Shao ◽  
Yang Wang ◽  
Jihong Guan ◽  
Shuigeng Zhou

2017 ◽  
Vol 13 (9) ◽  
pp. 1781-1787 ◽  
Author(s):  
Huan Hu ◽  
Chunyu Zhu ◽  
Haixin Ai ◽  
Li Zhang ◽  
Jian Zhao ◽  
...  

RNA–protein interactions are essential for understanding many important cellular processes.


2021 ◽  
Author(s):  
Joseph Szymborski ◽  
Amin Emad

Motivation: Computational methods for the prediction of protein-protein interactions, while important tools for researchers, are plagued by challenges in generalising to unseen proteins. Datasets used for modelling protein-protein predictions are particularly predisposed to information leakage and sampling biases. Results: In this study, we introduce RAPPPID, a method for the Regularised Automatic Prediction of Protein-Protein Interactions using Deep Learning. RAPPPID is a twin AWD-LSTM network which employs multiple regularisation methods during training time to learn generalised weights. Testing on stringent interaction datasets composed of proteins not seen during training, RAPPPID outperforms state-of-the-art methods. Further experiments show that RAPPPID's performance holds regardless of the particular proteins in the testing set and its performance is higher for biologically supported edges. This study serves to demonstrate that appropriate regularisation is an important component of overcoming the challenges of creating models for protein-protein interaction prediction that generalise to unseen proteins. Availability and Implementation: Code and datasets are freely available at https://github.com/jszym/rapppid. Contact: [email protected] Supplementary Information: Online-only supplementary data is available at the journal's website.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Thi Ngan Dong ◽  
Graham Brogden ◽  
Gisa Gerold ◽  
Megha Khosla

Abstract Background Viral infections are causing significant morbidity and mortality worldwide. Understanding the interaction patterns between a particular virus and human proteins plays a crucial role in unveiling the underlying mechanism of viral infection and pathogenesis. This could further help in prevention and treatment of virus-related diseases. However, the task of predicting protein–protein interactions between a new virus and human cells is extremely challenging due to scarce data on virus-human interactions and fast mutation rates of most viruses. Results We developed a multitask transfer learning approach that exploits the information of around 24 million protein sequences and the interaction patterns from the human interactome to counter the problem of small training datasets. Instead of using hand-crafted protein features, we utilize statistically rich protein representations learned by a deep language modeling approach from a massive source of protein sequences. Additionally, we employ an additional objective which aims to maximize the probability of observing human protein–protein interactions. This additional task objective acts as a regularizer and also allows to incorporate domain knowledge to inform the virus-human protein–protein interaction prediction model. Conclusions Our approach achieved competitive results on 13 benchmark datasets and the case study for the SARS-CoV-2 virus receptor. Experimental results show that our proposed model works effectively for both virus-human and bacteria-human protein–protein interaction prediction tasks. We share our code for reproducibility and future research at https://git.l3s.uni-hannover.de/dong/multitask-transfer.


2009 ◽  
Vol 9 (4) ◽  
pp. 179-194 ◽  
Author(s):  
Raghuraj Rao ◽  
Kyaw Tun ◽  
Yuko Makita ◽  
Samavedham Lakshminarayanan ◽  
Pawan K. Dhar

Author(s):  
Guofeng Lv ◽  
Zhiqiang Hu ◽  
Yanguang Bi ◽  
Shaoting Zhang

The study of multi-type Protein-Protein Interaction (PPI) is fundamental for understanding biological processes from a systematic perspective and revealing disease mechanisms. Existing methods suffer from significant performance degradation when tested in unseen dataset. In this paper, we investigate the problem and find that it is mainly attributed to the poor performance for inter-novel-protein interaction prediction. However, current evaluations overlook the inter-novel-protein interactions, and thus fail to give an instructive assessment. As a result, we propose to address the problem from both the evaluation and the methodology. Firstly, we design a new evaluation framework that fully respects the inter-novel-protein interactions and gives consistent assessment across datasets. Secondly, we argue that correlations between proteins must provide useful information for analysis of novel proteins, and based on this, we propose a graph neural network based method (GNN-PPI) for better inter-novel-protein interaction prediction. Experimental results on real-world datasets of different scales demonstrate that GNN-PPI significantly outperforms state-of-the-art PPI prediction methods, especially for the inter-novel-protein interaction prediction.


Sign in / Sign up

Export Citation Format

Share Document