scholarly journals Multifaceted protein–protein interaction prediction based on Siamese residual RCNN

2019 ◽  
Vol 35 (14) ◽  
pp. i305-i314 ◽  
Author(s):  
Muhao Chen ◽  
Chelsea J -T Ju ◽  
Guangyu Zhou ◽  
Xuelu Chen ◽  
Tianran Zhang ◽  
...  

AbstractMotivationSequence-based protein–protein interaction (PPI) prediction represents a fundamental computational biology problem. To address this problem, extensive research efforts have been made to extract predefined features from the sequences. Based on these features, statistical algorithms are learned to classify the PPIs. However, such explicit features are usually costly to extract, and typically have limited coverage on the PPI information.ResultsWe present an end-to-end framework, PIPR (Protein–Protein Interaction Prediction Based on Siamese Residual RCNN), for PPI predictions using only the protein sequences. PIPR incorporates a deep residual recurrent convolutional neural network in the Siamese architecture, which leverages both robust local features and contextualized information, which are significant for capturing the mutual influence of proteins sequences. PIPR relieves the data pre-processing efforts that are required by other systems, and generalizes well to different application scenarios. Experimental evaluations show that PIPR outperforms various state-of-the-art systems on the binary PPI prediction problem. Moreover, it shows a promising performance on more challenging problems of interaction type prediction and binding affinity estimation, where existing approaches fall short.Availability and implementationThe implementation is available at https://github.com/muhaochen/seq_ppi.git.Supplementary informationSupplementary data are available at Bioinformatics online.

2018 ◽  
Author(s):  
Muhao Chen ◽  
Chelsea Jui-Ting Ju ◽  
Guangyu Zhou ◽  
Tianran Zhang ◽  
Xuelu Chen ◽  
...  

Sequence-based protein-protein interaction (PPI) prediction represents a fundamental computational biology problem. To address this problem, extensive research efforts have been made to extract predefined features from the sequences. Based on these features, statistical algorithms are learned to classify the PPIs. However, such explicit features are usually costly to extract, and typically have limited coverage on the PPI information. Hence, we present an end-to-end framework, Lasagna, for PPI predictions using only the primary sequences of a protein pair. Lasagna incorporates a deep residual recurrent convolutional neural network in the Siamese learning architecture, which leverages both robust local features and contextualized information that are significant for capturing the mutual influence of protein sequences. Our framework relieves the data pre-processing efforts that are required by other systems, and generalizes well to different application scenarios. Experimental evaluations show that Lasagna outperforms various state-of-the-art systems on the binary PPI prediction problem. Moreover, it shows a promising performance on more challenging problems of interaction type prediction and binding affinity estimation, where existing approaches fall short.


2015 ◽  
Vol 2015 ◽  
pp. 1-7 ◽  
Author(s):  
Jianzhuang Yao ◽  
Hong Guo ◽  
Xiaohan Yang

Determining protein-protein interaction (PPI) in biological systems is of considerable importance, and prediction of PPI has become a popular research area. Although different classifiers have been developed for PPI prediction, no single classifier seems to be able to predict PPI with high confidence. We postulated that by combining individual classifiers the accuracy of PPI prediction could be improved. We developed a method called protein-protein interaction prediction classifiers merger (PPCM), and this method combines output from two PPI prediction tools, GO2PPI and Phyloprof, using Random Forests algorithm. The performance of PPCM was tested by area under the curve (AUC) using an assembled Gold Standard database that contains both positive and negative PPI pairs. Our AUC test showed that PPCM significantly improved the PPI prediction accuracy over the corresponding individual classifiers. We found that additional classifiers incorporated into PPCM could lead to further improvement in the PPI prediction accuracy. Furthermore, cross species PPCM could achieve competitive and even better prediction accuracy compared to the single species PPCM. This study established a robust pipeline for PPI prediction by integrating multiple classifiers using Random Forests algorithm. This pipeline will be useful for predicting PPI in nonmodel species.


2021 ◽  
Author(s):  
Joseph Szymborski ◽  
Amin Emad

Motivation: Computational methods for the prediction of protein-protein interactions, while important tools for researchers, are plagued by challenges in generalising to unseen proteins. Datasets used for modelling protein-protein predictions are particularly predisposed to information leakage and sampling biases. Results: In this study, we introduce RAPPPID, a method for the Regularised Automatic Prediction of Protein-Protein Interactions using Deep Learning. RAPPPID is a twin AWD-LSTM network which employs multiple regularisation methods during training time to learn generalised weights. Testing on stringent interaction datasets composed of proteins not seen during training, RAPPPID outperforms state-of-the-art methods. Further experiments show that RAPPPID's performance holds regardless of the particular proteins in the testing set and its performance is higher for biologically supported edges. This study serves to demonstrate that appropriate regularisation is an important component of overcoming the challenges of creating models for protein-protein interaction prediction that generalise to unseen proteins. Availability and Implementation: Code and datasets are freely available at https://github.com/jszym/rapppid. Contact: [email protected] Supplementary Information: Online-only supplementary data is available at the journal's website.


PLoS Genetics ◽  
2021 ◽  
Vol 17 (11) ◽  
pp. e1009869
Author(s):  
Jiajun Qiu ◽  
Kui Chen ◽  
Chunlong Zhong ◽  
Sihao Zhu ◽  
Xiao Ma

The perturbations of protein-protein interactions (PPIs) were found to be the main cause of cancer. Previous PPI prediction methods which were trained with non-disease general PPI data were not compatible to map the PPI network in cancer. Therefore, we established a novel cancer specific PPI prediction method dubbed NECARE, which was based on relational graph convolutional network (R-GCN) with knowledge-based features. It achieved the best performance with a Matthews correlation coefficient (MCC) = 0.84±0.03 and an F1 = 91±2% compared with other methods. With NECARE, we mapped the cancer interactome atlas and revealed that the perturbations of PPIs were enriched on 1362 genes, which were named cancer hub genes. Those genes were found to over-represent with mutations occurring at protein-macromolecules binding interfaces. Furthermore, over 56% of cancer treatment-related genes belonged to hub genes and they were significantly related to the prognosis of 32 types of cancers. Finally, by coimmunoprecipitation, we confirmed that the NECARE prediction method was highly reliable with a 90% accuracy. Overall, we provided the novel network-based cancer protein-protein interaction prediction method and mapped the perturbation of cancer interactome. NECARE is available at: https://github.com/JiajunQiu/NECARE.


Sign in / Sign up

Export Citation Format

Share Document