Inference of Protein-Protein Interactions by Using Co-evolutionary Information

Author(s):  
Tetsuya Sato ◽  
Yoshihiro Yamanishi ◽  
Katsuhisa Horimoto ◽  
Minoru Kanehisa ◽  
Hiroyuki Toh
2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Yang Li ◽  
Zheng Wang ◽  
Li-Ping Li ◽  
Zhu-Hong You ◽  
Wen-Zhun Huang ◽  
...  

AbstractVarious biochemical functions of organisms are performed by protein–protein interactions (PPIs). Therefore, recognition of protein–protein interactions is very important for understanding most life activities, such as DNA replication and transcription, protein synthesis and secretion, signal transduction and metabolism. Although high-throughput technology makes it possible to generate large-scale PPIs data, it requires expensive cost of both time and labor, and leave a risk of high false positive rate. In order to formulate a more ingenious solution, biology community is looking for computational methods to quickly and efficiently discover massive protein interaction data. In this paper, we propose a computational method for predicting PPIs based on a fresh idea of combining orthogonal locality preserving projections (OLPP) and rotation forest (RoF) models, using protein sequence information. Specifically, the protein sequence is first converted into position-specific scoring matrices (PSSMs) containing protein evolutionary information by using the Position-Specific Iterated Basic Local Alignment Search Tool (PSI-BLAST). Then we characterize a protein as a fixed length feature vector by applying OLPP to PSSMs. Finally, we train an RoF classifier for the purpose of identifying non-interacting and interacting protein pairs. The proposed method yielded a significantly better results than existing methods, with 90.07% and 96.09% prediction accuracy on Yeast and Human datasets. Our experiment show the proposed method can serve as a useful tool to accelerate the process of solving key problems in proteomics.


2019 ◽  
Vol 47 (W1) ◽  
pp. W338-W344 ◽  
Author(s):  
Carlos H M Rodrigues ◽  
Yoochan Myung ◽  
Douglas E V Pires ◽  
David B Ascher

AbstractProtein–protein Interactions are involved in most fundamental biological processes, with disease causing mutations enriched at their interfaces. Here we present mCSM-PPI2, a novel machine learning computational tool designed to more accurately predict the effects of missense mutations on protein–protein interaction binding affinity. mCSM-PPI2 uses graph-based structural signatures to model effects of variations on the inter-residue interaction network, evolutionary information, complex network metrics and energetic terms to generate an optimised predictor. We demonstrate that our method outperforms previous methods, ranking first among 26 others on CAPRI blind tests. mCSM-PPI2 is freely available as a user friendly webserver at http://biosig.unimelb.edu.au/mcsm_ppi2/.


2021 ◽  
Author(s):  
JinXuan Zhai ◽  
Ji-Yong An

Abstract Background:Protein–protein interactions (PPIs) are involved in a number of cellular processes and play a key role inside cells. The prediction of PPIs is an important task towards the understanding of many bioinformatics functions and applications, such as predicting protein functions, gene-disease associations and disease-drug associations. Given that high-throughput methods are expensive and time-consuming, it is a challenging task to develop efficient and accurate computational methods for predicting PPIs .Results:In the study, a novel computational approach named WELM-SURF was developed to predict PPIs. The proposed method used Position Specific Scoring Matrix (PSSM) to capture protein evolutionary information and employed Speed Up Robot Features (SURF) to extract key features from PSSM of protein sequence. Weighted Extreme Learning Machine (WELM) is featured with short training time and great ability to execute classification efficiently by optimizing the loss function of weight matrix. Therefore, WELM classifier was used to carry out classification. The cross-validation results show that WELM-SURF obtains 97.36% and 95.12% of average accuracy on yeast and human dataset, respectively. The prediction ability of WELM-SURF was also compared with those of ELM-SRUF, SVM-SURF and other existing approaches. The comparison results further verify that WELM-SURF is obviously better than other methods.Conclusion:The experimental results proved that the WELM-SURF method is very useful for predicting PPIs and can also be applied to other bioinformatics studies of protein.


2016 ◽  
Vol 25 (10) ◽  
pp. 1825-1833 ◽  
Author(s):  
Ji-Yong An ◽  
Fan-Rong Meng ◽  
Zhu-Hong You ◽  
Xing Chen ◽  
Gui-Ying Yan ◽  
...  

2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Jie Pan ◽  
Li-Ping Li ◽  
Chang-Qing Yu ◽  
Zhu-Hong You ◽  
Zhong-Hao Ren ◽  
...  

Protein-protein interactions (PPIs) in plants are crucial for understanding biological processes. Although high-throughput techniques produced valuable information to identify PPIs in plants, they are usually expensive, inefficient, and extremely time-consuming. Hence, there is an urgent need to develop novel computational methods to predict PPIs in plants. In this article, we proposed a novel approach to predict PPIs in plants only using the information of protein sequences. Specifically, plants’ protein sequences are first converted as position-specific scoring matrix (PSSM); then, the fast Walsh–Hadamard transform (FWHT) algorithm is used to extract feature vectors from PSSM to obtain evolutionary information of plant proteins. Lastly, the rotation forest (RF) classifier is trained for prediction and produced a series of evaluation results. In this work, we named this approach FWHT-RF because FWHT and RF are used for feature extraction and classification, respectively. When applying FWHT-RF on three plants’ PPI datasets Maize, Rice, and Arabidopsis thaliana (Arabidopsis), the average accuracies of FWHT-RF using 5-fold cross validation were achieved as high as 95.20%, 94.42%, and 83.85%, respectively. To further evaluate the predictive power of FWHT-RF, we compared it with the state-of-art support vector machine (SVM) and K-nearest neighbor (KNN) classifier in different aspects. The experimental results demonstrated that FWHT-RF can be a useful supplementary method to predict potential PPIs in plants.


2019 ◽  
Vol 15 ◽  
pp. 117693431987992 ◽  
Author(s):  
Ji-Yong An ◽  
Yong Zhou ◽  
Yu-Jun Zhao ◽  
Zi-Ji Yan

Background: Increasing evidence has indicated that protein-protein interactions (PPIs) play important roles in various aspects of the structural and functional organization of a cell. Thus, continuing to uncover potential PPIs is an important topic in the biomedical domain. Although various feature extraction methods with machine learning approaches have enhanced the prediction of PPIs. There remains room for improvement by developing novel and effective feature extraction methods and classifier approaches to identify PPIs. Method: In this study, we proposed a sequence-based feature extraction method called LCPSSMMF, which combined local coding position-specific scoring matrix (PSSM) with multifeatures fusion. First, we used a novel local coding method based on PSSM to build a new PSSM (CPSSM); the advantage of this method is that it incorporated global and local feature extraction, which can account for the interactions between residues in both continuous and discontinuous regions of amino acid sequences. Second, we adopted 2 different feature extraction methods (Local Average Group [LAG] and Bigram Probability [BP]) to capture multiple key feature information by employing the evolutionary information embedded in the CPSSM matrix. Finally, feature vectors were acquired by using multifeatures fusion method. Result: To evaluate the performance of the proposed feature extraction approach, we employed support vector machine (SVM) as a prediction classifier and applied this method to yeast and human PPI datasets. The prediction accuracies of LCPSSMMF were 93.43% and 90.41% on the yeast and human datasets, respectively. Moreover, we also compared the proposed method with the previous sequence-based approaches on the yeast datasets by using the same SVM classifier. The experimental results indicated that the performance of LCPSSMMF significantly exceeded that of several other state-of-the-art methods. It is proven that the LCPSSMMF approach can capture more local and global discriminatory information than almost all previous methods and can function remarkably well in identifying PPIs. To facilitate extensive research in future proteomics studies, we developed a LCPSSMMFSVM server, which is freely available for academic use at http://219.219.62.123:8888/LCPSSMMFSVM .


Sign in / Sign up

Export Citation Format

Share Document