Protein–Protein Interactions Prediction Base on Multiple Information Fusion via Graph Representation Learning

2022 ◽  
Vol 12 (4) ◽  
pp. 807-812
Author(s):  
Yan Li ◽  
Yu-Ren Zhang ◽  
Ping Zhang ◽  
Dong-Xu Li ◽  
Tian-Long Xiao

It is a critical impact on the processing of biological cells to protein–protein interactions (PPIs) in nature. Traditional PPIs predictive biological experiments consume a lot of human and material costs and time. Therefore, there is a great need to use computational methods to forecast PPIs. Most of the existing calculation methods are based on the sequence characteristics or internal structural characteristics of proteins, and most of them have the singleness of features. Therefore, we propose a novel method to predict PPIs base on multiple information fusion through graph representation learning. Specifically, firstly, the known protein sequences are calculated, and the properties of each protein are obtained by k-mer. Then, the known protein relationship pairs were constructed into an adjacency graph, and the graph representation learning method–graph convolution network was used to fuse the attributes of each protein with the graph structure information to obtain the features containing a variety of information. Finally, we put the multi-information features into the random forest classifier species for prediction and classification. Experimental results indicate that our method has high accuracy and AUC of 78.83% and 86.10%, respectively. In conclusion, our method has an excellent application prospect for predicting unknown PPIs.

2021 ◽  
Vol 13 (3) ◽  
pp. 526
Author(s):  
Shengliang Pu ◽  
Yuanfeng Wu ◽  
Xu Sun ◽  
Xiaotong Sun

The nascent graph representation learning has shown superiority for resolving graph data. Compared to conventional convolutional neural networks, graph-based deep learning has the advantages of illustrating class boundaries and modeling feature relationships. Faced with hyperspectral image (HSI) classification, the priority problem might be how to convert hyperspectral data into irregular domains from regular grids. In this regard, we present a novel method that performs the localized graph convolutional filtering on HSIs based on spectral graph theory. First, we conducted principal component analysis (PCA) preprocessing to create localized hyperspectral data cubes with unsupervised feature reduction. These feature cubes combined with localized adjacent matrices were fed into the popular graph convolution network in a standard supervised learning paradigm. Finally, we succeeded in analyzing diversified land covers by considering local graph structure with graph convolutional filtering. Experiments on real hyperspectral datasets demonstrated that the presented method offers promising classification performance compared with other popular competitors.


Cancers ◽  
2021 ◽  
Vol 13 (9) ◽  
pp. 2111
Author(s):  
Bo-Wei Zhao ◽  
Zhu-Hong You ◽  
Lun Hu ◽  
Zhen-Hao Guo ◽  
Lei Wang ◽  
...  

Identification of drug-target interactions (DTIs) is a significant step in the drug discovery or repositioning process. Compared with the time-consuming and labor-intensive in vivo experimental methods, the computational models can provide high-quality DTI candidates in an instant. In this study, we propose a novel method called LGDTI to predict DTIs based on large-scale graph representation learning. LGDTI can capture the local and global structural information of the graph. Specifically, the first-order neighbor information of nodes can be aggregated by the graph convolutional network (GCN); on the other hand, the high-order neighbor information of nodes can be learned by the graph embedding method called DeepWalk. Finally, the two kinds of feature are fed into the random forest classifier to train and predict potential DTIs. The results show that our method obtained area under the receiver operating characteristic curve (AUROC) of 0.9455 and area under the precision-recall curve (AUPR) of 0.9491 under 5-fold cross-validation. Moreover, we compare the presented method with some existing state-of-the-art methods. These results imply that LGDTI can efficiently and robustly capture undiscovered DTIs. Moreover, the proposed model is expected to bring new inspiration and provide novel perspectives to relevant researchers.


2016 ◽  
Vol 5 (4) ◽  
pp. 93-98
Author(s):  
Wen Sun ◽  
Lin Han ◽  
Wenmao Xu ◽  
Yazhen Sun

AbstractObjective: The objective of this work is to search for a novel method to explore the disrupted pathways associated with periodontitis (PD) based on the network level.Methods: Firstly, the differential expression genes (DEGs) between PD patients and cognitively normal subjects were inferred based on LIMMA package. Then, the protein-protein interactions (PPI) in each pathway were explored by Empirical Bayesian (EB) co-expression program. Specifically, we determined the 100th weight value as the threshold value of the disrupted pathways of PPI by constructing the randomly model and confirmed the weight value of each pathway. Meanwhile, we dissected the disrupted pathways under the weight value > the threshold value. Pathways enrichment analyses of DEGs were carried out based on Expression Analysis Systematic Explored (EASE) test. Finally, the better method was selected based on the more rich and significant obtained pathways by comparing the two methods.Results: After the calculation of LIMMA package, we estimated 524 DEGs in all. Then we determined 0.115222 as the threshold value of the disrupted pathways of PPI. When the weight value>0.115222, there were 258 disrupted pathways of PPI enriched in. Additionally, we observed those 524 DEGs that were enriched in 4 pathways under EASE=0.1.Conclusion: We proposed a novel network method inferring the disrupted pathway for PD. The disrupted pathways might be underlying biomarkers for treatment associated with PD.


2019 ◽  
Vol 20 (S16) ◽  
Author(s):  
Da Zhang ◽  
Mansur Kabuka

Abstract Background Protein-protein interactions(PPIs) engage in dynamic pathological and biological procedures constantly in our life. Thus, it is crucial to comprehend the PPIs thoroughly such that we are able to illuminate the disease occurrence, achieve the optimal drug-target therapeutic effect and describe the protein complex structures. However, compared to the protein sequences obtainable from various species and organisms, the number of revealed protein-protein interactions is relatively limited. To address this dilemma, lots of research endeavor have investigated in it to facilitate the discovery of novel PPIs. Among these methods, PPI prediction techniques that merely rely on protein sequence data are more widespread than other methods which require extensive biological domain knowledge. Results In this paper, we propose a multi-modal deep representation learning structure by incorporating protein physicochemical features with the graph topological features from the PPI networks. Specifically, our method not only bears in mind the protein sequence information but also discerns the topological representations for each protein node in the PPI networks. In our paper, we construct a stacked auto-encoder architecture together with a continuous bag-of-words (CBOW) model based on generated metapaths to study the PPI predictions. Following by that, we utilize the supervised deep neural networks to identify the PPIs and classify the protein families. The PPI prediction accuracy for eight species ranged from 96.76% to 99.77%, which signifies that our multi-modal deep representation learning framework achieves superior performance compared to other computational methods. Conclusion To the best of our knowledge, this is the first multi-modal deep representation learning framework for examining the PPI networks.


2017 ◽  
Vol 45 (12) ◽  
pp. 7094-7105 ◽  
Author(s):  
Milana Frenkel-Morgenstern ◽  
Alessandro Gorohovski ◽  
Somnath Tagore ◽  
Vaishnovi Sekar ◽  
Miguel Vazquez ◽  
...  

2020 ◽  
Vol 2 (2) ◽  
Author(s):  
Guangyu Zhou ◽  
Muhao Chen ◽  
Chelsea J T Ju ◽  
Zheng Wang ◽  
Jyun-Yu Jiang ◽  
...  

Abstract The functional impact of protein mutations is reflected on the alteration of conformation and thermodynamics of protein–protein interactions (PPIs). Quantifying the changes of two interacting proteins upon mutations is commonly carried out by computational approaches. Hence, extensive research efforts have been put to the extraction of energetic or structural features on proteins, followed by statistical learning methods to estimate the effects of mutations on PPI properties. Nonetheless, such features require extensive human labors and expert knowledge to obtain, and have limited abilities to reflect point mutations. We present an end-to-end deep learning framework, MuPIPR (Mutation Effects in Protein–protein Interaction PRediction Using Contextualized Representations), to estimate the effects of mutations on PPIs. MuPIPR incorporates a contextualized representation mechanism of amino acids to propagate the effects of a point mutation to surrounding amino acid representations, therefore amplifying the subtle change in a long protein sequence. On top of that, MuPIPR leverages a Siamese residual recurrent convolutional neural encoder to encode a wild-type protein pair and its mutation pair. Multi-layer perceptron regressors are applied to the protein pair representations to predict the quantifiable changes of PPI properties upon mutations. Experimental evaluations show that, with only sequence information, MuPIPR outperforms various state-of-the-art systems on estimating the changes of binding affinity for SKEMPI v1, and offers comparable performance on SKEMPI v2. Meanwhile, MuPIPR also demonstrates state-of-the-art performance on estimating the changes of buried surface areas. The software implementation is available at https://github.com/guangyu-zhou/MuPIPR.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Kanchan Jha ◽  
Sriparna Saha

Abstract Protein is the primary building block of living organisms. It interacts with other proteins and is then involved in various biological processes. Protein–protein interactions (PPIs) help in predicting and hence help in understanding the functionality of the proteins, causes and growth of diseases, and designing new drugs. However, there is a vast gap between the available protein sequences and the identification of protein–protein interactions. To bridge this gap, researchers proposed several computational methods to reveal the interactions between proteins. These methods merely depend on sequence-based information of proteins. With the advancement of technology, different types of information related to proteins are available such as 3D structure information. Nowadays, deep learning techniques are adopted successfully in various domains, including bioinformatics. So, current work focuses on the utilization of different modalities, such as 3D structures and sequence-based information of proteins, and deep learning algorithms to predict PPIs. The proposed approach is divided into several phases. We first get several illustrations of proteins using their 3D coordinates information, and three attributes, such as hydropathy index, isoelectric point, and charge of amino acids. Amino acids are the building blocks of proteins. A pre-trained ResNet50 model, a subclass of a convolutional neural network, is utilized to extract features from these representations of proteins. Autocovariance and conjoint triad are two widely used sequence-based methods to encode proteins, which are used here as another modality of protein sequences. A stacked autoencoder is utilized to get the compact form of sequence-based information. Finally, the features obtained from different modalities are concatenated in pairs and fed into the classifier to predict labels for protein pairs. We have experimented on the human PPIs dataset and Saccharomyces cerevisiae PPIs dataset and compared our results with the state-of-the-art deep-learning-based classifiers. The results achieved by the proposed method are superior to those obtained by the existing methods. Extensive experimentations on different datasets indicate that our approach to learning and combining features from two different modalities is useful in PPI prediction.


2014 ◽  
Vol 12 (06) ◽  
pp. 1442008 ◽  
Author(s):  
Jung-Hsien Chiang ◽  
Jiun-Huang Ju

Protein–protein interactions (PPIs) are involved in the majority of biological processes. Identification of PPIs is therefore one of the key aims of biological research. Although there are many databases of PPIs, many other unidentified PPIs could be buried in the biomedical literature. Therefore, automated identification of PPIs from biomedical literature repositories could be used to discover otherwise hidden interactions. Search engines, such as Google, have been successfully applied to measure the relatedness among words. Inspired by such approaches, we propose a novel method to identify PPIs through semantic similarity measures among protein mentions. We define six semantic similarity measures as features based on the page counts retrieved from the MEDLINE database. A machine learning classifier, Random Forest, is trained using the above features. The proposed approach achieve an averaged micro-F of 71.28% and an averaged macro-F of 64.03% over five PPI corpora, an improvement over the results of using only the conventional co-occurrence feature (averaged micro-F of 68.79% and an averaged macro-F of 60.49%). A relation-word reinforcement further improves the averaged micro-F to 71.3% and averaged macro-F to 65.12%. Comparing the results of the current work with other studies on the AIMed corpus (ranging from 77.58% to 85.1% in micro-F, 62.18% to 76.27% in macro-F), we show that the proposed approach achieves micro-F of 81.88% and macro-F of 64.01% without the use of sophisticated feature extraction. Finally, we manually examine the newly discovered PPI pairs based on a literature review, and the results suggest that our approach could extract novel protein–protein interactions.


2020 ◽  
Vol 2020 ◽  
pp. 1-8
Author(s):  
Wenzheng Ma ◽  
Yi Cao ◽  
Wenzheng Bao ◽  
Bin Yang ◽  
Yuehui Chen

The interactions between proteins play important roles in several organisms, and such issue can be involved in almost all activities in the cell. The research of protein-protein interactions (PPIs) can make a huge contribution to the prevention and treatment of diseases. Currently, many prediction methods based on machine learning have been proposed to predict PPIs. In this article, we propose a novel method ACT-SVM that can effectively predict PPIs. The ACT-SVM model maps protein sequences to digital features, performs feature extraction twice on the protein sequence to obtain vector A and descriptor CT, and combines them into a vector. Then, the feature vectors of the protein pair are merged as the input of the support vector machine (SVM) classifier. We utilize nonredundant H. pylori and human dataset to verify the prediction performance of our method. Finally, the proposed method has a prediction accuracy of 0.727897 for H. pylori data and a prediction accuracy of 0.838799 for human dataset. The results demonstrate that this method can be called a stable and reliable prediction model of PPIs.


Sign in / Sign up

Export Citation Format

Share Document