A Novel Method to Predict Drug-Target Interactions Based on Large-Scale Graph Representation Learning

Identification of drug-target interactions (DTIs) is a significant step in the drug discovery or repositioning process. Compared with the time-consuming and labor-intensive in vivo experimental methods, the computational models can provide high-quality DTI candidates in an instant. In this study, we propose a novel method called LGDTI to predict DTIs based on large-scale graph representation learning. LGDTI can capture the local and global structural information of the graph. Specifically, the first-order neighbor information of nodes can be aggregated by the graph convolutional network (GCN); on the other hand, the high-order neighbor information of nodes can be learned by the graph embedding method called DeepWalk. Finally, the two kinds of feature are fed into the random forest classifier to train and predict potential DTIs. The results show that our method obtained area under the receiver operating characteristic curve (AUROC) of 0.9455 and area under the precision-recall curve (AUPR) of 0.9491 under 5-fold cross-validation. Moreover, we compare the presented method with some existing state-of-the-art methods. These results imply that LGDTI can efficiently and robustly capture undiscovered DTIs. Moreover, the proposed model is expected to bring new inspiration and provide novel perspectives to relevant researchers.

Download Full-text

EmbedDTI: Enhancing the Molecular Representations via Sequence Embedding and Graph Convolutional Network for the Prediction of Drug-Target Interaction

Biomolecules ◽

10.3390/biom11121783 ◽

2021 ◽

Vol 11 (12) ◽

pp. 1783

Author(s):

Yuan Jin ◽

Jiarui Lu ◽

Runhan Shi ◽

Yang Yang

Keyword(s):

Drug Target ◽

Large Scale ◽

Structural Information ◽

Representation Learning ◽

Convolutional Network ◽

Drug Discovery And Development ◽

Target Proteins ◽

Target Interaction ◽

Drug Databases ◽

Benchmark Datasets

The identification of drug-target interaction (DTI) plays a key role in drug discovery and development. Benefitting from large-scale drug databases and verified DTI relationships, a lot of machine-learning methods have been developed to predict DTIs. However, due to the difficulty in extracting useful information from molecules, the performance of these methods is limited by the representation of drugs and target proteins. This study proposes a new model called EmbedDTI to enhance the representation of both drugs and target proteins, and improve the performance of DTI prediction. For protein sequences, we leverage language modeling for pretraining the feature embeddings of amino acids and feed them to a convolutional neural network model for further representation learning. For drugs, we build two levels of graphs to represent compound structural information, namely the atom graph and substructure graph, and adopt graph convolutional network with an attention module to learn the embedding vectors for the graphs. We compare EmbedDTI with the existing DTI predictors on two benchmark datasets. The experimental results show that EmbedDTI outperforms the state-of-the-art models, and the attention module can identify the components crucial for DTIs in compounds.

Download Full-text

Hybrid Low-Order and Higher-Order Graph Convolutional Networks

Computational Intelligence and Neuroscience ◽

10.1155/2020/3283890 ◽

2020 ◽

Vol 2020 ◽

pp. 1-9

Author(s):

Fangyuan Lei ◽

Xun Liu ◽

Qingyun Dai ◽

Bingo Wing-Kuen Ling ◽

Huimin Zhao ◽

...

Keyword(s):

Computational Complexity ◽

Large Scale ◽

Representation Learning ◽

Higher Order ◽

Semisupervised Learning ◽

Graph Representation ◽

Convolutional Network ◽

Convolutional Networks ◽

Proposed Model ◽

Low Order

With the higher-order neighborhood information of a graph network, the accuracy of graph representation learning classification can be significantly improved. However, the current higher-order graph convolutional networks have a large number of parameters and high computational complexity. Therefore, we propose a hybrid lower-order and higher-order graph convolutional network (HLHG) learning model, which uses a weight sharing mechanism to reduce the number of network parameters. To reduce the computational complexity, we propose a novel information fusion pooling layer to combine the high-order and low-order neighborhood matrix information. We theoretically compare the computational complexity and the number of parameters of the proposed model with those of the other state-of-the-art models. Experimentally, we verify the proposed model on large-scale text network datasets using supervised learning and on citation network datasets using semisupervised learning. The experimental results show that the proposed model achieves higher classification accuracy with a small set of trainable weight parameters.

Download Full-text

Multi-hop Attention Graph Neural Networks

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/425 ◽

2021 ◽

Author(s):

Guangtao Wang ◽

Rex Ying ◽

Jing Huang ◽

Jure Leskovec

Keyword(s):

Neural Networks ◽

Large Scale ◽

Performance Metrics ◽

State Of The Art ◽

Structural Information ◽

Representation Learning ◽

Attention Mechanism ◽

Graph Representation ◽

Knowledge Graph ◽

Graph Neural Networks

Self-attention mechanism in graph neural networks (GNNs) led to state-of-the-art performance on many graph representation learning tasks. Currently, at every layer, attention is computed between connected pairs of nodes and depends solely on the representation of the two nodes. However, such attention mechanism does not account for nodes that are not directly connected but provide important network context. Here we propose Multi-hop Attention Graph Neural Network (MAGNA), a principled way to incorporate multi-hop context information into every layer of attention computation. MAGNA diffuses the attention scores across the network, which increases the receptive field for every layer of the GNN. Unlike previous approaches, MAGNA uses a diffusion prior on attention values, to efficiently account for all paths between the pair of disconnected nodes. We demonstrate in theory and experiments that MAGNA captures large-scale structural information in every layer, and has a low-pass effect that eliminates noisy high-frequency information from graph data. Experimental results on node classification as well as the knowledge graph completion benchmarks show that MAGNA achieves state-of-the-art results: MAGNA achieves up to 5.7% relative error reduction over the previous state-of-the-art on Cora, Citeseer, and Pubmed. MAGNA also obtains the best performance on a large-scale Open Graph Benchmark dataset. On knowledge graph completion MAGNA advances state-of-the-art on WN18RR and FB15k-237 across four different performance metrics.

Download Full-text

A Drug Target Interaction Prediction Based on LINE-RF Learning

Current Bioinformatics ◽

10.2174/1574893615666191227092453 ◽

2020 ◽

Vol 15 (7) ◽

pp. 750-757

Author(s):

Jihong Wang ◽

Yue Shi ◽

Xiaodan Wang ◽

Huiyou Chang

Keyword(s):

Network Topology ◽

Drug Target ◽

Large Scale ◽

Representation Learning ◽

New Drugs ◽

Combination Method ◽

Learning Methods ◽

Network Representation ◽

On Line ◽

Clinical Experiments

Background: At present, using computer methods to predict drug-target interactions (DTIs) is a very important step in the discovery of new drugs and drug relocation processes. The potential DTIs identified by machine learning methods can provide guidance in biochemical or clinical experiments. Objective: The goal of this article is to combine the latest network representation learning methods for drug-target prediction research, improve model prediction capabilities, and promote new drug development. Methods: We use large-scale information network embedding (LINE) method to extract network topology features of drugs, targets, diseases, etc., integrate features obtained from heterogeneous networks, construct binary classification samples, and use random forest (RF) method to predict DTIs. Results: The experiments in this paper compare the common classifiers of RF, LR, and SVM, as well as the typical network representation learning methods of LINE, Node2Vec, and DeepWalk. It can be seen that the combined method LINE-RF achieves the best results, reaching an AUC of 0.9349 and an AUPR of 0.9016. Conclusion: The learning method based on LINE network can effectively learn drugs, targets, diseases and other hidden features from the network topology. The combination of features learned through multiple networks can enhance the expression ability. RF is an effective method of supervised learning. Therefore, the Line-RF combination method is a widely applicable method.

Download Full-text

Hyperspectral Image Classification with Localized Graph Convolutional Filtering

Remote Sensing ◽

10.3390/rs13030526 ◽

2021 ◽

Vol 13 (3) ◽

pp. 526

Author(s):

Shengliang Pu ◽

Yuanfeng Wu ◽

Xu Sun ◽

Xiaotong Sun

Keyword(s):

Hyperspectral Image ◽

Principal Component ◽

Representation Learning ◽

Classification Performance ◽

Hyperspectral Data ◽

Spectral Graph Theory ◽

Feature Reduction ◽

Graph Representation ◽

Novel Method ◽

Local Graph

The nascent graph representation learning has shown superiority for resolving graph data. Compared to conventional convolutional neural networks, graph-based deep learning has the advantages of illustrating class boundaries and modeling feature relationships. Faced with hyperspectral image (HSI) classification, the priority problem might be how to convert hyperspectral data into irregular domains from regular grids. In this regard, we present a novel method that performs the localized graph convolutional filtering on HSIs based on spectral graph theory. First, we conducted principal component analysis (PCA) preprocessing to create localized hyperspectral data cubes with unsupervised feature reduction. These feature cubes combined with localized adjacent matrices were fed into the popular graph convolution network in a standard supervised learning paradigm. Finally, we succeeded in analyzing diversified land covers by considering local graph structure with graph convolutional filtering. Experiments on real hyperspectral datasets demonstrated that the presented method offers promising classification performance compared with other popular competitors.

Download Full-text

Bistability of somatic pattern memories: stochastic outcomes in bioelectric circuits underlying regeneration

Philosophical Transactions of the Royal Society B Biological Sciences ◽

10.1098/rstb.2019.0765 ◽

2021 ◽

Vol 376 (1821) ◽

pp. 20190765 ◽

Cited By ~ 2

Author(s):

Giovanni Pezzulo ◽

Joshua LaPalme ◽

Fallon Durant ◽

Michael Levin

Keyword(s):

Large Scale ◽

Computational Models ◽

State Of The Art ◽

Memory Representation ◽

Theme Issue ◽

Evolutionary Innovation ◽

New Interpretation ◽

The Brain

Nervous systems’ computational abilities are an evolutionary innovation, specializing and speed-optimizing ancient biophysical dynamics. Bioelectric signalling originated in cells' communication with the outside world and with each other, enabling cooperation towards adaptive construction and repair of multicellular bodies. Here, we review the emerging field of developmental bioelectricity, which links the field of basal cognition to state-of-the-art questions in regenerative medicine, synthetic bioengineering and even artificial intelligence. One of the predictions of this view is that regeneration and regulative development can restore correct large-scale anatomies from diverse starting states because, like the brain, they exploit bioelectric encoding of distributed goal states—in this case, pattern memories. We propose a new interpretation of recent stochastic regenerative phenotypes in planaria, by appealing to computational models of memory representation and processing in the brain. Moreover, we discuss novel findings showing that bioelectric changes induced in planaria can be stored in tissue for over a week, thus revealing that somatic bioelectric circuits in vivo can implement a long-term, re-writable memory medium. A consideration of the mechanisms, evolution and functionality of basal cognition makes novel predictions and provides an integrative perspective on the evolution, physiology and biomedicine of information processing in vivo . This article is part of the theme issue ‘Basal cognition: multicellularity, neurons and the cognitive lens’.

Download Full-text

Multi-modal transportation recommendation with unified route representation learning

Proceedings of the VLDB Endowment ◽

10.14778/3430915.3430924 ◽

2020 ◽

Vol 14 (3) ◽

pp. 342-350

Author(s):

Hao Liu ◽

Jindong Han ◽

Yanjie Fu ◽

Jingbo Zhou ◽

Xinjiang Lu ◽

...

Keyword(s):

Large Scale ◽

Transportation Networks ◽

Representation Learning ◽

Transportation Systems ◽

Graph Representation ◽

Dynamic Graph ◽

Arbitrary Length ◽

Task Learning ◽

Semantic Coherence ◽

Spatio Temporal

Multi-modal transportation recommendation aims to provide the most appropriate travel route with various transportation modes according to certain criteria. After analyzing large-scale navigation data, we find that route representations exhibit two patterns: spatio-temporal autocorrelations within transportation networks and the semantic coherence of route sequences. However, there are few studies that consider both patterns when developing multi-modal transportation systems. To this end, in this paper, we study multi-modal transportation recommendation with unified route representation learning by exploiting both spatio-temporal dependencies in transportation networks and the semantic coherence of historical routes. Specifically, we propose to unify both dynamic graph representation learning and hierarchical multi-task learning for multi-modal transportation recommendations. Along this line, we first transform the multi-modal transportation network into time-dependent multi-view transportation graphs and propose a spatiotemporal graph neural network module to capture the spatial and temporal autocorrelation. Then, we introduce a coherent-aware attentive route representation learning module to project arbitrary-length routes into fixed-length representation vectors, with explicit modeling of route coherence from historical routes. Moreover, we develop a hierarchical multi-task learning module to differentiate route representations for different transport modes, and this is guided by the final recommendation feedback as well as multiple auxiliary tasks equipped in different network layers. Extensive experimental results on two large-scale real-world datasets demonstrate the performance of the proposed system outperforms eight baselines.

Download Full-text

An Attention-Based Graph Neural Network for Heterogeneous Structural Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5833 ◽

2020 ◽

Vol 34 (04) ◽

pp. 4132-4139

Author(s):

Huiting Hong ◽

Hantao Guo ◽

Yucheng Lin ◽

Xiaoqing Yang ◽

Zang Li ◽

...

Keyword(s):

Neural Network ◽

Structural Information ◽

Representation Learning ◽

Graph Representation ◽

Heterogeneous Information ◽

Domain Experts ◽

Proposed Model ◽

Meta Path ◽

Low Dimensional ◽

Public Datasets

In this paper, we focus on graph representation learning of heterogeneous information network (HIN), in which various types of vertices are connected by various types of relations. Most of the existing methods conducted on HIN revise homogeneous graph embedding models via meta-paths to learn low-dimensional vector space of HIN. In this paper, we propose a novel Heterogeneous Graph Structural Attention Neural Network (HetSANN) to directly encode structural information of HIN without meta-path and achieve more informative representations. With this method, domain experts will not be needed to design meta-path schemes and the heterogeneous information can be processed automatically by our proposed model. Specifically, we implicitly represent heterogeneous information using the following two methods: 1) we model the transformation between heterogeneous vertices through a projection in low-dimensional entity spaces; 2) afterwards, we apply the graph neural network to aggregate multi-relational information of projected neighborhood by means of attention mechanism. We also present three extensions of HetSANN, i.e., voices-sharing product attention for the pairwise relationships in HIN, cycle-consistency loss to retain the transformation between heterogeneous entity spaces, and multi-task learning with full use of information. The experiments conducted on three public datasets demonstrate that our proposed models achieve significant and consistent improvements compared to state-of-the-art solutions.

Download Full-text

An effective drug-disease associations prediction model based on graphic representation learning over multi-biomolecular network

BMC Bioinformatics ◽

10.1186/s12859-021-04553-2 ◽

2022 ◽

Vol 23 (1) ◽

Author(s):

Hanjing Jiang ◽

Yabing Huang

Keyword(s):

High Performance ◽

Large Scale ◽

Representation Learning ◽

Biological Data ◽

Graph Representation ◽

Data Set ◽

Validation Experiment ◽

Biomolecular Network ◽

Disease Associations ◽

Drug Reposition

Abstract Background Drug-disease associations (DDAs) can provide important information for exploring the potential efficacy of drugs. However, up to now, there are still few DDAs verified by experiments. Previous evidence indicates that the combination of information would be conducive to the discovery of new DDAs. How to integrate different biological data sources and identify the most effective drugs for a certain disease based on drug-disease coupled mechanisms is still a challenging problem. Results In this paper, we proposed a novel computation model for DDA predictions based on graph representation learning over multi-biomolecular network (GRLMN). More specifically, we firstly constructed a large-scale molecular association network (MAN) by integrating the associations among drugs, diseases, proteins, miRNAs, and lncRNAs. Then, a graph embedding model was used to learn vector representations for all drugs and diseases in MAN. Finally, the combined features were fed to a random forest (RF) model to predict new DDAs. The proposed model was evaluated on the SCMFDD-S data set using five-fold cross-validation. Experiment results showed that GRLMN model was very accurate with the area under the ROC curve (AUC) of 87.9%, which outperformed all previous works in terms of both accuracy and AUC in benchmark dataset. To further verify the high performance of GRLMN, we carried out two case studies for two common diseases. As a result, in the ranking of drugs that were predicted to be related to certain diseases (such as kidney disease and fever), 15 of the top 20 drugs have been experimentally confirmed. Conclusions The experimental results show that our model has good performance in the prediction of DDA. GRLMN is an effective prioritization tool for screening the reliable DDAs for follow-up studies concerning their participation in drug reposition.

Download Full-text

Deep learning with feature embedding for compound-protein interaction prediction

10.1101/086033 ◽

2016 ◽

Cited By ~ 16

Author(s):

Fangping Wan ◽

Jianyang (Michael) Zeng

Keyword(s):

Deep Learning ◽

Protein Interaction ◽

Protein Interactions ◽

Large Scale ◽

Computational Models ◽

Drug Repositioning ◽

Representation Learning ◽

Small Scale ◽

Interaction Prediction ◽

Protein Interaction Prediction

AbstractAccurately identifying compound-protein interactions in silico can deepen our understanding of the mechanisms of drug action and significantly facilitate the drug discovery and development process. Traditional similarity-based computational models for compound-protein interaction prediction rarely exploit the latent features from current available large-scale unlabelled compound and protein data, and often limit their usage on relatively small-scale datasets. We propose a new scheme that combines feature embedding (a technique of representation learning) with deep learning for predicting compound-protein interactions. Our method automatically learns the low-dimensional implicit but expressive features for compounds and proteins from the massive amount of unlabelled data. Combining effective feature embedding with powerful deep learning techniques, our method provides a general computational pipeline for accurate compound-protein interaction prediction, even when the interaction knowledge of compounds and proteins is entirely unknown. Evaluations on current large-scale databases of the measured compound-protein affinities, such as ChEMBL and BindingDB, as well as known drug-target interactions from DrugBank have demonstrated the superior prediction performance of our method, and suggested that it can offer a useful tool for drug development and drug repositioning.

Download Full-text