NEDD: a network embedding based method for predicting drug-disease associations

Abstract Background Drug discovery is known for the large amount of money and time it consumes and the high risk it takes. Drug repositioning has, therefore, become a popular approach to save time and cost by finding novel indications for approved drugs. In order to distinguish these novel indications accurately in a great many of latent associations between drugs and diseases, it is necessary to exploit abundant heterogeneous information about drugs and diseases. Results In this article, we propose a meta-path-based computational method called NEDD to predict novel associations between drugs and diseases using heterogeneous information. First, we construct a heterogeneous network as an undirected graph by integrating drug-drug similarity, disease-disease similarity, and known drug-disease associations. NEDD uses meta paths of different lengths to explicitly capture the indirect relationships, or high order proximity, within drugs and diseases, by which the low dimensional representation vectors of drugs and diseases are obtained. NEDD then uses a random forest classifier to predict novel associations between drugs and diseases. Conclusions The experiments on a gold standard dataset which contains 1933 validated drug–disease associations show that NEDD produces superior prediction results compared with the state-of-the-art approaches.

Download Full-text

An Attention-Based Graph Neural Network for Heterogeneous Structural Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5833 ◽

2020 ◽

Vol 34 (04) ◽

pp. 4132-4139

Author(s):

Huiting Hong ◽

Hantao Guo ◽

Yucheng Lin ◽

Xiaoqing Yang ◽

Zang Li ◽

...

Keyword(s):

Neural Network ◽

Structural Information ◽

Representation Learning ◽

Graph Representation ◽

Heterogeneous Information ◽

Domain Experts ◽

Proposed Model ◽

Meta Path ◽

Low Dimensional ◽

Public Datasets

In this paper, we focus on graph representation learning of heterogeneous information network (HIN), in which various types of vertices are connected by various types of relations. Most of the existing methods conducted on HIN revise homogeneous graph embedding models via meta-paths to learn low-dimensional vector space of HIN. In this paper, we propose a novel Heterogeneous Graph Structural Attention Neural Network (HetSANN) to directly encode structural information of HIN without meta-path and achieve more informative representations. With this method, domain experts will not be needed to design meta-path schemes and the heterogeneous information can be processed automatically by our proposed model. Specifically, we implicitly represent heterogeneous information using the following two methods: 1) we model the transformation between heterogeneous vertices through a projection in low-dimensional entity spaces; 2) afterwards, we apply the graph neural network to aggregate multi-relational information of projected neighborhood by means of attention mechanism. We also present three extensions of HetSANN, i.e., voices-sharing product attention for the pairwise relationships in HIN, cycle-consistency loss to retain the transformation between heterogeneous entity spaces, and multi-task learning with full use of information. The experiments conducted on three public datasets demonstrate that our proposed models achieve significant and consistent improvements compared to state-of-the-art solutions.

Download Full-text

A graph auto-encoder model for miRNA-disease associations prediction

Briefings in Bioinformatics ◽

10.1093/bib/bbaa240 ◽

2020 ◽

Author(s):

Zhengwei Li ◽

Jiashu Li ◽

Ru Nie ◽

Zhu-Hong You ◽

Wenzheng Bao

Keyword(s):

Neural Networks ◽

Clinical Medicine ◽

Area Under The Curve ◽

Heterogeneous Information ◽

Source Codes ◽

Differentially Expressed Mirnas ◽

Disease Associations ◽

Graph Neural Networks ◽

New Biomarkers ◽

Low Dimensional

Abstract Emerging evidence indicates that the abnormal expression of miRNAs involves in the evolution and progression of various human complex diseases. Identifying disease-related miRNAs as new biomarkers can promote the development of disease pathology and clinical medicine. However, designing biological experiments to validate disease-related miRNAs is usually time-consuming and expensive. Therefore, it is urgent to design effective computational methods for predicting potential miRNA-disease associations. Inspired by the great progress of graph neural networks in link prediction, we propose a novel graph auto-encoder model, named GAEMDA, to identify the potential miRNA-disease associations in an end-to-end manner. More specifically, the GAEMDA model applies a graph neural networks-based encoder, which contains aggregator function and multi-layer perceptron for aggregating nodes’ neighborhood information, to generate the low-dimensional embeddings of miRNA and disease nodes and realize the effective fusion of heterogeneous information. Then, the embeddings of miRNA and disease nodes are fed into a bilinear decoder to identify the potential links between miRNA and disease nodes. The experimental results indicate that GAEMDA achieves the average area under the curve of $93.56\pm 0.44\%$ under 5-fold cross-validation. Besides, we further carried out case studies on colon neoplasms, esophageal neoplasms and kidney neoplasms. As a result, 48 of the top 50 predicted miRNAs associated with these diseases are confirmed by the database of differentially expressed miRNAs in human cancers and microRNA deregulation in human disease database, respectively. The satisfactory prediction performance suggests that GAEMDA model could serve as a reliable tool to guide the following researches on the regulatory role of miRNAs. Besides, the source codes are available at https://github.com/chimianbuhetang/GAEMDA.

Download Full-text

MGRL: Predicting Drug-Disease Associations Based on Multi-Graph Representation Learning

Frontiers in Genetics ◽

10.3389/fgene.2021.657182 ◽

2021 ◽

Vol 12 ◽

Author(s):

Bo-Wei Zhao ◽

Zhu-Hong You ◽

Leon Wong ◽

Ping Zhang ◽

Hao-Yuan Li ◽

...

Keyword(s):

High Efficiency ◽

Drug Repositioning ◽

Representation Learning ◽

Computational Method ◽

Graph Representation ◽

Practical Applications ◽

Study Results ◽

Disease Associations ◽

Better Than

Drug repositioning is an application-based solution based on mining existing drugs to find new targets, quickly discovering new drug-disease associations, and reducing the risk of drug discovery in traditional medicine and biology. Therefore, it is of great significance to design a computational model with high efficiency and accuracy. In this paper, we propose a novel computational method MGRL to predict drug-disease associations based on multi-graph representation learning. More specifically, MGRL first uses the graph convolution network to learn the graph representation of drugs and diseases from their self-attributes. Then, the graph embedding algorithm is used to represent the relationships between drugs and diseases. Finally, the two kinds of graph representation learning features were put into the random forest classifier for training. To the best of our knowledge, this is the first work to construct a multi-graph to extract the characteristics of drugs and diseases to predict drug-disease associations. The experiments show that the MGRL can achieve a higher AUC of 0.8506 based on five-fold cross-validation, which is significantly better than other existing methods. Case study results show the reliability of the proposed method, which is of great significance for practical applications.

Download Full-text

Predicting Drug-Disease Association Based on Ensemble Strategy

Frontiers in Genetics ◽

10.3389/fgene.2021.666575 ◽

2021 ◽

Vol 12 ◽

Author(s):

Jianlin Wang ◽

Wenxiu Wang ◽

Chaokun Yan ◽

Junwei Luo ◽

Ge Zhang

Keyword(s):

Prediction Models ◽

Drug Repositioning ◽

Predictive Ability ◽

Disease Association ◽

New Drugs ◽

Similarity Network ◽

Disease Similarity ◽

Ensemble Strategy ◽

Disease Associations ◽

Reducing Costs

Drug repositioning is used to find new uses for existing drugs, effectively shortening the drug research and development cycle and reducing costs and risks. A new model of drug repositioning based on ensemble learning is proposed. This work develops a novel computational drug repositioning approach called CMAF to discover potential drug-disease associations. First, for new drugs and diseases or unknown drug-disease pairs, based on their known neighbor information, an association probability can be obtained by implementing the weighted K nearest known neighbors (WKNKN) method and improving the drug-disease association information. Then, a new drug similarity network and new disease similarity network can be constructed. Three prediction models are applied and ensembled to enable the final association of drug-disease pairs based on improved drug-disease association information and the constructed similarity network. The experimental results demonstrate that the developed approach outperforms recent state-of-the-art prediction models. Case studies further confirm the predictive ability of the proposed method. Our proposed method can effectively improve the prediction results.

Download Full-text

Overcoming sparseness of biomedical networks to identify drug repositioning candidates

10.1101/2020.06.07.138966 ◽

2020 ◽

Cited By ~ 1

Author(s):

Aleksandar Poleksic

Keyword(s):

Biological Networks ◽

Drug Targets ◽

Drug Repositioning ◽

Drug Repurposing ◽

Drug Efficacy ◽

Computational Procedure ◽

Computational Method ◽

Biomedical Data ◽

Disease Associations ◽

Disease Associated Genes

AbstractModeling complex biological systems is necessary to understand biochemical interactions behind pharmacological effects of drugs. Successful in silico drug repurposing requires a thorough exploration of diverse biochemical concepts and their relationships, including drug’s adverse reactions, drug targets, disease symptoms, as well as disease associated genes and their pathways, to name a few. We present a computational method for inferring drug-disease associations from complex but incomplete and biased biological networks. Our method employs the compressed sensing technique to overcome the sparseness of biomedical data and, in turn, to enrich the set of verified relationships between different biomedical entities. We present a strategy for identifying network paths supportive of drug efficacy as well as a computational procedure capable of combining different network patterns to better distinguish treatments from non-treatments. The data and programs are freely available at http://bioinfo.cs.uni.edu/AEONET.html.

Download Full-text

gGATLDA: lncRNA-disease association prediction based on graph-level graph attention network

BMC Bioinformatics ◽

10.1186/s12859-021-04548-z ◽

2022 ◽

Vol 23 (1) ◽

Author(s):

Li Wang ◽

Cheng Zhong

Keyword(s):

Characteristic Curve ◽

Experimental Results ◽

Computational Method ◽

Attention Network ◽

Feature Vectors ◽

Disease Similarity ◽

Potential Association ◽

Receiver Operation Characteristic ◽

Disease Pair ◽

Disease Associations

Abstract Background Long non-coding RNAs (lncRNAs) are related to human diseases by regulating gene expression. Identifying lncRNA-disease associations (LDAs) will contribute to diagnose, treatment, and prognosis of diseases. However, the identification of LDAs by the biological experiments is time-consuming, costly and inefficient. Therefore, the development of efficient and high-accuracy computational methods for predicting LDAs is of great significance. Results In this paper, we propose a novel computational method (gGATLDA) to predict LDAs based on graph-level graph attention network. Firstly, we extract the enclosing subgraphs of each lncRNA-disease pair. Secondly, we construct the feature vectors by integrating lncRNA similarity and disease similarity as node attributes in subgraphs. Finally, we train a graph neural network (GNN) model by feeding the subgraphs and feature vectors to it, and use the trained GNN model to predict lncRNA-disease potential association scores. The experimental results show that our method can achieve higher area under the receiver operation characteristic curve (AUC), area under the precision recall curve (AUPR), accuracy and F1-Score than the state-of-the-art methods in five fold cross-validation. Case studies show that our method can effectively identify lncRNAs associated with breast cancer, gastric cancer, prostate cancer, and renal cancer. Conclusion The experimental results indicate that our method is a useful approach for predicting potential LDAs.

Download Full-text

Temporal Heterogeneous Information Network Embedding

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/203 ◽

2021 ◽

Author(s):

Hong Huang ◽

Ruize Shi ◽

Wei Zhou ◽

Xiao Wang ◽

Hai Jin ◽

...

Keyword(s):

Temporal Dynamics ◽

Information Network ◽

Hawkes Process ◽

Heterogeneous Information Network ◽

Heterogeneous Information ◽

Meta Path ◽

Low Dimensional ◽

Type Node ◽

Link Recommendation ◽

Node Embeddings

Heterogeneous information network (HIN) embedding, learning the low-dimensional representation of multi-type nodes, has been applied widely and achieved excellent performance. However, most of the previous works focus more on static heterogeneous networks or learning node embedding within specific snapshots, and seldom attention has been paid to the whole evolution process and capturing all temporal dynamics. In order to fill the gap of obtaining multi-type node embeddings by considering all temporal dynamics during the evolution, we propose a novel temporal HIN embedding method (THINE). THINE not only uses attention mechanism and meta-path to preserve structures and semantics in HIN but also combines the Hawkes process to simulate the evolution of the temporal network. Our extensive evaluations with various real-world temporal HINs demonstrate that THINE achieves state-of-the-art performance in both static and dynamic tasks, including node classification, link prediction, and temporal link recommendation.

Download Full-text

A Novel Computational Method for Predicting LncRNA-Disease Associations from Heterogeneous Information Network with SDNE Embedding Model

Intelligent Computing Theories and Application - Lecture Notes in Computer Science ◽

10.1007/978-3-030-60802-6_44 ◽

2020 ◽

pp. 505-513

Author(s):

Ping Zhang ◽

Bo-Wei Zhao ◽

Leon Wong ◽

Zhu-Hong You ◽

Zhen-Hao Guo ◽

...

Keyword(s):

Computational Method ◽

Information Network ◽

Heterogeneous Information Network ◽

Heterogeneous Information ◽

Disease Associations

Download Full-text

Predicting miRNA-Disease Associations by Incorporating Projections in Low-Dimensional Space and Local Topological Information

Genes ◽

10.3390/genes10090685 ◽

2019 ◽

Vol 10 (9) ◽

pp. 685 ◽

Cited By ~ 1

Author(s):

Xuan ◽

Zhang ◽

Li ◽

Zhao

Keyword(s):

Dimensional Space ◽

Characteristic Curve ◽

Feature Space ◽

Superior Performance ◽

Topological Information ◽

Heterogeneous Information ◽

Feature Representations ◽

Disease Associations ◽

Precision Recall Curve ◽

Low Dimensional

Predicting the potential microRNA (miRNA) candidates associated with a disease helps in exploring the mechanisms of disease development. Most recent approaches have utilized heterogeneous information about miRNAs and diseases, including miRNA similarities, disease similarities, and miRNA-disease associations. However, these methods do not utilize the projections of miRNAs and diseases in a low-dimensional space. Thus, it is necessary to develop a method that can utilize the effective information in the low-dimensional space to predict potential disease-related miRNA candidates. We proposed a method based on non-negative matrix factorization, named DMAPred, to predict potential miRNA-disease associations. DMAPred exploits the similarities and associations of diseases and miRNAs, and it integrates local topological information of the miRNA network. The likelihood that a miRNA is associated with a disease also depends on their projections in low-dimensional space. Therefore, we project miRNAs and diseases into low-dimensional feature space to yield their low-dimensional and dense feature representations. Moreover, the sparse characteristic of miRNA-disease associations was introduced to make our predictive model more credible. DMAPred achieved superior performance for 15 well-characterized diseases with AUCs (area under the receiver operating characteristic curve) ranging from 0.860 to 0.973 and AUPRs (area under the precision-recall curve) ranging from 0.118 to 0.761. In addition, case studies on breast, prostatic, and lung neoplasms demonstrated the ability of DMAPred to discover potential disease-related miRNAs.

Download Full-text

Predicting drug−disease associations via sigmoid kernel-based convolutional neural networks

Journal of Translational Medicine ◽

10.1186/s12967-019-2127-5 ◽

2019 ◽

Vol 17 (1) ◽

Cited By ~ 3

Author(s):

Han-Jing Jiang ◽

Zhu-Hong You ◽

Yu-An Huang

Keyword(s):

Deep Learning ◽

Drug Repositioning ◽

Structural Similarity ◽

Computational Method ◽

Superior Performance ◽

Resource Saving ◽

New Drug ◽

Disease Pair ◽

Disease Associations ◽

Series Of Experiments

Abstract Background In the process of drug development, computational drug repositioning is effective and resource-saving with regards to its important functions on identifying new drug–disease associations. Recent years have witnessed a great progression in the field of data mining with the advent of deep learning. An increasing number of deep learning-based techniques have been proposed to develop computational tools in bioinformatics. Methods Along this promising direction, we here propose a drug repositioning computational method combining the techniques of Sigmoid Kernel and Convolutional Neural Network (SKCNN) which is able to learn new features effectively representing drug–disease associations via its hidden layers. Specifically, we first construct similarity metric of drugs using drug sigmoid similarity and drug structural similarity, and that of disease using disease sigmoid similarity and disease semantic similarity. Based on the combined similarities of drugs and diseases, we then use SKCNN to learn hidden representations for each drug-disease pair whose labels are finally predicted by a classifier based on random forest. Results A series of experiments were implemented for performance evaluation and their results show that the proposed SKCNN improves the prediction accuracy compared with other state-of-the-art approaches. Case studies of two selected disease are also conducted through which we prove the superior performance of our method in terms of the actual discovery of potential drug indications. Conclusion The aim of this study was to establish an effective predictive model for finding new drug–disease associations. These experimental results show that SKCNN can effectively predict the association between drugs and diseases.

Download Full-text