DRIMC: an improved drug repositioning approach using Bayesian inductive matrix completion

2020 ◽  
Vol 36 (9) ◽  
pp. 2839-2847 ◽  
Author(s):  
Wenjuan Zhang ◽  
Hunan Xu ◽  
Xiaozhong Li ◽  
Qiang Gao ◽  
Lin Wang

Abstract Motivation One of the most important problems in drug discovery research is to precisely predict a new indication for an existing drug, i.e. drug repositioning. Recent recommendation system-based methods have tackled this problem using matrix completion models. The models identify latent factors contributing to known drug-disease associations, and then infer novel drug-disease associations by the correlations between latent factors. However, these models have not fully considered the various drug data sources and the sparsity of the drug-disease association matrix. In addition, using the global structure of the drug-disease association data may introduce noise, and consequently limit the prediction power. Results In this work, we propose a novel drug repositioning approach by using Bayesian inductive matrix completion (DRIMC). First, we embed four drug data sources into a drug similarity matrix and two disease data sources in a disease similarity matrix. Then, for each drug or disease, its feature is described by similarity values between it and its nearest neighbors, and these features for drugs and diseases are mapped onto a shared latent space. We model the association probability for each drug-disease pair by inductive matrix completion, where the properties of drugs and diseases are represented by projections of drugs and diseases, respectively. As the known drug-disease associations have been manually verified, they are more trustworthy and important than the unknown pairs. We assign higher confidence levels to known association pairs compared with unknown pairs. We perform comprehensive experiments on three benchmark datasets, and DRIMC improves prediction accuracy compared with six stat-of-the-art approaches. Availability and implementation Source code and datasets are available at https://github.com/linwang1982/DRIMC. Supplementary information Supplementary data are available at Bioinformatics online.

2019 ◽  
Vol 35 (14) ◽  
pp. i455-i463 ◽  
Author(s):  
Mengyun Yang ◽  
Huimin Luo ◽  
Yaohang Li ◽  
Jianxin Wang

Abstract Motivation Computational drug repositioning is a cost-effective strategy to identify novel indications for existing drugs. Drug repositioning is often modeled as a recommendation system problem. Taking advantage of the known drug–disease associations, the objective of the recommendation system is to identify new treatments by filling out the unknown entries in the drug–disease association matrix, which is known as matrix completion. Underpinned by the fact that common molecular pathways contribute to many different diseases, the recommendation system assumes that the underlying latent factors determining drug–disease associations are highly correlated. In other words, the drug–disease matrix to be completed is low-rank. Accordingly, matrix completion algorithms efficiently constructing low-rank drug–disease matrix approximations consistent with known associations can be of immense help in discovering the novel drug–disease associations. Results In this article, we propose to use a bounded nuclear norm regularization (BNNR) method to complete the drug–disease matrix under the low-rank assumption. Instead of strictly fitting the known elements, BNNR is designed to tolerate the noisy drug–drug and disease–disease similarities by incorporating a regularization term to balance the approximation error and the rank properties. Moreover, additional constraints are incorporated into BNNR to ensure that all predicted matrix entry values are within the specific interval. BNNR is carried out on an adjacency matrix of a heterogeneous drug–disease network, which integrates the drug–drug, drug–disease and disease–disease networks. It not only makes full use of available drugs, diseases and their association information, but also is capable of dealing with cold start naturally. Our computational results show that BNNR yields higher drug–disease association prediction accuracy than the current state-of-the-art methods. The most significant gain is in prediction precision measured as the fraction of the positive predictions that are truly positive, which is particularly useful in drug design practice. Cases studies also confirm the accuracy and reliability of BNNR. Availability and implementation The code of BNNR is freely available at https://github.com/BioinformaticsCSU/BNNR. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 36 (8) ◽  
pp. 2538-2546 ◽  
Author(s):  
Jin Li ◽  
Sai Zhang ◽  
Tao Liu ◽  
Chenxi Ning ◽  
Zhuoxuan Zhang ◽  
...  

Abstract Motivation Predicting the association between microRNAs (miRNAs) and diseases plays an import role in identifying human disease-related miRNAs. As identification of miRNA-disease associations via biological experiments is time-consuming and expensive, computational methods are currently used as effective complements to determine the potential associations between disease and miRNA. Results We present a novel method of neural inductive matrix completion with graph convolutional network (NIMCGCN) for predicting miRNA-disease association. NIMCGCN first uses graph convolutional networks to learn miRNA and disease latent feature representations from the miRNA and disease similarity networks. Then, learned features were input into a novel neural inductive matrix completion (NIMC) model to generate an association matrix completion. The parameters of NIMCGCN were learned based on the known miRNA-disease association data in a supervised end-to-end way. We compared the proposed method with other state-of-the-art methods. The area under the receiver operating characteristic curve results showed that our method is significantly superior to existing methods. Furthermore, 50, 47 and 48 of the top 50 predicted miRNAs for three high-risk human diseases, namely, colon cancer, lymphoma and kidney cancer, were verified using experimental literature. Finally, 100% prediction accuracy was achieved when breast cancer was used as a case study to evaluate the ability of NIMCGCN for predicting a new disease without any known related miRNAs. Availability and implementation https://github.com/ljatynu/NIMCGCN/ Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Mengyun Yang ◽  
Gaoyan Wu ◽  
Qichang Zhao ◽  
Yaohang Li ◽  
Jianxin Wang

Abstract With the development of high-throughput technology and the accumulation of biomedical data, the prior information of biological entity can be calculated from different aspects. Specifically, drug–drug similarities can be measured from target profiles, drug–drug interaction and side effects. Similarly, different methods and data sources to calculate disease ontology can result in multiple measures of pairwise disease similarities. Therefore, in computational drug repositioning, developing a dynamic method to optimize the fusion process of multiple similarities is a crucial and challenging task. In this study, we propose a multi-similarities bilinear matrix factorization (MSBMF) method to predict promising drug-associated indications for existing and novel drugs. Instead of fusing multiple similarities into a single similarity matrix, we concatenate these similarity matrices of drug and disease, respectively. Applying matrix factorization methods, we decompose the drug–disease association matrix into a drug-feature matrix and a disease-feature matrix. At the same time, using these feature matrices as basis, we extract effective latent features representing the drug and disease similarity matrices to infer missing drug–disease associations. Moreover, these two factored matrices are constrained by non-negative factorization to ensure that the completed drug–disease association matrix is biologically interpretable. In addition, we numerically solve the MSBMF model by an efficient alternating direction method of multipliers algorithm. The computational experiment results show that MSBMF obtains higher prediction accuracy than the state-of-the-art drug repositioning methods in cross-validation experiments. Case studies also demonstrate the effectiveness of our proposed method in practical applications. Availability: The data and code of MSBMF are freely available at https://github.com/BioinformaticsCSU/MSBMF. Corresponding author: Jianxin Wang, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, P. R. China. E-mail: [email protected] Supplementary Data: Supplementary data are available online at https://academic.oup.com/bib.


Molecules ◽  
2020 ◽  
Vol 25 (12) ◽  
pp. 2776
Author(s):  
Xiguang Qi ◽  
Mingzhe Shen ◽  
Peihao Fan ◽  
Xiaojiang Guo ◽  
Tianqi Wang ◽  
...  

A gene expression signature (GES) is a group of genes that shows a unique expression profile as a result of perturbations by drugs, genetic modification or diseases on the transcriptional machinery. The comparisons between GES profiles have been used to investigate the relationships between drugs, their targets and diseases with quite a few successful cases reported. Especially in the study of GES-guided drugs–disease associations, researchers believe that if a GES induced by a drug is opposite to a GES induced by a disease, the drug may have potential as a treatment of that disease. In this study, we data-mined the crowd extracted expression of differential signatures (CREEDS) database to evaluate the similarity between GES profiles from drugs and their indicated diseases. Our study aims to explore the application domains of GES-guided drug–disease associations through the analysis of the similarity of GES profiles on known pairs of drug–disease associations, thereby identifying subgroups of drugs/diseases that are suitable for GES-guided drug repositioning approaches. Our results supported our hypothesis that the GES-guided drug–disease association method is better suited for some subgroups or pathways such as drugs and diseases associated with the immune system, diseases of the nervous system, non-chemotherapy drugs or the mTOR signaling pathway.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Shanchen Pang ◽  
Yu Zhuang ◽  
Xinzeng Wang ◽  
Fuyu Wang ◽  
Sibo Qiao

Abstract Background A large number of biological studies have shown that miRNAs are inextricably linked to many complex diseases. Studying the miRNA-disease associations could provide us a root cause understanding of the underlying pathogenesis in which promotes the progress of drug development. However, traditional biological experiments are very time-consuming and costly. Therefore, we come up with an efficient models to solve this challenge. Results In this work, we propose a deep learning model called EOESGC to predict potential miRNA-disease associations based on embedding of embedding and simplified convolutional network. Firstly, integrated disease similarity, integrated miRNA similarity, and miRNA-disease association network are used to construct a coupled heterogeneous graph, and the edges with low similarity are removed to simplify the graph structure and ensure the effectiveness of edges. Secondly, the Embedding of embedding model (EOE) is used to learn edge information in the coupled heterogeneous graph. The training rule of the model is that the associated nodes are close to each other and the unassociated nodes are far away from each other. Based on this rule, edge information learned is added into node embedding as supplementary information to enrich node information. Then, node embedding of EOE model training as a new feature of miRNA and disease, and information aggregation is performed by simplified graph convolution model, in which each level of convolution can aggregate multi-hop neighbor information. In this step, we only use the miRNA-disease association network to further simplify the graph structure, thus reducing the computational complexity. Finally, feature embeddings of both miRNA and disease are spliced into the MLP for prediction. On the EOESGC evaluation part, the AUC, AUPR, and F1-score of our model are 0.9658, 0.8543 and 0.8644 by 5-fold cross-validation respectively. Compared with the latest published models, our model shows better results. In addition, we predict the top 20 potential miRNAs for breast cancer and lung cancer, most of which are validated in the dbDEMC and HMDD3.2 databases. Conclusion The comprehensive experimental results show that EOESGC can effectively identify the potential miRNA-disease associations.


2019 ◽  
Vol 36 (7) ◽  
pp. 2209-2216 ◽  
Author(s):  
Herty Liany ◽  
Anand Jeyasekharan ◽  
Vaibhav Rajan

Abstract Motivation A synthetic lethal (SL) interaction is a relationship between two functional entities where the loss of either one of the entities is viable but the loss of both entities is lethal to the cell. Such pairs can be used as drug targets in targeted anticancer therapies, and so, many methods have been developed to identify potential candidate SL pairs. However, these methods use only a subset of available data from multiple platforms, at genomic, epigenomic and transcriptomic levels; and hence are limited in their ability to learn from complex associations in heterogeneous data sources. Results In this article, we develop techniques that can seamlessly integrate multiple heterogeneous data sources to predict SL interactions. Our approach obtains latent representations by collective matrix factorization-based techniques, which in turn are used for prediction through matrix completion. Our experiments, on a variety of biological datasets, illustrate the efficacy and versatility of our approach, that outperforms state-of-the-art methods for predicting SL interactions and can be used with heterogeneous data sources with minimal feature engineering. Availability and implementation Software available at https://github.com/lianyh. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Mengyun Yang ◽  
Lan Huang ◽  
Yunpei Xu ◽  
Chengqian Lu ◽  
Jianxin Wang

Abstract Motivation Emerging evidence presents that traditional drug discovery experiment is time-consuming and high costs. Computational drug repositioning plays a critical role in saving time and resources for drug research and discovery. Therefore, developing more accurate and efficient approaches is imperative. Heterogeneous graph inference is a classical method in computational drug repositioning, which not only has high convergence precision, but also has fast convergence speed. However, the method has not fully considered the sparsity of heterogeneous association network. In addition, rough similarity measure can reduce the performance in identifying drug-associated indications. Results In this article, we propose a heterogeneous graph inference with matrix completion (HGIMC) method to predict potential indications for approved and novel drugs. First, we use a bounded matrix completion (BMC) model to prefill a part of the missing entries in original drug–disease association matrix. This step can add more positive and formative drug–disease edges between drug network and disease network. Second, Gaussian radial basis function (GRB) is employed to improve the drug and disease similarities since the performance of heterogeneous graph inference more relies on similarity measures. Next, based on the updated drug–disease associations and new similarity measures of drug and disease, we construct a novel heterogeneous drug–disease network. Finally, HGIMC utilizes the heterogeneous network to infer the scores of unknown association pairs, and then recommend the promising indications for drugs. To evaluate the performance of our method, HGIMC is compared with five state-of-the-art approaches of drug repositioning in the 10-fold cross-validation and de novo tests. As the numerical results shown, HGIMC not only achieves a better prediction performance but also has an excellent computation efficiency. In addition, cases studies also confirm the effectiveness of our method in practical application. Availabilityand implementation The HGIMC software and data are freely available at https://github.com/BioinformaticsCSU/HGIMC, https://hub.docker.com/repository/docker/yangmy84/hgimc and http://doi.org/10.5281/zenodo.4285640. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Tian-Ru Wu ◽  
Meng-Meng Yin ◽  
Cui-Na Jiao ◽  
Ying-Lian Gao ◽  
Xiang-Zhen Kong ◽  
...  

Abstract Background MicroRNAs (miRNAs) are non-coding RNAs with regulatory functions. Many studies have shown that miRNAs are closely associated with human diseases. Among the methods to explore the relationship between the miRNA and the disease, traditional methods are time-consuming and the accuracy needs to be improved. In view of the shortcoming of previous models, a method, collaborative matrix factorization based on matrix completion (MCCMF) is proposed to predict the unknown miRNA-disease associations. Results The complete matrix of the miRNA and the disease is obtained by matrix completion. Moreover, Gaussian Interaction Profile kernel is added to the miRNA functional similarity matrix and the disease semantic similarity matrix. Then the Weight K Nearest Known Neighbors method is used to pretreat the association matrix, so the model is close to the reality. Finally, collaborative matrix factorization method is applied to obtain the prediction results. Therefore, the MCCMF obtains a satisfactory result in the fivefold cross-validation, with an AUC of 0.9569 (0.0005). Conclusions The AUC value of MCCMF is higher than other advanced methods in the fivefold cross validation experiment. In order to comprehensively evaluate the performance of MCCMF, accuracy, precision, recall and f-measure are also added. The final experimental results demonstrate that MCCMF outperforms other methods in predicting miRNA-disease associations. In the end, the effectiveness and practicability of MCCMF are further verified by researching three specific diseases.


2015 ◽  
Vol 2015 ◽  
pp. 1-9 ◽  
Author(s):  
Hailin Chen ◽  
Zuping Zhang

Increasing evidence discovered that the inappropriate expression of microRNAs (miRNAs) will lead to many kinds of complex diseases and drugs can regulate the expression level of miRNAs. Therefore human diseases may be treated by targeting some specific miRNAs with drugs, which provides a new perspective for drug repositioning. However, few studies have attempted to computationally predict associations between drugs and diseases via miRNAs for drug repositioning. In this paper, we developed an inference model to achieve this aim by combining experimentally supported drug-miRNA associations and miRNA-disease associations with the assumption that drugs will form associations with diseases when they share some significant miRNA partners. Experimental results showed excellent performance of our model. Case studies demonstrated that some of the strongly predicted drug-disease associations can be confirmed by the publicly accessible database CTD (www.ctdbase.org), which indicated the usefulness of our inference model. Moreover, candidate miRNAs as molecular hypotheses underpinning the associations were listed to guide future experiments. The predicted results were released for further studies. We expect that this study will provide help in our understanding of drug-disease association prediction and in the roles of miRNAs in drug repositioning.


2017 ◽  
Vol 4 (S) ◽  
pp. 76
Author(s):  
Duc-Hau Le ◽  
Duc-Hau Le

Computational drug repositioning has been proven as a promising and efficient strategy for discovering new uses from existing drugs. To achieve this goal, a number of computational methods have been proposed, which are based on different data sources of drugs, diseases and different approaches. Depending on where the discovery of drug-disease relationships comes from, proposed computational methods can be categorized as either ‘drug-based’ or ‘disease-based’. The proposed methods are usually based on an assumption that similar drugs can be used for similar diseases to identify new indications of drugs. Therefore, similarity between drugs and between diseases is usually used as inputs. In addition, known drug-disease associations are also needed for the methods. It should be noted that these associations are still not well established due to many of marketed drugs have been withdrawn and this could affect to outcome of the methods. In this study, instead of using the known drug-disease associations, we based on known disease-gene and drug-target associations. In addition, similarity between drugs measured by chemical structures of drug compounds and similarity between diseases sharing phenotypes are used. Then, a semi-supervised learning model, Regularized Least Square (RLS), which can exploit these information effectively, is used to find new uses of drugs. Experiment results demonstrate that our method, namely RLSDR, outperforms several state-of-the-art existing methods in terms of area under the ROC curve (AUC). Novel indications for a number of drugs are identified and validated by evidences from different resources


Sign in / Sign up

Export Citation Format

Share Document