scholarly journals Predicting miRNA-Disease Association Based on Modularity Preserving Heterogeneous Network Embedding

Author(s):  
Wei Peng ◽  
Jielin Du ◽  
Wei Dai ◽  
Wei Lan

MicroRNAs (miRNAs) are a category of small non-coding RNAs that profoundly impact various biological processes related to human disease. Inferring the potential miRNA-disease associations benefits the study of human diseases, such as disease prevention, disease diagnosis, and drug development. In this work, we propose a novel heterogeneous network embedding-based method called MDN-NMTF (Module-based Dynamic Neighborhood Non-negative Matrix Tri-Factorization) for predicting miRNA-disease associations. MDN-NMTF constructs a heterogeneous network of disease similarity network, miRNA similarity network and a known miRNA-disease association network. After that, it learns the latent vector representation for miRNAs and diseases in the heterogeneous network. Finally, the association probability is computed by the product of the latent miRNA and disease vectors. MDN-NMTF not only successfully integrates diverse biological information of miRNAs and diseases to predict miRNA-disease associations, but also considers the module properties of miRNAs and diseases in the course of learning vector representation, which can maximally preserve the heterogeneous network structural information and the network properties. At the same time, we also extend MDN-NMTF to a new version (called MDN-NMTF2) by using modular information to improve the miRNA-disease association prediction ability. Our methods and the other four existing methods are applied to predict miRNA-disease associations in four databases. The prediction results show that our methods can improve the miRNA-disease association prediction to a high level compared with the four existing methods.

2020 ◽  
Author(s):  
Bo-Ya Ji ◽  
Zhu-Hong You ◽  
Zhan-Heng Chen ◽  
Leon Wong ◽  
Hai-Cheng Yi

Abstract Background As an important non-coding RNA newly discovered in recent years, MicroRNA (miRNA) plays an important role in a series of life processes and is closely associated with a variety of human diseases. Hence, the identification of potential miRNA-disease associations can make great contributions to the research and treatment of human diseases. However, to our knowledge, many of the existing state-of-the-art computational methods only utilize the single type of known association information between miRNAs and diseases to predict their potential associations, without focusing on their interactions or associations with other types of molecules. Results In this paper, a network embedding-based the tripartite miRNA-protein-disease network (NEMPD) method was proposed for the prediction of miRNA-disease associations. Firstly, a tripartite miRNA-protein-disease network is created by integrating known miRNA-protein and protein-disease associations. Then, we utilize the network representation method-Learning Graph Representations with Global Structural Information (GraRep) to obtain the behavior information (associations with proteins in the network) of miRNAs and diseases. Secondly, the behavior information of miRNAs and diseases is combined with the attribute information of them (disease semantic similarity and miRNA sequence information) to represent miRNA-disease pairs. Thirdly, the prediction model was established based on these known miRNA-disease pairs and the Random Forest algorithm. In the results, under five-fold cross validation, the average prediction accuracy, sensitivity, and AUC of NEMPD is 85.41%, 80.96%, and 91.58%. Furthermore, the performance of NEMPD was also validated by the case studies. Among the top 50 predicted disease-related miRNAs, 48 (breast neoplasms), 47 (colon neoplasms), 47 (lung neoplasms) were confirmed by two other databases. Conclusions NEMPD has a good performance in predicting the potential associations between miRNAs and diseases and has great potency in the field of miRNA-disease association prediction in the future.


2020 ◽  
Author(s):  
Bo-Ya Ji ◽  
Zhu-Hong You ◽  
Zhan-Heng Chen ◽  
Leon Wong ◽  
Hai-Cheng Yi

Abstract Background: As an important non-coding RNA, microRNA (miRNA) plays a significant role in a series of life processes and is closely associated with a variety of Human diseases. Hence, identification of potential miRNA-disease associations can make great contributions to the research and treatment of Human diseases. However, to our knowledge, many existing computational methods only utilize the single type of known association information between miRNAs and diseases to predict their potential associations, without focusing on their interactions or associations with other types of molecules. Results: In this paper, we propose a network embedding-based method for predicting miRNA-disease associations by preserving behavior and attribute information. Firstly, a heterogeneous network is constructed by integrating known associations among miRNA, protein and disease, and the network representation method Learning Graph Representations with Global Structural Information (GraRep) is implemented to learn the behavior information of miRNAs and diseases in the network. Then, the behavior information of miRNAs and diseases is combined with the attribute information of them to represent miRNA-disease association pairs. Finally, the prediction model is established based on the Random Forest algorithm. Under the five-fold cross validation, the proposed NEMPD model obtained average 85.41% prediction accuracy with 80.96% sensitivity at the AUC of 91.58%. Furthermore, the performance of NEMPD is also validated by the case studies. Among the top 50 predicted disease-related miRNAs, 48 (breast neoplasms), 47 (colon neoplasms), 47 (lung neoplasms) were confirmed by two other databases.Conclusions: The proposed NEMPD model has a good performance in predicting the potential associations between miRNAs and diseases, and has great potency in the field of miRNA-disease association prediction in the future.


2020 ◽  
Author(s):  
Bo-Ya Ji ◽  
Zhu-Hong You ◽  
Zhan-Heng Chen ◽  
Leon Wong ◽  
Hai-Cheng Yi

Abstract Background: As an important non-coding RNA newly discovered in recent years, MicroRNA (miRNA) plays an important role in a series of life processes and is closely associated with a variety of human diseases. Hence, the identification of potential miRNA-disease associations can make great contributions to the research and treatment of human diseases. However, to our knowledge, many of the existing state-of-the-art computational methods only utilize the single type of known association information between miRNAs and diseases to predict their potential associations, without focusing on their interactions or associations with other types of molecules.Results: In this paper, a network embedding-based the tripartite miRNA-protein-disease network (NEMPD) method was proposed for the prediction of miRNA-disease associations. Firstly, a tripartite miRNA-protein-disease network is created by integrating known miRNA-protein and protein-disease associations. Then, we utilize the network representation method-Learning Graph Representations with Global Structural Information (GraRep) to obtain the behavior information (associations with proteins in the network) of miRNAs and diseases. Secondly, the behavior information of miRNAs and diseases is combined with the attribute information of them (disease semantic similarity and miRNA sequence information) to represent miRNA-disease pairs. Thirdly, the prediction model was established based on these known miRNA-disease pairs and the Random Forest algorithm. In the results, under five-fold cross validation, the prediction accuracy, sensitivity, and AUC of NEMPD is 85.41%, 80.96%, and 91.58%. Furthermore, the performance of NEMPD was also validated by the case studies. Among the top 50 predicted disease-related miRNAs, 48 (breast neoplasms), 47 (colon neoplasms), 47 (lung neoplasms) were confirmed by two other databases.Conclusions: NEMPD has a good performance in predicting the potential associations between miRNAs and diseases and has great potency in the field of miRNA-disease association prediction in the future.


2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Bo-Ya Ji ◽  
Zhu-Hong You ◽  
Zhan-Heng Chen ◽  
Leon Wong ◽  
Hai-Cheng Yi

Abstract Background As an important non-coding RNA, microRNA (miRNA) plays a significant role in a series of life processes and is closely associated with a variety of Human diseases. Hence, identification of potential miRNA-disease associations can make great contributions to the research and treatment of Human diseases. However, to our knowledge, many existing computational methods only utilize the single type of known association information between miRNAs and diseases to predict their potential associations, without focusing on their interactions or associations with other types of molecules. Results In this paper, we propose a network embedding-based method for predicting miRNA-disease associations by preserving behavior and attribute information. Firstly, a heterogeneous network is constructed by integrating known associations among miRNA, protein and disease, and the network representation method Learning Graph Representations with Global Structural Information (GraRep) is implemented to learn the behavior information of miRNAs and diseases in the network. Then, the behavior information of miRNAs and diseases is combined with the attribute information of them to represent miRNA-disease association pairs. Finally, the prediction model is established based on the Random Forest algorithm. Under the five-fold cross validation, the proposed NEMPD model obtained average 85.41% prediction accuracy with 80.96% sensitivity at the AUC of 91.58%. Furthermore, the performance of NEMPD is also validated by the case studies. Among the top 50 predicted disease-related miRNAs, 48 (breast neoplasms), 47 (colon neoplasms), 47 (lung neoplasms) were confirmed by two other databases. Conclusions The proposed NEMPD model has a good performance in predicting the potential associations between miRNAs and diseases, and has great potency in the field of miRNA-disease association prediction in the future.


Genes ◽  
2019 ◽  
Vol 10 (8) ◽  
pp. 608 ◽  
Author(s):  
Yan Li ◽  
Junyi Li ◽  
Naizheng Bian

Identifying associations between lncRNAs and diseases can help understand disease-related lncRNAs and facilitate disease diagnosis and treatment. The dual-network integrated logistic matrix factorization (DNILMF) model has been used for drug–target interaction prediction, and good results have been achieved. We firstly applied DNILMF to lncRNA–disease association prediction (DNILMF-LDA). We combined different similarity kernel matrices of lncRNAs and diseases by using nonlinear fusion to extract the most important information in fused matrices. Then, lncRNA–disease association networks and similarity networks were built simultaneously. Finally, the Gaussian process mutual information (GP-MI) algorithm of Bayesian optimization was adopted to optimize the model parameters. The 10-fold cross-validation result showed that the area under receiving operating characteristic (ROC) curve (AUC) value of DNILMF-LDA was 0.9202, and the area under precision-recall (PR) curve (AUPR) was 0.5610. Compared with LRLSLDA, SIMCLDA, BiwalkLDA, and TPGLDA, the AUC value of our method increased by 38.81%, 13.07%, 8.35%, and 6.75%, respectively. The AUPR value of our method increased by 52.66%, 40.05%, 37.01%, and 44.25%. These results indicate that DNILMF-LDA is an effective method for predicting the associations between lncRNAs and diseases.


BMC Genomics ◽  
2020 ◽  
Vol 21 (S10) ◽  
Author(s):  
Huiran Li ◽  
Yin Guo ◽  
Menglan Cai ◽  
Limin Li

Abstract Background Biological evidence has shown that microRNAs(miRNAs) are greatly implicated in various biological progresses involved in human diseases. The identification of miRNA-disease associations(MDAs) is beneficial to disease diagnosis as well as treatment. Due to the high costs of biological experiments, it attracts more and more attention to predict MDAs by computational approaches. Results In this work, we propose a novel model MTFMDA for miRNA-disease association prediction by matrix tri-factorization, based on the known miRNA-disease associations, two types of miRNA similarities, and two types of disease similarities. The main idea of MTFMDA is to factorize the miRNA-disease association matrix to three matrices, a feature matrix for miRNAs, a feature matrix for diseases, and a low-rank relationship matrix. Our model incorporates the Laplacian regularizers which force the feature matrices to preserve the similarities of miRNAs or diseases. A novel algorithm is proposed to solve the optimization problem. Conclusions We evaluate our model by 5-fold cross validation by using known MDAs from HMDD V2.0 and show that our model could obtain the significantly highest AUCs among all the state-of-art methods. We further validate our method by applying it on colon and breast neoplasms in two different types of experiment settings. The new identified associated miRNAs for the two diseases could be verified by two other databases including dbDEMC and HMDD V3.0, which further shows the power of our proposed method.


2020 ◽  
Vol 15 ◽  
Author(s):  
Xinguo Lu ◽  
Yan Gao ◽  
Zhenghao Zhu ◽  
Li Ding ◽  
Xinyu Wang ◽  
...  

: MicroRNA is a type of non-coding RNA molecule whose length is about 22 nucleotides. The growing evidence shows that microRNA makes critical regulations in the development of complex diseases, such as cancers, cardiovascular diseases. Predicting potential microRNA-disease associations can provide a new perspective to achieve a better scheme of disease diagnosis and prognosis. However, there is a challenge to predict some potential essential microRNAs only with few known associations. To tackle this, we propose a novel method, named as constrained strategy for predicting microRNA-disease associations called CPMDA, in heterogeneous omics data. Here, we firstly construct disease similarity network and microRNA similarity network to preprocess the microRNAs with none available associations. Then, we apply probabilistic factorization to obtain two feature matrices of microRNA and disease. Meanwhile, we formulate a similarity feature matrix as constraints in the factorization process. Finally, we utilize obtained feature matrixes to identify potential associations for all diseases. The results indicate that CPMDA is superior over other methods in predicting potential microRNA-disease associations. Moreover, the evaluation show that CPMDA has a strong effect on microRNAs with few known associations. In case studies, CPMDA also demonstrated the effectiveness to infer unknown microRNAdisease associations for those novel diseases and microRNAs.


2018 ◽  
Vol 19 (11) ◽  
pp. 3410 ◽  
Author(s):  
Xiujuan Lei ◽  
Zengqiang Fang ◽  
Luonan Chen ◽  
Fang-Xiang Wu

CircRNAs have particular biological structure and have proven to play important roles in diseases. It is time-consuming and costly to identify circRNA-disease associations by biological experiments. Therefore, it is appealing to develop computational methods for predicting circRNA-disease associations. In this study, we propose a new computational path weighted method for predicting circRNA-disease associations. Firstly, we calculate the functional similarity scores of diseases based on disease-related gene annotations and the semantic similarity scores of circRNAs based on circRNA-related gene ontology, respectively. To address missing similarity scores of diseases and circRNAs, we calculate the Gaussian Interaction Profile (GIP) kernel similarity scores for diseases and circRNAs, respectively, based on the circRNA-disease associations downloaded from circR2Disease database (http://bioinfo.snnu.edu.cn/CircR2Disease/). Then, we integrate disease functional similarity scores and circRNA semantic similarity scores with their related GIP kernel similarity scores to construct a heterogeneous network made up of three sub-networks: disease similarity network, circRNA similarity network and circRNA-disease association network. Finally, we compute an association score for each circRNA-disease pair based on paths connecting them in the heterogeneous network to determine whether this circRNA-disease pair is associated. We adopt leave one out cross validation (LOOCV) and five-fold cross validations to evaluate the performance of our proposed method. In addition, three common diseases, Breast Cancer, Gastric Cancer and Colorectal Cancer, are used for case studies. Experimental results illustrate the reliability and usefulness of our computational method in terms of different validation measures, which indicates PWCDA can effectively predict potential circRNA-disease associations.


2019 ◽  
Vol 12 (S10) ◽  
Author(s):  
Bo Xu ◽  
Yu Liu ◽  
Shuo Yu ◽  
Lei Wang ◽  
Jie Dong ◽  
...  

Abstract Background Prediction of pathogenic genes is crucial for disease prevention, diagnosis, and treatment. But traditional genetic localization methods are often technique-difficulty and time-consuming. With the development of computer science, computational biology has gradually become one of the main methods for finding candidate pathogenic genes. Methods We propose a pathogenic genes prediction method based on network embedding which is called Multipath2vec. Firstly, we construct an heterogeneous network which is called GP−network. It is constructed based on three kinds of relationships between genes and phenotypes, including correlations between phenotypes, interactions between genes and known gene-phenotype pairs. Then in order to embedding the network better, we design the multi-path to guide random walk in GP−network. The multi-path includes multiple paths between genes and phenotypes which can capture complex structural information of heterogeneous network. Finally, we use the learned vector representation of each phenotype and protein to calculate the similarities and rank according to the similarities between candidate genes and the target phenotype. Results We implemented Multipath2vec and four baseline approaches (i.e., CATAPULT, PRINCE, Deepwalk and Metapath2vec) on many-genes gene-phenotype data, single-gene gene-phenotype data and whole gene-phenotype data. Experimental results show that Multipath2vec outperformed the state-of-the-art baselines in pathogenic genes prediction task. Conclusions We propose Multipath2vec that can be utilized to predict pathogenic genes and experimental results show the higher accuracy of pathogenic genes prediction.


2021 ◽  
Vol 12 ◽  
Author(s):  
Jianlin Wang ◽  
Wenxiu Wang ◽  
Chaokun Yan ◽  
Junwei Luo ◽  
Ge Zhang

Drug repositioning is used to find new uses for existing drugs, effectively shortening the drug research and development cycle and reducing costs and risks. A new model of drug repositioning based on ensemble learning is proposed. This work develops a novel computational drug repositioning approach called CMAF to discover potential drug-disease associations. First, for new drugs and diseases or unknown drug-disease pairs, based on their known neighbor information, an association probability can be obtained by implementing the weighted K nearest known neighbors (WKNKN) method and improving the drug-disease association information. Then, a new drug similarity network and new disease similarity network can be constructed. Three prediction models are applied and ensembled to enable the final association of drug-disease pairs based on improved drug-disease association information and the constructed similarity network. The experimental results demonstrate that the developed approach outperforms recent state-of-the-art prediction models. Case studies further confirm the predictive ability of the proposed method. Our proposed method can effectively improve the prediction results.


Sign in / Sign up

Export Citation Format

Share Document