Predicting MiRNA-disease associations by multiple meta-paths fusion graph embedding model

Abstract Background Many studies prove that miRNAs have significant roles in diagnosing and treating complex human diseases. However, conventional biological experiments are too costly and time-consuming to identify unconfirmed miRNA-disease associations. Thus, computational models predicting unidentified miRNA-disease pairs in an efficient way are becoming promising research topics. Although existing methods have performed well to reveal unidentified miRNA-disease associations, more work is still needed to improve prediction performance. Results In this work, we present a novel multiple meta-paths fusion graph embedding model to predict unidentified miRNA-disease associations (M2GMDA). Our method takes full advantage of the complex structure and rich semantic information of miRNA-disease interactions in a self-learning way. First, a miRNA-disease heterogeneous network was derived from verified miRNA-disease pairs, miRNA similarity and disease similarity. All meta-path instances connecting miRNAs with diseases were extracted to describe intrinsic information about miRNA-disease interactions. Then, we developed a graph embedding model to predict miRNA-disease associations. The model is composed of linear transformations of miRNAs and diseases, the means encoder of a single meta-path instance, the attention-aware encoder of meta-path type and attention-aware multiple meta-path fusion. We innovatively integrated meta-path instances, meta-path based neighbours, intermediate nodes in meta-paths and more information to strengthen the prediction in our model. In particular, distinct contributions of different meta-path instances and meta-path types were combined with attention mechanisms. The data sets and source code that support the findings of this study are available at https://github.com/dangdangzhang/M2GMDA. Conclusions M2GMDA achieved AUCs of 0.9323 and 0.9182 in global leave-one-out cross validation and fivefold cross validation with HDMM V2.0. The results showed that our method outperforms other prediction methods. Three kinds of case studies with lung neoplasms, breast neoplasms, prostate neoplasms, pancreatic neoplasms, lymphoma and colorectal neoplasms demonstrated that 47, 50, 49, 48, 50 and 50 out of the top 50 candidate miRNAs predicted by M2GMDA were validated by biological experiments. Therefore, it further confirms the prediction performance of our method.

Download Full-text

Deep-belief network for predicting potential miRNA-disease associations

Briefings in Bioinformatics ◽

10.1093/bib/bbaa186 ◽

2020 ◽

Author(s):

Xing Chen ◽

Tian-Hao Li ◽

Yan Zhao ◽

Chun-Chun Wang ◽

Chi-Chi Zhu

Keyword(s):

Computational Models ◽

Cross Validation ◽

Biological Data ◽

Deep Belief Network ◽

Computational Method ◽

Restricted Boltzmann Machines ◽

Belief Network ◽

Disease Associations ◽

Leave One Out ◽

The Impact

Abstract MicroRNA (miRNA) plays an important role in the occurrence, development, diagnosis and treatment of diseases. More and more researchers begin to pay attention to the relationship between miRNA and disease. Compared with traditional biological experiments, computational method of integrating heterogeneous biological data to predict potential associations can effectively save time and cost. Considering the limitations of the previous computational models, we developed the model of deep-belief network for miRNA-disease association prediction (DBNMDA). We constructed feature vectors to pre-train restricted Boltzmann machines for all miRNA-disease pairs and applied positive samples and the same number of selected negative samples to fine-tune DBN to obtain the final predicted scores. Compared with the previous supervised models that only use pairs with known label for training, DBNMDA innovatively utilizes the information of all miRNA-disease pairs during the pre-training process. This step could reduce the impact of too few known associations on prediction accuracy to some extent. DBNMDA achieves the AUC of 0.9104 based on global leave-one-out cross validation (LOOCV), the AUC of 0.8232 based on local LOOCV and the average AUC of 0.9048 ± 0.0026 based on 5-fold cross validation. These AUCs are better than other previous models. In addition, three different types of case studies for three diseases were implemented to demonstrate the accuracy of DBNMDA. As a result, 84% (breast neoplasms), 100% (lung neoplasms) and 88% (esophageal neoplasms) of the top 50 predicted miRNAs were verified by recent literature. Therefore, we could conclude that DBNMDA is an effective method to predict potential miRNA-disease associations.

Download Full-text

Adaptive boosting-based computational model for predicting potential miRNA-disease associations

Bioinformatics ◽

10.1093/bioinformatics/btz297 ◽

2019 ◽

Vol 35 (22) ◽

pp. 4730-4738 ◽

Cited By ~ 23

Author(s):

Yan Zhao ◽

Xing Chen ◽

Jun Yin

Keyword(s):

Computational Models ◽

Cross Validation ◽

Learning Algorithm ◽

Area Under The Curve ◽

Human Diseases ◽

Supplementary Information ◽

Weak Classifier ◽

Adaptive Boosting ◽

Disease Associations ◽

Leave One Out

AbstractMotivationRecent studies have shown that microRNAs (miRNAs) play a critical part in several biological processes and dysregulation of miRNAs is related with numerous complex human diseases. Thus, in-depth research of miRNAs and their association with human diseases can help us to solve many problems.ResultsDue to the high cost of traditional experimental methods, revealing disease-related miRNAs through computational models is a more economical and efficient way. Considering the disadvantages of previous models, in this paper, we developed adaptive boosting for miRNA-disease association prediction (ABMDA) to predict potential associations between diseases and miRNAs. We balanced the positive and negative samples by performing random sampling based on k-means clustering on negative samples, whose process was quick and easy, and our model had higher efficiency and scalability for large datasets than previous methods. As a boosting technology, ABMDA was able to improve the accuracy of given learning algorithm by integrating weak classifiers that could score samples to form a strong classifier based on corresponding weights. Here, we used decision tree as our weak classifier. As a result, the area under the curve (AUC) of global and local leave-one-out cross validation reached 0.9170 and 0.8220, respectively. What is more, the mean and the standard deviation of AUCs achieved 0.9023 and 0.0016, respectively in 5-fold cross validation. Besides, in the case studies of three important human cancers, 49, 50 and 50 out of the top 50 predicted miRNAs for colon neoplasms, hepatocellular carcinoma and breast neoplasms were confirmed by the databases and experimental literatures.Availability and implementationThe code and dataset of ABMDA are freely available at https://github.com/githubcode007/ABMDA.Supplementary informationSupplementary data are available at Bioinformatics online.

Download Full-text

Prediction of Microbe-drug Associations based on Chemical Structures and the KATZ Measure

Current Bioinformatics ◽

10.2174/1574893616666210204144721 ◽

2021 ◽

Vol 16 ◽

Author(s):

Lingzhi Zhu ◽

Guihua Duan ◽

Cheng Yan ◽

Jianxin Wang

Keyword(s):

Computational Models ◽

Cross Validation ◽

Area Under The Curve ◽

Prediction Performance ◽

Complex Mechanism ◽

Chemical Structures ◽

Potential Association ◽

Health And Disease ◽

Leave One Out ◽

Fold Cross Validation

Background: Microbial communities have important influences on our health and disease. Identifying potential human microbe-drug associations will be greatly advantageous to explore complex mechanisms of microbes in drug discovery, combinations and repositioning. Until now, the complex mechanism of microbe-drug associations remains unknown. Objective: Computational models play an important role in discovering hidden microbe-drug associations, because biological experiments are time-consuming and expensive. Based on chemical structures of drugs and the KATZ measure, a new computational model (HMDAKATZ) is proposed for identifying potential Human Microbe-Drug Associations. Methods: In HMDAKATZ, the similarity between microbes is computed using the Gaussian Interaction Profile (GIP) kernel based on known human microbe-drug associations. The similarity between drugs is computed based on known human microbe-drug associations and chemical structures. Then, a microbe-drug heterogeneous network is constructed by integrating the microbe-microbe network, the drug-drug network, and a known microbe-drug association network. Finally, we apply KATZ to identify potential association s between microbes and drugs. Results: The experimental results showed that HMDAKATZ achieved area under the curve (AUC) values of 0.9010±0.0020, 0.9066±0.0015, and 0.9116 in 5-fold cross validation (5-fold CV), 10-fold cross validation (10-fold CV), and leave one out cross validation (LOOCV), respectively, which outperformed four other computational models (SNMF, RLS, HGBI, and NBI). Conclusion: HMDAKATZ obtained the better prediction performance than four other methods in 5-fold CV, 10-fold CV, and LOOCV. Furthermore, three case studies also illustrated that HMDAKATZ is an effective way to discover hidden microbe-drug associations.

Download Full-text

WBNPMD: weighted bipartite network projection for microRNA-disease association prediction

Journal of Translational Medicine ◽

10.1186/s12967-019-2063-4 ◽

2019 ◽

Vol 17 (1) ◽

Cited By ~ 2

Author(s):

Guobo Xie ◽

Zhiliang Fan ◽

Yuping Sun ◽

Cuiming Wu ◽

Lei Ma

Keyword(s):

Cross Validation ◽

Lung Neoplasm ◽

Disease Association ◽

Computational Techniques ◽

Bipartite Network ◽

Initial Information ◽

Method Transfer ◽

Disease Associations ◽

Association Data ◽

Leave One Out

Abstract Background Recently, numerous biological experiments have indicated that microRNAs (miRNAs) play critical roles in exploring the pathogenesis of various human diseases. Since traditional experimental methods for miRNA-disease associations detection are costly and time-consuming, it becomes urgent to design efficient and robust computational techniques for identifying undiscovered interactions. Methods In this paper, we proposed a computation framework named weighted bipartite network projection for miRNA-disease association prediction (WBNPMD). In this method, transfer weights were constructed by combining the known miRNA and disease similarities, and the initial information was properly configured. Then the two-step bipartite network algorithm was implemented to infer potential miRNA-disease associations. Results The proposed WBNPMD was applied to the known miRNA-disease association data, and leave-one-out cross-validation (LOOCV) and fivefold cross-validation were implemented to evaluate the performance of WBNPMD. As a result, our method achieved the AUCs of 0.9321 and $$0.9173 \pm 0.0005$$ 0.9173 ± 0.0005 in LOOCV and fivefold cross-validation, and outperformed other four state-of-the-art methods. We also carried out two kinds of case studies on prostate neoplasm, colorectal neoplasm, and lung neoplasm, and most of the top 50 predicted miRNAs were confirmed to have an association with the corresponding diseases based on dbDeMC, miR2Disease, and HMDD V3.0 databases. Conclusions The experimental results demonstrate that WBNPMD can accurately infer potential miRNA-disease associations. We anticipated that the proposed WBNPMD could serve as a powerful tool for potential miRNA-disease associations excavation.

Download Full-text

Improved Prediction of miRNA-Disease Associations Based on Matrix Completion with Network Regularization

Cells ◽

10.3390/cells9040881 ◽

2020 ◽

Vol 9 (4) ◽

pp. 881 ◽

Cited By ~ 1

Author(s):

Jihwan Ha ◽

Chihyun Park ◽

Chanyoung Park ◽

Sanghyun Park

Keyword(s):

Computational Models ◽

Matrix Completion ◽

Area Under The Curve ◽

Excellent Performance ◽

Novel Mirna ◽

Lack Of Information ◽

Disease Associations ◽

Roc Area ◽

Leave One Out ◽

Global And Local

The identification of potential microRNA (miRNA)-disease associations enables the elucidation of the pathogenesis of complex human diseases owing to the crucial role of miRNAs in various biologic processes and it yields insights into novel prognostic markers. In the consideration of the time and costs involved in wet experiments, computational models for finding novel miRNA-disease associations would be a great alternative. However, computational models, to date, are biased towards known miRNA-disease associations; this is not suitable for rare miRNAs (i.e., miRNAs with a few known disease associations) and uncommon diseases (i.e., diseases with a few known miRNA associations). This leads to poor prediction accuracies. The most straightforward way of improving the performance is by increasing the number of known miRNA-disease associations. However, due to lack of information, increasing attention has been paid to developing computational models that can handle insufficient data via a technical approach. In this paper, we present a general framework—improved prediction of miRNA-disease associations (IMDN)—based on matrix completion with network regularization to discover potential disease-related miRNAs. The success of adopting matrix factorization is demonstrated by its excellent performance in recommender systems. This approach considers a miRNA network as additional implicit feedback and makes predictions for disease associations relevant to a given miRNA based on its direct neighbors. Our experimental results demonstrate that IMDN achieved excellent performance with reliable area under the receiver operating characteristic (ROC) area under the curve (AUC) values of 0.9162 and 0.8965 in the frameworks of global and local leave-one-out cross-validations (LOOCV), respectively. Further, case studies demonstrated that our method can not only validate true miRNA-disease associations but also suggest novel disease-related miRNA candidates.

Download Full-text

A Probabilistic Matrix Factorization Method for Identifying lncRNA-disease Associations

Genes ◽

10.3390/genes10020126 ◽

2019 ◽

Vol 10 (2) ◽

pp. 126 ◽

Cited By ~ 11

Author(s):

Zhanwei Xuan ◽

Jiechen Li ◽

Jingwen Yu ◽

Xiang Feng ◽

Bihai Zhao ◽

...

Keyword(s):

Computational Models ◽

Matrix Decomposition ◽

Factorization Method ◽

Association Network ◽

Disease Associations ◽

Probabilistic Matrix Factorization ◽

Simulation Results ◽

Artery Disease ◽

Leave One Out ◽

Better Than

Recently, an increasing number of studies have indicated that long-non-coding RNAs (lncRNAs) can participate in various crucial biological processes and can also be used as the most promising biomarkers for the treatment of certain diseases such as coronary artery disease and various cancers. Due to costs and time complexity, the number of possible disease-related lncRNAs that can be verified by traditional biological experiments is very limited. Therefore, in recent years, it has been very popular to use computational models to predict potential disease-lncRNA associations. In this study, we constructed three kinds of association networks, namely the lncRNA-miRNA association network, the miRNA-disease association network, and the lncRNA-disease correlation network firstly. Then, through integrating these three newly constructed association networks, we constructed an lncRNA-disease weighted association network, which would be further updated by adopting the KNN algorithm based on the semantic similarity of diseases and the similarity of lncRNA functions. Thereafter, according to the updated lncRNA-disease weighted association network, a novel computational model called PMFILDA was proposed to infer potential lncRNA-disease associations based on the probability matrix decomposition. Finally, to evaluate the superiority of the new prediction model PMFILDA, we performed Leave One Out Cross-Validation (LOOCV) based on strongly validated data filtered from MNDR and the simulation results indicated that the performance of PMFILDA was better than some state-of-the-art methods. Moreover, case studies of breast cancer, lung cancer, and colorectal cancer were implemented to further estimate the performance of PMFILDA, and simulation results illustrated that PMFILDA could achieve satisfying prediction performance as well.

Download Full-text

Prediction of Potential miRNA–Disease Associations Through a Novel Unsupervised Deep Learning Framework with Variational Autoencoder

Cells ◽

10.3390/cells8091040 ◽

2019 ◽

Vol 8 (9) ◽

pp. 1040 ◽

Cited By ~ 3

Author(s):

Li Zhang ◽

Xing Chen ◽

Jun Yin

Keyword(s):

Deep Learning ◽

Cross Validation ◽

Operating Characteristics ◽

Learning Framework ◽

Disease Similarity ◽

Variational Autoencoder ◽

Disease Associations ◽

Unsupervised Deep Learning ◽

Leave One Out

The important role of microRNAs (miRNAs) in the formation, development, diagnosis, and treatment of diseases has attracted much attention among researchers recently. In this study, we present an unsupervised deep learning model of the variational autoencoder for MiRNA–disease association prediction (VAEMDA). Through combining the integrated miRNA similarity and the integrated disease similarity with known miRNA–disease associations, respectively, we constructed two spliced matrices. These matrices were applied to train the variational autoencoder (VAE), respectively. The final predicted association scores between miRNAs and diseases were obtained by integrating the scores from the two trained VAE models. Unlike previous models, VAEMDA can avoid noise introduced by the random selection of negative samples and reveal associations between miRNAs and diseases from the perspective of data distribution. Compared with previous methods, VAEMDA obtained higher area under the receiver operating characteristics curves (AUCs) of 0.9118, 0.8652, and 0.9091 ± 0.0065 in global leave-one-out cross validation (LOOCV), local LOOCV, and five-fold cross validation, respectively. Further, the AUCs of VAEMDA were 0.8250 and 0.8237 in global leave-one-disease-out cross validation (LODOCV), and local LODOCV, respectively. In three different types of case studies on three important diseases, the results showed that most of the top 50 potentially associated miRNAs were verified by databases and the literature.

Download Full-text

A Novel Network-Based Computational Model for Prediction of Potential LncRNA–Disease Association

International Journal of Molecular Sciences ◽

10.3390/ijms20071549 ◽

2019 ◽

Vol 20 (7) ◽

pp. 1549 ◽

Cited By ~ 6

Author(s):

Yang Liu ◽

Xiang Feng ◽

Haochen Zhao ◽

Zhanwei Xuan ◽

Lei Wang

Keyword(s):

Computational Models ◽

Disease Association ◽

Label Propagation ◽

Human Diseases ◽

Weighted Matrix ◽

Disease Associations ◽

Resource Allocation Strategy ◽

Simulation Results ◽

Leave One Out ◽

Treatment Prognosis

Accumulating studies have shown that long non-coding RNAs (lncRNAs) are involved in many biological processes and play important roles in a variety of complex human diseases. Developing effective computational models to identify potential relationships between lncRNAs and diseases can not only help us understand disease mechanisms at the lncRNA molecular level, but also promote the diagnosis, treatment, prognosis, and prevention of human diseases. For this paper, a network-based model called NBLDA was proposed to discover potential lncRNA–disease associations, in which two novel lncRNA–disease weighted networks were constructed. They were first based on known lncRNA–disease associations and topological similarity of the lncRNA–disease association network, and then an lncRNA–lncRNA weighted matrix and a disease–disease weighted matrix were obtained based on a resource allocation strategy of unequal allocation and unbiased consistence. Finally, a label propagation algorithm was applied to predict associated lncRNAs for the investigated diseases. Moreover, in order to estimate the prediction performance of NBLDA, the framework of leave-one-out cross validation (LOOCV) was implemented on NBLDA, and simulation results showed that NBLDA can achieve reliable areas under the ROC curve (AUCs) of 0.8846, 0.8273, and 0.8075 in three known lncRNA–disease association datasets downloaded from the lncRNADisease database, respectively. Furthermore, in case studies of lung cancer, leukemia, and colorectal cancer, simulation results demonstrated that NBLDA can be a powerful tool for identifying potential lncRNA–disease associations as well.

Download Full-text

A Novel Approach for Predicting Disease-lncRNA Associations Based on the Distance Correlation Set and Information of the miRNAs

Computational and Mathematical Methods in Medicine ◽

10.1155/2018/6747453 ◽

2018 ◽

Vol 2018 ◽

pp. 1-12 ◽

Cited By ~ 1

Author(s):

Haochen Zhao ◽

Linai Kuang ◽

Lei Wang ◽

Zhanwei Xuan

Keyword(s):

Computational Models ◽

State Of The Art ◽

Experimental Studies ◽

Research Field ◽

Computational Method ◽

Biological Processes ◽

Distance Correlation ◽

Novel Approach ◽

Disease Associations ◽

Leave One Out

Recently, accumulating laboratorial studies have indicated that plenty of long noncoding RNAs (lncRNAs) play important roles in various biological processes and are associated with many complex human diseases. Therefore, developing powerful computational models to predict correlation between lncRNAs and diseases based on heterogeneous biological datasets will be important. However, there are few approaches to calculating and analyzing lncRNA-disease associations on the basis of information about miRNAs. In this article, a new computational method based on distance correlation set is developed to predict lncRNA-disease associations (DCSLDA). Comparing with existing state-of-the-art methods, we found that the major novelty of DCSLDA lies in the introduction of lncRNA-miRNA-disease network and distance correlation set; thus DCSLDA can be applied to predict potential lncRNA-disease associations without requiring any known disease-lncRNA associations. Simulation results show that DCSLDA can significantly improve previous existing models with reliable AUC of 0.8517 in the leave-one-out cross-validation. Furthermore, while implementing DCSLDA to prioritize candidate lncRNAs for three important cancers, in the first 0.5% of forecast results, 17 predicted associations are verified by other independent studies and biological experimental studies. Hence, it is anticipated that DCSLDA could be a great addition to the biomedical research field.

Download Full-text

Interdatabase Variability in Cortical Thickness Measurements

Cerebral Cortex ◽

10.1093/cercor/bhy197 ◽

2018 ◽

Vol 29 (8) ◽

pp. 3282-3293 ◽

Cited By ~ 2

Author(s):

M Ethan MacDonald ◽

Rebecca J Williams ◽

Nils D Forkert ◽

Avery J L Berman ◽

Cheryl R McCreary ◽

...

Keyword(s):

Large Scale ◽

Cross Validation ◽

Prediction Performance ◽

Rate Of Change ◽

Prediction Modeling ◽

Correlation Matrices ◽

Cortical Thinning ◽

Leave One Out ◽

Thickness Measurements ◽

Age Prediction

Abstract The phenomenon of cortical thinning with age has been well established; however, the measured rate of change varies between studies. The source of this variation could be image acquisition techniques including hardware and vendor specific differences. Databases are often consolidated to increase the number of subjects but underlying differences between these datasets could have undesired effects. We explore differences in cerebral cortex thinning between 4 databases, totaling 1382 subjects. We investigate several aspects of these databases, including: 1) differences between databases of cortical thinning rates versus age, 2) correlation of cortical thinning rates between regions for each database, and 3) regression bootstrapping to determine the effect of the number of subjects included. We also examined the effect of different databases on age prediction modeling. Cortical thinning rates were significantly different between databases in all 68 parcellated regions (ANCOVA, P < 0.001). Subtle differences were observed in correlation matrices and bootstrapping convergence. Age prediction modeling using a leave-one-out cross-validation approach showed varying prediction performance (0.64 < R2 < 0.82) between databases. When a database was used to calibrate the model and then applied to another database, prediction performance consistently decreased. We conclude that there are indeed differences in the measured cortical thinning rates between these large-scale databases.

Download Full-text