Adaptive boosting-based computational model for predicting potential miRNA-disease associations

Yan Zhao; Xing Chen; Jun Yin

doi:10.1093/bioinformatics/btz297

Adaptive boosting-based computational model for predicting potential miRNA-disease associations

Bioinformatics ◽

10.1093/bioinformatics/btz297 ◽

2019 ◽

Vol 35 (22) ◽

pp. 4730-4738 ◽

Cited By ~ 23

Author(s):

Yan Zhao ◽

Xing Chen ◽

Jun Yin

Keyword(s):

Computational Models ◽

Cross Validation ◽

Learning Algorithm ◽

Area Under The Curve ◽

Human Diseases ◽

Supplementary Information ◽

Weak Classifier ◽

Adaptive Boosting ◽

Disease Associations ◽

Leave One Out

AbstractMotivationRecent studies have shown that microRNAs (miRNAs) play a critical part in several biological processes and dysregulation of miRNAs is related with numerous complex human diseases. Thus, in-depth research of miRNAs and their association with human diseases can help us to solve many problems.ResultsDue to the high cost of traditional experimental methods, revealing disease-related miRNAs through computational models is a more economical and efficient way. Considering the disadvantages of previous models, in this paper, we developed adaptive boosting for miRNA-disease association prediction (ABMDA) to predict potential associations between diseases and miRNAs. We balanced the positive and negative samples by performing random sampling based on k-means clustering on negative samples, whose process was quick and easy, and our model had higher efficiency and scalability for large datasets than previous methods. As a boosting technology, ABMDA was able to improve the accuracy of given learning algorithm by integrating weak classifiers that could score samples to form a strong classifier based on corresponding weights. Here, we used decision tree as our weak classifier. As a result, the area under the curve (AUC) of global and local leave-one-out cross validation reached 0.9170 and 0.8220, respectively. What is more, the mean and the standard deviation of AUCs achieved 0.9023 and 0.0016, respectively in 5-fold cross validation. Besides, in the case studies of three important human cancers, 49, 50 and 50 out of the top 50 predicted miRNAs for colon neoplasms, hepatocellular carcinoma and breast neoplasms were confirmed by the databases and experimental literatures.Availability and implementationThe code and dataset of ABMDA are freely available at https://github.com/githubcode007/ABMDA.Supplementary informationSupplementary data are available at Bioinformatics online.

Download Full-text

Improved Prediction of miRNA-Disease Associations Based on Matrix Completion with Network Regularization

Cells ◽

10.3390/cells9040881 ◽

2020 ◽

Vol 9 (4) ◽

pp. 881 ◽

Cited By ~ 1

Author(s):

Jihwan Ha ◽

Chihyun Park ◽

Chanyoung Park ◽

Sanghyun Park

Keyword(s):

Computational Models ◽

Matrix Completion ◽

Area Under The Curve ◽

Excellent Performance ◽

Novel Mirna ◽

Lack Of Information ◽

Disease Associations ◽

Roc Area ◽

Leave One Out ◽

Global And Local

The identification of potential microRNA (miRNA)-disease associations enables the elucidation of the pathogenesis of complex human diseases owing to the crucial role of miRNAs in various biologic processes and it yields insights into novel prognostic markers. In the consideration of the time and costs involved in wet experiments, computational models for finding novel miRNA-disease associations would be a great alternative. However, computational models, to date, are biased towards known miRNA-disease associations; this is not suitable for rare miRNAs (i.e., miRNAs with a few known disease associations) and uncommon diseases (i.e., diseases with a few known miRNA associations). This leads to poor prediction accuracies. The most straightforward way of improving the performance is by increasing the number of known miRNA-disease associations. However, due to lack of information, increasing attention has been paid to developing computational models that can handle insufficient data via a technical approach. In this paper, we present a general framework—improved prediction of miRNA-disease associations (IMDN)—based on matrix completion with network regularization to discover potential disease-related miRNAs. The success of adopting matrix factorization is demonstrated by its excellent performance in recommender systems. This approach considers a miRNA network as additional implicit feedback and makes predictions for disease associations relevant to a given miRNA based on its direct neighbors. Our experimental results demonstrate that IMDN achieved excellent performance with reliable area under the receiver operating characteristic (ROC) area under the curve (AUC) values of 0.9162 and 0.8965 in the frameworks of global and local leave-one-out cross-validations (LOOCV), respectively. Further, case studies demonstrated that our method can not only validate true miRNA-disease associations but also suggest novel disease-related miRNA candidates.

Download Full-text

A Novel Network-Based Computational Model for Prediction of Potential LncRNA–Disease Association

International Journal of Molecular Sciences ◽

10.3390/ijms20071549 ◽

2019 ◽

Vol 20 (7) ◽

pp. 1549 ◽

Cited By ~ 6

Author(s):

Yang Liu ◽

Xiang Feng ◽

Haochen Zhao ◽

Zhanwei Xuan ◽

Lei Wang

Keyword(s):

Computational Models ◽

Disease Association ◽

Label Propagation ◽

Human Diseases ◽

Weighted Matrix ◽

Disease Associations ◽

Resource Allocation Strategy ◽

Simulation Results ◽

Leave One Out ◽

Treatment Prognosis

Accumulating studies have shown that long non-coding RNAs (lncRNAs) are involved in many biological processes and play important roles in a variety of complex human diseases. Developing effective computational models to identify potential relationships between lncRNAs and diseases can not only help us understand disease mechanisms at the lncRNA molecular level, but also promote the diagnosis, treatment, prognosis, and prevention of human diseases. For this paper, a network-based model called NBLDA was proposed to discover potential lncRNA–disease associations, in which two novel lncRNA–disease weighted networks were constructed. They were first based on known lncRNA–disease associations and topological similarity of the lncRNA–disease association network, and then an lncRNA–lncRNA weighted matrix and a disease–disease weighted matrix were obtained based on a resource allocation strategy of unequal allocation and unbiased consistence. Finally, a label propagation algorithm was applied to predict associated lncRNAs for the investigated diseases. Moreover, in order to estimate the prediction performance of NBLDA, the framework of leave-one-out cross validation (LOOCV) was implemented on NBLDA, and simulation results showed that NBLDA can achieve reliable areas under the ROC curve (AUCs) of 0.8846, 0.8273, and 0.8075 in three known lncRNA–disease association datasets downloaded from the lncRNADisease database, respectively. Furthermore, in case studies of lung cancer, leukemia, and colorectal cancer, simulation results demonstrated that NBLDA can be a powerful tool for identifying potential lncRNA–disease associations as well.

Download Full-text

Deep-belief network for predicting potential miRNA-disease associations

Briefings in Bioinformatics ◽

10.1093/bib/bbaa186 ◽

2020 ◽

Author(s):

Xing Chen ◽

Tian-Hao Li ◽

Yan Zhao ◽

Chun-Chun Wang ◽

Chi-Chi Zhu

Keyword(s):

Computational Models ◽

Cross Validation ◽

Biological Data ◽

Deep Belief Network ◽

Computational Method ◽

Restricted Boltzmann Machines ◽

Belief Network ◽

Disease Associations ◽

Leave One Out ◽

The Impact

Abstract MicroRNA (miRNA) plays an important role in the occurrence, development, diagnosis and treatment of diseases. More and more researchers begin to pay attention to the relationship between miRNA and disease. Compared with traditional biological experiments, computational method of integrating heterogeneous biological data to predict potential associations can effectively save time and cost. Considering the limitations of the previous computational models, we developed the model of deep-belief network for miRNA-disease association prediction (DBNMDA). We constructed feature vectors to pre-train restricted Boltzmann machines for all miRNA-disease pairs and applied positive samples and the same number of selected negative samples to fine-tune DBN to obtain the final predicted scores. Compared with the previous supervised models that only use pairs with known label for training, DBNMDA innovatively utilizes the information of all miRNA-disease pairs during the pre-training process. This step could reduce the impact of too few known associations on prediction accuracy to some extent. DBNMDA achieves the AUC of 0.9104 based on global leave-one-out cross validation (LOOCV), the AUC of 0.8232 based on local LOOCV and the average AUC of 0.9048 ± 0.0026 based on 5-fold cross validation. These AUCs are better than other previous models. In addition, three different types of case studies for three diseases were implemented to demonstrate the accuracy of DBNMDA. As a result, 84% (breast neoplasms), 100% (lung neoplasms) and 88% (esophageal neoplasms) of the top 50 predicted miRNAs were verified by recent literature. Therefore, we could conclude that DBNMDA is an effective method to predict potential miRNA-disease associations.

Download Full-text

MDAKRLS: Predicting human microbe-disease association based on Kronecker regularized least squares and similarities

Journal of Translational Medicine ◽

10.1186/s12967-021-02732-6 ◽

2021 ◽

Vol 19 (1) ◽

Author(s):

Da Xu ◽

Hanxiao Xu ◽

Yusen Zhang ◽

Mingyi Wang ◽

Wei Chen ◽

...

Keyword(s):

Least Squares ◽

Cross Validation ◽

Computational Method ◽

Human Diseases ◽

Regularized Least Squares ◽

Comparison Results ◽

Pathological Mechanism ◽

Disease Associations ◽

Inflammatory Bowel ◽

Leave One Out

Abstract Background Microbes are closely related to human health and diseases. Identification of disease-related microbes is of great significance for revealing the pathological mechanism of human diseases and understanding the interaction mechanisms between microbes and humans, which is also useful for the prevention, diagnosis and treatment of human diseases. Considering the known disease-related microbes are still insufficient, it is necessary to develop effective computational methods and reduce the time and cost of biological experiments. Methods In this work, we developed a novel computational method called MDAKRLS to discover potential microbe-disease associations (MDAs) based on the Kronecker regularized least squares. Specifically, we introduced the Hamming interaction profile similarity to measure the similarities of microbes and diseases besides Gaussian interaction profile kernel similarity. In addition, we introduced the Kronecker product to construct two kinds of Kronecker similarities between microbe-disease pairs. Then, we designed the Kronecker regularized least squares with different Kronecker similarities to obtain prediction scores, respectively, and calculated the final prediction scores by integrating the contributions of different similarities. Results The AUCs value of global leave-one-out cross-validation and 5-fold cross-validation achieved by MDAKRLS were 0.9327 and 0.9023 ± 0.0015, which were significantly higher than five state-of-the-art methods used for comparison. Comparison results demonstrate that MDAKRLS has faster computing speed under two kinds of frameworks. In addition, case studies of inflammatory bowel disease (IBD) and asthma further showed 19 (IBD), 19 (asthma) of the top 20 prediction disease-related microbes could be verified by previously published biological or medical literature. Conclusions All the evaluation results adequately demonstrated that MDAKRLS has an effective and reliable prediction performance. It may be a useful tool to seek disease-related new microbes and help biomedical researchers to carry out follow-up studies.

Download Full-text

NCMCMDA: miRNA–disease association prediction through neighborhood constraint matrix completion

Briefings in Bioinformatics ◽

10.1093/bib/bbz159 ◽

2020 ◽

Cited By ~ 6

Author(s):

Xing Chen ◽

Lian-Gang Sun ◽

Yan Zhao

Keyword(s):

Computational Models ◽

Cross Validation ◽

Matrix Completion ◽

Critical Role ◽

Disease Association ◽

Human Diseases ◽

Superior Performance ◽

Main Task ◽

Constraint Matrix ◽

Disease Associations

Abstract Emerging evidence shows that microRNAs (miRNAs) play a critical role in diverse fundamental and important biological processes associated with human diseases. Inferring potential disease related miRNAs and employing them as the biomarkers or drug targets could contribute to the prevention, diagnosis and treatment of complex human diseases. In view of that traditional biological experiments cost much time and resources, computational models would serve as complementary means to uncover potential miRNA–disease associations. In this study, we proposed a new computational model named Neighborhood Constraint Matrix Completion for MiRNA–Disease Association prediction (NCMCMDA) to predict potential miRNA–disease associations. The main task of NCMCMDA was to recover the missing miRNA–disease associations based on the known miRNA–disease associations and integrated disease (miRNA) similarity. In this model, we innovatively integrated neighborhood constraint with matrix completion, which provided a novel idea of utilizing similarity information to assist the prediction. After the recovery task was transformed into an optimization problem, we solved it with a fast iterative shrinkage-thresholding algorithm. As a result, the AUCs of NCMCMDA in global and local leave-one-out cross validation were 0.9086 and 0.8453, respectively. In 5-fold cross validation, NCMCMDA achieved an average AUC of 0.8942 and standard deviation of 0.0015, which demonstrated NCMCMDA’s superior performance than many previous computational methods. Furthermore, NCMCMDA was applied to three different types of case studies to further evaluate its prediction reliability and accuracy. As a result, 84% (colon neoplasms), 98% (esophageal neoplasms) and 98% (breast neoplasms) of the top 50 predicted miRNAs were verified by recent literature.

Download Full-text

SRMDAP: SimRank and Density-Based Clustering Recommender Model for miRNA-Disease Association Prediction

BioMed Research International ◽

10.1155/2018/5747489 ◽

2018 ◽

Vol 2018 ◽

pp. 1-11

Author(s):

Xiaoying Li ◽

Yaping Lin ◽

Changlong Gu ◽

Zejun Li

Keyword(s):

Cross Validation ◽

Computational Method ◽

Human Diseases ◽

Excellent Performance ◽

Aberrant Expression ◽

Experimental Identification ◽

Density Based Clustering ◽

Disease Associations ◽

Leave One Out ◽

The Relationship

Aberrant expression of microRNAs (miRNAs) can be applied for the diagnosis, prognosis, and treatment of human diseases. Identifying the relationship between miRNA and human disease is important to further investigate the pathogenesis of human diseases. However, experimental identification of the associations between diseases and miRNAs is time-consuming and expensive. Computational methods are efficient approaches to determine the potential associations between diseases and miRNAs. This paper presents a new computational method based on the SimRank and density-based clustering recommender model for miRNA-disease associations prediction (SRMDAP). The AUC of 0.8838 based on leave-one-out cross-validation and case studies suggested the excellent performance of the SRMDAP in predicting miRNA-disease associations. SRMDAP could also predict diseases without any related miRNAs and miRNAs without any related diseases.

Download Full-text

Prediction of Microbe-drug Associations based on Chemical Structures and the KATZ Measure

Current Bioinformatics ◽

10.2174/1574893616666210204144721 ◽

2021 ◽

Vol 16 ◽

Author(s):

Lingzhi Zhu ◽

Guihua Duan ◽

Cheng Yan ◽

Jianxin Wang

Keyword(s):

Computational Models ◽

Cross Validation ◽

Area Under The Curve ◽

Prediction Performance ◽

Complex Mechanism ◽

Chemical Structures ◽

Potential Association ◽

Health And Disease ◽

Leave One Out ◽

Fold Cross Validation

Background: Microbial communities have important influences on our health and disease. Identifying potential human microbe-drug associations will be greatly advantageous to explore complex mechanisms of microbes in drug discovery, combinations and repositioning. Until now, the complex mechanism of microbe-drug associations remains unknown. Objective: Computational models play an important role in discovering hidden microbe-drug associations, because biological experiments are time-consuming and expensive. Based on chemical structures of drugs and the KATZ measure, a new computational model (HMDAKATZ) is proposed for identifying potential Human Microbe-Drug Associations. Methods: In HMDAKATZ, the similarity between microbes is computed using the Gaussian Interaction Profile (GIP) kernel based on known human microbe-drug associations. The similarity between drugs is computed based on known human microbe-drug associations and chemical structures. Then, a microbe-drug heterogeneous network is constructed by integrating the microbe-microbe network, the drug-drug network, and a known microbe-drug association network. Finally, we apply KATZ to identify potential association s between microbes and drugs. Results: The experimental results showed that HMDAKATZ achieved area under the curve (AUC) values of 0.9010±0.0020, 0.9066±0.0015, and 0.9116 in 5-fold cross validation (5-fold CV), 10-fold cross validation (10-fold CV), and leave one out cross validation (LOOCV), respectively, which outperformed four other computational models (SNMF, RLS, HGBI, and NBI). Conclusion: HMDAKATZ obtained the better prediction performance than four other methods in 5-fold CV, 10-fold CV, and LOOCV. Furthermore, three case studies also illustrated that HMDAKATZ is an effective way to discover hidden microbe-drug associations.

Download Full-text

SCMFMDA: Predicting microRNA-disease associations based on similarity constrained matrix factorization

PLoS Computational Biology ◽

10.1371/journal.pcbi.1009165 ◽

2021 ◽

Vol 17 (7) ◽

pp. e1009165

Author(s):

Lei Li ◽

Zhen Gao ◽

Yu-Tian Wang ◽

Ming-Wen Zhang ◽

Jian-Cheng Ni ◽

...

Keyword(s):

Matrix Factorization ◽

Cross Validation ◽

Sequence Similarity ◽

Nonnegative Matrix ◽

Functional Similarity ◽

Human Diseases ◽

Mirna Sequence ◽

Fusion Algorithm ◽

Disease Associations ◽

Leave One Out

miRNAs belong to small non-coding RNAs that are related to a number of complicated biological processes. Considerable studies have suggested that miRNAs are closely associated with many human diseases. In this study, we proposed a computational model based on Similarity Constrained Matrix Factorization for miRNA-Disease Association Prediction (SCMFMDA). In order to effectively combine different disease and miRNA similarity data, we applied similarity network fusion algorithm to obtain integrated disease similarity (composed of disease functional similarity, disease semantic similarity and disease Gaussian interaction profile kernel similarity) and integrated miRNA similarity (composed of miRNA functional similarity, miRNA sequence similarity and miRNA Gaussian interaction profile kernel similarity). In addition, the L2 regularization terms and similarity constraint terms were added to traditional Nonnegative Matrix Factorization algorithm to predict disease-related miRNAs. SCMFMDA achieved AUCs of 0.9675 and 0.9447 based on global Leave-one-out cross validation and five-fold cross validation, respectively. Furthermore, the case studies on two common human diseases were also implemented to demonstrate the prediction accuracy of SCMFMDA. The out of top 50 predicted miRNAs confirmed by experimental reports that indicated SCMFMDA was effective for prediction of relationship between miRNAs and diseases.

Download Full-text

Predicting MiRNA-disease associations by multiple meta-paths fusion graph embedding model

BMC Bioinformatics ◽

10.1186/s12859-020-03765-2 ◽

2020 ◽

Vol 21 (1) ◽

Author(s):

Lei Zhang ◽

Bailong Liu ◽

Zhengwei Li ◽

Xiaoyan Zhu ◽

Zhizhen Liang ◽

...

Keyword(s):

Computational Models ◽

Cross Validation ◽

Complex Structure ◽

Graph Embedding ◽

Prostate Neoplasms ◽

Prediction Performance ◽

Linear Transformations ◽

Disease Associations ◽

Meta Path ◽

Leave One Out

Abstract Background Many studies prove that miRNAs have significant roles in diagnosing and treating complex human diseases. However, conventional biological experiments are too costly and time-consuming to identify unconfirmed miRNA-disease associations. Thus, computational models predicting unidentified miRNA-disease pairs in an efficient way are becoming promising research topics. Although existing methods have performed well to reveal unidentified miRNA-disease associations, more work is still needed to improve prediction performance. Results In this work, we present a novel multiple meta-paths fusion graph embedding model to predict unidentified miRNA-disease associations (M2GMDA). Our method takes full advantage of the complex structure and rich semantic information of miRNA-disease interactions in a self-learning way. First, a miRNA-disease heterogeneous network was derived from verified miRNA-disease pairs, miRNA similarity and disease similarity. All meta-path instances connecting miRNAs with diseases were extracted to describe intrinsic information about miRNA-disease interactions. Then, we developed a graph embedding model to predict miRNA-disease associations. The model is composed of linear transformations of miRNAs and diseases, the means encoder of a single meta-path instance, the attention-aware encoder of meta-path type and attention-aware multiple meta-path fusion. We innovatively integrated meta-path instances, meta-path based neighbours, intermediate nodes in meta-paths and more information to strengthen the prediction in our model. In particular, distinct contributions of different meta-path instances and meta-path types were combined with attention mechanisms. The data sets and source code that support the findings of this study are available at https://github.com/dangdangzhang/M2GMDA. Conclusions M2GMDA achieved AUCs of 0.9323 and 0.9182 in global leave-one-out cross validation and fivefold cross validation with HDMM V2.0. The results showed that our method outperforms other prediction methods. Three kinds of case studies with lung neoplasms, breast neoplasms, prostate neoplasms, pancreatic neoplasms, lymphoma and colorectal neoplasms demonstrated that 47, 50, 49, 48, 50 and 50 out of the top 50 candidate miRNAs predicted by M2GMDA were validated by biological experiments. Therefore, it further confirms the prediction performance of our method.

Download Full-text

Prioritizing Disease-Related Microbes Based on the Topological Properties of a Comprehensive Network

Frontiers in Microbiology ◽

10.3389/fmicb.2021.685549 ◽

2021 ◽

Vol 12 ◽

Author(s):

Haixiu Yang ◽

Fan Tong ◽

Changlu Qi ◽

Ping Wang ◽

Jiangyu Li ◽

...

Keyword(s):

High Performance ◽

Large Scale ◽

Cross Validation ◽

Area Under The Curve ◽

Disease Pathogenesis ◽

Physiological Processes ◽

Disease Associations ◽

Inflammatory Bowel ◽

Leave One Out ◽

Insight Into

Many microbes are parasitic within the human body, engaging in various physiological processes and playing an important role in human diseases. The discovery of new microbe–disease associations aids our understanding of disease pathogenesis. Computational methods can be applied in such investigations, thereby avoiding the time-consuming and laborious nature of experimental methods. In this study, we constructed a comprehensive microbe–disease network by integrating known microbe–disease associations from three large-scale databases (Peryton, Disbiome, and gutMDisorder), and extended the random walk with restart to the network for prioritizing unknown microbe–disease associations. The area under the curve values of the leave-one-out cross-validation and the fivefold cross-validation exceeded 0.9370 and 0.9366, respectively, indicating the high performance of this method. Despite being widely studied diseases, in case studies of inflammatory bowel disease, asthma, and obesity, some prioritized disease-related microbes were validated by recent literature. This suggested that our method is effective at prioritizing novel disease-related microbes and may offer further insight into disease pathogenesis.

Download Full-text