A Novel Approach Based on a Weighted Interactive Network to Predict Associations of MiRNAs and Diseases

Accumulating evidence progressively indicated that microRNAs (miRNAs) play a significant role in the pathogenesis of diseases through many experimental studies; therefore, developing powerful computational models to identify potential human miRNA–disease associations is vital for an understanding of the disease etiology and pathogenesis. In this paper, a weighted interactive network was firstly constructed by combining known miRNA–disease associations, as well as the integrated similarity between diseases and the integrated similarity between miRNAs. Then, a new computational method implementing the newly weighted interactive network was developed for discovering potential miRNA–disease associations (WINMDA) by integrating the T most similar neighbors and the shortest path algorithm. Simulation results show that WINMDA can achieve reliable area under the receiver operating characteristics (ROC) curve (AUC) results of 0.9183 ± 0.0007 in 5-fold cross-validation, 0.9200 ± 0.0004 in 10-fold cross-validation, 0.9243 in global leave-one-out cross-validation (LOOCV), and 0.8856 in local LOOCV. Furthermore, case studies of colon neoplasms, gastric neoplasms, and prostate neoplasms based on the Human microRNA Disease Database (HMDD) database were implemented, for which 94% (colon neoplasms), 96% (gastric neoplasms), and 96% (prostate neoplasms) of the top 50 predicting miRNAs were confirmed by recent experimental reports, which also demonstrates that WINMDA can effectively uncover potential miRNA–disease associations.

Download Full-text

Fusing multiple biological networks to effectively predict miRNA-disease associations

Current Bioinformatics ◽

10.2174/1574893615999200715165335 ◽

2020 ◽

Vol 15 ◽

Author(s):

Qingqi Zhu ◽

Yongxian Fan ◽

Xiaoyong Pan

Keyword(s):

Random Walk ◽

Case Studies ◽

Biological Networks ◽

Cross Validation ◽

Prostate Neoplasms ◽

Random Walk Model ◽

Gastric Neoplasms ◽

Similarity Matrix ◽

Fusion Algorithm ◽

Disease Associations

Background: MicroRNAs (miRNAs) are a class of endogenous non-coding RNAs with about 22 nucleotides and they play a significant role in a variety of complex biological processes. Many researches have shown that miRNAs are closely related to human diseases. Although the biological experiments are reliable in identifying miRNA-disease associations, they are time-consuming and costly. Objective: Thus, computational methods are urgently needed to effectively predict miRNA-disease associations. Method: In this paper, we proposed a novel method, BIRWMDA based on a bi-random walk model to predict miRNAdisease associations. Specifically, in BIRWMDA, the similarity network fusion algorithm is used to combine the multiple similarity matrices to obtain a miRNA-miRNA similarity matrix and a disease-disease similarity matrix, then the miRNAdisease associations were predicted by the bi-random walk model. Results: To evaluate the performance of BIRWMDA, we ran the leave-one-out cross validation and 5-fold cross validation, and their corresponding AUCs were 0.9303 and 0.9223 ± 0.00067, respectively. To further demonstrate the effectiveness of the BIRWMDA, from the perspective of exploring disease-related miRNAs, we conducted three case studies of breast neoplasms, prostate neoplasms and gastric neoplasms, where 48, 50 and 50 out of the top 50 predicted miRNAs were confirmed by literatures, respectively. From the perspective of exploring miRNA-related diseases, we conducted two case studies of hsa-mir-21 and hsa-mir-155, where 7 and 5 out of the top 10 predicted diseases were confirmed by literatures, respectively. Conclusion: Fusion of multiple biological networks could effectively predict miRNA-diseases associations. We expected BIRWMDA to severe as a biological tool for mining potential miRNA-disease associations.

Download Full-text

A novel computational model for predicting potential LncRNA-disease associations based on both direct and indirect features of LncRNA-disease pairs

BMC Bioinformatics ◽

10.1186/s12859-020-03906-7 ◽

2020 ◽

Vol 21 (1) ◽

Author(s):

Yubin Xiao ◽

Zheng Xiao ◽

Xiang Feng ◽

Zhiping Chen ◽

Linai Kuang ◽

...

Keyword(s):

Computational Model ◽

Cross Validation ◽

State Of The Art ◽

Prediction Methods ◽

Good Prediction ◽

Average Case ◽

Comparison Results ◽

Disease Associations ◽

Fold Cross Validation

Abstract Background Accumulating evidence has demonstrated that long non-coding RNAs (lncRNAs) are closely associated with human diseases, and it is useful for the diagnosis and treatment of diseases to get the relationships between lncRNAs and diseases. Due to the high costs and time complexity of traditional bio-experiments, in recent years, more and more computational methods have been proposed by researchers to infer potential lncRNA-disease associations. However, there exist all kinds of limitations in these state-of-the-art prediction methods as well. Results In this manuscript, a novel computational model named FVTLDA is proposed to infer potential lncRNA-disease associations. In FVTLDA, its major novelty lies in the integration of direct and indirect features related to lncRNA-disease associations such as the feature vectors of lncRNA-disease pairs and their corresponding association probability fractions, which guarantees that FVTLDA can be utilized to predict diseases without known related-lncRNAs and lncRNAs without known related-diseases. Moreover, FVTLDA neither relies solely on known lncRNA-disease nor requires any negative samples, which guarantee that it can infer potential lncRNA-disease associations more equitably and effectively than traditional state-of-the-art prediction methods. Additionally, to avoid the limitations of single model prediction techniques, we combine FVTLDA with the Multiple Linear Regression (MLR) and the Artificial Neural Network (ANN) for data analysis respectively. Simulation experiment results show that FVTLDA with MLR can achieve reliable AUCs of 0.8909, 0.8936 and 0.8970 in 5-Fold Cross Validation (fivefold CV), 10-Fold Cross Validation (tenfold CV) and Leave-One-Out Cross Validation (LOOCV), separately, while FVTLDA with ANN can achieve reliable AUCs of 0.8766, 0.8830 and 0.8807 in fivefold CV, tenfold CV, and LOOCV respectively. Furthermore, in case studies of gastric cancer, leukemia and lung cancer, experiment results show that there are 8, 8 and 8 out of top 10 candidate lncRNAs predicted by FVTLDA with MLR, and 8, 7 and 8 out of top 10 candidate lncRNAs predicted by FVTLDA with ANN, having been verified by recent literature. Comparing with the representative prediction model of KATZLDA, comparison results illustrate that FVTLDA with MLR and FVTLDA with ANN can achieve the average case study contrast scores of 0.8429 and 0.8515 respectively, which are both notably higher than the average case study contrast score of 0.6375 achieved by KATZLDA. Conclusion The simulation results show that FVTLDA has good prediction performance, which is a good supplement to future bioinformatics research.

Download Full-text

A Novel Computational Model for Predicting Potential LncRNA-Disease Associations based on Both Direct and Indirect Features of LncRNA-Disease Pairs

10.21203/rs.2.18937/v3 ◽

2020 ◽

Author(s):

Yubin Xiao ◽

Zheng Xiao ◽

Xiang Feng ◽

Zhiping Chen ◽

Linai Kuang ◽

...

Keyword(s):

Computational Model ◽

Cross Validation ◽

State Of The Art ◽

Prediction Methods ◽

Good Prediction ◽

Average Case ◽

Comparison Results ◽

Disease Associations ◽

Fold Cross Validation

Abstract Background: Accumulating evidence has demonstrated that long non-coding RNAs (lncRNAs) are closely associated with human diseases, and it is useful for the diagnosis and treatment of diseases to get the relationships between lncRNAs and diseases. Due to the high costs and time complexity of traditional bio-experiments, in recent years, more and more computational methods have been proposed by researchers to infer potential lncRNA-disease associations. However, there exist all kinds of limitations in these state-of-the-art prediction methods as well.Results: In this manuscript, a novel computational model named FVTLDA is proposed to infer potential lncRNA-disease associations. In FVTLDA, its major novelty lies in the integration of direct and indirect features related to lncRNA-disease associations such as the feature vectors of lncRNA-disease pairs and their corresponding association probability fractions, which guarantees that FVTLDA can be utilized to predict diseases without known related-lncRNAs and lncRNAs without known related-diseases. Moreover, FVTLDA neither relies solely on known lncRNA-disease nor requires any negative samples, which guarantee that it can infer potential lncRNA-disease associations more equitably and effectively than traditional state-of-the-art prediction methods. Additionally, to avoid the limitations of single model prediction techniques, we combine FVTLDA with the Multiple Linear Regression (MLR) and the Artificial Neural Network (ANN) for data analysis respectively. Simulation experiment results show that FVTLDA with MLR can achieve reliable AUCs of 0.8909, 0.8936 and 0.8970 in 5-Fold Cross Validation (5-fold CV), 10-Fold Cross Validation (10-fold CV) and Leave-One-Out Cross Validation (LOOCV), separately, while FVTLDA with ANN can achieve reliable AUCs of 0.8766, 0.8830 and 0.8807 in 5-fold CV, 10-fold CV, and LOOCV respectively. Furthermore, in case studies of gastric cancer, leukemia and lung cancer, experiment results show that there are 8, 8 and 8 out of top 10 candidate lncRNAs predicted by FVTLDA with MLR, and 8, 7 and 8 out of top 10 candidate lncRNAs predicted by FVTLDA with ANN, having been verified by recent literature. Comparing with the representative prediction model of KATZLDA, comparison results illustrate that FVTLDA with MLR and FVTLDA with ANN can achieve the average case study contrast scores of 0.8429 and 0.8515 respectively, which are both notably higher than the average case study contrast score of 0.6375 achieved by KATZLDA.Conclusion: The simulation results show that FVTLDA has good prediction performance, which is a good supplement to future bioinformatics research.

Download Full-text

IPCARF: improving lncRNA-disease association prediction using incremental principal component analysis feature selection and a random forest classifier

BMC Bioinformatics ◽

10.1186/s12859-021-04104-9 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Rong Zhu ◽

Yong Wang ◽

Jin-Xing Liu ◽

Ling-Yun Dai

Keyword(s):

Principal Component Analysis ◽

Random Forest ◽

Cross Validation ◽

Search Algorithm ◽

Principal Component ◽

Component Analysis ◽

Biological Data ◽

Learning Technology ◽

Disease Associations ◽

Fold Cross Validation

Abstract Background Identifying lncRNA-disease associations not only helps to better comprehend the underlying mechanisms of various human diseases at the lncRNA level but also speeds up the identification of potential biomarkers for disease diagnoses, treatments, prognoses, and drug response predictions. However, as the amount of archived biological data continues to grow, it has become increasingly difficult to detect potential human lncRNA-disease associations from these enormous biological datasets using traditional biological experimental methods. Consequently, developing new and effective computational methods to predict potential human lncRNA diseases is essential. Results Using a combination of incremental principal component analysis (IPCA) and random forest (RF) algorithms and by integrating multiple similarity matrices, we propose a new algorithm (IPCARF) based on integrated machine learning technology for predicting lncRNA-disease associations. First, we used two different models to compute a semantic similarity matrix of diseases from a directed acyclic graph of diseases. Second, a characteristic vector for each lncRNA-disease pair is obtained by integrating disease similarity, lncRNA similarity, and Gaussian nuclear similarity. Then, the best feature subspace is obtained by applying IPCA to decrease the dimension of the original feature set. Finally, we train an RF model to predict potential lncRNA-disease associations. The experimental results show that the IPCARF algorithm effectively improves the AUC metric when predicting potential lncRNA-disease associations. Before the parameter optimization procedure, the AUC value predicted by the IPCARF algorithm under 10-fold cross-validation reached 0.8529; after selecting the optimal parameters using the grid search algorithm, the predicted AUC of the IPCARF algorithm reached 0.8611. Conclusions We compared IPCARF with the existing LRLSLDA, LRLSLDA-LNCSIM, TPGLDA, NPCMF, and ncPred prediction methods, which have shown excellent performance in predicting lncRNA-disease associations. The compared results of 10-fold cross-validation procedures show that the predictions of the IPCARF method are better than those of the other compared methods.

Download Full-text

Bipartite graph-based collaborative matrix factorization method for predicting miRNA-disease associations

BMC Bioinformatics ◽

10.1186/s12859-021-04486-w ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Feng Zhou ◽

Meng-Meng Yin ◽

Cui-Na Jiao ◽

Zhen Cui ◽

Jing-Xiu Zhao ◽

...

Keyword(s):

Bipartite Graph ◽

Matrix Factorization ◽

Cross Validation ◽

Rapid Development ◽

Factorization Method ◽

Computational Method ◽

Human Diseases ◽

Simulation Experiments ◽

Disease Associations ◽

Fold Cross Validation

Abstract Background With the rapid development of various advanced biotechnologies, researchers in related fields have realized that microRNAs (miRNAs) play critical roles in many serious human diseases. However, experimental identification of new miRNA–disease associations (MDAs) is expensive and time-consuming. Practitioners have shown growing interest in methods for predicting potential MDAs. In recent years, an increasing number of computational methods for predicting novel MDAs have been developed, making a huge contribution to the research of human diseases and saving considerable time. In this paper, we proposed an efficient computational method, named bipartite graph-based collaborative matrix factorization (BGCMF), which is highly advantageous for predicting novel MDAs. Results By combining two improved recommendation methods, a new model for predicting MDAs is generated. Based on the idea that some new miRNAs and diseases do not have any associations, we adopt the bipartite graph based on the collaborative matrix factorization method to complete the prediction. The BGCMF achieves a desirable result, with AUC of up to 0.9514 ± (0.0007) in the five-fold cross-validation experiments. Conclusions Five-fold cross-validation is used to evaluate the capabilities of our method. Simulation experiments are implemented to predict new MDAs. More importantly, the AUC value of our method is higher than those of some state-of-the-art methods. Finally, many associations between new miRNAs and new diseases are successfully predicted by performing simulation experiments, indicating that BGCMF is a useful method to predict more potential miRNAs with roles in various diseases.

Download Full-text

Prediction of Potential miRNA–Disease Associations Through a Novel Unsupervised Deep Learning Framework with Variational Autoencoder

Cells ◽

10.3390/cells8091040 ◽

2019 ◽

Vol 8 (9) ◽

pp. 1040 ◽

Cited By ~ 3

Author(s):

Li Zhang ◽

Xing Chen ◽

Jun Yin

Keyword(s):

Deep Learning ◽

Cross Validation ◽

Operating Characteristics ◽

Learning Framework ◽

Disease Similarity ◽

Variational Autoencoder ◽

Disease Associations ◽

Unsupervised Deep Learning ◽

Leave One Out

The important role of microRNAs (miRNAs) in the formation, development, diagnosis, and treatment of diseases has attracted much attention among researchers recently. In this study, we present an unsupervised deep learning model of the variational autoencoder for MiRNA–disease association prediction (VAEMDA). Through combining the integrated miRNA similarity and the integrated disease similarity with known miRNA–disease associations, respectively, we constructed two spliced matrices. These matrices were applied to train the variational autoencoder (VAE), respectively. The final predicted association scores between miRNAs and diseases were obtained by integrating the scores from the two trained VAE models. Unlike previous models, VAEMDA can avoid noise introduced by the random selection of negative samples and reveal associations between miRNAs and diseases from the perspective of data distribution. Compared with previous methods, VAEMDA obtained higher area under the receiver operating characteristics curves (AUCs) of 0.9118, 0.8652, and 0.9091 ± 0.0065 in global leave-one-out cross validation (LOOCV), local LOOCV, and five-fold cross validation, respectively. Further, the AUCs of VAEMDA were 0.8250 and 0.8237 in global leave-one-disease-out cross validation (LODOCV), and local LODOCV, respectively. In three different types of case studies on three important diseases, the results showed that most of the top 50 potentially associated miRNAs were verified by databases and the literature.

Download Full-text

Network analysis of autistic disease comorbidities in Chinese children based on ICD-10 codes

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-020-01282-z ◽

2020 ◽

Vol 20 (1) ◽

Author(s):

Xiaojun Li ◽

Guangjian Liu ◽

Wenxiong Chen ◽

Zhisheng Bi ◽

Huiying Liang

Keyword(s):

Literature Search ◽

Cross Validation ◽

Medical Center ◽

Comorbid Disease ◽

Disease Network ◽

Disease Patterns ◽

Disease Associations ◽

Icd 10 ◽

Comorbid Diseases ◽

Fold Cross Validation

Abstract Background Autism is a lifelong disability associated with several comorbidities that confound diagnosis and treatment. A better understanding of these comorbidities would facilitate diagnosis and improve treatments. Our aim was to improve the detection of comorbid diseases associated with autism. Methods We used an FP-growth algorithm to retrospectively infer disease associations using 1488 patients with autism treated at the Guangzhou Women and Children’s Medical Center. The disease network was established using Cytoscape 3.7. The rules were internally validated by 10-fold cross-validation. All rules were further verified using the Columbia Open Health Data (COHD) and by literature search. Results We found 148 comorbid diseases including intellectual disability, developmental speech disorder, and epilepsy. The network comprised of 76 nodes and 178 directed links. 158 links were confirmed by literature search and 105 links were validated by COHD. Furthermore, we identified 14 links not previously reported. Conclusion We demonstrate that the FP-growth algorithm can detect comorbid disease patterns, including novel ones, in patients with autism.

Download Full-text

MiRNA-disease association prediction via hypergraph learning based on high-dimensionality features

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-020-01320-w ◽

2021 ◽

Vol 21 (S1) ◽

Author(s):

Yu-Tian Wang ◽

Qing-Wen Wu ◽

Zhen Gao ◽

Jian-Cheng Ni ◽

Chun-Hou Zheng

Keyword(s):

Computational Models ◽

Cross Validation ◽

Nearest Neighbor ◽

Experimental Studies ◽

Prediction Method ◽

Disease Association ◽

High Dimensionality ◽

K Nearest Neighbor ◽

Hypergraph Learning ◽

Disease Associations

Abstract Background MicroRNAs (miRNAs) have been confirmed to have close relationship with various human complex diseases. The identification of disease-related miRNAs provides great insights into the underlying pathogenesis of diseases. However, it is still a big challenge to identify which miRNAs are related to diseases. As experimental methods are in general expensive and time‐consuming, it is important to develop efficient computational models to discover potential miRNA-disease associations. Methods This study presents a novel prediction method called HFHLMDA, which is based on high-dimensionality features and hypergraph learning, to reveal the association between diseases and miRNAs. Firstly, the miRNA functional similarity and the disease semantic similarity are integrated to form an informative high-dimensionality feature vector. Then, a hypergraph is constructed by the K-Nearest-Neighbor (KNN) method, in which each miRNA-disease pair and its k most relevant neighbors are linked as one hyperedge to represent the complex relationships among miRNA-disease pairs. Finally, the hypergraph learning model is designed to learn the projection matrix which is used to calculate uncertain miRNA-disease association score. Result Compared with four state-of-the-art computational models, HFHLMDA achieved best results of 92.09% and 91.87% in leave-one-out cross validation and fivefold cross validation, respectively. Moreover, in case studies on Esophageal neoplasms, Hepatocellular Carcinoma, Breast Neoplasms, 90%, 98%, and 96% of the top 50 predictions have been manually confirmed by previous experimental studies. Conclusion MiRNAs have complex connections with many human diseases. In this study, we proposed a novel computational model to predict the underlying miRNA-disease associations. All results show that the proposed method is effective for miRNA–disease association predication.

Download Full-text

LOMDA: Linear optimization for miRNA-disease association prediction

10.1101/751651 ◽

2019 ◽

Author(s):

Yan-Li Lee ◽

Ratha Pech ◽

Maryna Po ◽

Dong Hao ◽

Tao Zhou

Keyword(s):

Cross Validation ◽

Linear Optimization ◽

Optimization Technique ◽

Auxiliary Information ◽

Prediction Performance ◽

First Case ◽

Disease Associations ◽

Similarity Information ◽

Fold Cross Validation

AbstractMicroRNAs (miRNAs) have been playing a crucial role in many important biological processes e.g., pathogenesis of diseases. Currently, the validated associations between miRNAs and diseases are insufficient comparing to the hidden associations. Testing all these hidden associations by biological experiments is expensive, laborious, and time consuming. Therefore, computationally inferring hidden associations from biological datasets for further laboratory experiments has attracted increasing interests from different communities ranging from biological to computational science. In this work, we propose an effective and efficient method to predict associations between miRNAs and diseases, namely linear optimization (LOMDA). The proposed method uses the heterogenous matrix incorporating of miRNA functional similarity information, disease similarity information and known miRNA-disease associations. Compared with the other methods, LOMDA performs best in terms of AUC (0.970), precision (0.566), and accuracy (0.971) in average over 15 diseases in local 5-fold cross-validation. Moreover, LOMDA has also been applied to two types of case studies. In the first case study, 30 predictions from breast neoplasms, 24 from colon neoplasms, and 26 from kidney neoplasms among top 30 predicted miRNAs are confirmed. In the second case study, for new diseases without any known associations, top 30 predictions from hepatocellular carcinoma and 29 from lung neoplasms among top 30 predicted miRNAs are confirmed.Author summaryIdentifying associations between miRNAs and diseases is significant in investigation of pathogenesis, diagnosis, treatment and preventions of related diseases. Employing computational methods to predict the hidden associations based on known associations and focus on those predicted associations can sharply reduce the experimental costs. We developed a computational method LOMDA based on the linear optimization technique to predict the hidden associations. In addition to the observed associations, LOMDA also can employ the auxiliary information (diseases and miRNAs similarity information) flexibly and effectively. Numerical experiments on global 5-fold cross validation show that the use of the auxiliary information can greatly improve the prediction performance. Meanwhile, the result on local 5-fold cross validation shows that LOMDA performs best among the seven related methods. We further test the prediction performance of LOMDA for two types of diseases based on HDMMv2.0 (2014), including (i) diseases with all the known associations, and (ii) new diseases without known associations. Three independent or updated databases (dbDEMC, 2010; miR2Disease, 2009; HDMMv3.2, 2019) are introduced to evaluate the prediction results. As a result, most miRNAs for target diseases are confirmed by at least one of the three databases. So, we believe that LOMDA can guide experiments to identify the hidden miRNA-disease associations.

Download Full-text

Predicting Drug-Disease Associations via Using Gaussian Interaction Profile and Kernel-Based Autoencoder

BioMed Research International ◽

10.1155/2019/2426958 ◽

2019 ◽

Vol 2019 ◽

pp. 1-11 ◽

Cited By ~ 4

Author(s):

Han-Jing Jiang ◽

Yu-An Huang ◽

Zhu-Hong You

Keyword(s):

Case Studies ◽

Computational Models ◽

Cross Validation ◽

Drug Repositioning ◽

Feature Learning ◽

Superior Performance ◽

Reliable Model ◽

Disease Associations ◽

The Cost ◽

Fold Cross Validation

Computational drug repositioning, designed to identify new indications for existing drugs, significantly reduced the cost and time involved in drug development. Prediction of drug-disease associations is promising for drug repositioning. Recent years have witnessed an increasing number of machine learning-based methods for calculating drug repositioning. In this paper, a novel feature learning method based on Gaussian interaction profile kernel and autoencoder (GIPAE) is proposed for drug-disease association. In order to further reduce the computation cost, both batch normalization layer and the full-connected layer are introduced to reduce training complexity. The experimental results of 10-fold cross validation indicate that the proposed method achieves superior performance on Fdataset and Cdataset with the AUCs of 93.30% and 96.03%, respectively, which were higher than many previous computational models. To further assess the accuracy of GIPAE, we conducted case studies on two complex human diseases. The top 20 drugs predicted, 14 obesity-related drugs, and 11 drugs related to Alzheimer's disease were validated in the CTD database. The results of cross validation and case studies indicated that GIPAE is a reliable model for predicting drug-disease associations.

Download Full-text