miRNA-Disease Association Prediction with Collaborative Matrix Factorization

As one of the factors in the noncoding RNA family, microRNAs (miRNAs) are involved in the development and progression of various complex diseases. Experimental identification of miRNA-disease association is expensive and time-consuming. Therefore, it is necessary to design efficient algorithms to identify novel miRNA-disease association. In this paper, we developed the computational method of Collaborative Matrix Factorization for miRNA-Disease Association prediction (CMFMDA) to identify potential miRNA-disease associations by integrating miRNA functional similarity, disease semantic similarity, and experimentally verified miRNA-disease associations. Experiments verified that CMFMDA achieves intended purpose and application values with its short consuming-time and high prediction accuracy. In addition, we used CMFMDA on Esophageal Neoplasms and Kidney Neoplasms to reveal their potential related miRNAs. As a result, 84% and 82% of top 50 predicted miRNA-disease pairs for these two diseases were confirmed by experiment. Not only this, but also CMFMDA could be applied to new diseases and new miRNAs without any known associations, which overcome the defects of many previous computational methods.

Download Full-text

DNILMF-LDA: Prediction of lncRNA-Disease Associations by Dual-Network Integrated Logistic Matrix Factorization and Bayesian Optimization

Genes ◽

10.3390/genes10080608 ◽

2019 ◽

Vol 10 (8) ◽

pp. 608 ◽

Cited By ~ 3

Author(s):

Yan Li ◽

Junyi Li ◽

Naizheng Bian

Keyword(s):

Matrix Factorization ◽

Disease Diagnosis ◽

Disease Association ◽

Bayesian Optimization ◽

Model Parameters ◽

Target Interaction ◽

Disease Associations ◽

Auc Value ◽

Similarity Networks ◽

Fold Cross Validation

Identifying associations between lncRNAs and diseases can help understand disease-related lncRNAs and facilitate disease diagnosis and treatment. The dual-network integrated logistic matrix factorization (DNILMF) model has been used for drug–target interaction prediction, and good results have been achieved. We firstly applied DNILMF to lncRNA–disease association prediction (DNILMF-LDA). We combined different similarity kernel matrices of lncRNAs and diseases by using nonlinear fusion to extract the most important information in fused matrices. Then, lncRNA–disease association networks and similarity networks were built simultaneously. Finally, the Gaussian process mutual information (GP-MI) algorithm of Bayesian optimization was adopted to optimize the model parameters. The 10-fold cross-validation result showed that the area under receiving operating characteristic (ROC) curve (AUC) value of DNILMF-LDA was 0.9202, and the area under precision-recall (PR) curve (AUPR) was 0.5610. Compared with LRLSLDA, SIMCLDA, BiwalkLDA, and TPGLDA, the AUC value of our method increased by 38.81%, 13.07%, 8.35%, and 6.75%, respectively. The AUPR value of our method increased by 52.66%, 40.05%, 37.01%, and 44.25%. These results indicate that DNILMF-LDA is an effective method for predicting the associations between lncRNAs and diseases.

Download Full-text

Computational drug repositioning based on multi-similarities bilinear matrix factorization

Briefings in Bioinformatics ◽

10.1093/bib/bbaa267 ◽

2020 ◽

Author(s):

Mengyun Yang ◽

Gaoyan Wu ◽

Qichang Zhao ◽

Yaohang Li ◽

Jianxin Wang

Keyword(s):

Matrix Factorization ◽

Drug Repositioning ◽

Disease Association ◽

Biological Entity ◽

Biomedical Data ◽

Supplementary Data ◽

Practical Applications ◽

Disease Associations ◽

Association Matrix ◽

Similarity Matrices

Abstract With the development of high-throughput technology and the accumulation of biomedical data, the prior information of biological entity can be calculated from different aspects. Specifically, drug–drug similarities can be measured from target profiles, drug–drug interaction and side effects. Similarly, different methods and data sources to calculate disease ontology can result in multiple measures of pairwise disease similarities. Therefore, in computational drug repositioning, developing a dynamic method to optimize the fusion process of multiple similarities is a crucial and challenging task. In this study, we propose a multi-similarities bilinear matrix factorization (MSBMF) method to predict promising drug-associated indications for existing and novel drugs. Instead of fusing multiple similarities into a single similarity matrix, we concatenate these similarity matrices of drug and disease, respectively. Applying matrix factorization methods, we decompose the drug–disease association matrix into a drug-feature matrix and a disease-feature matrix. At the same time, using these feature matrices as basis, we extract effective latent features representing the drug and disease similarity matrices to infer missing drug–disease associations. Moreover, these two factored matrices are constrained by non-negative factorization to ensure that the completed drug–disease association matrix is biologically interpretable. In addition, we numerically solve the MSBMF model by an efficient alternating direction method of multipliers algorithm. The computational experiment results show that MSBMF obtains higher prediction accuracy than the state-of-the-art drug repositioning methods in cross-validation experiments. Case studies also demonstrate the effectiveness of our proposed method in practical applications. Availability: The data and code of MSBMF are freely available at https://github.com/BioinformaticsCSU/MSBMF. Corresponding author: Jianxin Wang, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, P. R. China. E-mail: [email protected] Supplementary Data: Supplementary data are available online at https://academic.oup.com/bib.

Download Full-text

Bipartite graph-based collaborative matrix factorization method for predicting miRNA-disease associations

BMC Bioinformatics ◽

10.1186/s12859-021-04486-w ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Feng Zhou ◽

Meng-Meng Yin ◽

Cui-Na Jiao ◽

Zhen Cui ◽

Jing-Xiu Zhao ◽

...

Keyword(s):

Bipartite Graph ◽

Matrix Factorization ◽

Cross Validation ◽

Rapid Development ◽

Factorization Method ◽

Computational Method ◽

Human Diseases ◽

Simulation Experiments ◽

Disease Associations ◽

Fold Cross Validation

Abstract Background With the rapid development of various advanced biotechnologies, researchers in related fields have realized that microRNAs (miRNAs) play critical roles in many serious human diseases. However, experimental identification of new miRNA–disease associations (MDAs) is expensive and time-consuming. Practitioners have shown growing interest in methods for predicting potential MDAs. In recent years, an increasing number of computational methods for predicting novel MDAs have been developed, making a huge contribution to the research of human diseases and saving considerable time. In this paper, we proposed an efficient computational method, named bipartite graph-based collaborative matrix factorization (BGCMF), which is highly advantageous for predicting novel MDAs. Results By combining two improved recommendation methods, a new model for predicting MDAs is generated. Based on the idea that some new miRNAs and diseases do not have any associations, we adopt the bipartite graph based on the collaborative matrix factorization method to complete the prediction. The BGCMF achieves a desirable result, with AUC of up to 0.9514 ± (0.0007) in the five-fold cross-validation experiments. Conclusions Five-fold cross-validation is used to evaluate the capabilities of our method. Simulation experiments are implemented to predict new MDAs. More importantly, the AUC value of our method is higher than those of some state-of-the-art methods. Finally, many associations between new miRNAs and new diseases are successfully predicted by performing simulation experiments, indicating that BGCMF is a useful method to predict more potential miRNAs with roles in various diseases.

Download Full-text

Benchmark of computational methods for predicting microRNA-disease associations

Genome Biology ◽

10.1186/s13059-019-1811-3 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 7

Author(s):

Zhou Huang ◽

Leibo Liu ◽

Yuanxu Gao ◽

Jiangcheng Shi ◽

Qinghua Cui ◽

...

Keyword(s):

Performance Improvement ◽

Disease Association ◽

Prediction Methods ◽

Similarity Matrix ◽

Future Directions ◽

Novel Mirna ◽

Disease Associations ◽

Precision Recall Curve ◽

Single Predictor ◽

Recall Curve

Abstract Background A series of miRNA-disease association prediction methods have been proposed to prioritize potential disease-associated miRNAs. Independent benchmarking of these methods is warranted to assess their effectiveness and robustness. Results Based on more than 8000 novel miRNA-disease associations from the latest HMDD v3.1 database, we perform systematic comparison among 36 readily available prediction methods. Their overall performances are evaluated with rigorous precision-recall curve analysis, where 13 methods show acceptable accuracy (AUPRC > 0.200) while the top two methods achieve a promising AUPRC over 0.300, and most of these methods are also highly ranked when considering only the causal miRNA-disease associations as the positive samples. The potential of performance improvement is demonstrated by combining different predictors or adopting a more updated miRNA similarity matrix, which would result in up to 16% and 46% of AUPRC augmentations compared to the best single predictor and the predictors using the previous similarity matrix, respectively. Our analysis suggests a common issue of the available methods, which is that the prediction results are severely biased toward well-annotated diseases with many associated miRNAs known and cannot further stratify the positive samples by discriminating the causal miRNA-disease associations from the general miRNA-disease associations. Conclusion Our benchmarking results not only provide a reference for biomedical researchers to choose appropriate miRNA-disease association predictors for their purpose, but also suggest the future directions for the development of more robust miRNA-disease association predictors.

Download Full-text

Deep belief network–Based Matrix Factorization Model for MicroRNA-Disease Associations Prediction

Evolutionary Bioinformatics ◽

10.1177/1176934320919707 ◽

2020 ◽

Vol 16 ◽

pp. 117693432091970 ◽

Cited By ~ 1

Author(s):

Yulian Ding ◽

Fei Wang ◽

Xiujuan Lei ◽

Bo Liao ◽

Fang-Xiang Wu

Keyword(s):

Matrix Factorization ◽

Critical Role ◽

Experimental Studies ◽

Score Function ◽

Disease Association ◽

Deep Belief Network ◽

Time Saving ◽

Belief Network ◽

Related Data ◽

Disease Associations

MicroRNAs (miRNAs) are small single-stranded noncoding RNAs that have shown to play a critical role in regulating gene expression. In past decades, cumulative experimental studies have verified that miRNAs are implicated in many complex human diseases and might be potential biomarkers for various types of diseases. With the increase of miRNA-related data and the development of analysis methodologies, some computational methods have been developed for predicting miRNA-disease associations, which are more economical and time-saving than traditional biological experimental approaches. In this study, a novel computational model, deep belief network (DBN)-based matrix factorization (DBN-MF), is proposed for miRNA-disease association prediction. First, the raw interaction features of miRNAs and diseases were obtained from the miRNA-disease adjacent matrix. Second, 2 DBNs were used for unsupervised learning of the features of miRNAs and diseases, respectively, based on the raw interaction features. Finally, a classifier consisting of 2 DBNs and a cosine score function was trained with the initial weights of DBN from the last step. During the training, the miRNA-disease adjacent matrix was factorized into 2 feature matrices for the representation of miRNAs and diseases, and the final prediction label was obtained according to the feature matrices. The experimental results show that the proposed model outperforms the state-of-the-art approaches in miRNA-disease association prediction based on the 10-fold cross-validation. Besides, the effectiveness of our model was further demonstrated by case studies.

Download Full-text

A network similarity integration method for predicting microRNA-disease associations

RSC Advances ◽

10.1039/c7ra05348g ◽

2017 ◽

Vol 7 (51) ◽

pp. 32216-32224 ◽

Cited By ~ 5

Author(s):

Xiaoying Li ◽

Yaping Lin ◽

Changlong Gu

Keyword(s):

Integration Method ◽

Disease Association ◽

Association Network ◽

Similarity Network ◽

Novel Mirna ◽

Disease Similarity ◽

Disease Associations ◽

Network Similarity

The NSIM integrates the disease similarity network, miRNA similarity network, and known miRNA-disease association network on the basis of cousin similarity to predict not only novel miRNA-disease associations but also isolated diseases.

Download Full-text

GBDTLRL2D Predicts LncRNA–Disease Associations Using MetaGraph2Vec and K-Means Based on Heterogeneous Network

Frontiers in Cell and Developmental Biology ◽

10.3389/fcell.2021.753027 ◽

2021 ◽

Vol 9 ◽

Author(s):

Tao Duan ◽

Zhufang Kuang ◽

Jiaqi Wang ◽

Zhihao Ma

Keyword(s):

Heterogeneous Networks ◽

Noncoding Rna ◽

Clustering Algorithm ◽

Characteristic Curve ◽

Feature Learning ◽

Structural Features ◽

Disease Association ◽

Gradient Boosting ◽

Negative Sample ◽

Disease Associations

In recent years, the long noncoding RNA (lncRNA) has been shown to be involved in many disease processes. The prediction of the lncRNA–disease association is helpful to clarify the mechanism of disease occurrence and bring some new methods of disease prevention and treatment. The current methods for predicting the potential lncRNA–disease association seldom consider the heterogeneous networks with complex node paths, and these methods have the problem of unbalanced positive and negative samples. To solve this problem, a method based on the Gradient Boosting Decision Tree (GBDT) and logistic regression (LR) to predict the lncRNA–disease association (GBDTLRL2D) is proposed in this paper. MetaGraph2Vec is used for feature learning, and negative sample sets are selected by using K-means clustering. The innovation of the GBDTLRL2D is that the clustering algorithm is used to select a representative negative sample set, and the use of MetaGraph2Vec can better retain the semantic and structural features in heterogeneous networks. The average area under the receiver operating characteristic curve (AUC) values of GBDTLRL2D obtained on the three datasets are 0.98, 0.98, and 0.96 in 10-fold cross-validation.

Download Full-text

SRMDAP: SimRank and Density-Based Clustering Recommender Model for miRNA-Disease Association Prediction

BioMed Research International ◽

10.1155/2018/5747489 ◽

2018 ◽

Vol 2018 ◽

pp. 1-11

Author(s):

Xiaoying Li ◽

Yaping Lin ◽

Changlong Gu ◽

Zejun Li

Keyword(s):

Cross Validation ◽

Computational Method ◽

Human Diseases ◽

Excellent Performance ◽

Aberrant Expression ◽

Experimental Identification ◽

Density Based Clustering ◽

Disease Associations ◽

Leave One Out ◽

The Relationship

Aberrant expression of microRNAs (miRNAs) can be applied for the diagnosis, prognosis, and treatment of human diseases. Identifying the relationship between miRNA and human disease is important to further investigate the pathogenesis of human diseases. However, experimental identification of the associations between diseases and miRNAs is time-consuming and expensive. Computational methods are efficient approaches to determine the potential associations between diseases and miRNAs. This paper presents a new computational method based on the SimRank and density-based clustering recommender model for miRNA-disease associations prediction (SRMDAP). The AUC of 0.8838 based on leave-one-out cross-validation and case studies suggested the excellent performance of the SRMDAP in predicting miRNA-disease associations. SRMDAP could also predict diseases without any related miRNAs and miRNAs without any related diseases.

Download Full-text

A Novel Computational Method for the Identification of Potential miRNA-Disease Association Based on Symmetric Non-negative Matrix Factorization and Kronecker Regularized Least Square

Frontiers in Genetics ◽

10.3389/fgene.2018.00324 ◽

2018 ◽

Vol 9 ◽

Cited By ~ 17

Author(s):

Yan Zhao ◽

Xing Chen ◽

Jun Yin

Keyword(s):

Matrix Factorization ◽

Disease Association ◽

Least Square ◽

Computational Method ◽

Non Negative Matrix Factorization

Download Full-text

Prediction of miRNA-Disease Association Using Deep Collaborative Filtering

BioMed Research International ◽

10.1155/2021/6652948 ◽

2021 ◽

Vol 2021 ◽

pp. 1-16

Author(s):

Li Wang ◽

Cheng Zhong

Keyword(s):

Collaborative Filtering ◽

Cross Validation ◽

Kidney Neoplasms ◽

Feature Vector ◽

High Failure Rate ◽

Experimental Identification ◽

Disease Similarity ◽

Disease Associations ◽

Novel Method ◽

Fold Cross Validation

The existing studies have shown that miRNAs are related to human diseases by regulating gene expression. Identifying miRNA association with diseases will contribute to diagnosis, treatment, and prognosis of diseases. The experimental identification of miRNA-disease associations is time-consuming, tremendously expensive, and of high-failure rate. In recent years, many researchers predicted potential associations between miRNAs and diseases by computational approaches. In this paper, we proposed a novel method using deep collaborative filtering called DCFMDA to predict miRNA-disease potential associations. To improve prediction performance, we integrated neural network matrix factorization (NNMF) and multilayer perceptron (MLP) in a deep collaborative filtering framework. We utilized known miRNA-disease associations to capture miRNA-disease interaction features by NNMF and utilized miRNA similarity and disease similarity to extract miRNA feature vector and disease feature vector, respectively, by MLP. At last, we merged outputs of the NNMF and MLP to obtain the prediction matrix. The experimental results indicate that compared with other existing computational methods, our method can achieve the AUC of 0.9466 based on 10-fold cross-validation. In addition, case studies show that the DCFMDA can effectively predict candidate miRNAs for breast neoplasms, colon neoplasms, kidney neoplasms, leukemia, and lymphoma.

Download Full-text