MicroRNA-disease association prediction by matrix tri-factorization

Abstract Background Biological evidence has shown that microRNAs(miRNAs) are greatly implicated in various biological progresses involved in human diseases. The identification of miRNA-disease associations(MDAs) is beneficial to disease diagnosis as well as treatment. Due to the high costs of biological experiments, it attracts more and more attention to predict MDAs by computational approaches. Results In this work, we propose a novel model MTFMDA for miRNA-disease association prediction by matrix tri-factorization, based on the known miRNA-disease associations, two types of miRNA similarities, and two types of disease similarities. The main idea of MTFMDA is to factorize the miRNA-disease association matrix to three matrices, a feature matrix for miRNAs, a feature matrix for diseases, and a low-rank relationship matrix. Our model incorporates the Laplacian regularizers which force the feature matrices to preserve the similarities of miRNAs or diseases. A novel algorithm is proposed to solve the optimization problem. Conclusions We evaluate our model by 5-fold cross validation by using known MDAs from HMDD V2.0 and show that our model could obtain the significantly highest AUCs among all the state-of-art methods. We further validate our method by applying it on colon and breast neoplasms in two different types of experiment settings. The new identified associated miRNAs for the two diseases could be verified by two other databases including dbDEMC and HMDD V3.0, which further shows the power of our proposed method.

Download Full-text

DNILMF-LDA: Prediction of lncRNA-Disease Associations by Dual-Network Integrated Logistic Matrix Factorization and Bayesian Optimization

Genes ◽

10.3390/genes10080608 ◽

2019 ◽

Vol 10 (8) ◽

pp. 608 ◽

Cited By ~ 3

Author(s):

Yan Li ◽

Junyi Li ◽

Naizheng Bian

Keyword(s):

Matrix Factorization ◽

Disease Diagnosis ◽

Disease Association ◽

Bayesian Optimization ◽

Model Parameters ◽

Target Interaction ◽

Disease Associations ◽

Auc Value ◽

Similarity Networks ◽

Fold Cross Validation

Identifying associations between lncRNAs and diseases can help understand disease-related lncRNAs and facilitate disease diagnosis and treatment. The dual-network integrated logistic matrix factorization (DNILMF) model has been used for drug–target interaction prediction, and good results have been achieved. We firstly applied DNILMF to lncRNA–disease association prediction (DNILMF-LDA). We combined different similarity kernel matrices of lncRNAs and diseases by using nonlinear fusion to extract the most important information in fused matrices. Then, lncRNA–disease association networks and similarity networks were built simultaneously. Finally, the Gaussian process mutual information (GP-MI) algorithm of Bayesian optimization was adopted to optimize the model parameters. The 10-fold cross-validation result showed that the area under receiving operating characteristic (ROC) curve (AUC) value of DNILMF-LDA was 0.9202, and the area under precision-recall (PR) curve (AUPR) was 0.5610. Compared with LRLSLDA, SIMCLDA, BiwalkLDA, and TPGLDA, the AUC value of our method increased by 38.81%, 13.07%, 8.35%, and 6.75%, respectively. The AUPR value of our method increased by 52.66%, 40.05%, 37.01%, and 44.25%. These results indicate that DNILMF-LDA is an effective method for predicting the associations between lncRNAs and diseases.

Download Full-text

A Novel Model for Predicting Associations between Diseases and LncRNA-miRNA Pairs Based on a Newly Constructed Bipartite Network

Computational and Mathematical Methods in Medicine ◽

10.1155/2018/6789089 ◽

2018 ◽

Vol 2018 ◽

pp. 1-11 ◽

Cited By ~ 1

Author(s):

Shunxian Zhou ◽

Zhanwei Xuan ◽

Lei Wang ◽

Pengyao Ping ◽

Tingrui Pei

Keyword(s):

Computational Models ◽

Disease Diagnosis ◽

Disease Association ◽

Bipartite Network ◽

Association Network ◽

Disease Biomarkers ◽

Validation Framework ◽

Treatment Prognosis ◽

Novel Model ◽

Fold Cross Validation

Motivation. Increasing studies have demonstrated that many human complex diseases are associated with not only microRNAs, but also long-noncoding RNAs (lncRNAs). LncRNAs and microRNA play significant roles in various biological processes. Therefore, developing effective computational models for predicting novel associations between diseases and lncRNA-miRNA pairs (LMPairs) will be beneficial to not only the understanding of disease mechanisms at lncRNA-miRNA level and the detection of disease biomarkers for disease diagnosis, treatment, prognosis, and prevention, but also the understanding of interactions between diseases and LMPairs at disease level.Results. It is well known that genes with similar functions are often associated with similar diseases. In this article, a novel model named PADLMP for predicting associations between diseases and LMPairs is proposed. In this model, a Disease-LncRNA-miRNA (DLM) tripartite network was designed firstly by integrating the lncRNA-disease association network and miRNA-disease association network; then we constructed the disease-LMPairs bipartite association network based on the DLM network and lncRNA-miRNA association network; finally, we predicted potential associations between diseases and LMPairs based on the newly constructed disease-LMPair network. Simulation results show that PADLMP can achieve AUCs of 0.9318, 0.9090 ± 0.0264, and 0.8950 ± 0.0027 in the LOOCV, 2-fold, and 5-fold cross validation framework, respectively, which demonstrate the reliable prediction performance of PADLMP.

Download Full-text

Computational drug repositioning based on multi-similarities bilinear matrix factorization

Briefings in Bioinformatics ◽

10.1093/bib/bbaa267 ◽

2020 ◽

Author(s):

Mengyun Yang ◽

Gaoyan Wu ◽

Qichang Zhao ◽

Yaohang Li ◽

Jianxin Wang

Keyword(s):

Matrix Factorization ◽

Drug Repositioning ◽

Disease Association ◽

Biological Entity ◽

Biomedical Data ◽

Supplementary Data ◽

Practical Applications ◽

Disease Associations ◽

Association Matrix ◽

Similarity Matrices

Abstract With the development of high-throughput technology and the accumulation of biomedical data, the prior information of biological entity can be calculated from different aspects. Specifically, drug–drug similarities can be measured from target profiles, drug–drug interaction and side effects. Similarly, different methods and data sources to calculate disease ontology can result in multiple measures of pairwise disease similarities. Therefore, in computational drug repositioning, developing a dynamic method to optimize the fusion process of multiple similarities is a crucial and challenging task. In this study, we propose a multi-similarities bilinear matrix factorization (MSBMF) method to predict promising drug-associated indications for existing and novel drugs. Instead of fusing multiple similarities into a single similarity matrix, we concatenate these similarity matrices of drug and disease, respectively. Applying matrix factorization methods, we decompose the drug–disease association matrix into a drug-feature matrix and a disease-feature matrix. At the same time, using these feature matrices as basis, we extract effective latent features representing the drug and disease similarity matrices to infer missing drug–disease associations. Moreover, these two factored matrices are constrained by non-negative factorization to ensure that the completed drug–disease association matrix is biologically interpretable. In addition, we numerically solve the MSBMF model by an efficient alternating direction method of multipliers algorithm. The computational experiment results show that MSBMF obtains higher prediction accuracy than the state-of-the-art drug repositioning methods in cross-validation experiments. Case studies also demonstrate the effectiveness of our proposed method in practical applications. Availability: The data and code of MSBMF are freely available at https://github.com/BioinformaticsCSU/MSBMF. Corresponding author: Jianxin Wang, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, P. R. China. E-mail: [email protected] Supplementary Data: Supplementary data are available online at https://academic.oup.com/bib.

Download Full-text

Predicting miRNA-Disease Association Based on Modularity Preserving Heterogeneous Network Embedding

Frontiers in Cell and Developmental Biology ◽

10.3389/fcell.2021.603758 ◽

2021 ◽

Vol 9 ◽

Author(s):

Wei Peng ◽

Jielin Du ◽

Wei Dai ◽

Wei Lan

Keyword(s):

Heterogeneous Network ◽

Structural Information ◽

Disease Diagnosis ◽

Disease Association ◽

Biological Information ◽

Vector Representation ◽

Network Embedding ◽

Prediction Ability ◽

Similarity Network ◽

Disease Associations

MicroRNAs (miRNAs) are a category of small non-coding RNAs that profoundly impact various biological processes related to human disease. Inferring the potential miRNA-disease associations benefits the study of human diseases, such as disease prevention, disease diagnosis, and drug development. In this work, we propose a novel heterogeneous network embedding-based method called MDN-NMTF (Module-based Dynamic Neighborhood Non-negative Matrix Tri-Factorization) for predicting miRNA-disease associations. MDN-NMTF constructs a heterogeneous network of disease similarity network, miRNA similarity network and a known miRNA-disease association network. After that, it learns the latent vector representation for miRNAs and diseases in the heterogeneous network. Finally, the association probability is computed by the product of the latent miRNA and disease vectors. MDN-NMTF not only successfully integrates diverse biological information of miRNAs and diseases to predict miRNA-disease associations, but also considers the module properties of miRNAs and diseases in the course of learning vector representation, which can maximally preserve the heterogeneous network structural information and the network properties. At the same time, we also extend MDN-NMTF to a new version (called MDN-NMTF2) by using modular information to improve the miRNA-disease association prediction ability. Our methods and the other four existing methods are applied to predict miRNA-disease associations in four databases. The prediction results show that our methods can improve the miRNA-disease association prediction to a high level compared with the four existing methods.

Download Full-text

Hierarchical Extension Based on the Boolean Matrix for LncRNA-Disease Association Prediction

Current Molecular Medicine ◽

10.2174/1566524019666191119104212 ◽

2020 ◽

Vol 20 (6) ◽

pp. 452-460

Author(s):

Lin Tang ◽

Yu Liang ◽

Xin Jin ◽

Lin Liu ◽

Wei Zhou

Keyword(s):

Computational Models ◽

Characteristic Curve ◽

Experimental Studies ◽

Potential Method ◽

Disease Association ◽

Biological Data ◽

Boolean Matrix ◽

Disease Associations ◽

Association Data ◽

Association Matrix

Background: Accumulating experimental studies demonstrated that long non-coding RNAs (LncRNAs) play crucial roles in the occurrence and development progress of various complex human diseases. Nonetheless, only a small portion of LncRNA–disease associations have been experimentally verified at present. Automatically predicting LncRNA–disease associations based on computational models can save the huge cost of wet-lab experiments. Methods and Result: To develop effective computational models to integrate various heterogeneous biological data for the identification of potential disease-LncRNA, we propose a hierarchical extension based on the Boolean matrix for LncRNA-disease association prediction model (HEBLDA). HEBLDA discovers the intrinsic hierarchical correlation based on the property of the Boolean matrix from various relational sources. Then, HEBLDA integrates these hierarchical associated matrices by fusion weights. Finally, HEBLDA uses the hierarchical associated matrix to reconstruct the LncRNA– disease association matrix by hierarchical extending. HEBLDA is able to work for potential diseases or LncRNA without known association data. In 5-fold cross-validation experiments, HEBLDA obtained an area under the receiver operating characteristic curve (AUC) of 0.8913, improving previous classical methods. Besides, case studies show that HEBLDA can accurately predict candidate disease for several LncRNAs. Conclusion: Based on its ability to discover the more-richer correlated structure of various data sources, we can anticipate that HEBLDA is a potential method that can obtain more comprehensive association prediction in a broad field.

Download Full-text

MCCMF: Collaborative matrix factorization based on matrix completion for predicting miRNA-disease associations

10.21203/rs.3.rs-36602/v2 ◽

2020 ◽

Author(s):

Tian-Ru Wu ◽

Meng-Meng Yin ◽

Cui-Na Jiao ◽

Ying-Lian Gao ◽

Xiang-Zhen Kong ◽

...

Keyword(s):

Matrix Factorization ◽

Cross Validation ◽

Matrix Completion ◽

Similarity Matrix ◽

Validation Experiment ◽

Disease Associations ◽

Auc Value ◽

Regulatory Functions ◽

Association Matrix ◽

Fold Cross Validation

Abstract Background: microRNAs (miRNAs) are non-coding RNAs with regulatory functions. Many studies have shown that miRNAs are closely associated with human diseases. Among the methods to explore the relationship between the miRNA and the disease, traditional methods are time-consuming and the accuracy needs to be improved. In view of the shortcoming of previous models, a collaborative matrix factorization based on matrix completion (MCCMF) is proposed to predict the unknown miRNA-disease associations.Results: The complete matrix of the miRNA and the disease is obtained by matrix completion. Moreover, Gaussian Interaction Profile (GIP) kernel is added to the miRNA functional similarity matrix and the disease semantic similarity matrix to form the GIP kernel similarity matrix. Then the Weight K Nearest Known Neighbors (WKNKN) method is used to pretreat the association matrix, so the model is close to the reality. Finally, collaborative matrix factorization (CMF) method is applied to obtain the prediction results. Therefore, the MCCMF obtains a satisfactory result in the five-fold cross-validation, with an AUC of 0.9569(0.0005).Conclusions: The AUC value of MCCMF is higher than other advanced methods in the 5-fold cross validation experiment. In order to comprehensively evaluate the performance of MCCMF, accuracy, precision, recall and f-measure are also added. The final experimental results demonstrate that MCCMF outperforms other methods in predicting miRNA-disease associations. In the end, the effectiveness and practicability of MCCMF are further verified by researching three specific diseases.

Download Full-text

MCCMF: Collaborative matrix factorization based on matrix completion for predicting miRNA-disease associations

10.21203/rs.3.rs-36602/v3 ◽

2020 ◽

Author(s):

Tian-Ru Wu ◽

Meng-Meng Yin ◽

Cui-Na Jiao ◽

Ying-Lian Gao ◽

Xiang-Zhen Kong ◽

...

Keyword(s):

Matrix Factorization ◽

Cross Validation ◽

Matrix Completion ◽

Similarity Matrix ◽

Evaluation Indexes ◽

Validation Experiment ◽

Disease Associations ◽

Regulatory Functions ◽

Association Matrix ◽

Fold Cross Validation

Abstract Background: MicroRNAs (MiRNAs) are non-coding RNAs with regulatory functions. Many studies have shown that miRNAs are closely associated with human diseases. Among the methods to explore the relationship between the miRNA and the disease, traditional methods are time-consuming and the accuracy needs to be improved. In view of the shortcoming of previous models, a collaborative matrix factorization based on matrix completion (MCCMF) is proposed to predict the unknown miRNA-disease associations.Results: The complete matrix of the miRNA and the disease is obtained by matrix completion. Moreover, Gaussian Interaction Profile (GIP) kernel is added to the miRNA functional similarity matrix and the disease semantic similarity matrix to form the GIP kernel similarity matrix. Then the Weight K Nearest Known Neighbors (WKNKN) method is used to pretreat the association matrix, so the model is close to the reality. Finally, collaborative matrix factorization (CMF) method is applied to obtain the prediction results. Therefore, the MCCMF obtains a satisfactory result in the five-fold cross-validation, with an AUC of 0.9569(0.0005). Conclusions: The AUC value of MCCMF is higher than other advanced methods in the 5-fold cross validation experiment. In order to comprehensively evaluate the performance of MCCMF, f-measure and other evaluation indexes are also added. The final experimental results demonstrate that MCCMF outperforms other methods in prediction miRNA-disease associations. In the end, the effectiveness and practicability of MCCMF are further verified by researching three specific diseases.

Download Full-text

MCCMF: Collaborative matrix factorization based on matrix completion for predicting miRNA-disease associations

10.21203/rs.3.rs-36602/v1 ◽

2020 ◽

Author(s):

Tian-Ru Wu ◽

Meng-Meng Yin ◽

Cui-Na Jiao ◽

Jin-Xing Liu ◽

Ying-Lian Gao ◽

...

Keyword(s):

Matrix Factorization ◽

Cross Validation ◽

Matrix Completion ◽

Similarity Matrix ◽

Evaluation Indexes ◽

Validation Experiment ◽

Disease Associations ◽

Regulatory Functions ◽

Association Matrix ◽

Fold Cross Validation

Abstract Background: MicroRNAs (MiRNAs) are non-coding RNAs with regulatory functions. Many studies have shown that miRNAs are closely associated with human diseases. Among the methods to explore the relationship between the miRNA and the disease, traditional methods are time-consuming and the accuracy needs to be improved. In view of the shortcoming of previous models, a collaborative matrix factorization based on matrix completion (MCCMF) is proposed to predict the unknown miRNA-disease associations. Results: The complete matrix of the miRNA and the disease is obtained by matrix completion. Moreover, Gaussian Interaction Profile (GIP) kernel is added to the miRNA functional similarity matrix and the disease semantic similarity matrix to form the GIP kernel similarity matrix. Then the Weight K Nearest Known Neighbors (WKNKN) method is used to pretreat the association matrix, so the model is close to the reality. Finally, collaborative matrix factorization (CMF) method is applied to obtain the prediction results. Therefore, the MCCMF obtains a satisfactory result in the five-fold cross-validation, with an AUC of 0.9569(0.0005). Conclusions: The AUC value of MCCMF is higher than other advanced methods in the 5-fold cross validation experiment. In order to comprehensively evaluate the performance of MCCMF, f-measure and other evaluation indexes are also added. The final experimental results demonstrate that MCCMF outperforms other methods in prediction miRNA-disease associations. In the end, the effectiveness and practicability of MCCMF are further verified by researching three specific diseases.

Download Full-text

MCCMF: Collaborative matrix factorization based on matrix completion for predicting miRNA-disease associations

10.21203/rs.3.rs-36602/v4 ◽

2020 ◽

Author(s):

Tian-Ru Wu ◽

Meng-Meng Yin ◽

Cui-Na Jiao ◽

Ying-Lian Gao ◽

Xiang-Zhen Kong ◽

...

Keyword(s):

Matrix Factorization ◽

Cross Validation ◽

Matrix Completion ◽

Similarity Matrix ◽

Validation Experiment ◽

Disease Associations ◽

Auc Value ◽

Regulatory Functions ◽

Association Matrix ◽

Fold Cross Validation

Abstract Background: microRNAs (miRNAs) are non-coding RNAs with regulatory functions. Many studies have shown that miRNAs are closely associated with human diseases. Among the methods to explore the relationship between the miRNA and the disease, traditional methods are time-consuming and the accuracy needs to be improved. In view of the shortcoming of previous models, a method, collaborative matrix factorization based on matrix completion (MCCMF) is proposed to predict the unknown miRNA-disease associations.Results: The complete matrix of the miRNA and the disease is obtained by matrix completion. Moreover, Gaussian Interaction Profile (GIP) kernel is added to the miRNA functional similarity matrix and the disease semantic similarity matrix. Then the Weight K Nearest Known Neighbors (WKNKN) method is used to pretreat the association matrix, so the model is close to the reality. Finally, collaborative matrix factorization (CMF) method is applied to obtain the prediction results. Therefore, the MCCMF obtains a satisfactory result in the five-fold cross-validation, with an AUC of 0.9569(0.0005).Conclusions: The AUC value of MCCMF is higher than other advanced methods in the 5-fold cross validation experiment. In order to comprehensively evaluate the performance of MCCMF, accuracy, precision, recall and f-measure are also added. The final experimental results demonstrate that MCCMF outperforms other methods in predicting miRNA-disease associations. In the end, the effectiveness and practicability of MCCMF are further verified by researching three specific diseases.

Download Full-text

Drug repositioning based on bounded nuclear norm regularization

Bioinformatics ◽

10.1093/bioinformatics/btz331 ◽

2019 ◽

Vol 35 (14) ◽

pp. i455-i463 ◽

Cited By ~ 18

Author(s):

Mengyun Yang ◽

Huimin Luo ◽

Yaohang Li ◽

Jianxin Wang

Keyword(s):

Recommendation System ◽

Drug Repositioning ◽

Matrix Completion ◽

Approximation Error ◽

Disease Association ◽

Nuclear Norm ◽

Low Rank ◽

Supplementary Information ◽

Disease Associations ◽

Nuclear Norm Regularization

Abstract Motivation Computational drug repositioning is a cost-effective strategy to identify novel indications for existing drugs. Drug repositioning is often modeled as a recommendation system problem. Taking advantage of the known drug–disease associations, the objective of the recommendation system is to identify new treatments by filling out the unknown entries in the drug–disease association matrix, which is known as matrix completion. Underpinned by the fact that common molecular pathways contribute to many different diseases, the recommendation system assumes that the underlying latent factors determining drug–disease associations are highly correlated. In other words, the drug–disease matrix to be completed is low-rank. Accordingly, matrix completion algorithms efficiently constructing low-rank drug–disease matrix approximations consistent with known associations can be of immense help in discovering the novel drug–disease associations. Results In this article, we propose to use a bounded nuclear norm regularization (BNNR) method to complete the drug–disease matrix under the low-rank assumption. Instead of strictly fitting the known elements, BNNR is designed to tolerate the noisy drug–drug and disease–disease similarities by incorporating a regularization term to balance the approximation error and the rank properties. Moreover, additional constraints are incorporated into BNNR to ensure that all predicted matrix entry values are within the specific interval. BNNR is carried out on an adjacency matrix of a heterogeneous drug–disease network, which integrates the drug–drug, drug–disease and disease–disease networks. It not only makes full use of available drugs, diseases and their association information, but also is capable of dealing with cold start naturally. Our computational results show that BNNR yields higher drug–disease association prediction accuracy than the current state-of-the-art methods. The most significant gain is in prediction precision measured as the fraction of the positive predictions that are truly positive, which is particularly useful in drug design practice. Cases studies also confirm the accuracy and reliability of BNNR. Availability and implementation The code of BNNR is freely available at https://github.com/BioinformaticsCSU/BNNR. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text