scholarly journals Fusion of KATZ measure and space projection to fast probe potential lncRNA-disease associations in bipartite graphs

PLoS ONE ◽  
2021 ◽  
Vol 16 (11) ◽  
pp. e0260329
Author(s):  
Yi Zhang ◽  
Min Chen ◽  
Li Huang ◽  
Xiaolan Xie ◽  
Xin Li ◽  
...  

It is well known that numerous long noncoding RNAs (lncRNAs) closely relate to the physiological and pathological processes of human diseases and can serves as potential biomarkers. Therefore, lncRNA-disease associations that are identified by computational methods as the targeted candidates reduce the cost of biological experiments focusing on deep study furtherly. However, inaccurate construction of similarity networks and inadequate numbers of observed known lncRNA–disease associations, such inherent problems make many mature computational methods that have been developed for many years still exit some limitations. It motivates us to explore a new computational method that was fused with KATZ measure and space projection to fast probing potential lncRNA-disease associations (namely KATZSP). KATZSP is comprised of following key steps: combining all the global information with which to change Boolean network of known lncRNA–disease associations into the weighted networks; changing the similarities calculation into counting the number of walks that connect lncRNA nodes and disease nodes in bipartite graphs; obtaining the space projection scores to refine the primary prediction scores. The process to fuse KATZ measure and space projection was simplified and uncomplicated with needing only one attenuation factor. The leave-one-out cross validation (LOOCV) experimental results showed that, compared with other state-of-the-art methods (NCPLDA, LDAI-ISPS and IIRWR), KATZSP had a higher predictive accuracy shown with area-under-the-curve (AUC) value on the three datasets built, while KATZSP well worked on inferring potential associations related to new lncRNAs (or isolated diseases). The results from real cases study (such as pancreas cancer, lung cancer and colorectal cancer) further confirmed that KATZSP is capable of superior predictive ability to be applied as a guide for traditional biological experiments.

2020 ◽  
Vol 21 (4) ◽  
pp. 1508
Author(s):  
Yi Zhang ◽  
Min Chen ◽  
Ang Li ◽  
Xiaohui Cheng ◽  
Hong Jin ◽  
...  

Long non-coding RNAs (long ncRNAs, lncRNAs) of all kinds have been implicated in a range of cell developmental processes and diseases, while they are not translated into proteins. Inferring diseases associated lncRNAs by computational methods can be helpful to understand the pathogenesis of diseases, but those current computational methods still have not achieved remarkable predictive performance: such as the inaccurate construction of similarity networks and inadequate numbers of known lncRNA–disease associations. In this research, we proposed a lncRNA–disease associations inference based on integrated space projection scores (LDAI-ISPS) composed of the following key steps: changing the Boolean network of known lncRNA–disease associations into the weighted networks via combining all the global information (e.g., disease semantic similarities, lncRNA functional similarities, and known lncRNA–disease associations); obtaining the space projection scores via vector projections of the weighted networks to form the final prediction scores without biases. The leave-one-out cross validation (LOOCV) results showed that, compared with other methods, LDAI-ISPS had a higher accuracy with area-under-the-curve (AUC) value of 0.9154 for inferring diseases, with AUC value of 0.8865 for inferring new lncRNAs (whose associations related to diseases are unknown), with AUC value of 0.7518 for inferring isolated diseases (whose associations related to lncRNAs are unknown). A case study also confirmed the predictive performance of LDAI-ISPS as a helper for traditional biological experiments in inferring the potential LncRNA–disease associations and isolated diseases.


Cells ◽  
2020 ◽  
Vol 9 (4) ◽  
pp. 881 ◽  
Author(s):  
Jihwan Ha ◽  
Chihyun Park ◽  
Chanyoung Park ◽  
Sanghyun Park

The identification of potential microRNA (miRNA)-disease associations enables the elucidation of the pathogenesis of complex human diseases owing to the crucial role of miRNAs in various biologic processes and it yields insights into novel prognostic markers. In the consideration of the time and costs involved in wet experiments, computational models for finding novel miRNA-disease associations would be a great alternative. However, computational models, to date, are biased towards known miRNA-disease associations; this is not suitable for rare miRNAs (i.e., miRNAs with a few known disease associations) and uncommon diseases (i.e., diseases with a few known miRNA associations). This leads to poor prediction accuracies. The most straightforward way of improving the performance is by increasing the number of known miRNA-disease associations. However, due to lack of information, increasing attention has been paid to developing computational models that can handle insufficient data via a technical approach. In this paper, we present a general framework—improved prediction of miRNA-disease associations (IMDN)—based on matrix completion with network regularization to discover potential disease-related miRNAs. The success of adopting matrix factorization is demonstrated by its excellent performance in recommender systems. This approach considers a miRNA network as additional implicit feedback and makes predictions for disease associations relevant to a given miRNA based on its direct neighbors. Our experimental results demonstrate that IMDN achieved excellent performance with reliable area under the receiver operating characteristic (ROC) area under the curve (AUC) values of 0.9162 and 0.8965 in the frameworks of global and local leave-one-out cross-validations (LOOCV), respectively. Further, case studies demonstrated that our method can not only validate true miRNA-disease associations but also suggest novel disease-related miRNA candidates.


Author(s):  
Xing Chen ◽  
Tian-Hao Li ◽  
Yan Zhao ◽  
Chun-Chun Wang ◽  
Chi-Chi Zhu

Abstract MicroRNA (miRNA) plays an important role in the occurrence, development, diagnosis and treatment of diseases. More and more researchers begin to pay attention to the relationship between miRNA and disease. Compared with traditional biological experiments, computational method of integrating heterogeneous biological data to predict potential associations can effectively save time and cost. Considering the limitations of the previous computational models, we developed the model of deep-belief network for miRNA-disease association prediction (DBNMDA). We constructed feature vectors to pre-train restricted Boltzmann machines for all miRNA-disease pairs and applied positive samples and the same number of selected negative samples to fine-tune DBN to obtain the final predicted scores. Compared with the previous supervised models that only use pairs with known label for training, DBNMDA innovatively utilizes the information of all miRNA-disease pairs during the pre-training process. This step could reduce the impact of too few known associations on prediction accuracy to some extent. DBNMDA achieves the AUC of 0.9104 based on global leave-one-out cross validation (LOOCV), the AUC of 0.8232 based on local LOOCV and the average AUC of 0.9048 ± 0.0026 based on 5-fold cross validation. These AUCs are better than other previous models. In addition, three different types of case studies for three diseases were implemented to demonstrate the accuracy of DBNMDA. As a result, 84% (breast neoplasms), 100% (lung neoplasms) and 88% (esophageal neoplasms) of the top 50 predicted miRNAs were verified by recent literature. Therefore, we could conclude that DBNMDA is an effective method to predict potential miRNA-disease associations.


2018 ◽  
Vol 19 (11) ◽  
pp. 3410 ◽  
Author(s):  
Xiujuan Lei ◽  
Zengqiang Fang ◽  
Luonan Chen ◽  
Fang-Xiang Wu

CircRNAs have particular biological structure and have proven to play important roles in diseases. It is time-consuming and costly to identify circRNA-disease associations by biological experiments. Therefore, it is appealing to develop computational methods for predicting circRNA-disease associations. In this study, we propose a new computational path weighted method for predicting circRNA-disease associations. Firstly, we calculate the functional similarity scores of diseases based on disease-related gene annotations and the semantic similarity scores of circRNAs based on circRNA-related gene ontology, respectively. To address missing similarity scores of diseases and circRNAs, we calculate the Gaussian Interaction Profile (GIP) kernel similarity scores for diseases and circRNAs, respectively, based on the circRNA-disease associations downloaded from circR2Disease database (http://bioinfo.snnu.edu.cn/CircR2Disease/). Then, we integrate disease functional similarity scores and circRNA semantic similarity scores with their related GIP kernel similarity scores to construct a heterogeneous network made up of three sub-networks: disease similarity network, circRNA similarity network and circRNA-disease association network. Finally, we compute an association score for each circRNA-disease pair based on paths connecting them in the heterogeneous network to determine whether this circRNA-disease pair is associated. We adopt leave one out cross validation (LOOCV) and five-fold cross validations to evaluate the performance of our proposed method. In addition, three common diseases, Breast Cancer, Gastric Cancer and Colorectal Cancer, are used for case studies. Experimental results illustrate the reliability and usefulness of our computational method in terms of different validation measures, which indicates PWCDA can effectively predict potential circRNA-disease associations.


2018 ◽  
Vol 2018 ◽  
pp. 1-12 ◽  
Author(s):  
Haochen Zhao ◽  
Linai Kuang ◽  
Lei Wang ◽  
Zhanwei Xuan

Recently, accumulating laboratorial studies have indicated that plenty of long noncoding RNAs (lncRNAs) play important roles in various biological processes and are associated with many complex human diseases. Therefore, developing powerful computational models to predict correlation between lncRNAs and diseases based on heterogeneous biological datasets will be important. However, there are few approaches to calculating and analyzing lncRNA-disease associations on the basis of information about miRNAs. In this article, a new computational method based on distance correlation set is developed to predict lncRNA-disease associations (DCSLDA). Comparing with existing state-of-the-art methods, we found that the major novelty of DCSLDA lies in the introduction of lncRNA-miRNA-disease network and distance correlation set; thus DCSLDA can be applied to predict potential lncRNA-disease associations without requiring any known disease-lncRNA associations. Simulation results show that DCSLDA can significantly improve previous existing models with reliable AUC of 0.8517 in the leave-one-out cross-validation. Furthermore, while implementing DCSLDA to prioritize candidate lncRNAs for three important cancers, in the first 0.5% of forecast results, 17 predicted associations are verified by other independent studies and biological experimental studies. Hence, it is anticipated that DCSLDA could be a great addition to the biomedical research field.


2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Da Xu ◽  
Hanxiao Xu ◽  
Yusen Zhang ◽  
Mingyi Wang ◽  
Wei Chen ◽  
...  

Abstract Background Microbes are closely related to human health and diseases. Identification of disease-related microbes is of great significance for revealing the pathological mechanism of human diseases and understanding the interaction mechanisms between microbes and humans, which is also useful for the prevention, diagnosis and treatment of human diseases. Considering the known disease-related microbes are still insufficient, it is necessary to develop effective computational methods and reduce the time and cost of biological experiments. Methods In this work, we developed a novel computational method called MDAKRLS to discover potential microbe-disease associations (MDAs) based on the Kronecker regularized least squares. Specifically, we introduced the Hamming interaction profile similarity to measure the similarities of microbes and diseases besides Gaussian interaction profile kernel similarity. In addition, we introduced the Kronecker product to construct two kinds of Kronecker similarities between microbe-disease pairs. Then, we designed the Kronecker regularized least squares with different Kronecker similarities to obtain prediction scores, respectively, and calculated the final prediction scores by integrating the contributions of different similarities. Results The AUCs value of global leave-one-out cross-validation and 5-fold cross-validation achieved by MDAKRLS were 0.9327 and 0.9023 ± 0.0015, which were significantly higher than five state-of-the-art methods used for comparison. Comparison results demonstrate that MDAKRLS has faster computing speed under two kinds of frameworks. In addition, case studies of inflammatory bowel disease (IBD) and asthma further showed 19 (IBD), 19 (asthma) of the top 20 prediction disease-related microbes could be verified by previously published biological or medical literature. Conclusions All the evaluation results adequately demonstrated that MDAKRLS has an effective and reliable prediction performance. It may be a useful tool to seek disease-related new microbes and help biomedical researchers to carry out follow-up studies.


2019 ◽  
Vol 35 (22) ◽  
pp. 4730-4738 ◽  
Author(s):  
Yan Zhao ◽  
Xing Chen ◽  
Jun Yin

AbstractMotivationRecent studies have shown that microRNAs (miRNAs) play a critical part in several biological processes and dysregulation of miRNAs is related with numerous complex human diseases. Thus, in-depth research of miRNAs and their association with human diseases can help us to solve many problems.ResultsDue to the high cost of traditional experimental methods, revealing disease-related miRNAs through computational models is a more economical and efficient way. Considering the disadvantages of previous models, in this paper, we developed adaptive boosting for miRNA-disease association prediction (ABMDA) to predict potential associations between diseases and miRNAs. We balanced the positive and negative samples by performing random sampling based on k-means clustering on negative samples, whose process was quick and easy, and our model had higher efficiency and scalability for large datasets than previous methods. As a boosting technology, ABMDA was able to improve the accuracy of given learning algorithm by integrating weak classifiers that could score samples to form a strong classifier based on corresponding weights. Here, we used decision tree as our weak classifier. As a result, the area under the curve (AUC) of global and local leave-one-out cross validation reached 0.9170 and 0.8220, respectively. What is more, the mean and the standard deviation of AUCs achieved 0.9023 and 0.0016, respectively in 5-fold cross validation. Besides, in the case studies of three important human cancers, 49, 50 and 50 out of the top 50 predicted miRNAs for colon neoplasms, hepatocellular carcinoma and breast neoplasms were confirmed by the databases and experimental literatures.Availability and implementationThe code and dataset of ABMDA are freely available at https://github.com/githubcode007/ABMDA.Supplementary informationSupplementary data are available at Bioinformatics online.


2018 ◽  
Vol 2018 ◽  
pp. 1-11
Author(s):  
Xiaoying Li ◽  
Yaping Lin ◽  
Changlong Gu ◽  
Zejun Li

Aberrant expression of microRNAs (miRNAs) can be applied for the diagnosis, prognosis, and treatment of human diseases. Identifying the relationship between miRNA and human disease is important to further investigate the pathogenesis of human diseases. However, experimental identification of the associations between diseases and miRNAs is time-consuming and expensive. Computational methods are efficient approaches to determine the potential associations between diseases and miRNAs. This paper presents a new computational method based on the SimRank and density-based clustering recommender model for miRNA-disease associations prediction (SRMDAP). The AUC of 0.8838 based on leave-one-out cross-validation and case studies suggested the excellent performance of the SRMDAP in predicting miRNA-disease associations. SRMDAP could also predict diseases without any related miRNAs and miRNAs without any related diseases.


2020 ◽  
Vol 27 (3) ◽  
pp. 419-428 ◽  
Author(s):  
Max Moldovan ◽  
Jyoti Khadka ◽  
Renuka Visvanathan ◽  
Steve Wesselingh ◽  
Maria C Inacio

Abstract Objectives To (1) use an elastic net (EN) algorithm to derive a frailty measure from a national aged care eligibility assessment program; (2) compare the ability of EN-based and a traditional cumulative deficit (CD) based frailty measures to predict mortality and entry into permanent residential care; (3) assess if the predictive ability can be improved by using weighted frailty measures. Materials and Methods A Cox proportional hazard model based EN algorithm was applied to the 2003–2013 cohort of 903 996 participants for selecting items to enter an EN based frailty measure. The out-of-sample predictive accuracy was measured by the area under the curve (AUC) from Cox models fitted to 80% training and validated on 20% testing samples. Results The EN approach resulted in a 178-item frailty measure including items excluded from the 44-item CD-based measure. The EN based measure was not statistically significantly different from the CD-based approach in terms of predicting mortality (AUC 0.641, 95% CI: 0.637–0.644 vs AUC 0.637, 95% CI: 0.634–0.641) and permanent care entry (AUC 0.626, 95% CI: 0.624–0.629 vs AUC 0.627, 95% CI: 0.625–0.63). However, the weighted EN based measure statistically outperforms the weighted CD measure for predicting mortality (AUC 0.774, 95% CI: 0.771–0.777 vs AUC 0.757, 95% CI: 0.754–0.760) and permanent care entry (AUC 0.676, 95% CI: 0.673–0.678 vs AUC 0.671, 95% CI: 0.668–0.674). Conclusions The weighted EN and CD-based measures demonstrated similar prediction performance. The CD-based measure items are relevant to frailty measurement and easier to interpret. We recommend using the weighted and unweighted CD-based frailty measures.


2021 ◽  
Vol 12 ◽  
Author(s):  
Haixiu Yang ◽  
Fan Tong ◽  
Changlu Qi ◽  
Ping Wang ◽  
Jiangyu Li ◽  
...  

Many microbes are parasitic within the human body, engaging in various physiological processes and playing an important role in human diseases. The discovery of new microbe–disease associations aids our understanding of disease pathogenesis. Computational methods can be applied in such investigations, thereby avoiding the time-consuming and laborious nature of experimental methods. In this study, we constructed a comprehensive microbe–disease network by integrating known microbe–disease associations from three large-scale databases (Peryton, Disbiome, and gutMDisorder), and extended the random walk with restart to the network for prioritizing unknown microbe–disease associations. The area under the curve values of the leave-one-out cross-validation and the fivefold cross-validation exceeded 0.9370 and 0.9366, respectively, indicating the high performance of this method. Despite being widely studied diseases, in case studies of inflammatory bowel disease, asthma, and obesity, some prioritized disease-related microbes were validated by recent literature. This suggested that our method is effective at prioritizing novel disease-related microbes and may offer further insight into disease pathogenesis.


Sign in / Sign up

Export Citation Format

Share Document