Predicting LncRNA–Disease Association by a Random Walk With Restart on Multiplex and Heterogeneous Networks

Studies have found that long non-coding RNAs (lncRNAs) play important roles in many human biological processes, and it is critical to explore potential lncRNA–disease associations, especially cancer-associated lncRNAs. However, traditional biological experiments are costly and time-consuming, so it is of great significance to develop effective computational models. We developed a random walk algorithm with restart on multiplex and heterogeneous networks of lncRNAs and diseases to predict lncRNA–disease associations (MHRWRLDA). First, multiple disease similarity networks are constructed by using different approaches to calculate similarity scores between diseases, and multiple lncRNA similarity networks are also constructed by using different approaches to calculate similarity scores between lncRNAs. Then, a multiplex and heterogeneous network was constructed by integrating multiple disease similarity networks and multiple lncRNA similarity networks with the lncRNA–disease associations, and a random walk with restart on the multiplex and heterogeneous network was performed to predict lncRNA–disease associations. The results of Leave-One-Out cross-validation (LOOCV) showed that the value of Area under the curve (AUC) was 0.68736, which was improved compared with the classical algorithm in recent years. Finally, we confirmed a few novel predicted lncRNAs associated with specific diseases like colon cancer by literature mining. In summary, MHRWRLDA contributes to predict lncRNA–disease associations.

Download Full-text

Prediction of lncRNA-disease association based on a Laplace normalized random walk with restart algorithm on heterogeneous networks

BMC Bioinformatics ◽

10.1186/s12859-021-04538-1 ◽

2022 ◽

Vol 23 (1) ◽

Author(s):

Liugen Wang ◽

Min Shang ◽

Qi Dai ◽

Ping-an He

Keyword(s):

Random Walk ◽

Heterogeneous Networks ◽

Global Network ◽

Random Walk With Restart ◽

Lncrna Gene ◽

Similarity Network ◽

Disease Similarity ◽

Disease Associations ◽

Universal Network ◽

Gene Similarity

Abstract Background More and more evidence showed that long non-coding RNAs (lncRNAs) play important roles in the development and progression of human sophisticated diseases. Therefore, predicting human lncRNA-disease associations is a challenging and urgently task in bioinformatics to research of human sophisticated diseases. Results In the work, a global network-based computational framework called as LRWRHLDA were proposed which is a universal network-based method. Firstly, four isomorphic networks include lncRNA similarity network, disease similarity network, gene similarity network and miRNA similarity network were constructed. And then, six heterogeneous networks include known lncRNA-disease, lncRNA-gene, lncRNA-miRNA, disease-gene, disease-miRNA, and gene-miRNA associations network were applied to design a multi-layer network. Finally, the Laplace normalized random walk with restart algorithm in this global network is suggested to predict the relationship between lncRNAs and diseases. Conclusions The ten-fold cross validation is used to evaluate the performance of LRWRHLDA. As a result, LRWRHLDA achieves an AUC of 0.98402, which is higher than other compared methods. Furthermore, LRWRHLDA can predict isolated disease-related lnRNA (isolated lnRNA related disease). The results for colorectal cancer, lung adenocarcinoma, stomach cancer and breast cancer have been verified by other researches. The case studies indicated that our method is effective.

Download Full-text

A model based on random walk with restart to predict circRNA-disease associations on heterogeneous network

Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining ◽

10.1145/3341161.3343514 ◽

2019 ◽

Author(s):

Hüseyin Vural ◽

Mehmet Kaya ◽

Reda Alhajj

Keyword(s):

Random Walk ◽

Heterogeneous Network ◽

Random Walk With Restart ◽

Model Based ◽

Disease Associations

Download Full-text

A novel target convergence set based random walk with restart for prediction of potential LncRNA-disease associations

BMC Bioinformatics ◽

10.1186/s12859-019-3216-4 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 1

Author(s):

Jiechen Li ◽

Xueyong Li ◽

Xiang Feng ◽

Bing Wang ◽

Bihai Zhao ◽

...

Keyword(s):

Random Walk ◽

Case Studies ◽

Computational Models ◽

Stable State ◽

Random Walk With Restart ◽

Disease Network ◽

Disease Associations ◽

Novel Target ◽

Comparative Results ◽

Leave One Out

Abstract Background In recent years, lncRNAs (long-non-coding RNAs) have been proved to be closely related to the occurrence and development of many serious diseases that are seriously harmful to human health. However, most of the lncRNA-disease associations have not been found yet due to high costs and time complexity of traditional bio-experiments. Hence, it is quite urgent and necessary to establish efficient and reasonable computational models to predict potential associations between lncRNAs and diseases. Results In this manuscript, a novel prediction model called TCSRWRLD is proposed to predict potential lncRNA-disease associations based on improved random walk with restart. In TCSRWRLD, a heterogeneous lncRNA-disease network is constructed first by combining the integrated similarity of lncRNAs and the integrated similarity of diseases. And then, for each lncRNA/disease node in the newly constructed heterogeneous lncRNA-disease network, it will establish a node set called TCS (Target Convergence Set) consisting of top 100 disease/lncRNA nodes with minimum average network distances to these disease/lncRNA nodes having known associations with itself. Finally, an improved random walk with restart is implemented on the heterogeneous lncRNA-disease network to infer potential lncRNA-disease associations. The major contribution of this manuscript lies in the introduction of the concept of TCS, based on which, the velocity of convergence of TCSRWRLD can be quicken effectively, since the walker can stop its random walk while the walking probability vectors obtained by it at the nodes in TCS instead of all nodes in the whole network have reached stable state. And Simulation results show that TCSRWRLD can achieve a reliable AUC of 0.8712 in the Leave-One-Out Cross Validation (LOOCV), which outperforms previous state-of-the-art results apparently. Moreover, case studies of lung cancer and leukemia demonstrate the satisfactory prediction performance of TCSRWRLD as well. Conclusions Both comparative results and case studies have demonstrated that TCSRWRLD can achieve excellent performances in prediction of potential lncRNA-disease associations, which imply as well that TCSRWRLD may be a good addition to the research of bioinformatics in the future.

Download Full-text

Biased Random Walk With Restart on Multilayer Heterogeneous Networks for MiRNA–Disease Association Prediction

Frontiers in Genetics ◽

10.3389/fgene.2021.720327 ◽

2021 ◽

Vol 12 ◽

Author(s):

Jia Qu ◽

Chun-Chun Wang ◽

Shu-Bin Cai ◽

Wen-Di Zhao ◽

Xiao-Long Cheng ◽

...

Keyword(s):

Random Walk ◽

Heterogeneous Networks ◽

Disease Association ◽

Random Walk With Restart ◽

Candidate Mirnas ◽

Biased Random Walk ◽

Disease Associations ◽

And Performance ◽

Function Similarity ◽

Leave One Out

Numerous experiments have proved that microRNAs (miRNAs) could be used as diagnostic biomarkers for many complex diseases. Thus, it is conceivable that predicting the unobserved associations between miRNAs and diseases is extremely significant for the medical field. Here, based on heterogeneous networks built on the information of known miRNA–disease associations, miRNA function similarity, disease semantic similarity, and Gaussian interaction profile kernel similarity for miRNAs and diseases, we developed a computing model of biased random walk with restart on multilayer heterogeneous networks for miRNA–disease association prediction (BRWRMHMDA) through enforcing degree-based biased random walk with restart (BRWR). Assessment results reflected that an AUC of 0.8310 was gained in local leave-one-out cross-validation (LOOCV), which proved the calculation algorithm’s good performance. Besides, we carried out BRWRMHMDA to prioritize candidate miRNAs for esophageal neoplasms based on HMDD v2.0. We further prioritize candidate miRNAs for breast neoplasms based on HMDD v1.0. The local LOOCV results and performance analysis of the case study all showed that the proposed model has good and stable performance.

Download Full-text

A scalable random walk with restart on heterogeneous networks with Apache Spark for ranking disease-causing genes using type-2 fuzzy data fusion

10.1101/844159 ◽

2019 ◽

Author(s):

Mehdi Joodaki ◽

Nasser Ghadiri ◽

Zeinab Maleki ◽

Maryam Lotfi Shahreza

Keyword(s):

Random Walk ◽

Heterogeneous Networks ◽

Heterogeneous Network ◽

Gene Network ◽

Parallel Execution ◽

Apache Spark ◽

Random Walk With Restart ◽

Integrated Network ◽

New Genes ◽

Gene Similarity

AbstractPrediction and discovery of disease-causing genes are among the main missions of biology and medicine. In recent years, researchers have developed several methods based on gene/protein networks for the detection of causative genes. However, because of the presence of false positives in these networks, the results of these methods often lack accuracy and reliability. This problem can be solved by using multiple genomic sources to reduce noise in data. However, network integration can also affect the quality of the integrated network. In this paper, we present a method named RWRHN (random walk with restart on a heterogeneous network) with fuzzy fusion or RWRHN-FF. In this method, first, four gene-gene similarity networks are constructed based on different genomic sources and then integrated using the type-II fuzzy voter scheme. The resulting gene-gene network is then linked to a disease-disease similarity network, which itself is constructed by the integration of four sources, through a two-part disease-gene network. The product of this process is a reliable heterogeneous network, which is analyzed by the RWRHN algorithm. The results of the analysis with the leave-one-out cross-validation method show that RWRHN-FF outperforms both RWRHN and RWRH. The proposed method is used to predict new genes for prostate, breast, gastric and colon cancers. To reduce the algorithm run time, Apache Spark is used as a platform for parallel execution of the RWRHN algorithm on heterogeneous networks. In the test conducted on heterogeneous networks of different sizes, this solution results in faster convergence than other non-distributed modes of implementations.

Download Full-text

Predicting miRNA–disease associations using improved random walk with restart and integrating multiple similarities

Scientific Reports ◽

10.1038/s41598-021-00677-w ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Van Tinh Nguyen ◽

Thi Tu Kien Le ◽

Khoat Than ◽

Dang Hung Tran

Keyword(s):

Random Walk ◽

Heterogeneous Networks ◽

Negative Impact ◽

Statistical Tests ◽

Computational Method ◽

Random Walk With Restart ◽

New Associations ◽

Computer Scientists ◽

Disease Associations ◽

Top 40

AbstractPredicting beneficial and valuable miRNA–disease associations (MDAs) by doing biological laboratory experiments is costly and time-consuming. Proposing a forceful and meaningful computational method for predicting MDAs is essential and captivated many computer scientists in recent years. In this paper, we proposed a new computational method to predict miRNA–disease associations using improved random walk with restart and integrating multiple similarities (RWRMMDA). We used a WKNKN algorithm as a pre-processing step to solve the problem of sparsity and incompletion of data to reduce the negative impact of a large number of missing associations. Two heterogeneous networks in disease and miRNA spaces were built by integrating multiple similarity networks, respectively, and different walk probabilities could be designated to each linked neighbor node of the disease or miRNA node in line with its degree in respective networks. Finally, an improve extended random walk with restart algorithm based on miRNA similarity-based and disease similarity-based heterogeneous networks was used to calculate miRNA–disease association prediction probabilities. The experiments showed that our proposed method achieved a momentous performance with Global LOOCV AUC (Area Under Roc Curve) and AUPR (Area Under Precision-Recall Curve) values of 0.9882 and 0.9066, respectively. And the best AUC and AUPR values under fivefold cross-validation of 0.9855 and 0.8642 which are proven by statistical tests, respectively. In comparison with other previous related methods, it outperformed than NTSHMDA, PMFMDA, IMCMDA and MCLPMDA methods in both AUC and AUPR values. In case studies of Breast Neoplasms, Carcinoma Hepatocellular and Stomach Neoplasms diseases, it inferred 1, 12 and 7 new associations out of top 40 predicted associated miRNAs for each disease, respectively. All of these new inferred associations have been confirmed in different databases or literatures.

Download Full-text

A scalable random walk with restart on heterogeneous networks with Apache Spark for ranking disease-related genes through type-II fuzzy data fusion

Journal of Biomedical Informatics ◽

10.1016/j.jbi.2021.103688 ◽

2021 ◽

Vol 115 ◽

pp. 103688

Author(s):

Mehdi Joodaki ◽

Nasser Ghadiri ◽

Zeinab Maleki ◽

Maryam Lotfi Shahreza

Keyword(s):

Random Walk ◽

Data Fusion ◽

Heterogeneous Networks ◽

Apache Spark ◽

Fuzzy Data ◽

Type Ii ◽

Random Walk With Restart ◽

Disease Related Genes

Download Full-text

Inferring MicroRNA-Disease Associations by Random Walk on a Heterogeneous Network with Multiple Data Sources

IEEE/ACM Transactions on Computational Biology and Bioinformatics ◽

10.1109/tcbb.2016.2550432 ◽

2017 ◽

Vol 14 (4) ◽

pp. 905-915 ◽

Cited By ~ 138

Author(s):

Yuansheng Liu ◽

Xiangxiang Zeng ◽

Zengyou He ◽

Quan Zou

Keyword(s):

Random Walk ◽

Heterogeneous Network ◽

Data Sources ◽

Multiple Data Sources ◽

Multiple Data ◽

Disease Associations

Download Full-text

A novel approach for predicting microRNA-disease associations by unbalanced bi-random walk on heterogeneous network

Journal of Biomedical Informatics ◽

10.1016/j.jbi.2017.01.008 ◽

2017 ◽

Vol 66 ◽

pp. 194-203 ◽

Cited By ~ 53

Author(s):

Jiawei Luo ◽

Qiu Xiao

Keyword(s):

Random Walk ◽

Heterogeneous Network ◽

Novel Approach ◽

Disease Associations

Download Full-text

MultiSourcDSim: an integrated approach for exploring disease similarity

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-019-0968-8 ◽

2019 ◽

Vol 19 (S6) ◽

Author(s):

Lei Deng ◽

Danyi Ye ◽

Junmin Zhao ◽

Jingpu Zhang

Keyword(s):

Integrated Approach ◽

Data Sources ◽

Related Data ◽

The Past ◽

Multiple Data ◽

Disease Similarity ◽

Disease Associations ◽

Novel Method ◽

Similarity Networks ◽

Few Data

Abstract Background A collection of disease-associated data contributes to study the association between diseases. Discovering closely related diseases plays a crucial role in revealing their common pathogenic mechanisms. This might further imply treatment that can be appropriated from one disease to another. During the past decades, a number of approaches for calculating disease similarity have been developed. However, most of them are designed to take advantage of single or few data sources, which results in their low accuracy. Methods In this paper, we propose a novel method, called MultiSourcDSim, to calculate disease similarity by integrating multiple data sources, namely, gene-disease associations, GO biological process-disease associations and symptom-disease associations. Firstly, we establish three disease similarity networks according to the three disease-related data sources respectively. Secondly, the representation of each node is obtained by integrating the three small disease similarity networks. In the end, the learned representations are applied to calculate the similarity between diseases. Results Our approach shows the best performance compared to the other three popular methods. Besides, the similarity network built by MultiSourcDSim suggests that our method can also uncover the latent relationships between diseases. Conclusions MultiSourcDSim is an efficient approach to predict similarity between diseases.

Download Full-text