scholarly journals GCAEMDA: Predicting miRNA-disease associations via graph convolutional autoencoder

2021 ◽  
Vol 17 (12) ◽  
pp. e1009655
Author(s):  
Lei Li ◽  
Yu-Tian Wang ◽  
Cun-Mei Ji ◽  
Chun-Hou Zheng ◽  
Jian-Cheng Ni ◽  
...  

microRNAs (miRNAs) are small non-coding RNAs related to a number of complicated biological processes. A growing body of studies have suggested that miRNAs are closely associated with many human diseases. It is meaningful to consider disease-related miRNAs as potential biomarkers, which could greatly contribute to understanding the mechanisms of complex diseases and benefit the prevention, detection, diagnosis and treatment of extraordinary diseases. In this study, we presented a novel model named Graph Convolutional Autoencoder for miRNA-Disease Association Prediction (GCAEMDA). In the proposed model, we utilized miRNA-miRNA similarities, disease-disease similarities and verified miRNA-disease associations to construct a heterogeneous network, which is applied to learn the embeddings of miRNAs and diseases. In addition, we separately constructed miRNA-based and disease-based sub-networks. Combining the embeddings of miRNAs and diseases, graph convolution autoencoder (GCAE) is utilized to calculate association scores of miRNA-disease on two sub-networks, respectively. Furthermore, we obtained final prediction scores between miRNAs and diseases by adopting an average ensemble way to integrate the prediction scores from two types of subnetworks. To indicate the accuracy of GCAEMDA, we applied different cross validation methods to evaluate our model whose performance were better than the state-of-the-art models. Case studies on a common human diseases were also implemented to prove the effectiveness of GCAEMDA. The results demonstrated that GCAEMDA were beneficial to infer potential associations of miRNA-disease.

Complexity ◽  
2019 ◽  
Vol 2019 ◽  
pp. 1-12 ◽  
Author(s):  
Xiujuan Lei ◽  
Wenxiang Zhang

The circular RNAs (circRNAs) have significant effects on a variety of biological processes, the dysfunction of which is closely related to the emergence and development of diseases. Therefore, identification of circRNA-disease associations will contribute to analysing the pathogenesis of diseases. Here, we present a computational model called BRWSP to predict circRNA-disease associations, which searches paths on a multiple heterogeneous network based on biased random walk. Firstly, BRWSP constructs a multiple heterogeneous network by using circRNAs, diseases, and genes. Then, the biased random walk algorithm runs on the multiple heterogeneous network to search paths between circRNAs and diseases. Finally, the performance of BRWSP is significantly better than the state-of-the-art algorithms. Furthermore, BRWSP further contributes to the discovery of novel circRNA-disease associations.


2021 ◽  
Vol 12 ◽  
Author(s):  
Cunmei Ji ◽  
Yutian Wang ◽  
Jiancheng Ni ◽  
Chunhou Zheng ◽  
Yansen Su

In recent years, more and more evidence has shown that microRNAs (miRNAs) play an important role in the regulation of post-transcriptional gene expression, and are closely related to human diseases. Many studies have also revealed that miRNAs can be served as promising biomarkers for the potential diagnosis and treatment of human diseases. The interactions between miRNA and human disease have rarely been demonstrated, and the underlying mechanism of miRNA is not clear. Therefore, computational approaches has attracted the attention of researchers, which can not only save time and money, but also improve the efficiency and accuracy of biological experiments. In this work, we proposed a Heterogeneous Graph Attention Networks (GAT) based method for miRNA-disease associations prediction, named HGATMDA. We constructed a heterogeneous graph for miRNAs and diseases, introduced weighted DeepWalk and GAT methods to extract features of miRNAs and diseases from the graph. Moreover, a fully-connected neural networks is used to predict correlation scores between miRNA-disease pairs. Experimental results under five-fold cross validation (five-fold CV) showed that HGATMDA achieved better prediction performance than other state-of-the-art methods. In addition, we performed three case studies on breast neoplasms, lung neoplasms and kidney neoplasms. The results showed that for the three diseases mentioned above, 50 out of top 50 candidates were confirmed by the validation datasets. Therefore, HGATMDA is suitable as an effective tool to identity potential diseases-related miRNAs.


Author(s):  
Cunmei Ji ◽  
Zhen Gao ◽  
Xu Ma ◽  
Qingwen Wu ◽  
Jiancheng Ni ◽  
...  

Abstract Motivation MicroRNAs (miRNAs) are a class of non-coding RNAs that play critical roles in various biological processes. Many studies have shown that miRNAs are closely related to the occurrence, development and diagnosis of human diseases. Traditional biological experiments are costly and time consuming. As a result, effective computational models have become increasingly popular for predicting associations between miRNAs and diseases, which could effectively boost human disease diagnosis and prevention. Results We propose a novel computational framework, called AEMDA, to identify associations between miRNAs and diseases. AEMDA applies a learning-based method to extract dense and high-dimensional representations of diseases and miRNAs from integrated disease semantic similarity, miRNA functional similarity and heterogeneous related interaction data. In addition, AEMDA adopts a deep autoencoder that does not need negative samples to retrieve the underlying associations between miRNAs and diseases. Furthermore, the reconstruction error is used as a measurement to predict disease-associated miRNAs. Our experimental results indicate that AEMDA can effectively predict disease-related miRNAs and outperforms state-of-the-art methods. Availability and implementation The source code and data are available at https://github.com/CunmeiJi/AEMDA. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Ying Li ◽  
Hang Sun ◽  
Shiyao Feng ◽  
Qi Zhang ◽  
Siyu Han ◽  
...  

Abstract Background Long noncoding RNAs (lncRNAs) play important roles in multiple biological processes. Identifying LncRNA–protein interactions (LPIs) is key to understanding lncRNA functions. Although some LPIs computational methods have been developed, the LPIs prediction problem remains challenging. How to integrate multimodal features from more perspectives and build deep learning architectures with better recognition performance have always been the focus of research on LPIs. Results We present a novel multichannel capsule network framework to integrate multimodal features for LPI prediction, Capsule-LPI. Capsule-LPI integrates four groups of multimodal features, including sequence features, motif information, physicochemical properties and secondary structure features. Capsule-LPI is composed of four feature-learning subnetworks and one capsule subnetwork. Through comprehensive experimental comparisons and evaluations, we demonstrate that both multimodal features and the architecture of the multichannel capsule network can significantly improve the performance of LPI prediction. The experimental results show that Capsule-LPI performs better than the existing state-of-the-art tools. The precision of Capsule-LPI is 87.3%, which represents a 1.7% improvement. The F-value of Capsule-LPI is 92.2%, which represents a 1.4% improvement. Conclusions This study provides a novel and feasible LPI prediction tool based on the integration of multimodal features and a capsule network. A webserver (http://csbg-jlu.site/lpc/predict) is developed to be convenient for users.


2020 ◽  
Vol 17 (3) ◽  
pp. 849-865
Author(s):  
Zhongqin Bi ◽  
Shuming Dou ◽  
Zhe Liu ◽  
Yongbin Li

Neural network methods have been trained to satisfactorily learn user/product representations from textual reviews. A representation can be considered as a multiaspect attention weight vector. However, in several existing methods, it is assumed that the user representation remains unchanged even when the user interacts with products having diverse characteristics, which leads to inaccurate recommendations. To overcome this limitation, this paper proposes a novel model to capture the varying attention of a user for different products by using a multilayer attention framework. First, two individual hierarchical attention networks are used to encode the users and products to learn the user preferences and product characteristics from review texts. Then, we design an attention network to reflect the adaptive change in the user preferences for each aspect of the targeted product in terms of the rating and review. The results of experiments performed on three public datasets demonstrate that the proposed model notably outperforms the other state-of-the-art baselines, thereby validating the effectiveness of the proposed approach.


Author(s):  
Hongying Zhao ◽  
Jian Shi ◽  
Yunpeng Zhang ◽  
Aimin Xie ◽  
Lei Yu ◽  
...  

Abstract Long non-coding RNAs (lncRNAs) are associated with human diseases. Although lncRNA–disease associations have received significant attention, no online repository is available to collect lncRNA-mediated regulatory mechanisms, key downstream targets, and important biological functions driven by disease-related lncRNAs in human diseases. We thus developed LncTarD (http://biocc.hrbmu.edu.cn/LncTarD/ or http://bio-bigdata.hrbmu.edu.cn/LncTarD), a manually-curated database that provides a comprehensive resource of key lncRNA–target regulations, lncRNA-influenced functions, and lncRNA-mediated regulatory mechanisms in human diseases. LncTarD offers (i) 2822 key lncRNA–target regulations involving 475 lncRNAs and 1039 targets associated with 177 human diseases; (ii) 1613 experimentally-supported functional regulations and 1209 expression associations in human diseases; (iii) important biological functions driven by disease-related lncRNAs in human diseases; (iv) lncRNA–target regulations responsible for drug resistance or sensitivity in human diseases and (v) lncRNA microarray, lncRNA sequence data and transcriptome data of an 11 373 pan-cancer patient cohort from TCGA to help characterize the functional dynamics of these lncRNA–target regulations. LncTarD also provides a user-friendly interface to conveniently browse, search, and download data. LncTarD will be a useful resource platform for the further understanding of functions and molecular mechanisms of lncRNA deregulation in human disease, which will help to identify novel and sensitive biomarkers and therapeutic targets.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Yahui Long ◽  
Jiawei Luo

Abstract Background An increasing number of biological and clinical evidences have indicated that the microorganisms significantly get involved in the pathological mechanism of extensive varieties of complex human diseases. Inferring potential related microbes for diseases can not only promote disease prevention, diagnosis and treatment, but also provide valuable information for drug development. Considering that experimental methods are expensive and time-consuming, developing computational methods is an alternative choice. However, most of existing methods are biased towards well-characterized diseases and microbes. Furthermore, existing computational methods are limited in predicting potential microbes for new diseases. Results Here, we developed a novel computational model to predict potential human microbe-disease associations (MDAs) based on Weighted Meta-Graph (WMGHMDA). We first constructed a heterogeneous information network (HIN) by combining the integrated microbe similarity network, the integrated disease similarity network and the known microbe-disease bipartite network. And then, we implemented iteratively pre-designed Weighted Meta-Graph search algorithm on the HIN to uncover possible microbe-disease pairs by cumulating the contribution values of weighted meta-graphs to the pairs as their probability scores. Depending on contribution potential, we described the contribution degree of different types of meta-graphs to a microbe-disease pair with bias rating. Meta-graph with higher bias rating will be assigned greater weight value when calculating probability scores. Conclusions The experimental results showed that WMGHMDA outperformed some state-of-the-art methods with average AUCs of 0.9288, 0.9068 ±0.0031 in global leave-one-out cross validation (LOOCV) and 5-fold cross validation (5-fold CV), respectively. In the case studies, 9, 19, 37 and 10, 20, 45 out of top-10, 20, 50 candidate microbes were manually verified by previous reports for asthma and inflammatory bowel disease (IBD), respectively. Furthermore, three common human diseases (Crohn’s disease, Liver cirrhosis, Type 1 diabetes) were adopted to demonstrate that WMGHMDA could be efficiently applied to make predictions for new diseases. In summary, WMGHMDA has a high potential in predicting microbe-disease associations.


2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Yubin Xiao ◽  
Zheng Xiao ◽  
Xiang Feng ◽  
Zhiping Chen ◽  
Linai Kuang ◽  
...  

Abstract Background Accumulating evidence has demonstrated that long non-coding RNAs (lncRNAs) are closely associated with human diseases, and it is useful for the diagnosis and treatment of diseases to get the relationships between lncRNAs and diseases. Due to the high costs and time complexity of traditional bio-experiments, in recent years, more and more computational methods have been proposed by researchers to infer potential lncRNA-disease associations. However, there exist all kinds of limitations in these state-of-the-art prediction methods as well. Results In this manuscript, a novel computational model named FVTLDA is proposed to infer potential lncRNA-disease associations. In FVTLDA, its major novelty lies in the integration of direct and indirect features related to lncRNA-disease associations such as the feature vectors of lncRNA-disease pairs and their corresponding association probability fractions, which guarantees that FVTLDA can be utilized to predict diseases without known related-lncRNAs and lncRNAs without known related-diseases. Moreover, FVTLDA neither relies solely on known lncRNA-disease nor requires any negative samples, which guarantee that it can infer potential lncRNA-disease associations more equitably and effectively than traditional state-of-the-art prediction methods. Additionally, to avoid the limitations of single model prediction techniques, we combine FVTLDA with the Multiple Linear Regression (MLR) and the Artificial Neural Network (ANN) for data analysis respectively. Simulation experiment results show that FVTLDA with MLR can achieve reliable AUCs of 0.8909, 0.8936 and 0.8970 in 5-Fold Cross Validation (fivefold CV), 10-Fold Cross Validation (tenfold CV) and Leave-One-Out Cross Validation (LOOCV), separately, while FVTLDA with ANN can achieve reliable AUCs of 0.8766, 0.8830 and 0.8807 in fivefold CV, tenfold CV, and LOOCV respectively. Furthermore, in case studies of gastric cancer, leukemia and lung cancer, experiment results show that there are 8, 8 and 8 out of top 10 candidate lncRNAs predicted by FVTLDA with MLR, and 8, 7 and 8 out of top 10 candidate lncRNAs predicted by FVTLDA with ANN, having been verified by recent literature. Comparing with the representative prediction model of KATZLDA, comparison results illustrate that FVTLDA with MLR and FVTLDA with ANN can achieve the average case study contrast scores of 0.8429 and 0.8515 respectively, which are both notably higher than the average case study contrast score of 0.6375 achieved by KATZLDA. Conclusion The simulation results show that FVTLDA has good prediction performance, which is a good supplement to future bioinformatics research.


2020 ◽  
Vol 49 (D1) ◽  
pp. D86-D91
Author(s):  
Bailing Zhou ◽  
Baohua Ji ◽  
Kui Liu ◽  
Guodong Hu ◽  
Fei Wang ◽  
...  

Abstract Long non-coding RNAs (lncRNAs) play important functional roles in many diverse biological processes. However, not all expressed lncRNAs are functional. Thus, it is necessary to manually collect all experimentally validated functional lncRNAs (EVlncRNA) with their sequences, structures, and functions annotated in a central database. The first release of such a database (EVLncRNAs) was made using the literature prior to 1 May 2016. Since then (till 15 May 2020), 19 245 articles related to lncRNAs have been published. In EVLncRNAs 2.0, these articles were manually examined for a major expansion of the data collected. Specifically, the number of annotated EVlncRNAs, associated diseases, lncRNA-disease associations, and interaction records were increased by 260%, 320%, 484% and 537%, respectively. Moreover, the database has added several new categories: 8 lncRNA structures, 33 exosomal lncRNAs, 188 circular RNAs, and 1079 drug-resistant, chemoresistant, and stress-resistant lncRNAs. All records have checked against known retraction and fake articles. This release also comes with a highly interactive visual interaction network that facilitates users to track the underlying relations among lncRNAs, miRNAs, proteins, genes and other functional elements. Furthermore, it provides links to four new bioinformatics tools with improved data browsing and searching functionality. EVLncRNAs 2.0 is freely available at https://www.sdklab-biophysics-dzu.net/EVLncRNAs2/.


2015 ◽  
Vol 2015 ◽  
pp. 1-12 ◽  
Author(s):  
Yingli Zhong ◽  
Ping Xuan ◽  
Ke Han ◽  
Weiping Zhang ◽  
Jianzhong Li

MicroRNAs (miRNAs) play important roles in the diverse biological processes of animals and plants. Although the prediction methods based on machine learning can identify nonhomologous and species-specific miRNAs, they suffered from severe class imbalance on real and pseudo pre-miRNAs. We propose a pre-miRNA classification method based on cost-sensitive ensemble learning and refer to it as MiRNAClassify. Through a series of iterations, the information of all the positive and negative samples is completely exploited. In each iteration, a new classification instance is trained by the equal number of positive and negative samples. In this way, the negative effect of class imbalance is efficiently relieved. The new instance primarily focuses on those samples that are easy to be misclassified. In addition, the positive samples are assigned higher cost weight than the negative samples. MiRNAClassify is compared with several state-of-the-art methods and some well-known classification models by testing the datasets about human, animal, and plant. The result of cross validation indicates that MiRNAClassify significantly outperforms other methods and models. In addition, the newly added pre-miRNAs are used to further evaluate the ability of these methods to discover novel pre-miRNAs. MiRNAClassify still achieves consistently superior performance and can discover more pre-miRNAs.


Sign in / Sign up

Export Citation Format

Share Document