scholarly journals DeepLncLoc: a deep learning framework for long non-coding RNA subcellular localization prediction based on subsequence embedding

Author(s):  
Min Zeng ◽  
Yifan Wu ◽  
Chengqian Lu ◽  
Fuhao Zhang ◽  
Fang-Xiang Wu ◽  
...  

Abstract Long non-coding RNAs (lncRNAs) are a class of RNA molecules with more than 200 nucleotides. A growing amount of evidence reveals that subcellular localization of lncRNAs can provide valuable insights into their biological functions. Existing computational methods for predicting lncRNA subcellular localization use k-mer features to encode lncRNA sequences. However, the sequence order information is lost by using only k-mer features. We proposed a deep learning framework, DeepLncLoc, to predict lncRNA subcellular localization. In DeepLncLoc, we introduced a new subsequence embedding method that keeps the order information of lncRNA sequences. The subsequence embedding method first divides a sequence into some consecutive subsequences and then extracts the patterns of each subsequence, last combines these patterns to obtain a complete representation of the lncRNA sequence. After that, a text convolutional neural network is employed to learn high-level features and perform the prediction task. Compared with traditional machine learning models, popular representation methods and existing predictors, DeepLncLoc achieved better performance, which shows that DeepLncLoc could effectively predict lncRNA subcellular localization. Our study not only presented a novel computational model for predicting lncRNA subcellular localization but also introduced a new subsequence embedding method which is expected to be applied in other sequence-based prediction tasks. The DeepLncLoc web server is freely accessible at http://bioinformatics.csu.edu.cn/DeepLncLoc/, and source code and datasets can be downloaded from https://github.com/CSUBioGroup/DeepLncLoc.

2021 ◽  
Author(s):  
Min Zeng ◽  
Yifan Wu ◽  
Chengqian Lu ◽  
Fuhao Zhang ◽  
Fang-Xiang Wu ◽  
...  

AbstractMotivationLong non-coding RNAs (IncRNAs) are a class of RNA molecules with more than 200 nucleotides. A growing amount of evidence reveals that subcellular localization of lncRNAs can provide valuable insights into their biological functions. Existing computational methods for predicting lncRNA subcellular localization use k-mer features to encode lncRNA sequences. However, the sequence order information is lost by using only k-mer features.ResultsWe proposed a deep learning framework, DeepLncLoc, to predict lncRNA subcellular localization. In DeepLncLoc, we introduced a new subsequence embedding method that keeps the order information of lncRNA sequences. The subsequence embedding method first divides a sequence into some consecutive subsequences, and then extracts the patterns of each subsequence, last combines these patterns to obtain a complete representation of the lncRNA sequence. After that, a text convolutional neural network is employed to learn high-level features and perform the prediction task. Compared to traditional machine learning models with k-mer features and existing predictors, DeepLncLoc achieved better performance, which shows that DeepLncLoc could effectively predict lncRNA subcellular localization. Our study not only presented a novel computational model for predicting lncRNA subcellular localization but also provided a new subsequence embedding method which is expected to be applied in other sequence-based prediction tasks.AvailabilityThe DeepLncLoc web server, source code and datasets are freely available at http://bioinformatics.csu.edu.cn/DeepLncLoc/, and https://github.com/CSUBioGroup/[email protected]


Author(s):  
Katarzyna Piórkowska ◽  
Kacper Żukowski ◽  
Katarzyna Ropka-Molik ◽  
Mirosław Tyra

Obesity is a problem in the last decades since the development of different technologies forced the submission of a faster pace of life, resulting in nutrition style changes. In turn, domestic pigs are an excellent animal model in recognition of adiposity-related processes, corresponding to the size of individual organs, the distribution of body fat in the organism, and similar metabolism. The present study applied the next-generation sequencing method to identify adipose tissue (AT) transcriptomic signals related to increased fat content by identifying differentially expressed genes (DEGs), included long-non coding RNA molecules. The Freiburg RNA tool was applied to recognise predicting hybridisation energy of RNA-RNA interactions. The results indicated several long non-coding RNAs (lncRNAs) whose expression was significantly positively or negatively associated with fat deposition. lncRNAs play an essential role in regulating gene expression by sponging miRNA, binding transcripts, facilitating translation, or coding other smaller RNA regulatory elements. In the pig fat tissue of obese group, increased expression of lncRNAs corresponding to human MALAT1 was observed that previously recognised in the obesity-related context. Moreover, hybridisation energy analyses pinpointed numerous potential interactions between identified differentially expressed lncRNAs, and obesity-related genes and miRNAs expressed in AT.


2020 ◽  
Vol 21 (15) ◽  
pp. 5222 ◽  
Author(s):  
Xiao-Nan Fan ◽  
Shao-Wu Zhang ◽  
Song-Yao Zhang ◽  
Jin-Jie Ni

Long non-coding RNAs (lncRNAs) play crucial roles in diverse biological processes and human complex diseases. Distinguishing lncRNAs from protein-coding transcripts is a fundamental step for analyzing the lncRNA functional mechanism. However, the experimental identification of lncRNAs is expensive and time-consuming. In this study, we presented an alignment-free multimodal deep learning framework (namely lncRNA_Mdeep) to distinguish lncRNAs from protein-coding transcripts. LncRNA_Mdeep incorporated three different input modalities, then a multimodal deep learning framework was built for learning the high-level abstract representations and predicting the probability whether a transcript was lncRNA or not. LncRNA_Mdeep achieved 98.73% prediction accuracy in a 10-fold cross-validation test on humans. Compared with other eight state-of-the-art methods, lncRNA_Mdeep showed 93.12% prediction accuracy independent test on humans, which was 0.94%~15.41% higher than that of other eight methods. In addition, the results on 11 cross-species datasets showed that lncRNA_Mdeep was a powerful predictor for predicting lncRNAs.


2019 ◽  
Vol 5 (1) ◽  
pp. 13 ◽  
Author(s):  
Romana Butova ◽  
Petra Vychytilova-Faltejskova ◽  
Adela Souckova ◽  
Sabina Sevcikova ◽  
Roman Hajek

Multiple myeloma (MM) is the second most common hematooncological disease of malignant plasma cells in the bone marrow. While new treatment brought unprecedented increase of survival of patients, MM pathogenesis is yet to be clarified. Increasing evidence of expression of long non-coding RNA molecules (lncRNA) linked to development and progression of many tumors suggested their important role in tumorigenesis. To date, over 15,000 lncRNA molecules characterized by diversity of function and specificity of cell distribution were identified in the human genome. Due to their involvement in proliferation, apoptosis, metabolism, and differentiation, they have a key role in the biological processes and pathogenesis of many diseases, including MM. This review summarizes current knowledge of non-coding RNAs (ncRNA), especially lncRNAs, and their role in MM pathogenesis. Undeniable involvement of lncRNAs in MM development suggests their potential as biomarkers.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Yu Zhang ◽  
Yahui Long ◽  
Chee Keong Kwoh

Abstract Background Long non-coding RNAs (lncRNAs) play significant roles in varieties of physiological and pathological processes.The premise of the lncRNA functional study is that the lncRNAs are identified correctly. Recently, deep learning method like convolutional neural network (CNN) has been successfully applied to identify the lncRNAs. However, the traditional CNN considers little relationships among samples via an indirect way. Results Inspired by the Siamese Neural Network (SNN), here we propose a novel network named Class Similarity Network in coding RNA and lncRNA classification. Class Similarity Network considers more relationships among input samples in a direct way. It focuses on exploring the potential relationships between input samples and samples from both the same class and the different classes. To achieve this, Class Similarity Network trains the parameters specific to each class to obtain the high-level features and represents the general similarity to each class in a node. The comparison results on the validation dataset under the same conditions illustrate the superiority of our Class Similarity Network to the baseline CNN. Besides, our method performs effectively and achieves state-of-the-art performances on two test datasets. Conclusions We construct Class Similarity Network in coding RNA and lncRNA classification, which is shown to work effectively on two different datasets by achieving accuracy, precision, and F1-score as 98.43%, 0.9247, 0.9374, and 97.54%, 0.9990, 0.9860, respectively.


2021 ◽  
Vol 01 (1) ◽  
pp. 9-15
Author(s):  
Imad Matouk

Increasing evidence has indicated that the non-coding RNA molecules play central roles in almost all biological processes and many pathological conditions including carcinogenesis. This review focuses on the pathological tumorigenic role of the first discovered long non-coding RNA gene called H19 and its pivotal contribution to the cancer axis of evil. H19 RNA utilizes a variety of mechanisms to perform its pathological function. Some key unanswered questions are presented by the end. Understanding the H19 RNA mechanisms of action will shed light into the class of long non-coding RNA which contains thousands of members mostly with unknown function and will help in delineating the pathological role played by at least some of them.


2020 ◽  
Author(s):  
Xiao-Nan Fan ◽  
Shao-Wu Zhang ◽  
Song-Yao Zhang ◽  
Jin-Jie Ni

Abstract Background: Long non-coding RNAs (lncRNAs) play crucial roles in diverse biological processes and human complex diseases. Distinguishing lncRNAs from protein-coding transcripts is a fundamental step for analyzing lncRNA functional mechanism. However, the experimental identification of lncRNAs is expensive and time-consuming. Results: In this study, we present an alignment-free multimodal deep learning framework (namely lncRNA_Mdeep) to distinguish lncRNAs from protein-coding transcripts. LncRNA_Mdeep incorporates three different input modalities (i.e. OFH modality, k-mer modality, and sequence modality), then a multimodal deep learning framework is built for learning the high-level abstract representations and predicting the probability whether a transcript is lncRNA or not. Conclusions: LncRNA_Mdeep achieves 98.73% prediction accuracy in 10-fold cross-validation test on human. Compared with other eight state-of-the-art methods, lncRNA_Mdeep shows 93.12% prediction accuracy independent test on human, which is 0.94%~15.41% higher than that of other eight methods. In addition, the results on 11 cross-species datasets show that lncRNA_Mdeep is a powerful predictor for identifying lncRNAs. The source code can be downloaded from https://github.com/NWPU-903PR/lncRNA_Mdeep.


2020 ◽  
Vol 25 (41) ◽  
pp. 4368-4378 ◽  
Author(s):  
Mahesh Mundalil Vasu ◽  
Puthiripadath S. Sumitha ◽  
Parakkal Rahna ◽  
Ismail Thanseem ◽  
Ayyappan Anitha

Background: Efforts to unravel the extensive impact of the non-coding elements of the human genome on cell homeostasis and pathological processes have gained momentum over the last couple of decades. miRNAs refer to short, often 18-25 nucleotides long, non-coding RNA molecules which can regulate gene expression. Each miRNA can regulate several mRNAs. Methods: This article reviews the literature on the roles of miRNAs in autism. Results: Considering the fact that ~ 1% of the human DNA encodes different families of miRNAs, their overall impact as critical regulators of gene expression in the mammalian brain should be immense. Though the autism spectrum disorders (ASDs) are predominantly genetic in nature and several candidate genes are already identified, the highly heterogeneous and multifactorial nature of the disorder makes it difficult to identify common genetic risk factors. Several studies have suggested that the environmental factors may interact with the genetic factors to increase the risk. miRNAs could possibly be one of those factors which explain this link between genetics and the environment. Conclusion: In the present review, we have summarized our current knowledge on miRNAs and their complex roles in ASD, and also on their therapeutic applications.


Database ◽  
2018 ◽  
Vol 2018 ◽  
Author(s):  
Xiao Wen ◽  
Lin Gao ◽  
Xingli Guo ◽  
Xing Li ◽  
Xiaotai Huang ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document