scholarly journals circDeep: deep learning approach for circular RNA classification from other long non-coding RNA

2019 ◽  
Vol 36 (1) ◽  
pp. 73-80 ◽  
Author(s):  
Mohamed Chaabane ◽  
Robert M Williams ◽  
Austin T Stephens ◽  
Juw Won Park

Abstract Motivation Over the past two decades, a circular form of RNA (circular RNA), produced through alternative splicing, has become the focus of scientific studies due to its major role as a microRNA (miRNA) activity modulator and its association with various diseases including cancer. Therefore, the detection of circular RNAs is vital to understanding their biogenesis and purpose. Prediction of circular RNA can be achieved in three steps: distinguishing non-coding RNAs from protein coding gene transcripts, separating short and long non-coding RNAs and predicting circular RNAs from other long non-coding RNAs (lncRNAs). However, the available tools are less than 80 percent accurate for distinguishing circular RNAs from other lncRNAs due to difficulty of classification. Therefore, the availability of a more accurate and fast machine learning method for the identification of circular RNAs, which considers the specific features of circular RNA, is essential to the development of systematic annotation. Results Here we present an End-to-End deep learning framework, circDeep, to classify circular RNA from other lncRNA. circDeep fuses an RCM descriptor, ACNN-BLSTM sequence descriptor and a conservation descriptor into high level abstraction descriptors, where the shared representations across different modalities are integrated. The experiments show that circDeep is not only faster than existing tools but also performs at an unprecedented level of accuracy by achieving a 12 percent increase in accuracy over the other tools. Availability and implementation https://github.com/UofLBioinformatics/circDeep. Supplementary information Supplementary data are available at Bioinformatics online.

Author(s):  
Erik S Wright

Abstract Summary Non-coding RNAs are often neglected during genome annotation due to their difficulty of detection relative to protein coding genes. FindNonCoding takes a pattern mining approach to capture the essential sequence motifs and hairpin loops representing a non-coding RNA family and quickly identify matches in genomes. FindNonCoding was designed for ease of use and accurately finds non-coding RNAs with a low false discovery rate. Availability FindNonCoding is implemented within the DECIPHER package (v2.19.3) for R (v4.1) available from Bioconductor. Pre-trained models of common non-coding RNA families are included for bacteria, archaea, and eukarya. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 21 (15) ◽  
pp. 5222 ◽  
Author(s):  
Xiao-Nan Fan ◽  
Shao-Wu Zhang ◽  
Song-Yao Zhang ◽  
Jin-Jie Ni

Long non-coding RNAs (lncRNAs) play crucial roles in diverse biological processes and human complex diseases. Distinguishing lncRNAs from protein-coding transcripts is a fundamental step for analyzing the lncRNA functional mechanism. However, the experimental identification of lncRNAs is expensive and time-consuming. In this study, we presented an alignment-free multimodal deep learning framework (namely lncRNA_Mdeep) to distinguish lncRNAs from protein-coding transcripts. LncRNA_Mdeep incorporated three different input modalities, then a multimodal deep learning framework was built for learning the high-level abstract representations and predicting the probability whether a transcript was lncRNA or not. LncRNA_Mdeep achieved 98.73% prediction accuracy in a 10-fold cross-validation test on humans. Compared with other eight state-of-the-art methods, lncRNA_Mdeep showed 93.12% prediction accuracy independent test on humans, which was 0.94%~15.41% higher than that of other eight methods. In addition, the results on 11 cross-species datasets showed that lncRNA_Mdeep was a powerful predictor for predicting lncRNAs.


2021 ◽  
Vol 11 ◽  
Author(s):  
Soudeh Ghafouri-Fard ◽  
Tayyebeh Khoshbakht ◽  
Mohammad Taheri ◽  
Elena Jamali

Circular RNAs (circRNAs) are a group of long non-coding RNAs with enclosed structure generated by back-splicing events. Numerous members of these transcripts have been shown to affect carcinogenesis. Circular RNA itchy E3 ubiquitin protein ligase (circITCH) is a circRNA created from back splicing events in ITCH gene, a protein coding gene on 20q11.22 region. ITCH has a role as a catalyzer for ubiquitination through both proteolytic and non-proteolytic routes. CircITCH is involved in the pathetiology of cancers through regulation of the linear isoform as well as serving as sponge for several microRNAs, namely miR-17, miR-224, miR-214, miR-93-5p, miR-22, miR-7, miR-106a, miR-10a, miR-145, miR-421, miR-224-5p, miR-197 and miR-199a-5p. CircITCH is also involved in the modulation of Wnt/β-catenin and PTEN/PI3K/AKT pathways. Except from a single study in osteosarcoma, circITCH has been found to exert tumor suppressor role in diverse cancers. In the present manuscript, we provided a comprehensive review of investigations that reported function of circITCH in the carcinogenesis.


2020 ◽  
Author(s):  
Xiao-Nan Fan ◽  
Shao-Wu Zhang ◽  
Song-Yao Zhang ◽  
Jin-Jie Ni

Abstract Background: Long non-coding RNAs (lncRNAs) play crucial roles in diverse biological processes and human complex diseases. Distinguishing lncRNAs from protein-coding transcripts is a fundamental step for analyzing lncRNA functional mechanism. However, the experimental identification of lncRNAs is expensive and time-consuming. Results: In this study, we present an alignment-free multimodal deep learning framework (namely lncRNA_Mdeep) to distinguish lncRNAs from protein-coding transcripts. LncRNA_Mdeep incorporates three different input modalities (i.e. OFH modality, k-mer modality, and sequence modality), then a multimodal deep learning framework is built for learning the high-level abstract representations and predicting the probability whether a transcript is lncRNA or not. Conclusions: LncRNA_Mdeep achieves 98.73% prediction accuracy in 10-fold cross-validation test on human. Compared with other eight state-of-the-art methods, lncRNA_Mdeep shows 93.12% prediction accuracy independent test on human, which is 0.94%~15.41% higher than that of other eight methods. In addition, the results on 11 cross-species datasets show that lncRNA_Mdeep is a powerful predictor for identifying lncRNAs. The source code can be downloaded from https://github.com/NWPU-903PR/lncRNA_Mdeep.


Author(s):  
Olga Wawrzyniak ◽  
Żaneta Zarębska ◽  
Katarzyna Rolle ◽  
Anna Gotz-Więckowska

Long non-coding RNAs are >200-nucleotide-long RNA molecules which lack or have limited protein-coding potential. They can regulate protein formation through several different mechanisms. Similarly, circular RNAs are reported to play a critical role in post-transcriptional gene regulation. Changes in the expression pattern of these molecules are known to underlie various diseases, including cancer, cardiovascular, neurological and immunological disorders (Rinn & Chang, 2012; Sun & Kraus, 2015). Recent studies suggest that they are differentially expressed both in healthy ocular tissues as well as in eye pathologies, such as neovascularization, proliferative vitreoretinopathy, glaucoma, cataract, ocular malignancy or even strabismus (Li et al., 2016). Aetiology of ocular diseases is multifactorial and combines genetic and environmental factors, including epigenetic and non-coding RNAs. In addition, disorders like diabetic retinopathy or age-related macular degeneration lack biomarkers for early detection as well as effective treatment methods that would allow controlling the disease progression at its early stages. The newly discovered non-coding RNAs seem to be the ideal candidates for novel molecular markers and therapeutic strategies. In this review, we summarized the current knowledge about gene expression regulators – long non-coding and circular RNA molecules in eye diseases.


2021 ◽  
Author(s):  
Marianne C Kramer ◽  
Hee Jong Kim ◽  
Kyle R Palos ◽  
Benjamin A Garcia ◽  
Eric Lyons ◽  
...  

Long non-coding RNAs (lncRNAs) are an increasingly studied group of non-protein-coding transcripts with a wide variety of molecular functions gaining attention for their roles in numerous biological processes. Nearly 6,000 lncRNAs have been identified in Arabidopsis thaliana but many have yet to be studied. Here, we examine a class of previously uncharacterized lncRNAs termed CONSERVED IN BRASSICA RAPA (lncCOBRA) transcripts that were previously identified for their high level of sequence conservation in the related crop species Brassica rapa, their nuclear-localization and protein-bound nature. In particular, we focus on lncCOBRA1 and demonstrate that its abundance is highly tissue and developmental specific, with particularly high levels early in germination. lncCOBRA1 contains two snoRNAs domains within it, making it the first sno-lincRNA example in a non-mammalian system. However, we find that it is processed differently than its mammalian counterparts. We further show that plants lacking lncCOBRA1 display patterns of delayed gemination and are overall smaller than wild-type plants. Lastly, we identify the proteins that interact with lncCOBRA1 and examine the protein-protein interaction network of lncCOBRA1-interacting proteins.


2019 ◽  
Author(s):  
Xiao-Nan Fan ◽  
Shao-Wu Zhang ◽  
Song-Yao Zhang ◽  
Jin-Jie Ni

Abstract Background: Long non-coding RNAs (lncRNAs) play crucial roles in diverse biological processes and human complex diseases. Distinguishing lncRNAs from protein-coding transcripts is a fundamental step for analyzing lncRNA functional mechanism. However, the experimental identification of lncRNAs is expensive and time-consuming. Results: In this study, we present an alignment-free multimodal deep learning framework (namely lncRNA_Mdeep) to distinguish lncRNAs from protein-coding transcripts. LncRNA_Mdeep incorporates three different input modalities (i.e. OFH modality, k-mer modality, and sequence modality), then a multimodal deep learning framework is built for learning the high-level abstract representations and predicting the probability whether a transcript is lncRNA or not.Conclusions: LncRNA_Mdeep achieves 98.73% prediction accuracy in 10-fold cross-validation test on human. Compared with other eight state-of-the-art methods, lncRNA_Mdeep shows 93.12% prediction accuracy independent test on human, which is 0.94%~15.41% higher than that of other eight methods. In addition, the results on 11 cross-species datasets show that lncRNA_Mdeep is a powerful predictor for identifying lncRNAs. The source code can be downloaded from https://github.com/NWPU-903PR/lncRNA_Mdeep.


2018 ◽  
Vol 27 (12) ◽  
pp. 1763-1777 ◽  
Author(s):  
Sheng-Wen Wang ◽  
Zhong Liu ◽  
Zhong-Song Shi

Non-coding RNAs (ncRNAs) are a class of functional RNAs that regulate gene expression in a post-transcriptional manner. NcRNAs include microRNAs, long non-coding RNAs and circular RNAs. They are highly expressed in the brain and are involved in the regulation of physiological and pathophysiological processes, including cerebral ischemic injury, neurodegeneration, neural development, and plasticity. Stroke is one of the leading causes of death and physical disability worldwide. Acute ischemic stroke (AIS) occurs when brain blood flow stops, and that stoppage results in reduced oxygen and glucose supply to cells in the brain. In this article, we review the latest progress on ncRNAs in relation to their implications in AIS, as well as their potential as diagnostic and prognostic biomarkers. We also review ncRNAs acting as possible therapeutic targets in future precision medicine. Finally, we conclude with a brief discussion of current challenges and future directions for ncRNAs studies in AIS, which may facilitate the translation of ncRNAs research into clinical practice to improve clinical outcome of AIS.


2018 ◽  
Vol 4 (4) ◽  
pp. 40 ◽  
Author(s):  
Carolyn Klinge

Non-coding RNAs (ncRNAs) are regulators of intracellular and intercellular signaling in breast cancer. ncRNAs modulate intracellular signaling to control diverse cellular processes, including levels and activity of estrogen receptor α (ERα), proliferation, invasion, migration, apoptosis, and stemness. In addition, ncRNAs can be packaged into exosomes to provide intercellular communication by the transmission of microRNAs (miRNAs) and long non-coding RNAs (lncRNAs) to cells locally or systemically. This review provides an overview of the biogenesis and roles of ncRNAs: small nucleolar RNA (snRNA), circular RNAs (circRNAs), PIWI-interacting RNAs (piRNAs), miRNAs, and lncRNAs in breast cancer. Since more is known about the miRNAs and lncRNAs that are expressed in breast tumors, their established targets as oncogenic drivers and tumor suppressors will be reviewed. The focus is on miRNAs and lncRNAs identified in breast tumors, since a number of ncRNAs identified in breast cancer cells are not dysregulated in breast tumors. The identity and putative function of selected lncRNAs increased: nuclear paraspeckle assembly transcript 1 (NEAT1), metastasis-associated lung adenocarcinoma transcript 1 (MALAT1), steroid receptor RNA activator 1 (SRA1), colon cancer associated transcript 2 (CCAT2), colorectal neoplasia differentially expressed (CRNDE), myocardial infarction associated transcript (MIAT), and long intergenic non-protein coding RNA, Regulator of Reprogramming (LINC-ROR); and decreased levels of maternally-expressed 3 (MEG3) in breast tumors have been observed as well. miRNAs and lncRNAs are considered targets of therapeutic intervention in breast cancer, but further work is needed to bring the promise of regulating their activities to clinical use.


2018 ◽  
Vol 35 (13) ◽  
pp. 2326-2328 ◽  
Author(s):  
Tobias Jakobi ◽  
Alexey Uvarovskii ◽  
Christoph Dieterich

Abstract Motivation Circular RNAs (circRNAs) originate through back-splicing events from linear primary transcripts, are resistant to exonucleases, are not polyadenylated and have been shown to be highly specific for cell type and developmental stage. CircRNA detection starts from high-throughput sequencing data and is a multi-stage bioinformatics process yielding sets of potential circRNA candidates that require further analyses. While a number of tools for the prediction process already exist, publicly available analysis tools for further characterization are rare. Our work provides researchers with a harmonized workflow that covers different stages of in silico circRNA analyses, from prediction to first functional insights. Results Here, we present circtools, a modular, Python-based framework for computational circRNA analyses. The software includes modules for circRNA detection, internal sequence reconstruction, quality checking, statistical testing, screening for enrichment of RBP binding sites, differential exon RNase R resistance and circRNA-specific primer design. circtools supports researchers with visualization options and data export into commonly used formats. Availability and implementation circtools is available via https://github.com/dieterich-lab/circtools and http://circ.tools under GPLv3.0. Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document