scholarly journals High-throughput annotation of full-length long noncoding RNAs with capture long-read sequencing

2017 ◽  
Vol 49 (12) ◽  
pp. 1731-1740 ◽  
Author(s):  
Julien Lagarde ◽  
Barbara Uszczynska-Ratajczak ◽  
Silvia Carbonell ◽  
Sílvia Pérez-Lluch ◽  
Amaya Abad ◽  
...  

2017 ◽  
Author(s):  
Julien Lagarde ◽  
Barbara Uszczynska-Ratajczak ◽  
Silvia Carbonell ◽  
SÍlvia Pérez-Lluch ◽  
Amaya Abad ◽  
...  

AbstractAccurate annotations of genes and their transcripts is a foundation of genomics, but no annotation technique presently combines throughput and accuracy. As a result, reference gene collections remain incomplete: many gene models are fragmentary, while thousands more remain uncatalogued–particularly for long noncoding RNAs (lncRNAs). To accelerate lncRNA annotation, the GENCODE consortium has developed RNA Capture Long Seq (CLS), combining targeted RNA capture with third-generation long-read sequencing. We present an experimental re-annotation of the GENCODE intergenic lncRNA population in matched human and mouse tissues, resulting in novel transcript models for 3574 / 561 gene loci, respectively. CLS approximately doubles the annotated complexity of targeted loci, outperforming existing short-read techniques. Full-length transcript models produced by CLS enable us to definitively characterize the genomic features of lncRNAs, including promoter- and gene-structure, and protein-coding potential. Thus CLS removes a longstanding bottleneck of transcriptome annotation, generating manual-quality full-length transcript models at high-throughput scales.Abbreviationsbpbase pairFLfull lengthntnucleotideROIread of insert, i.e. PacBio readSJsplice junctionSMRTsingle-molecule real-timeTMtranscript model



Author(s):  
Sílvia Carbonell Sala ◽  
Barbara Uszczyńska-Ratajczak ◽  
Julien Lagarde ◽  
Rory Johnson ◽  
Roderic Guigó


2019 ◽  
Vol 20 (5) ◽  
pp. 1853-1864 ◽  
Author(s):  
Seo-Won Choi ◽  
Hyun-Woo Kim ◽  
Jin-Wu Nam

Abstract Long noncoding RNAs (lncRNAs) are a group of transcripts that are longer than 200 nucleotides (nt) without coding potential. Over the past decade, tens of thousands of novel lncRNAs have been annotated in animal and plant genomes because of advanced high-throughput RNA sequencing technologies and with the aid of coding transcript classifiers. Further, a considerable number of reports have revealed the existence of stable, functional small peptides (also known as micropeptides), translated from lncRNAs. In this review, we discuss the methods of lncRNA classification, the investigations regarding their coding potential and the functional significance of the peptides they encode.



2020 ◽  
Author(s):  
Xiaomin Zheng ◽  
Yanjun Chen ◽  
Yifan Zhou ◽  
Danyang Li ◽  
Keke Shi ◽  
...  

AbstractLong noncoding RNAs (lncRNAs) are crucial factors during plant development and environmental responses. High-throughput and accurate identification of lncRNAs is still lacking in plants. To build an accurate atlas of lncRNA in cotton, we combined Isoform-sequencing (Iso-seq), strand-specific RNA-seq (ssRNA-seq), cap analysis gene expression (CAGE-seq) with PolyA-seq and compiled a pipeline named plant full-length lncRNA (PULL) to integrate multi-omics data. A total of 9240 lncRNAs from 21 tissue samples of the diploid cotton Gossypium arboreum were identified. We revealed that alternative usage of transcription start site (TSS) and transcription end site (TES) of lncRNAs occurs pervasively during plant growth and responses to stress. We identified the lncRNAs which co-expressed or be linked to the protein coding genes (PCGs) or GWAS studied SNPs associated with ovule and fiber development. We also mapped the genome-wide binding sites of two lncRNAs with chromatin isolation by RNA purification sequencing (ChIRP-seq) and validated the trans transcriptional regulation of lnc-Ga13g0352 via virus induced gene suppression (VIGS) assay. These findings provide valuable research resources for plant community and broaden our understandings of biogenesis and regulation function of plant lncRNAs.One sentence summaryThe full-length annotation and transcriptional regulation of long noncoding RNAs in cotton.





BMC Genomics ◽  
2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Ying Wan ◽  
Xiaoyang Liu ◽  
Dongwang Zheng ◽  
Yuying Wang ◽  
Huan Chen ◽  
...  


2020 ◽  
Author(s):  
Ying-Feng Zheng ◽  
Zhi-Chao Chen ◽  
Zhuo-Xing Shi ◽  
Kun-Hua Hu ◽  
Jia-Yong Zhong ◽  
...  

AbstractSingle-cell isoform sequencing can reveal transcriptomic dynamics in individual cells invisible to bulk- and single-cell RNA analysis based on short-read sequencing. However, current long-read single-cell sequencing technologies have been limited by low throughput and high error rate. Here we introduce HIT-scISOseq for high-throughput single-cell isoform sequencing. This method was made possible by full-length cDNA capture using biotinylated PCR primers, and by our novel library preparation procedure that combines head-to-tail concatemeric full-length cDNAs into a long SMRTbell insert for high-accuracy PacBio sequencing. HIT-scISOseq yields > 10 million high-accuracy full-length isoforms in a single PacBio Sequel II 8M SMRT Cell, providing > 8 times more data output than the standard single-cell isoform PacBio sequencing protocol. We exemplified HIT-scISOseq by first studying transcriptome profiles of 4,000 normal and 8,000 injured corneal epitheliums from cynomolgus monkeys. We constructed dynamic transcriptome landscapes of known and rare cell types, revealed novel isoforms, and identified injury-related splicing and switching events that are previously not accessible with low throughput isoform sequencing. HIT-scISOseq represents a high-throughput, cost-effective, and technically simple method to accelerate the burgeoning field of long-read single-cell transcriptomics.



Sign in / Sign up

Export Citation Format

Share Document