scholarly journals The full-length transcriptome of C. elegans using direct RNA sequencing

2020 ◽  
Vol 30 (2) ◽  
pp. 299-312 ◽  
Author(s):  
Nathan P. Roach ◽  
Norah Sadowski ◽  
Amelia F. Alessi ◽  
Winston Timp ◽  
James Taylor ◽  
...  
Keyword(s):  
2019 ◽  
Author(s):  
Nathan P. Roach ◽  
Norah Sadowski ◽  
Amelia F. Alessi ◽  
Winston Timp ◽  
James Taylor ◽  
...  

AbstractCurrent transcriptome annotations have largely relied on short read lengths intrinsic to most widely used high-throughput cDNA sequencing technologies. For example, in the annotation of theCaenorhabditis eleganstranscriptome, more than half of the transcript isoforms lack full-length support and instead rely on inference from short reads that do not span the full length of the isoform. We applied nanopore-based direct RNA sequencing to characterize the developmental polyadenylated transcriptome ofC. elegans. Taking advantage of long reads spanning the full length of mRNA transcripts, we provide support for 20,902 splice isoforms across 14,115 genes, without the need for computational reconstruction of gene models. Of the isoforms identified, 2,188 are novel splice isoforms not present in the Wormbase WS265 annotation. Furthermore, we identified 16,325 3’ untranslated region (3’UTR) isoforms, 2,304 of which are novel and do not fall within 10 bp of existing 3’UTR datasets and annotations. Combining 3’UTRs and splice isoforms we identified 25,944 full-length isoforms. We also determined that poly(A) tail lengths of transcripts vary across development, as do the strengths of previously reported correlations between poly(A) tail length and expression level, and poly(A) tail length and 3’UTR length. Finally, we have formatted this data as a publically accessible track hub, enabling researchers to explore this dataset easily in a genome browser.


2012 ◽  
Vol 110 (2) ◽  
pp. 594-599 ◽  
Author(s):  
X. Pan ◽  
R. E. Durrett ◽  
H. Zhu ◽  
Y. Tanaka ◽  
Y. Li ◽  
...  

2019 ◽  
Author(s):  
Anne Deslattes Mays ◽  
Marcel O. Schmidt ◽  
Garrett T. Graham ◽  
Elizabeth Tseng ◽  
Primo Baybayan ◽  
...  

AbstractHematopoietic cells are continuously replenished from progenitor cells that reside in the bone marrow. To evaluate molecular changes during this process, we analyzed the transcriptomes of freshly harvested human bone marrow progenitor (lineage-negative) and differentiated (lineage-positive) cells by single molecule, real time (SMRT) full length RNA sequencing. This analysis revealed a ∼5-fold higher number of transcript isoforms than previously detected and showed a distinct composition of individual transcript isoforms characteristic for bone marrow subpopulations. A detailed analysis of mRNA isoforms transcribed from the ANXA1 and EEF1A1 loci confirmed their distinct composition. The expression of proteins predicted from the transcriptome analysis was validated by mass spectrometry and validated previously unknown protein isoforms predicted e.g. for EEF1A1. These protein isoforms distinguished the lineage negative cell population from the lineage positive cell population. Finally, transcript isoforms expressed from paralogous gene loci (e.g. CFD, GATA2, HLA-A, B & C) also distinguished cell subpopulations but were only detectable by full length RNA sequencing. Thus, qualitatively distinct transcript isoforms from individual genomic loci separate bone marrow cell subpopulations indicating complex transcriptional regulation and protein isoform generation during hematopoiesis.


eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
Matthew T Parker ◽  
Katarzyna Knop ◽  
Anna V Sherwood ◽  
Nicholas J Schurch ◽  
Katarzyna Mackinnon ◽  
...  

Understanding genome organization and gene regulation requires insight into RNA transcription, processing and modification. We adapted nanopore direct RNA sequencing to examine RNA from a wild-type accession of the model plant Arabidopsis thaliana and a mutant defective in mRNA methylation (m6A). Here we show that m6A can be mapped in full-length mRNAs transcriptome-wide and reveal the combinatorial diversity of cap-associated transcription start sites, splicing events, poly(A) site choice and poly(A) tail length. Loss of m6A from 3’ untranslated regions is associated with decreased relative transcript abundance and defective RNA 3′ end formation. A functional consequence of disrupted m6A is a lengthening of the circadian period. We conclude that nanopore direct RNA sequencing can reveal the complexity of mRNA processing and modification in full-length single molecule reads. These findings can refine Arabidopsis genome annotation. Further, applying this approach to less well-studied species could transform our understanding of what their genomes encode.


2018 ◽  
Vol 9 (1) ◽  
Author(s):  
Tetsutaro Hayashi ◽  
Haruka Ozaki ◽  
Yohei Sasagawa ◽  
Mana Umeda ◽  
Hiroki Danno ◽  
...  

Development ◽  
2000 ◽  
Vol 127 (4) ◽  
pp. 821-830 ◽  
Author(s):  
J. Larrain ◽  
D. Bachiller ◽  
B. Lu ◽  
E. Agius ◽  
S. Piccolo ◽  
...  

A number of genetic and molecular studies have implicated Chordin in the regulation of dorsoventral patterning during gastrulation. Chordin, a BMP antagonist of 120 kDa, contains four small (about 70 amino acids each) cysteine-rich domains (CRs) of unknown function. In this study, we show that the Chordin CRs define a novel protein module for the binding and regulation of BMPs. The biological activity of Chordin resides in the CRs, especially in CR1 and CR3, which have dorsalizing activity in Xenopus embryo assays and bind BMP4 with dissociation constants in the nanomolar range. The activity of individual CRs, however, is 5- to 10-fold lower than that of full-length Chordin. These results shed light on the molecular mechanism by which Chordin/BMP complexes are regulated by the metalloprotease Xolloid, which cleaves in the vicinity of CR1 and CR3 and would release CR/BMP complexes with lower anti-BMP activity than intact Chordin. CR domains are found in other extracellular proteins such as procollagens. Full-length Xenopus procollagen IIA mRNA has dorsalizing activity in embryo microinjection assays and the CR domain is required for this activity. Similarly, a C. elegans cDNA containing five CR domains induces secondary axes in injected Xenopus embryos. These results suggest that CR modules may function in a number of extracellular proteins to regulate growth factor signalling.


Author(s):  
Ramiro Lorenzo ◽  
Michiho Onizuka ◽  
Matthieu Defrance ◽  
Patrick Laurent

Abstract Single-cell RNA-sequencing (scRNA-seq) of the Caenorhabditis elegans nervous system offers the unique opportunity to obtain a partial expression profile for each neuron within a known connectome. Building on recent scRNA-seq data and on a molecular atlas describing the expression pattern of ∼800 genes at the single cell resolution, we designed an iterative clustering analysis aiming to match each cell-cluster to the ∼100 anatomically defined neuron classes of C. elegans. This heuristic approach successfully assigned 97 of the 118 neuron classes to a cluster. Sixty two clusters were assigned to a single neuron class and 15 clusters grouped neuron classes sharing close molecular signatures. Pseudotime analysis revealed a maturation process occurring in some neurons (e.g. PDA) during the L2 stage. Based on the molecular profiles of all identified neurons, we predicted cell fate regulators and experimentally validated unc-86 for the normal differentiation of RMG neurons. Furthermore, we observed that different classes of genes functionally diversify sensory neurons, interneurons and motorneurons. Finally, we designed 15 new neuron class-specific promoters validated in vivo. Amongst them, 10 represent the only specific promoter reported to this day, expanding the list of neurons amenable to genetic manipulations.


2017 ◽  
Author(s):  
Belinda Phipson ◽  
Luke Zappia ◽  
Alicia Oshlack

AbstractSingle cell RNA sequencing (scRNA-seq) has rapidly gained popularity for profiling transcriptomes of hundreds to thousands of single cells. This technology has led to the discovery of novel cell types and revealed insights into the development of complex tissues. However, many technical challenges need to be overcome during data generation. Due to minute amounts of starting material, samples undergo extensive amplification, increasing technical variability. A solution for mitigating amplification biases is to include Unique Molecular Identifiers (UMIs), which tag individual molecules. Transcript abundances are then estimated from the number of unique UMIs aligning to a specific gene and PCR duplicates resulting in copies of the UMI are not included in expression estimates. Here we investigate the effect of gene length bias in scRNA-Seq across a variety of datasets differing in terms of capture technology, library preparation, cell types and species. We find that scRNA-seq datasets that have been sequenced using a full-length transcript protocol exhibit gene length bias akin to bulk RNA-seq data. Specifically, shorter genes tend to have lower counts and a higher rate of dropout. In contrast, protocols that include UMIs do not exhibit gene length bias, and have a mostly uniform rate of dropout across genes of varying length. Across four different scRNA-Seq datasets profiling mouse embryonic stem cells (mESCs), we found the subset of genes that are only detected in the UMI datasets tended to be shorter, while the subset of genes detected only in the full-length datasets tended to be longer. We briefly discuss the role of these genes in the context of differential expression testing and GO analysis. In addition, despite clear differences between UMI and full-length transcript data, we illustrate that full-length and UMI data can be combined to reveal underlying biology influencing expression of mESCs.


Sign in / Sign up

Export Citation Format

Share Document