Mapping and modeling the genomic basis of differential RNA isoform expression at single-cell resolution with LR-Split-seq

AbstractAlternative RNA isoforms are defined by promoter choice, alternative splicing, and polyA site selection. Although differential isoform expression is known to play a large regulatory role in eukaryotes, it has proved challenging to study with standard short-read RNA-seq because of the uncertainties it leaves about the full-length structure and precise termini of transcripts. The rise in throughput and quality of long-read sequencing now makes it possible, in principle, to unambiguously identify most transcript isoforms from beginning to end. However, its application to single-cell RNA-seq has been limited by throughput and expense. Here, we develop and characterize long-read Split-seq (LR-Split-seq), which uses a combinatorial barcoding-based method for sequencing single cells and nuclei with long reads. We show that LR-Split-seq can associate isoforms with cell types with relative economy and design flexibility. We characterize LR-Split-seq for whole cells and nuclei by using the well-studied mouse C2C12 system in which mononucleated myoblast cells differentiate and fuse into multinucleated myotubes. We show that the overall results are reproducible when comparing long- and short-read data from the same cell or nucleus. We find substantial evidence of differential isoform expression during differentiation including alternative transcription start site (TSS) usage. We integrate the resulting isoform expression dynamics with snATAC-seq chromatin accessibility to validate TSS-driven isoform choices. LR-Split-seq provides an affordable method for identifying cluster-specific isoforms in single cells that can be further quantified with companion deep short-read scRNA-seq from the same cell populations.

Download Full-text

Mapping and modeling the genomic basis of differential RNA isoform expression at single-cell resolution with LR-Split-seq

Genome Biology ◽

10.1186/s13059-021-02505-w ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Elisabeth Rebboah ◽

Fairlie Reese ◽

Katherine Williams ◽

Gabriela Balderrama-Gutierrez ◽

Cassandra McGill ◽

...

Keyword(s):

Single Cell ◽

Single Cells ◽

Cell Types ◽

Transcription Start Sites ◽

Design Flexibility ◽

Long Reads ◽

Internal Exon ◽

Long Read ◽

Isoform Expression ◽

Full Length Transcript

AbstractThe rise in throughput and quality of long-read sequencing should allow unambiguous identification of full-length transcript isoforms. However, its application to single-cell RNA-seq has been limited by throughput and expense. Here we develop and characterize long-read Split-seq (LR-Split-seq), which uses combinatorial barcoding to sequence single cells with long reads. Applied to the C2C12 myogenic system, LR-split-seq associates isoforms to cell types with relative economy and design flexibility. We find widespread evidence of changing isoform expression during differentiation including alternative transcription start sites (TSS) and/or alternative internal exon usage. LR-Split-seq provides an affordable method for identifying cluster-specific isoforms in single cells.

Download Full-text

Simultaneous trimodal single-cell measurement of transcripts, epitopes, and chromatin accessibility using TEA-seq

eLife ◽

10.7554/elife.63632 ◽

2021 ◽

Vol 10 ◽

Author(s):

Elliott Swanson ◽

Cara Lord ◽

Julian Reading ◽

Alexander T Heubeck ◽

Palak C Genge ◽

...

Keyword(s):

Gene Regulation ◽

Single Cell ◽

Human Peripheral Blood ◽

Single Cells ◽

Cell Types ◽

Chromatin Accessibility ◽

Specific Gene ◽

Test Case ◽

Cell Assays ◽

Paired Measurement

Single-cell measurements of cellular characteristics have been instrumental in understanding the heterogeneous pathways that drive differentiation, cellular responses to signals, and human disease. Recent advances have allowed paired capture of protein abundance and transcriptomic state, but a lack of epigenetic information in these assays has left a missing link to gene regulation. Using the heterogeneous mixture of cells in human peripheral blood as a test case, we developed a novel scATAC-seq workflow that increases signal-to-noise and allows paired measurement of cell surface markers and chromatin accessibility: integrated cellular indexing of chromatin landscape and epitopes, called ICICLE-seq. We extended this approach using a droplet-based multiomics platform to develop a trimodal assay that simultaneously measures transcriptomics (scRNA-seq), epitopes, and chromatin accessibility (scATAC-seq) from thousands of single cells, which we term TEA-seq. Together, these multimodal single-cell assays provide a novel toolkit to identify type-specific gene regulation and expression grounded in phenotypically defined cell types.

Download Full-text

Single-cell RNA cap and tail sequencing (scRCAT-seq) reveals subtype-specific isoforms differing in transcript demarcation

Nature Communications ◽

10.1038/s41467-020-18976-7 ◽

2020 ◽

Vol 11 (1) ◽

Author(s):

Youjin Hu ◽

Jiawei Zhong ◽

Yuhua Xiao ◽

Zheng Xing ◽

Katherine Sheu ◽

...

Keyword(s):

Single Cell ◽

Single Cells ◽

Single Gene ◽

Cell Types ◽

Machine Learning Algorithms ◽

Translation Efficiency ◽

Transcription Start Sites ◽

Long Read ◽

Mrna Gene ◽

Gene Isoforms

Abstract The differences in transcription start sites (TSS) and transcription end sites (TES) among gene isoforms can affect the stability, localization, and translation efficiency of mRNA. Gene isoforms allow a single gene diverse functions across different cell types, and isoform dynamics allow different functions over time. However, methods to efficiently identify and quantify RNA isoforms genome-wide in single cells are still lacking. Here, we introduce single cell RNA Cap And Tail sequencing (scRCAT-seq), a method to demarcate the boundaries of isoforms based on short-read sequencing, with higher efficiency and lower cost than existing long-read sequencing methods. In conjunction with machine learning algorithms, scRCAT-seq demarcates RNA transcripts with unprecedented accuracy. We identified hundreds of previously uncharacterized transcripts and thousands of alternative transcripts for known genes, revealed cell-type specific isoforms for various cell types across different species, and generated a cell atlas of isoform dynamics during the development of retinal cones.

Download Full-text

SAT-298 Integrative Single-Cell Transcriptomic and Epigenomic Landscape of Mouse Anterior Pituitary Cell Types

Journal of the Endocrine Society ◽

10.1210/jendso/bvaa046.593 ◽

2020 ◽

Vol 4 (Supplement_1) ◽

Author(s):

Frederique Murielle Ruf-Zamojski ◽

Michel A Zamojski ◽

German Nudelman ◽

Yongchao Ge ◽

Natalia Mendelev ◽

...

Keyword(s):

Single Cell ◽

Cell Line ◽

Anterior Pituitary ◽

Cell Types ◽

Chromatin Accessibility ◽

Pituitary Cell ◽

Integrated Analysis ◽

Pituitary Cells ◽

Rna Seq ◽

Cell Type

Abstract The pituitary gland is a critical regulator of the neuroendocrine system. To further our understanding of the classification, cellular heterogeneity, and regulatory landscape of pituitary cell types, we performed and computationally integrated single cell (SC)/single nucleus (SN) resolution experiments capturing RNA expression, chromatin accessibility, and DNA methylation state from mouse dissociated whole pituitaries. Both SC and SN transcriptome analysis and promoter accessibility identified the five classical hormone-producing cell types (somatotropes, gonadotropes (GT), lactotropes, thyrotropes, and corticotropes). GT cells distinctively expressed transcripts for Cga, Fshb, Lhb, Nr5a1, and Gnrhr in SC RNA-seq and SN RNA-seq. This was matched in SN ATAC-seq with GTs specifically showing open chromatin at the promoter regions for the same genes. Similarly, the other classically defined anterior pituitary cells displayed transcript expression and chromatin accessibility patterns characteristic of their own cell type. This integrated analysis identified additional cell-types, such as a stem cell cluster expressing transcripts for Sox2, Sox9, Mia, and Rbpms, and a broadly accessible chromatin state. In addition, we performed bulk ATAC-seq in the LβT2b gonadotrope-like cell line. While the FSHB promoter region was closed in the cell line, we identified a region upstream of Fshb that became accessible by the synergistic actions of GnRH and activin A, and that corresponded to a conserved region identified by a polycystic ovary syndrome (PCOS) single nucleotide polymorphism (SNP). Although this locus appears closed in deep sequencing bulk ATAC-seq of dissociated mouse pituitary cells, SN ATAC-seq of the same preparation showed that this site was specifically open in mouse GT, but closed in 14 other pituitary cell type clusters. This discrepancy highlighted the detection limit of a bulk ATAC-seq experiment in a subpopulation, as GT represented ~5% of this dissociated anterior pituitary sample. These results identified this locus as a candidate for explaining the dual dependence of Fshb expression on GnRH and activin/TGFβ signaling, and potential new evidence for upstream regulation of Fshb. The pituitary epigenetic landscape provides a resource for improved cell type identification and for the investigation of the regulatory mechanisms driving cell-to-cell heterogeneity. Additional authors not listed due to abstract submission restrictions: N. Seenarine, M. Amper, N. Jain (ISMMS).

Download Full-text

Computational approaches towards reducing contamination in single-cell RNA-seq data

10.1101/2020.07.15.205062 ◽

2020 ◽

Author(s):

Siamak Yousefi ◽

Hao Chen ◽

Jesse F. Ingels ◽

Melinda S. McCarty ◽

Arthur G. Centeno ◽

...

Keyword(s):

Single Cell ◽

Single Cells ◽

Real Life ◽

Cell Types ◽

Cell Capture ◽

Rna Seq ◽

Sequence Analyses ◽

Cell Functions ◽

Biological Interpretation ◽

Different Cell Types

SUMMARYSingle cell RNA sequencing has enabled quantification of single cells and identification of different cell types and subtypes as well as cell functions in different tissues. Single cell RNA sequence analyses assume acquired RNAs correspond to cells, however, RNAs from contamination within the input data are also captured by these assays. The sequencing of background contamination as well as unwanted cells making their way to the final assay Potentially confound the correct biological interpretation of single cell transcriptomic data. Here we demonstrate two approaches to deal with background contamination as well as profiling of unwanted cells in the assays. We use three real-life datasets of whole-cell capture and nucleotide single-cell captures generated by Fluidigm and 10x technologies and show that these methods reduce the effect of contamination, strengthen clustering of cells and improves biological interpretation.

Download Full-text

Single cell profiling of total RNA using Smart-seq-total

10.1101/2020.06.02.131060 ◽

2020 ◽

Author(s):

Alina Isakova ◽

Norma Neff ◽

Stephen R. Quake

Keyword(s):

Single Cell ◽

Single Cells ◽

Embryonic Stem ◽

Cell Types ◽

Embryoid Bodies ◽

High Yield ◽

Rna Seq ◽

Mcf7 Cells ◽

Total Rna ◽

Non Coding Rna

ABSTRACTThe ability to interrogate total RNA content of single cells would enable better mapping of the transcriptional logic behind emerging cell types and states. However, current RNA-seq methods are unable to simultaneously monitor both short and long, poly(A)+ and poly(A)-transcripts at the single-cell level, and thus deliver only a partial snapshot of the cellular RNAome. Here, we describe Smart-seq-total, a method capable of assaying a broad spectrum of coding and non-coding RNA from a single cell. Built upon the template-switch mechanism, Smart-seq-total bears the key feature of its predecessor, Smart-seq2, namely, the ability to capture full-length transcripts with high yield and quality. It also outperforms current poly(A)–independent total RNA-seq protocols by capturing transcripts of a broad size range, thus, allowing us to simultaneously analyze protein-coding, long non-coding, microRNA and other non-coding RNA transcripts from single cells. We used Smart-seq-total to analyze the total RNAome of human primary fibroblasts, HEK293T and MCF7 cells as well as that of induced murine embryonic stem cells differentiated into embryoid bodies. We show that simultaneous measurement of non-coding RNA and mRNA from the same cell enables elucidation of new roles of non-coding RNA throughout essential processes such as cell cycle or lineage commitment. Moreover, we show that cell types can be distinguished based on the abundance of non-coding transcripts alone.

Download Full-text

Single-cell quantification of a broad RNA spectrum reveals unique noncoding patterns associated with cell types and states

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.2113568118 ◽

2021 ◽

Vol 118 (51) ◽

pp. e2113568118

Author(s):

Alina Isakova ◽

Norma Neff ◽

Stephen R. Quake

Keyword(s):

Single Cell ◽

Noncoding Rna ◽

Single Cells ◽

Embryonic Stem ◽

Cell Types ◽

Embryoid Bodies ◽

Rna Seq ◽

Total Rna ◽

Rna Transcripts ◽

Rna Content

The ability to interrogate total RNA content of single cells would enable better mapping of the transcriptional logic behind emerging cell types and states. However, current single-cell RNA-sequencing (RNA-seq) methods are unable to simultaneously monitor all forms of RNA transcripts at the single-cell level, and thus deliver only a partial snapshot of the cellular RNAome. Here we describe Smart-seq-total, a method capable of assaying a broad spectrum of coding and noncoding RNA from a single cell. Smart-seq-total does not require splitting the RNA content of a cell and allows the incorporation of unique molecular identifiers into short and long RNA molecules for absolute quantification. It outperforms current poly(A)-independent total RNA-seq protocols by capturing transcripts of a broad size range, thus enabling simultaneous analysis of protein-coding, long-noncoding, microRNA, and other noncoding RNA transcripts from single cells. We used Smart-seq-total to analyze the total RNAome of human primary fibroblasts, HEK293T, and MCF7 cells, as well as that of induced murine embryonic stem cells differentiated into embryoid bodies. By analyzing the coexpression patterns of both noncoding RNA and mRNA from the same cell, we were able to discover new roles of noncoding RNA throughout essential processes, such as cell cycle and lineage commitment during embryonic development. Moreover, we show that independent classes of short-noncoding RNA can be used to determine cell-type identity.

Download Full-text

TEA-seq: a trimodal assay for integrated single cell measurement of transcription, epitopes, and chromatin accessibility

10.1101/2020.09.04.283887 ◽

2020 ◽

Cited By ~ 3

Author(s):

Elliott Swanson ◽

Cara Lord ◽

Julian Reading ◽

Alexander T. Heubeck ◽

Adam K. Savage ◽

...

Keyword(s):

Cell Surface ◽

Single Cell ◽

Human Peripheral Blood ◽

Signal To Noise Ratio ◽

Single Cells ◽

Cell Types ◽

Chromatin Accessibility ◽

Specific Gene ◽

Test Case ◽

Cell Assays

AbstractSingle-cell measurements of cellular characteristics have been instrumental in understanding the heterogeneous pathways that drive differentiation, cellular responses to extracellular signals, and human disease states. scATAC-seq has been particularly challenging due to the large size of the human genome and processing artefacts resulting from DNA damage that are an inherent source of background signal. Downstream analysis and integration of scATAC-seq with other single-cell assays is complicated by the lack of clear phenotypic information linking chromatin state and cell type. Using the heterogeneous mixture of cells in human peripheral blood as a test case, we developed a novel scATAC-seq workflow that increases the signal-to-noise ratio and allows simultaneous measurement of cell surface markers: Integrated Cellular Indexing of Chromatin Landscape and Epitopes (ICICLE-seq). We extended this approach using a droplet-based multiomics platform to develop a trimodal assay to simultaneously measure Transcriptomic state (scRNA-seq), cell surface Epitopes, and chromatin Accessibility (scATAC-seq) from thousands of single cells, which we term TEA-seq. Together, these multimodal single-cell assays provide a novel toolkit to identify type-specific gene regulation and expression grounded in phenotypically defined cell types.

Download Full-text

Genome annotation with long RNA reads reveals new patterns of gene expression and improves single-cell analyses in an ant brain

BMC Biology ◽

10.1186/s12915-021-01188-w ◽

2021 ◽

Vol 19 (1) ◽

Author(s):

Emily J. Shields ◽

Masato Sorida ◽

Lihong Sheng ◽

Bogdan Sieriebriennikov ◽

Long Ding ◽

...

Keyword(s):

Single Cell ◽

Cell Types ◽

Untranslated Regions ◽

Specific Cell ◽

Rna Seq ◽

Splice Isoforms ◽

Long Read ◽

Genome Annotations ◽

Genome Assemblies ◽

Traditional Approaches

Abstract Background Functional genomic analyses rely on high-quality genome assemblies and annotations. Highly contiguous genome assemblies have become available for a variety of species, but accurate and complete annotation of gene models, inclusive of alternative splice isoforms and transcription start and termination sites, remains difficult with traditional approaches. Results Here, we utilized full-length isoform sequencing (Iso-Seq), a long-read RNA sequencing technology, to obtain a comprehensive annotation of the transcriptome of the ant Harpegnathos saltator. The improved genome annotations include additional splice isoforms and extended 3′ untranslated regions for more than 4000 genes. Reanalysis of RNA-seq experiments using these annotations revealed several genes with caste-specific differential expression and tissue- or caste-specific splicing patterns that were missed in previous analyses. The extended 3′ untranslated regions afforded great improvements in the analysis of existing single-cell RNA-seq data, resulting in the recovery of the transcriptomes of 18% more cells. The deeper single-cell transcriptomes obtained with these new annotations allowed us to identify additional markers for several cell types in the ant brain, as well as genes differentially expressed across castes in specific cell types. Conclusions Our results demonstrate that Iso-Seq is an efficient and effective approach to improve genome annotations and maximize the amount of information that can be obtained from existing and future genomic datasets in Harpegnathos and other organisms.

Download Full-text

scTPA: A web tool for single-cell transcriptome analysis of pathway activation signatures

10.1101/2020.01.15.907592 ◽

2020 ◽

Cited By ~ 1

Author(s):

Yan Zhang ◽

Yaru Zhang ◽

Jun Hu ◽

Ji Zhang ◽

Fangjie Guo ◽

...

Keyword(s):

Data Analysis ◽

Single Cell ◽

Single Cells ◽

Cell Types ◽

Biological Pathways ◽

Rna Seq ◽

Web Tool ◽

Functional Interpretation ◽

Cell Functions ◽

Pathway Activation

ABSTRACTThe most fundamental challenge in current single-cell RNA-seq data analysis is functional interpretation and annotation of cell clusters. The biological pathways in distinct cell types have different activation patterns, which facilitates understanding cell functions in single-cell transcriptomics. However, no effective web tool has been implemented for single-cell transcriptomic data analysis based on prior biological pathway knowledge. Here, we introduce scTPA (http://sctpa.bio-data.cn/sctpa), which is a web-based platform providing pathway-based analysis of single-cell RNA-seq data in human and mouse. scTPA incorporates four widely-used gene set enrichment methods to estimate the pathway activation scores of single cells based on a collection of available biological pathways with different functional and taxonomic classifications. The clustering analysis and cell-type-specific activation pathway identification were provided for the functional interpretation of cell types from pathway-oriented perspective. An intuitive interface allows users to conveniently visualize and download single-cell pathway signatures. Together, scTPA is a comprehensive tool to identify pathway activation signatures for dissecting single cell heterogeneity.

Download Full-text