Comprehensive characterization of single cell full-length isoforms in human and mouse with long-read sequencing

AbstractAlternative splicing shapes the phenotype of cells in development and disease. Long-read RNA-sequencing recovers full-length transcripts but has limited throughput at the single-cell level. Here we developed single-cell full-length transcript sequencing by sampling (FLT-seq), together with the computational pipeline FLAMES to overcome these issues and perform isoform discovery and quantification, splicing analysis and mutation detection in single cells. With FLT-seq and FLAMES, we performed the first comprehensive characterization of the full-length isoform landscape in single cells of different types and species and identified thousands of unannotated isoforms. We found conserved functional modules that were enriched for alternative transcript usage in different cell populations, including ribosome biogenesis and mRNA splicing. Analysis at the transcript-level allowed data integration with scATAC-seq on individual promoters, improved correlation with protein expression data and linked mutations known to confer drug resistance to transcriptome heterogeneity. Our methods reveal previously unseen isoform complexity and provide a better framework for multi-omics data integration.

Download Full-text

Comprehensive characterization of single-cell full-length isoforms in human and mouse with long-read sequencing

Genome Biology ◽

10.1186/s13059-021-02525-6 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Luyi Tian ◽

Jafar S. Jabbari ◽

Rachel Thijssen ◽

Quentin Gouil ◽

Shanika L. Amarasinghe ◽

...

Keyword(s):

Ribosome Biogenesis ◽

Single Cells ◽

Transcript Level ◽

Cell Types ◽

Alternative Transcript ◽

Long Read ◽

Different Cell Types ◽

Human And Mouse ◽

Comprehensive Characterization

AbstractA modified Chromium 10x droplet-based protocol that subsamples cells for both short-read and long-read (nanopore) sequencing together with a new computational pipeline (FLAMES) is developed to enable isoform discovery, splicing analysis, and mutation detection in single cells. We identify thousands of unannotated isoforms and find conserved functional modules that are enriched for alternative transcript usage in different cell types and species, including ribosome biogenesis and mRNA splicing. Analysis at the transcript level allows data integration with scATAC-seq on individual promoters, improved correlation with protein expression data, and linked mutations known to confer drug resistance to transcriptome heterogeneity.

Download Full-text

Single-cell isoform RNA sequencing (ScISOr-Seq) across thousands of cells reveals isoforms of cerebellar cell types

10.1101/364950 ◽

2018 ◽

Cited By ~ 1

Author(s):

Ishaan Gupta ◽

Paul G Collier ◽

Bettina Haase ◽

Ahmed Mahfouz ◽

Anoushka Joglekar ◽

...

Keyword(s):

Single Cell ◽

Single Cells ◽

Cell Types ◽

Full Length ◽

Cell Of Origin ◽

Cell Type ◽

Long Reads ◽

Long Read ◽

The Individual ◽

Bulk Tissue

AbstractFull-length isoform sequencing has advanced our knowledge of isoform biology1–11. However, apart from applying full-length isoform sequencing to very few single cells12,13, isoform sequencing has been limited to bulk tissue, cell lines, or sorted cells. Single splicing events have been described for <=200 single cells with great statistical success14,15, but these methods do not describe full-length mRNAs. Single cell short-read 3’ sequencing has allowed identification of many cell sub-types16–23, but full-length isoforms for these cell types have not been profiled. Using our new method of single-cell-isoform-RNA-sequencing (ScISOr-Seq) we determine isoform-expression in thousands of individual cells from a heterogeneous bulk tissue (cerebellum), without specific antibody-fluorescence activated cell sorting. We elucidate isoform usage in high-level cell types such as neurons, astrocytes and microglia and finer sub-types, such as Purkinje cells and Granule cells, including the combination patterns of distant splice sites6–9,24,25, which for individual molecules requires long reads. We produce an enhanced genome annotation revealing cell-type specific expression of known and 16,872 novel (with respect to mouse Gencode version 10) isoforms (see isoformatlas.com).ScISOr-Seq describes isoforms from >1,000 single cells from bulk tissue without cell sorting by leveraging two technologies in three steps: In step one, we employ microfluidics to produce amplified full-length cDNAs barcoded for their cell of origin. This cDNA is split into two pools: one pool for 3’ sequencing to measure gene expression (step 2) and another pool for long-read sequencing and isoform expression (step 3). In step two, short-read 3’-sequencing provides molecular counts for each gene and cell, which allows clustering cells and assigning a cell type using cell-type specific markers. In step three, an aliquot of the same cDNAs (each barcoded for the individual cell of origin) is sequenced using Pacific Biosciences (“PacBio”)1,2,4,5,26 or Oxford Nanopore3. Since these long reads carry the single-cell barcodes identified in step two, one can determine the individual cell from which each long read originates. Since most single cells are assigned to a named cluster, we can also assign the cell’s cluster name (e.g. “Purkinje cell” or “astrocyte”) to the long read in question (Fig 1A) – without losing the cell of origin of each long read.

Download Full-text

scCAT-seq:single-cell identification and quantification of mRNA isoforms by cost-effective short-read sequencing of cap and tail

10.1101/2019.12.11.873505 ◽

2019 ◽

Author(s):

Youjin Hu ◽

Jiawei Zhong ◽

Yuhua Xiao ◽

Zheng Xing ◽

Katherine Sheu ◽

...

Keyword(s):

Single Cell ◽

Learning Algorithm ◽

Single Cells ◽

Full Length ◽

Translation Efficiency ◽

Mrna Isoforms ◽

Short Read ◽

Short Read Sequencing ◽

Long Read ◽

Identification And Quantification

AbstractThe differences in transcription start sites (TSS) and transcription end sites (TES) among gene isoforms can affect the stability, localization, and translation efficiency of mRNA. Isoforms also allow a single gene different functions across various tissues and cells However, methods for efficient genome-wide identification and quantification of RNA isoforms in single cells are still lacking. Here, we introduce single cell Cap And Tail sequencing (scCAT-seq). In conjunction with a novel machine learning algorithm developed for TSS/TES characterization, scCAT-seq can demarcate transcript boundaries of RNA transcripts, providing an unprecedented way to identify and quantify single-cell full-length RNA isoforms based on short-read sequencing. Compared with existing long-read sequencing methods, scCAT-seq has higher efficiency with lower cost. Using scCAT-seq, we identified hundreds of previously uncharacterized full-length transcripts and thousands of alternative transcripts for known genes, quantitatively revealed cell-type specific isoforms with alternative TSSs/TESs in dorsal root ganglion (DRG) neurons, mature oocytes and ageing oocytes, and generated the first atlas of the non-human primate cornea. The approach described here can be widely adapted to other short-read or long-read methods to improve accuracy and efficiency in assessing RNA isoform dynamics among single cells.

Download Full-text

FlsnRNA-seq: protoplasting-free full-length single-nucleus RNA profiling in plants

Genome Biology ◽

10.1186/s13059-021-02288-0 ◽

2021 ◽

Vol 22 (1) ◽

Cited By ~ 2

Author(s):

Yanping Long ◽

Zhijian Liu ◽

Jinbu Jia ◽

Weipeng Mo ◽

Liang Fang ◽

...

Keyword(s):

Single Cell ◽

Cell Walls ◽

Large Scale ◽

Full Length ◽

Cell Level ◽

Root Cells ◽

Rna Profiling ◽

Different Types ◽

Long Read ◽

Single Nucleus

AbstractThe broad application of single-cell RNA profiling in plants has been hindered by the prerequisite of protoplasting that requires digesting the cell walls from different types of plant tissues. Here, we present a protoplasting-free approach, flsnRNA-seq, for large-scale full-length RNA profiling at a single-nucleus level in plants using isolated nuclei. Combined with 10x Genomics and Nanopore long-read sequencing, we validate the robustness of this approach in Arabidopsis root cells and the developing endosperm. Sequencing results demonstrate that it allows for uncovering alternative splicing and polyadenylation-related RNA isoform information at the single-cell level, which facilitates characterizing cell identities.

Download Full-text

Nanopore sequencing of single-cell transcriptomes with scCOLOR-seq

Nature Biotechnology ◽

10.1038/s41587-021-00965-w ◽

2021 ◽

Author(s):

Martin Philpott ◽

Jonathan Watson ◽

Anjan Thakurta ◽

Tom Brown ◽

...

Keyword(s):

Single Cell ◽

Error Detection ◽

Single Cells ◽

Fusion Transcript ◽

Building Blocks ◽

Myeloma Cell ◽

Nanopore Sequencing ◽

Long Read ◽

Unique Molecular Identifier ◽

Transcript Detection

AbstractHere we describe single-cell corrected long-read sequencing (scCOLOR-seq), which enables error correction of barcode and unique molecular identifier oligonucleotide sequences and permits standalone cDNA nanopore sequencing of single cells. Barcodes and unique molecular identifiers are synthesized using dimeric nucleotide building blocks that allow error detection. We illustrate the use of the method for evaluating barcode assignment accuracy, differential isoform usage in myeloma cell lines, and fusion transcript detection in a sarcoma cell line.

Download Full-text

RA3 is a reference-guided approach for epigenetic characterization of single cells

Nature Communications ◽

10.1038/s41467-021-22495-4 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Shengquan Chen ◽

Guanao Yan ◽

Wenyu Zhang ◽

Jinzhao Li ◽

Rui Jiang ◽

...

Keyword(s):

Single Cell ◽

Computational Analysis ◽

Reference Data ◽

Single Cells ◽

Chromatin Accessibility ◽

Biological Variation ◽

Superior Performance ◽

High Dimensionality ◽

High Degree

AbstractThe recent advancements in single-cell technologies, including single-cell chromatin accessibility sequencing (scCAS), have enabled profiling the epigenetic landscapes for thousands of individual cells. However, the characteristics of scCAS data, including high dimensionality, high degree of sparsity and high technical variation, make the computational analysis challenging. Reference-guided approaches, which utilize the information in existing datasets, may facilitate the analysis of scCAS data. Here, we present RA3 (Reference-guided Approach for the Analysis of single-cell chromatin Accessibility data), which utilizes the information in massive existing bulk chromatin accessibility and annotated scCAS data. RA3 simultaneously models (1) the shared biological variation among scCAS data and the reference data, and (2) the unique biological variation in scCAS data that identifies distinct subpopulations. We show that RA3 achieves superior performance when used on several scCAS datasets, and on references constructed using various approaches. Altogether, these analyses demonstrate the wide applicability of RA3 in analyzing scCAS data.

Download Full-text

High Throughput Immunophenotyping and Expression Profiling at Single Cell Level Reveal BCR-ABL1 Dependent Surface Markers of Chronic Myeloid Leukemia Stem Cells

Blood ◽

10.1182/blood-2019-126924 ◽

2019 ◽

Vol 134 (Supplement_1) ◽

pp. 2920-2920

Author(s):

Marianna Romzova ◽

Dagmar Smitalova ◽

Peter Taus ◽

Jiri Mayer ◽

Martin Culen

Keyword(s):

Stem Cells ◽

Single Cell ◽

Expression Profiling ◽

Myeloid Leukemia ◽

Single Cells ◽

Transcript Level ◽

Surface Expression ◽

Surface Markers ◽

Cell Level ◽

Tki Treatment

BACKGROUND: Bcr-abl1 oncogene targeted treatment with tyrosine kinase inhibitors (TKI) showed an impressive efficacy against proliferating chronic myeloid leukemia (CML) cells. However, rapid relapses in more than half of CML patients after discontinuation of the treatment suggest a presence of quiescent leukemic stem cells inherently resistant to BCR-ABL1 inhibition. Understanding the heterogeneity of CML stem cell compartment is crucial for preventing the treatment failure. Specificity of already established leukemic stem cell (LSC) markers has been tested mainly in bulk CD34+CD38- populations at diagnosis. Phenotypes and molecular signatures of therapy resistant BCR ABL1 positive stem cells is however yet to be established. AIMS: Identification of BCR-ABL1 dependent LSC markers at single cell level by direct comparison their surface and transcript expression with the levels and the presence of BCR-ABL1 transcript at diagnosis and after administration of TKI treatment. METHODS: Total number of 375 cells were obtained from bone marrow and peripheral blood of 4 chronic phase CML patients. Cells were collected prior any treatment and three months after TKI treatment initiation. Normal bone marrow cells and BCR-ABL1 positive K562 cell line were used as controls. Indexed immuno-phenotyping and sorting of CD34+CD38- single cells was performed using a panel of 11 specific surface markers. Collected single cells were lysed and cDNA was enriched for 11 targets using 22 cycle pre-amplification. Expression profiling was carried on SmartChip real-time PCR system (Takara Bio) detecting following genes: BCR-ABL1, CD26, CD25, IL1-Rap, CD56, CD90, CD93, CD69, KI67, and control genes GUS and HPRT. Unsupervised clustering was performed using principal component analysis (PCA). Correlations were measured by Spearman rank method. RESULTS: At diagnosis, majority of BCR-ABL1+ C34+CD38- stem cells co-express IL1-Rap, CD26, and CD69 on their surface (88%, 82%, 78% overlap). Only 56% of BCR-ABL1+ cells positive for aforementioned markers co-express CD25, 28% CD93 and 16% CD56. The expression of these markers could also be detected in 4-11% of BCR-ABL1- cell, although this could be technical inaccuracy caused by the single cell profiling. CD90 marker did not show any correlation with BCR-ABL1 expression. At transcript level the expression of IL-1Rap, CD26, CD25 and CD56 was observed in 62%, 52% 45% and 16% BCR-ABL1+ cells, and up to 7% of BCR-ABL1- cells. CD69 expression was observed in 90% of BCR-ABL+ cells at transcript level, but also in 71% BCR-ABL- cells. BCR-ABL1 independent expression was observed for cKIT. (60% vs. 76 % in positive vs negative). Finally proliferation marker KI67 was expressed only in 6% of the BCR-ABL1+ cells. PCA analysis divided cells into several distinct clusters with BCR-ABL1 as the main contributor, and cKIT, CD69 and CD26, IL-1RAP as other significant factors. Interestingly BCR-ABL1+ cells collected during TKI treatment showed persistent surface expression of IL-1Rap and CD26, while CD56, CD69 and CD93 were only on part of the BCR-ABL1+ cells. CD25 was significantly deregulated during TKI treatment. CONCLUSION: At diagnosis up to 80% of LSC co-express 3 specific surface markers - IL-1RAP, CD26 and CD69. Variable portion of LSC co-express additional markers such are CD25, CD56 and CD93. During TKI treatment the surface expression of majority of markers is decreased, where the best correlated LSC marker is IL-1Rap, followed by CD26 and CD69. CD56 marker seems to persist in the same proportion of cells while CD25 disappears. cKIT is highly expressed in normal BM and HSC from CML patients, but also in some LSC. CD34+CD38- cells show non-proliferating phenotype. Disclosures Mayer: AOP Orphan Pharmaceuticals AG: Research Funding.

Download Full-text

Characterization of CRISPR/Cas9 RANKL knockout mesenchymal stem cell clones based on single-cell printing technology and emulsion coupling assay as a low-cellularity workflow for single-cell cloning

10.1101/2020.08.17.253559 ◽

2020 ◽

Author(s):

Tobias Groß ◽

Csaba Jeney ◽

Darius Halm ◽

Günter Finkenzeller ◽

G. Björn Stark ◽

...

Keyword(s):

Single Cell ◽

Cell Line ◽

Large Scale ◽

Single Cells ◽

Protein Detection ◽

Cell Line Development ◽

Cell Printing ◽

Cell Clones ◽

The University

AbstractThe homogeneity of the genetically modified single-cells is a necessity for many applications such as cell line development, gene therapy, and tissue engineering and in particular for regenerative medical applications. The lack of tools to effectively isolate and characterize CRISPR/Cas9 engineered cells is considered as a significant bottleneck in these applications. Especially the incompatibility of protein detection technologies to confirm protein expression changes without a preconditional large-scale clonal expansion, creates a gridlock in many applications. To ameliorate the characterization of engineered cells, we propose an improved workflow, including single-cell printing/isolation technology based on fluorescent properties with high yield, a genomic edit screen (surveyor assay), mRNA rtPCR assessing altered gene expression and a versatile protein detection tool called emulsion-coupling to deliver a high-content, unified single-cell workflow. The workflow was exemplified by engineering and functionally validating RANKL knockout immortalized mesenchymal stem cells showing altered bone formation capacity of these cells. The resulting workflow is economical, without the requirement of large-scale clonal expansions of the cells with overall cloning efficiency above 30% of CRISPR/Cas9 edited cells. Nevertheless, as the single-cell clones are comprehensively characterized at an early, highly parallel phase of the development of cells including DNA, RNA, and protein levels, the workflow delivers a higher number of successfully edited cells for further characterization, lowering the chance of late failures in the development process.Author summaryI completed my undergraduate degree in biochemistry at the University of Ulm and finished my master's degree in pharmaceutical biotechnology at the University of Ulm and University of applied science of Biberach with a focus on biotechnology, toxicology and molecular biology. For my master thesis, I went to the University of Freiburg to the department of microsystems engineering, where I developed a novel workflow for cell line development. I stayed at the institute for my doctorate, but changed my scientific focus to the development of the emulsion coupling technology, which is a powerful tool for the quantitative and highly parallel measurement of protein and protein interactions. I am generally interested in being involved in the development of innovative molecular biological methods that can be used to gain new insights about biological issues. I am particularly curious to unravel the complex and often poorly understood protein interaction pathways that are the cornerstone of understanding cellular functionality and are a fundamental necessity to describe life mechanistically.

Download Full-text

Single-cell RNA cap and tail sequencing (scRCAT-seq) reveals subtype-specific isoforms differing in transcript demarcation

Nature Communications ◽

10.1038/s41467-020-18976-7 ◽

2020 ◽

Vol 11 (1) ◽

Author(s):

Youjin Hu ◽

Jiawei Zhong ◽

Yuhua Xiao ◽

Zheng Xing ◽

Katherine Sheu ◽

...

Keyword(s):

Single Cell ◽

Single Cells ◽

Single Gene ◽

Cell Types ◽

Machine Learning Algorithms ◽

Translation Efficiency ◽

Transcription Start Sites ◽

Long Read ◽

Mrna Gene ◽

Gene Isoforms

Abstract The differences in transcription start sites (TSS) and transcription end sites (TES) among gene isoforms can affect the stability, localization, and translation efficiency of mRNA. Gene isoforms allow a single gene diverse functions across different cell types, and isoform dynamics allow different functions over time. However, methods to efficiently identify and quantify RNA isoforms genome-wide in single cells are still lacking. Here, we introduce single cell RNA Cap And Tail sequencing (scRCAT-seq), a method to demarcate the boundaries of isoforms based on short-read sequencing, with higher efficiency and lower cost than existing long-read sequencing methods. In conjunction with machine learning algorithms, scRCAT-seq demarcates RNA transcripts with unprecedented accuracy. We identified hundreds of previously uncharacterized transcripts and thousands of alternative transcripts for known genes, revealed cell-type specific isoforms for various cell types across different species, and generated a cell atlas of isoform dynamics during the development of retinal cones.

Download Full-text