Benchmarking Single-Cell mRNA–Sequencing Technologies Uncovers Differences in Sensitivity and Reproducibility in Cell Types With Low RNA Content

AbstractA fundamental task in single-cell RNA-seq (scRNA-seq) analysis is the identification of transcriptionally distinct groups of cells. Numerous methods have been proposed for this problem, with a recent focus on methods for the cluster analysis of ultra-large scRNA-seq data sets produced by droplet-based sequencing technologies. Most existing methods rely on a sampling step to bridge the gap between algorithm scalability and volume of the data. Ignoring large parts of the data, however, often yields inaccurate groupings of cells and risks overlooking rare cell types. We propose method Specter that adopts and extends recent algorithmic advances in (fast) spectral clustering. In contrast to methods that cluster a (random) subsample of the data, we adopt the idea of landmarks that are used to create a sparse representation of the full data from which a spectral embedding can then be computed in linear time. We exploit Specter’s speed in a cluster ensemble scheme that achieves a substantial improvement in accuracy over existing methods and that is sensitive to rare cell types. Its linear time complexity allows Specter to scale to millions of cells and leads to fast computation times in practice. Furthermore, on CITE-seq data that simultaneously measures gene and protein marker expression we demonstrate that Specter is able to utilize multimodal omics measurements to resolve subtle transcriptomic differences between subpopulations of cells. Specter is open source and available at https://github.com/canzarlab/Specter.

Download Full-text

Splatter: simulation of single-cell RNA sequencing data

10.1101/133173 ◽

2017 ◽

Cited By ~ 8

Author(s):

Luke Zappia ◽

Belinda Phipson ◽

Alicia Oshlack

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Real Data ◽

Cell Types ◽

Rna Seq ◽

Sequencing Data ◽

Sequencing Technologies ◽

Simulation Based ◽

Single Cell Rna Sequencing ◽

Multiple Cell

AbstractAs single-cell RNA sequencing technologies have rapidly developed, so have analysis methods. Many methods have been tested, developed and validated using simulated datasets. Unfortunately, current simulations are often poorly documented, their similarity to real data is not demonstrated, or reproducible code is not available.Here we present the Splatter Bioconductor package for simple, reproducible and well-documented simulation of single-cell RNA-seq data. Splatter provides an interface to multiple simulation methods including Splat, our own simulation, based on a gamma-Poisson distribution. Splat can simulate single populations of cells, populations with multiple cell types or differentiation paths.

Download Full-text

Single-cell quantification of a broad RNA spectrum reveals unique noncoding patterns associated with cell types and states

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.2113568118 ◽

2021 ◽

Vol 118 (51) ◽

pp. e2113568118

Author(s):

Alina Isakova ◽

Norma Neff ◽

Stephen R. Quake

Keyword(s):

Single Cell ◽

Noncoding Rna ◽

Single Cells ◽

Embryonic Stem ◽

Cell Types ◽

Embryoid Bodies ◽

Rna Seq ◽

Total Rna ◽

Rna Transcripts ◽

Rna Content

The ability to interrogate total RNA content of single cells would enable better mapping of the transcriptional logic behind emerging cell types and states. However, current single-cell RNA-sequencing (RNA-seq) methods are unable to simultaneously monitor all forms of RNA transcripts at the single-cell level, and thus deliver only a partial snapshot of the cellular RNAome. Here we describe Smart-seq-total, a method capable of assaying a broad spectrum of coding and noncoding RNA from a single cell. Smart-seq-total does not require splitting the RNA content of a cell and allows the incorporation of unique molecular identifiers into short and long RNA molecules for absolute quantification. It outperforms current poly(A)-independent total RNA-seq protocols by capturing transcripts of a broad size range, thus enabling simultaneous analysis of protein-coding, long-noncoding, microRNA, and other noncoding RNA transcripts from single cells. We used Smart-seq-total to analyze the total RNAome of human primary fibroblasts, HEK293T, and MCF7 cells, as well as that of induced murine embryonic stem cells differentiated into embryoid bodies. By analyzing the coexpression patterns of both noncoding RNA and mRNA from the same cell, we were able to discover new roles of noncoding RNA throughout essential processes, such as cell cycle and lineage commitment during embryonic development. Moreover, we show that independent classes of short-noncoding RNA can be used to determine cell-type identity.

Download Full-text

Predicting cell-to-cell communication networks using NATMI

Nature Communications ◽

10.1038/s41467-020-18873-z ◽

2020 ◽

Vol 11 (1) ◽

Author(s):

Rui Hou ◽

Elena Denisenko ◽

Huan Ting Ong ◽

Jordan A. Ramilowski ◽

Alistair R. R. Forrest

Keyword(s):

Single Cell ◽

Communication Networks ◽

Cell Communication ◽

Cost Effective ◽

Cell Types ◽

Sequencing Technologies ◽

Multiple Cell ◽

Multicellular Interactions ◽

Autocrine Signalling ◽

Different Cell Types

Abstract Development of high throughput single-cell sequencing technologies has made it cost-effective to profile thousands of cells from diverse samples containing multiple cell types. To study how these different cell types work together, here we develop NATMI (Network Analysis Toolkit for Multicellular Interactions). NATMI uses connectomeDB2020 (a database of 2293 manually curated ligand-receptor pairs with literature support) to predict and visualise cell-to-cell communication networks from single-cell (or bulk) expression data. Using multiple published single-cell datasets we demonstrate how NATMI can be used to identify (i) the cell-type pairs that are communicating the most (or most specifically) within a network, (ii) the most active (or specific) ligand-receptor pairs active within a network, (iii) putative highly-communicating cellular communities and (iv) differences in intercellular communication when profiling given cell types under different conditions. Furthermore, analysis of the Tabula Muris (organism-wide) atlas confirms our previous prediction that autocrine signalling is a major feature of cell-to-cell communication networks, while also revealing that hundreds of ligands and their cognate receptors are co-expressed in individual cells suggesting a substantial potential for self-signalling.

Download Full-text

The new technologies of high-throughput single-cell RNA sequencing

Vavilov Journal of Genetics and Breeding ◽

10.18699/vj19.520 ◽

2019 ◽

Vol 23 (5) ◽

pp. 508-518

Author(s):

E. A. Vodiasova ◽

E. S. Chelebieva ◽

O. N. Kuleshova

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

High Performance ◽

New Technologies ◽

Bioinformatics Analysis ◽

Single Cells ◽

Rapid Development ◽

Cell Types ◽

Regulatory Pathways ◽

Sequencing Technologies

A wealth of genome and transcriptome data obtained using new generation sequencing (NGS) technologies for whole organisms could not answer many questions in oncology, immunology, physiology, neurobiology, zoology and other fields of science and medicine. Since the cell is the basis for the living of all unicellular and multicellular organisms, it is necessary to study the biological processes at its level. This understanding gave impetus to the development of a new direction – the creation of technologies that allow working with individual cells (single-cell technology). The rapid development of not only instruments, but also various advanced protocols for working with single cells is due to the relevance of these studies in many fields of science and medicine. Studying the features of various stages of ontogenesis, identifying patterns of cell differentiation and subsequent tissue development, conducting genomic and transcriptome analyses in various areas of medicine (especially in demand in immunology and oncology), identifying cell types and states, patterns of biochemical and physiological processes using single cell technologies, allows the comprehensive research to be conducted at a new level. The first RNA-sequencing technologies of individual cell transcriptomes (scRNA-seq) captured no more than one hundred cells at a time, which was insufficient due to the detection of high cell heterogeneity, existence of the minor cell types (which were not detected by morphology) and complex regulatory pathways. The unique techniques for isolating, capturing and sequencing transcripts of tens of thousands of cells at a time are evolving now. However, new technologies have certain differences both at the sample preparation stage and during the bioinformatics analysis. In the paper we consider the most effective methods of multiple parallel scRNA-seq using the example of 10XGenomics, as well as the specifics of such an experiment, further bioinformatics analysis of the data, future outlook and applications of new high-performance technologies.

Download Full-text

ACME dissociation: a versatile cell fixation-dissociation method for single-cell transcriptomics

10.1101/2020.05.26.117234 ◽

2020 ◽

Author(s):

Helena García-Castro ◽

Nathan J Kenny ◽

Patricia Álvarez-Campos ◽

Vincent Mason ◽

Anna Schönauer ◽

...

Keyword(s):

Single Cell ◽

Cell Sorting ◽

Cell Types ◽

Cell Dissociation ◽

Rna Integrity ◽

Single Cell Sequencing ◽

Sequencing Technologies ◽

A Cell ◽

Dissociated Cells ◽

Cell Fixation

AbstractSingle-cell sequencing technologies are revolutionizing biology, but are limited by the need to dissociate fresh samples that can only be fixed at later stages. We present ACME (ACetic-MEthanol) dissociation, a cell dissociation approach that fixes cells as they are being dissociated. ACME-dissociated cells have high RNA integrity, can be cryopreserved multiple times, can be sorted by Fluorescence-Activated Cell Sorting (FACS) and are permeable, enabling combinatorial single-cell transcriptomic approaches. As a proof of principle, we have performed SPLiT-seq with ACME cells to obtain around ∼34K single cell transcriptomes from two planarian species and identified all previously described cell types in similar proportions. ACME is based on affordable reagents, can be done in most laboratories and even in the field, and thus will accelerate our knowledge of cell types across the tree of life.

Download Full-text

An integrated transcriptomic and epigenomic atlas of mouse primary motor cortex cell types

10.1101/2020.02.29.970558 ◽

2020 ◽

Cited By ~ 15

Author(s):

Zizhen Yao ◽

Hanqing Liu ◽

Fangming Xie ◽

Stephan Fischer ◽

A. Sina Booeshaghi ◽

...

Keyword(s):

Motor Cortex ◽

Single Cell ◽

Primary Motor Cortex ◽

Neuronal Cell ◽

Cell Types ◽

Regulatory Elements ◽

Brain Cell ◽

Molecular Signatures ◽

Sequencing Technologies ◽

Cell Transcriptome

AbstractSingle cell transcriptomics has transformed the characterization of brain cell identity by providing quantitative molecular signatures for large, unbiased samples of brain cell populations. With the proliferation of taxonomies based on individual datasets, a major challenge is to integrate and validate results toward defining biologically meaningful cell types. We used a battery of single-cell transcriptome and epigenome measurements generated by the BRAIN Initiative Cell Census Network (BICCN) to comprehensively assess the molecular signatures of cell types in the mouse primary motor cortex (MOp). We further developed computational and statistical methods to integrate these multimodal data and quantitatively validate the reproducibility of the cell types. The reference atlas, based on more than 600,000 high quality single-cell or -nucleus samples assayed by six molecular modalities, is a comprehensive molecular account of the diverse neuronal and non-neuronal cell types in MOp. Collectively, our study indicates that the mouse primary motor cortex contains over 55 neuronal cell types that are highly replicable across analysis methods, sequencing technologies, and modalities. We find many concordant multimodal markers for each cell type, as well as thousands of genes and gene regulatory elements with discrepant transcriptomic and epigenomic signatures. These data highlight the complex molecular regulation of brain cell types and will directly enable design of reagents to target specific MOp cell types for functional analysis.

Download Full-text

Vertical flow array chips reliably identify cell types from single-cell mRNA sequencing experiments

Scientific Reports ◽

10.1038/srep36014 ◽

2016 ◽

Vol 6 (1) ◽

Cited By ~ 3

Author(s):

Masataka Shirai ◽

Koji Arikawa ◽

Kiyomi Taniguchi ◽

Maiko Tanabe ◽

Tomoyuki Sakai

Keyword(s):

Single Cell ◽

Cell Types ◽

Vertical Flow ◽

Mrna Sequencing

Download Full-text

Uncovering hypergraphs of cell-cell interaction from single cell RNA-sequencing data

10.1101/566182 ◽

2019 ◽

Cited By ~ 10

Author(s):

Koki Tsuyuzaki ◽

Manabu Ishii ◽

Itoshi Nikaido

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Cell Types ◽

Cell Interaction ◽

Receptor Expression ◽

Sequencing Data ◽

Sequencing Technologies ◽

Single Cell Rna Sequencing ◽

Ligand Expression ◽

Cell Cell

AbstractComplex biological systems can be described as a multitude of cell-cell interactions (CCIs). Recent single-cell RNA-sequencing technologies have enabled the detection of CCIs and related ligand-receptor (L-R) gene expression simultaneously. However, previous data analysis methods have focused on only one-to-one CCIs between two cell types. To also detect many-to-many CCIs, we proposescTensor, a novel method for extracting representative triadic relationships (hypergraphs), which include (i) ligand-expression, (ii) receptor-expression, and (iii) L-R pairs. When applied to simulated and empirical datasets,scTensorwas able to detect some hypergraphs including paracrine/autocrine CCI patterns, which cannot be detected by previous methods.

Download Full-text

Automated quality control and cell identification of droplet-based single-cell data using dropkick

10.1101/2020.10.08.332288 ◽

2020 ◽

Author(s):

Cody N. Heiser ◽

Victoria M. Wang ◽

Bob Chen ◽

Jacob J. Hughey ◽

Ken S. Lau

Keyword(s):

Quality Control ◽

Single Cell ◽

Single Cell Analysis ◽

Software Tool ◽

Cell Types ◽

Computational Method ◽

High Background ◽

Cell Identification ◽

Sequencing Technologies ◽

Downstream Analysis

AbstractA major challenge for droplet-based single-cell sequencing technologies is distinguishing true cells from uninformative barcodes in datasets with disparate library sizes confounded by high technical noise (i.e. batch-specific ambient RNA). We present dropkick, a fully automated software tool for quality control and filtering of single-cell RNA sequencing (scRNA-seq) data with a focus on excluding ambient barcodes and recovering real cells bordering the quality threshold. By automatically determining dataset-specific training labels based on predictive global heuristics, dropkick learns a gene-based representation of real cells and ambient noise, calculating a cell probability score for each barcode. Using simulated and real-world scRNA-seq data, we benchmarked dropkick against a conventional thresholding approach and EmptyDrops, a popular computational method, demonstrating greater recovery of rare cell types and exclusion of empty droplets and noisy, uninformative barcodes. We show for both low and high-background datasets that dropkick’s weakly supervised model reliably learns which genes are enriched in ambient barcodes and draws a multidimensional boundary that is more robust to dataset-specific variation than existing filtering approaches. dropkick provides a fast, automated tool for reproducible cell identification from scRNA-seq data that is critical to downstream analysis and compatible with popular single-cell analysis Python packages.

Download Full-text