scPower accelerates and optimizes the design of multi-sample single cell transcriptomic studies

Katharina T. Schmid; Barbara Höllbacher; Cristiana Cruceanu; Anika Böttcher; Heiko Lickert; Elisabeth B. Binder; Fabian J. Theis; Matthias Heinig

doi:10.1038/s41467-021-26779-7

scPower accelerates and optimizes the design of multi-sample single cell transcriptomic studies

Nature Communications ◽

10.1038/s41467-021-26779-7 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Katharina T. Schmid ◽

Barbara Höllbacher ◽

Cristiana Cruceanu ◽

Anika Böttcher ◽

Heiko Lickert ◽

...

Keyword(s):

Single Cell ◽

Power Analysis ◽

Optimal Parameter ◽

Cell Types ◽

R Package ◽

Statistical Framework ◽

Limited Budget ◽

Single Cell Profiling ◽

Differential Gene ◽

Efficient Power

AbstractSingle cell RNA-seq has revolutionized transcriptomics by providing cell type resolution for differential gene expression and expression quantitative trait loci (eQTL) analyses. However, efficient power analysis methods for single cell data and inter-individual comparisons are lacking. Here, we present scPower; a statistical framework for the design and power analysis of multi-sample single cell transcriptomic experiments. We modelled the relationship between sample size, the number of cells per individual, sequencing depth, and the power of detecting differentially expressed genes within cell types. We systematically evaluated these optimal parameter combinations for several single cell profiling platforms, and generated broad recommendations. In general, shallow sequencing of high numbers of cells leads to higher overall power than deep sequencing of fewer cells. The model, including priors, is implemented as an R package and is accessible as a web tool. scPower is a highly customizable tool that experimentalists can use to quickly compare a multitude of experimental designs and optimize for a limited budget.

Download Full-text

Design and power analysis for multi-sample single cell genomics experiments

10.21203/rs.3.rs-331370/v1 ◽

2021 ◽

Author(s):

Katharina Schmid ◽

Cristiana Cruceanu ◽

Anika Böttcher ◽

Heiko Lickert ◽

Elisabeth Binder ◽

...

Keyword(s):

Single Cell ◽

Power Analysis ◽

Cell Types ◽

R Package ◽

Single Cell Genomics ◽

Statistical Framework ◽

Limited Budget ◽

Size Number ◽

Differential Gene ◽

Efficient Power

Abstract Single cell RNA-seq revolutionizes transcriptomics by providing cell type resolution for interindividual differential gene expression and expression quantitative trait loci analyses. However, efficient power analysis methods accounting for the characteristics of single cell data and interindividual comparison are missing. Here we present a statistical framework for design and power analysis of multi-sample single cell genomics experiments. The model relates sample size, number of cells per individual and sequencing depth to the power of detecting differentially expressed genes within cell types. It enables fast systematic comparison of alternative experimental designs and optimization for a limited budget. We evaluated data driven priors for a range of applications and single cell platforms. In many settings, shallow sequencing of high numbers of cells leads to higher overall power than deep sequencing of fewer cells. The model including priors is implemented as an R package scPower and is accessible as a web tool.

Download Full-text

Faculty Opinions recommendation of Simultaneous single-cell profiling of lineages and cell types in the vertebrate brain.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.732909626.793544340 ◽

2018 ◽

Author(s):

Bruce Appel

Keyword(s):

Single Cell ◽

Cell Types ◽

Vertebrate Brain ◽

Single Cell Profiling

Download Full-text

Single-cell mapping of focused ultrasound-transfected brain

Gene Therapy ◽

10.1038/s41434-021-00226-0 ◽

2021 ◽

Author(s):

A. S. Mathew ◽

C. M. Gorick ◽

R. J. Price

Keyword(s):

Single Cell ◽

Focused Ultrasound ◽

Cell Types ◽

Brain Cell ◽

Therapeutic Modality ◽

Full Potential ◽

Stress Genes ◽

Multiple Cell ◽

Differential Gene

AbstractGene delivery via focused ultrasound (FUS) mediated blood-brain barrier (BBB) opening is a disruptive therapeutic modality. Unlocking its full potential will require an understanding of how FUS parameters (e.g., peak-negative pressure (PNP)) affect transfected cell populations. Following plasmid (mRuby) delivery across the BBB with 1 MHz FUS, we used single-cell RNA-sequencing to ascertain that distributions of transfected cell types were highly dependent on PNP. Cells of the BBB (i.e., endothelial cells, pericytes, and astrocytes) were enriched at 0.2 MPa PNP, while transfection of cells distal to the BBB (i.e., neurons, oligodendrocytes, and microglia) was augmented at 0.4 MPa PNP. PNP-dependent differential gene expression was observed for multiple cell types. Cell stress genes were upregulated proportional to PNP, independent of cell type. Our results underscore how FUS may be tuned to bias transfection toward specific brain cell types in vivo and predict how those cells will respond to transfection.

Download Full-text

Selecting single cell clustering parameter values using subsampling-based robustness metrics

BMC Bioinformatics ◽

10.1186/s12859-021-03957-4 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Ryan B. Patterson-Cross ◽

Ariel J. Levine ◽

Vilas Menon

Keyword(s):

Single Cell ◽

Optimal Parameter ◽

Clustering Algorithms ◽

Cell Types ◽

Parameter Selection ◽

Data Set ◽

Biologically Relevant ◽

Cell Clustering ◽

Parameter Values ◽

Robustness Metrics

Abstract Background Generating and analysing single-cell data has become a widespread approach to examine tissue heterogeneity, and numerous algorithms exist for clustering these datasets to identify putative cell types with shared transcriptomic signatures. However, many of these clustering workflows rely on user-tuned parameter values, tailored to each dataset, to identify a set of biologically relevant clusters. Whereas users often develop their own intuition as to the optimal range of parameters for clustering on each data set, the lack of systematic approaches to identify this range can be daunting to new users of any given workflow. In addition, an optimal parameter set does not guarantee that all clusters are equally well-resolved, given the heterogeneity in transcriptomic signatures in most biological systems. Results Here, we illustrate a subsampling-based approach (chooseR) that simultaneously guides parameter selection and characterizes cluster robustness. Through bootstrapped iterative clustering across a range of parameters, chooseR was used to select parameter values for two distinct clustering workflows (Seurat and scVI). In each case, chooseR identified parameters that produced biologically relevant clusters from both well-characterized (human PBMC) and complex (mouse spinal cord) datasets. Moreover, it provided a simple “robustness score” for each of these clusters, facilitating the assessment of cluster quality. Conclusion chooseR is a simple, conceptually understandable tool that can be used flexibly across clustering algorithms, workflows, and datasets to guide clustering parameter selection and characterize cluster robustness.

Download Full-text

A Novel Method to Identify the Differences Between Two Single Cell Groups at Single Gene, Gene Pair, and Gene Module Levels

Frontiers in Genetics ◽

10.3389/fgene.2021.648898 ◽

2021 ◽

Vol 12 ◽

Author(s):

Lingyu Cui ◽

Bo Wang ◽

Changjing Ren ◽

Ailan Wang ◽

Hong An ◽

...

Keyword(s):

Single Cell ◽

Molecular Mechanisms ◽

Single Gene ◽

Cell Types ◽

Human Pancreas ◽

Biological Difference ◽

Cell Clusters ◽

Cell Groups ◽

Network Modules ◽

Differential Gene

Single-cell sequencing technology can not only view the heterogeneity of cells from a molecular perspective, but also discover new cell types. Although there are many effective methods on dropout imputation, cell clustering, and lineage reconstruction based on single cell RNA sequencing (RNA-seq) data, there is no systemic pipeline on how to compare two single cell clusters at the molecular level. In the study, we present a novel pipeline on comparing two single cell clusters, including calling differential gene expression, coexpression network modules, and so on. The pipeline could reveal mechanisms behind the biological difference between cell clusters and cell types, and identify cell type specific molecular mechanisms. We applied the pipeline to two famous single-cell databases, Usoskin from mouse brain and Xin from human pancreas, which contained 622 and 1,600 cells, respectively, both of which were composed of four types of cells. As a result, we identified many significant differential genes, differential gene coexpression and network modules among the cell clusters, which confirmed that different cell clusters might perform different functions.

Download Full-text

STACAS: Sub-Type Anchor Correction for Alignment in Seurat to integrate single-cell RNA-seq data

Bioinformatics ◽

10.1093/bioinformatics/btaa755 ◽

2020 ◽

Cited By ~ 1

Author(s):

Massimo Andreatta ◽

Santiago J Carmona

Keyword(s):

Single Cell ◽

Distance Measure ◽

Source Code ◽

Cell Types ◽

R Package ◽

Computational Method ◽

Biological Variability ◽

Rna Seq ◽

Batch Effects ◽

Guide Trees

Abstract Summary STACAS is a computational method for the identification of integration anchors in the Seurat environment, optimized for the integration of single-cell (sc) RNA-seq datasets that share only a subset of cell types. We demonstrate that by (i) correcting batch effects while preserving relevant biological variability across datasets, (ii) filtering aberrant integration anchors with a quantitative distance measure and (iii) constructing optimal guide trees for integration, STACAS can accurately align scRNA-seq datasets composed of only partially overlapping cell populations. Availability and implementation Source code and R package available at https://github.com/carmonalab/STACAS; Docker image available at https://hub.docker.com/repository/docker/mandrea1/stacas_demo.

Download Full-text

rPanglaoDB: an R package to download and merge labeled single-cell RNA-seq data from the PanglaoDB database

10.1101/2021.05.28.446161 ◽

2021 ◽

Author(s):

Daniel Osorio ◽

Marieke Lydia Kuijjer ◽

James J. Cai

Keyword(s):

Single Cell ◽

Cell Types ◽

R Package ◽

Rna Seq ◽

Cell Type ◽

Sequencing Data ◽

Single Experiment ◽

Tissue Samples ◽

Molecular Phenotypes ◽

Public Datasets

Motivation: Characterizing cells with rare molecular phenotypes is one of the promises of high throughput single-cell RNA sequencing (scRNA-seq) techniques. However, collecting enough cells with the desired molecular phenotype in a single experiment is challenging, requiring several samples preprocessing steps to filter and collect the desired cells experimentally before sequencing. Data integration of multiple public single-cell experiments stands as a solution for this problem, allowing the collection of enough cells exhibiting the desired molecular signatures. By increasing the sample size of the desired cell type, this approach enables a robust cell type transcriptome characterization. Results: Here, we introduce rPanglaoDB, an R package to download and merge the uniformly processed and annotated scRNA-seq data provided by the PanglaoDB database. To show the potential of rPanglaoDB for collecting rare cell types by integrating multiple public datasets, we present a biological application collecting and characterizing a set of 157 fibrocytes. Fibrocytes are a rare monocyte-derived cell type, that exhibits both the inflammatory features of macrophages and the tissue remodeling properties of fibroblasts. This constitutes the first fibrocytes' unbiased transcriptome profile report. We compared the transcriptomic profile of the fibrocytes against the fibroblasts collected from the same tissue samples and confirm their associated relationship with healing processes in tissue damage and infection through the activation of the prostaglandin biosynthesis and regulation pathway. Availability and Implementation: rPanglaoDB is implemented as an R package available through the CRAN repositories https://CRAN.R-project.org/package=rPanglaoDB.

Download Full-text

LRcell: detecting the source of differential expression at the sub-cell type level from bulk RNA-seq data

10.1101/2021.08.10.455821 ◽

2021 ◽

Author(s):

Wenjing Ma ◽

Sumeet Sharma ◽

Peng Jin ◽

Shannon L Gourley ◽

Zhaohui Qin

Keyword(s):

Single Cell ◽

Cell Types ◽

Marker Genes ◽

Bioconductor Package ◽

Rna Seq ◽

Cell Type ◽

Reference Dataset ◽

Cell Type Composition ◽

Type Composition ◽

Differential Gene

The rapid proliferation of single-cell RNA-sequencing (scRNA-seq) datasets have revealed cell heterogeneity at unprecedented scales. Several deconvolution methods have been developed to decompose bulk experiments to reveal cell type contributions. However, these methods lack power in identifying the accurate cell type composition when having a considerable amount of sub-cell types in the reference dataset. Here, we present LRcell, a R Bioconductor package (http://bioconductor.org/packages/release/bioc/html/LRcell.html) aiming to identify specific sub-cell type(s) that drives the changes observed in a bulk RNA-seq differential gene expression experiment. In addition, LRcell provides pre-embedded marker genes computed from putative single-cell RNA-seq experiments as options to execute the analyses.

Download Full-text

Urinary Single-Cell Profiling Captures the Cellular Diversity of the Kidney

Journal of the American Society of Nephrology ◽

10.1681/asn.2020050757 ◽

2021 ◽

Vol 32 (3) ◽

pp. 614-627

Author(s):

Amin Abedini ◽

Yuan O. Zhu ◽

Shatakshee Chatterjee ◽

Gabor Halasz ◽

Kishor Devalaraja-Narashimha ◽

...

Keyword(s):

Single Cell ◽

Collecting Duct ◽

Microscopic Analysis ◽

Human Kidney ◽

Cell Types ◽

Urine Sediment ◽

Type Composition ◽

Single Cell Profiling ◽

Bladder Cells ◽

Almost All

BackgroundMicroscopic analysis of urine sediment is probably the most commonly used diagnostic procedure in nephrology. The urinary cells, however, have not yet undergone careful unbiased characterization.MethodsSingle-cell transcriptomic analysis was performed on 17 urine samples obtained from five subjects at two different occasions, using both spot and 24-hour urine collection. A pooled urine sample from multiple healthy individuals served as a reference control. In total 23,082 cells were analyzed. Urinary cells were compared with human kidney and human bladder datasets to understand similarities and differences among the observed cell types.ResultsAlmost all kidney cell types can be identified in urine, such as podocyte, proximal tubule, loop of Henle, and collecting duct, in addition to macrophages, lymphocytes, and bladder cells. The urinary cell–type composition was subject specific and reasonably stable using different collection methods and over time. Urinary cells clustered with kidney and bladder cells, such as urinary podocytes with kidney podocytes, and principal cells of the kidney and urine, indicating their similarities in gene expression.ConclusionsA reference dataset for cells in human urine was generated. Single-cell transcriptomics enables detection and quantification of almost all types of cells in the kidney and urinary tract.

Download Full-text

Destin: toolkit for single-cell analysis of chromatin accessibility

Bioinformatics ◽

10.1093/bioinformatics/btz141 ◽

2019 ◽

Vol 35 (19) ◽

pp. 3818-3820 ◽

Cited By ~ 10

Author(s):

Eugene Urrutia ◽

Li Chen ◽

Haibo Zhou ◽

Yuchao Jiang

Keyword(s):

Single Cell ◽

Single Cell Analysis ◽

New Technology ◽

R Package ◽

Chromatin Accessibility ◽

Supplementary Information ◽

Cell Type ◽

Statistical Framework ◽

Specific Association ◽

Accessible Chromatin

Abstract Summary Single-cell assay of transposase-accessible chromatin followed by sequencing (scATAC-seq) is an emerging new technology for the study of gene regulation with single-cell resolution. The data from scATAC-seq are unique—sparse, binary and highly variable even within the same cell type. As such, neither methods developed for bulk ATAC-seq nor single-cell RNA-seq data are appropriate. Here, we present Destin, a bioinformatic and statistical framework for comprehensive scATAC-seq data analysis. Destin performs cell-type clustering via weighted principle component analysis, weighting accessible chromatin regions by existing genomic annotations and publicly available regulomic datasets. The weights and additional tuning parameters are determined via model-based likelihood. We evaluated the performance of Destin using downsampled bulk ATAC-seq data of purified samples and scATAC-seq data from seven diverse experiments. Compared to existing methods, Destin was shown to outperform across all datasets and platforms. For demonstration, we further applied Destin to 2088 adult mouse forebrain cells and identified cell-type-specific association of previously reported schizophrenia GWAS loci. Availability and implementation Destin toolkit is freely available as an R package at https://github.com/urrutiag/destin. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text