Scalable analysis of cell-type composition from single-cell transcriptomics using deep recurrent learning

The rapid proliferation of single-cell RNA-sequencing (scRNA-seq) datasets have revealed cell heterogeneity at unprecedented scales. Several deconvolution methods have been developed to decompose bulk experiments to reveal cell type contributions. However, these methods lack power in identifying the accurate cell type composition when having a considerable amount of sub-cell types in the reference dataset. Here, we present LRcell, a R Bioconductor package (http://bioconductor.org/packages/release/bioc/html/LRcell.html) aiming to identify specific sub-cell type(s) that drives the changes observed in a bulk RNA-seq differential gene expression experiment. In addition, LRcell provides pre-embedded marker genes computed from putative single-cell RNA-seq experiments as options to execute the analyses.

Download Full-text

Comprehensive benchmarking of computational deconvolution of transcriptomics data

10.1101/2020.01.10.897116 ◽

2020 ◽

Cited By ~ 2

Author(s):

Francisco Avila Cobos ◽

José Alquicira-Hernandez ◽

Joseph Powell ◽

Pieter Mestdagh ◽

Katleen De Preter

Keyword(s):

Single Cell ◽

Cell Types ◽

Cell Type ◽

Factors Affecting ◽

Marker Selection ◽

Cell Type Composition ◽

Type Composition ◽

Comparable Performance ◽

Transcriptomics Data ◽

Combined Impact

AbstractMany computational methods to infer cell type proportions from bulk transcriptomics data have been developed. Attempts comparing these methods revealed that the choice of reference marker signatures is far more important than the method itself. However, a thorough evaluation of the combined impact of data transformation, pre-processing, marker selection, cell type composition and choice of methodology on the results is still lacking.Using different single-cell RNA-sequencing (scRNA-seq) datasets, we generated hundreds of pseudo-bulk mixtures to evaluate the combined impact of these factors on the deconvolution results. Along with methods to perform deconvolution of bulk RNA-seq data we also included five methods specifically designed to infer the cell type composition of bulk data using scRNA-seq data as reference.Both bulk and single-cell deconvolution methods perform best when applied to data in linear scale and the choice of normalization can have a dramatic impact on the performance of some, but not all methods. Overall, single-cell methods have comparable performance to the best performing bulk methods and bulk methods based on semi-supervised approaches showed higher error and lower correlation values between the computed and the expected proportions. Moreover, failure to include cell types in the reference that are present in a mixture always led to substantially worse results, regardless of any of the previous choices. Taken together, we provide a thorough evaluation of the combined impact of the different factors affecting the computational deconvolution task across different datasets and propose general guidelines to maximize its performance.

Download Full-text

Genetic design automation for autonomous formation of multicellular shapes from a single cell progenitor

10.1101/807107 ◽

2019 ◽

Author(s):

Evan Appleton ◽

Noushin Mehdipour ◽

Tristan Daifuku ◽

Demarcus Briers ◽

Iman Haghighi ◽

...

Keyword(s):

Single Cell ◽

Cell Biology ◽

Computer Aided Design ◽

Complex Structures ◽

Cell Type ◽

Genetic Circuits ◽

Cell Type Composition ◽

Type Composition ◽

Starting Point ◽

Aided Design

AbstractMulti-cellular organisms originate from a single cell, ultimately giving rise to mature organisms of heterogeneous cell type composition in complex structures. Recent work in the areas of stem cell biology and tissue engineering have laid major groundwork in the ability to convert certain types of cells into other types, but there has been limited progress in the ability to control the morphology of cellular masses as they grow. Contemporary approaches to this problem have included the use of artificial scaffolds, 3D bioprinting, and complex media formulations, however, there are no existing approaches to controlling this process purely through genetics and from a single-cell starting point. Here we describe a computer-aided design approach for designing recombinase-based genetic circuits for controlling the formation of multi-cellular masses into arbitrary shapes in human cells.

Download Full-text

propeller: testing for differences in cell type proportions in single cell data

10.1101/2021.11.28.470236 ◽

2021 ◽

Author(s):

Belinda Phipson ◽

Choon Boon Sim ◽

Enzo R. Porrello ◽

Alex W Hewitt ◽

Joseph Powell ◽

...

Keyword(s):

Single Cell ◽

Single Cells ◽

R Package ◽

Cell Type ◽

Experimental Conditions ◽

Cell Type Composition ◽

Type Composition ◽

Biological Replication ◽

Cell Data ◽

Different Sources

Single cell RNA Sequencing (scRNA-seq) has rapidly gained popularity over the last few years for profiling the transcriptomes of thousands to millions of single cells. To date, there are more than a thousand software packages that have been developed to analyse scRNA-seq data. These focus predominantly on visualization, dimensionality reduction and cell type identification. Single cell technology is now being used to analyse experiments with complex designs including biological replication. One question that can be asked from single cell experiments which has not been possible to address with bulk RNA-seq data is whether the cell type proportions are different between two or more experimental conditions. As well as gene expression changes, the relative depletion or enrichment of a particular cell type can be the functional consequence of disease or treatment. However, cell type proportions estimates from scRNA-seq data are variable and statistical methods that can correctly account for different sources of variability are needed to confidently identify statistically significant shifts in cell type composition between experimental conditions. We present propeller, a robust and flexible method that leverages biological replication to find statistically significant differences in cell type proportions between groups. The propeller method is publicly available in the open source speckle R package (https://github.com/Oshlack/speckle).

Download Full-text

SingleCellNet: a computational tool to classify single cell RNA-Seq data across platforms and across species

10.1101/508085 ◽

2018 ◽

Cited By ~ 8

Author(s):

Yuqi Tan ◽

Patrick Cahan

Keyword(s):

Single Cell ◽

Cell Fate ◽

Marker Genes ◽

Rna Seq ◽

Cell Type ◽

Computational Tool ◽

Cell Type Composition ◽

Type Composition ◽

Cell Type Specific

Single cell RNA-Seq has emerged as a powerful tool in diverse applications, ranging from determining the cell-type composition of tissues to uncovering the regulators of developmental programs. A near-universal step in the analysis of single cell RNA-Seq data is to hypothesize the identity of each cell. Often, this is achieved by finding cells that express combinations of marker genes that had previously been implicated as being cell-type specific, an approach that is not quantitative and does not explicitly take advantage of other single cell RNA-Seq studies. Here, we describe our tool, SingleCellNet, which addresses these issues and enables the classification of query single cell RNA-Seq data in comparison to reference single cell RNA-Seq data. SingleCellNet compares favorably to other methods, and it is notably able to make sensitive and accurate classifications across platforms and species. We demonstrate how SingleCellNet can be used to classify previously undetermined cells, and how it can be used to assess the outcome of cell fate engineering experiments.

Download Full-text

Strategies for cellular deconvolution in human brain RNA sequencing data

F1000Research ◽

10.12688/f1000research.50858.1 ◽

2021 ◽

Vol 10 ◽

pp. 750

Author(s):

Olukayode A. Sosina ◽

Matthew N. Tran ◽

Kristen R. Maynard ◽

Ran Tao ◽

Margaret A. Taub ◽

...

Keyword(s):

Single Cell ◽

Expression Data ◽

Moderate Correlation ◽

Rna Seq ◽

Cell Type ◽

Sequencing Data ◽

Reference Dataset ◽

The Past ◽

Cell Type Composition ◽

Type Composition

Background: Statistical deconvolution strategies have emerged over the past decade to estimate the proportion of various cell populations in homogenate tissue sources like brain using gene expression data. However, no study has been undertaken to assess the extent to which expression-based and DNAm-based cell type composition estimates agree. Results: Using estimated neuronal fractions from DNAm data, from the same brain region (i.e., matched) as our bulk RNA-Seq dataset, as proxies for the true unobserved cell-type fractions (i.e., as the gold standard), we assessed the accuracy (RMSE) and concordance (R2) of four reference-based deconvolution algorithms: Houseman, CIBERSORT, non-negative least squares (NNLS)/MIND, and MuSiC. We did this for two cell-type populations - neurons and non-neurons/glia - using matched single nuclei RNA-Seq and mismatched single cell RNA-Seq reference datasets. With the mismatched single cell RNA-Seq reference dataset, Houseman, MuSiC, and NNLS produced concordant (high correlation; Houseman R2 = 0.51, 95% CI [0.39, 0.65]; MuSiC R2 = 0.56, 95% CI [0.43, 0.69]; NNLS R2 = 0.54, 95% CI [0.32, 0.68]) but biased (high RMSE, >0.35) neuronal fraction estimates. CIBERSORT produced more discordant (moderate correlation; R2 = 0.25, 95% CI [0.15, 0.38]) neuronal fraction estimates, but with less bias (low RSME, 0.09). Using the matched single nuclei RNA-Seq reference dataset did not eliminate bias (MuSiC RMSE = 0.17). Conclusions: Our results together suggest that many existing RNA deconvolution algorithms estimate the RNA composition of homogenate tissue, e.g. the amount of RNA attributable to each cell type, and not the cellular composition, which relates to the underlying fraction of cells.

Download Full-text

Spatial cell type composition in normal and Alzheimers human brains is revealed using integrated mouse and human single cell RNA sequencing

Scientific Reports ◽

10.1038/s41598-020-74917-w ◽

2020 ◽

Vol 10 (1) ◽

Author(s):

Travis S. Johnson ◽

Shunian Xiang ◽

Bryan R. Helm ◽

Zachary B. Abrams ◽

Peter Neidecker ◽

...

Keyword(s):

Human Brain ◽

Single Cell ◽

Rna Sequencing ◽

Brain Tissue ◽

Cell Types ◽

Brain Atlas ◽

Cell Type ◽

Cell Type Composition ◽

Type Composition ◽

Single Cell Rna Sequencing

Abstract Single-cell RNA sequencing (scRNA-seq) resolves heterogenous cell populations in tissues and helps to reveal single-cell level function and dynamics. In neuroscience, the rarity of brain tissue is the bottleneck for such study. Evidence shows that, mouse and human share similar cell type gene markers. We hypothesized that the scRNA-seq data of mouse brain tissue can be used to complete human data to infer cell type composition in human samples. Here, we supplement cell type information of human scRNA-seq data, with mouse. The resulted data were used to infer the spatial cellular composition of 3702 human brain samples from Allen Human Brain Atlas. We then mapped the cell types back to corresponding brain regions. Most cell types were localized to the correct regions. We also compare the mapping results to those derived from neuronal nuclei locations. They were consistent after accounting for changes in neural connectivity between regions. Furthermore, we applied this approach on Alzheimer’s brain data and successfully captured cell pattern changes in AD brains. We believe this integrative approach can solve the sample rarity issue in the neuroscience.

Download Full-text

scDC: single cell differential composition analysis

BMC Bioinformatics ◽

10.1186/s12859-019-3211-9 ◽

2019 ◽

Vol 20 (S19) ◽

Cited By ~ 6

Author(s):

Yue Cao ◽

Yingxin Lin ◽

John T. Ormerod ◽

Pengyi Yang ◽

Jean Y.H. Yang ◽

...

Keyword(s):

Single Cell ◽

Confidence Intervals ◽

Biological Significance ◽

Composition Analysis ◽

Bootstrap Resampling ◽

Cell Type ◽

Cell Type Composition ◽

Type Composition ◽

Differential Cell ◽

Synthetic Datasets

Abstract Background Differences in cell-type composition across subjects and conditions often carry biological significance. Recent advancements in single cell sequencing technologies enable cell-types to be identified at the single cell level, and as a result, cell-type composition of tissues can now be studied in exquisite detail. However, a number of challenges remain with cell-type composition analysis – none of the existing methods can identify cell-type perfectly and variability related to cell sampling exists in any single cell experiment. This necessitates the development of method for estimating uncertainty in cell-type composition. Results We developed a novel single cell differential composition (scDC) analysis method that performs differential cell-type composition analysis via bootstrap resampling. scDC captures the uncertainty associated with cell-type proportions of each subject via bias-corrected and accelerated bootstrap confidence intervals. We assessed the performance of our method using a number of simulated datasets and synthetic datasets curated from publicly available single cell datasets. In simulated datasets, scDC correctly recovered the true cell-type proportions. In synthetic datasets, the cell-type compositions returned by scDC were highly concordant with reference cell-type compositions from the original data. Since the majority of datasets tested in this study have only 2 to 5 subjects per condition, the addition of confidence intervals enabled better comparisons of compositional differences between subjects and across conditions. Conclusions scDC is a novel statistical method for performing differential cell-type composition analysis for scRNA-seq data. It uses bootstrap resampling to estimate the standard errors associated with cell-type proportion estimates and performs significance testing through GLM and GLMM models. We have made this method available to the scientific community as part of the scdney package (Single Cell Data Integrative Analysis) R package, available from https://github.com/SydneyBioX/scdney.

Download Full-text

SpiceMix: Integrative single-cell spatial modeling for inferring cell identity

10.1101/2020.11.29.383067 ◽

2020 ◽

Author(s):

Benjamin Chidester ◽

Tianming Zhou ◽

Jian Ma

Keyword(s):

Single Cell ◽

Spatial Organization ◽

Spatial Information ◽

Single Cells ◽

Joint Analysis ◽

Transcriptome Data ◽

Cell Type ◽

Cell Identity ◽

Cell Type Composition ◽

Type Composition

AbstractSpatial transcriptomics technologies promise to reveal spatial relationships of cell-type composition in complex tissues. However, the development of computational methods that capture the unique properties of single-cell spatial transcriptome data to unveil cell identities remains a challenge. Here, we report SpiceMix, a new probabilistic model that enables effective joint analysis of spatial information and gene expression of single cells based on spatial transcriptome data. Both simulation and real data evaluations demonstrate that SpiceMix consistently improves upon the inference of the intrinsic cell types compared with existing approaches. As a proof-of-principle, we use SpiceMix to analyze single-cell spatial transcriptome data of the mouse primary visual cortex acquired by seqFISH+ and STARmap. We find that SpiceMix can improve cell identity assignments and uncover potentially new cell subtypes. SpiceMix is a generalizable framework for analyzing spatial transcriptome data that may provide critical insights into the cell-type composition and spatial organization of cells in complex tissues.

Download Full-text

Single Cell Transcriptional Signatures of the Human Placenta in Term and Preterm Parturition

10.1101/738658 ◽

2019 ◽

Cited By ~ 2

Author(s):

Roger Pique-Regi ◽

Roberto Romero ◽

Adi L.Tarca ◽

Edward D. Sendler ◽

Yi Xu ◽

...

Keyword(s):

Gene Expression ◽

T Cell ◽

Single Cell ◽

Human Placenta ◽

Preterm Labor ◽

Cell Types ◽

Cell Type ◽

Cell Type Composition ◽

Type Composition ◽

Activated T Cell

AbstractMore than 135 million births occur each year; yet, the molecular underpinnings of human parturition in gestational tissues, and in particular the placenta, are still poorly understood. The placenta is a complex heterogeneous organ including cells of both maternal and fetal origin, and insults that disrupt the maternal-fetal dialogue could result in adverse pregnancy outcomes such as preterm birth. There is limited knowledge of the cell type composition and transcriptional activity of the placenta and its compartments during physiologic and pathologic parturition. To fill this knowledge gap, we used scRNA-seq to profile the placental villous tree, basal plate, and chorioamniotic membranes of women with or without labor at term and those with preterm labor. Significant differences in cell type composition and transcriptional profiles were found among placental compartments and across study groups. For the first time, two cell types were identified: 1) lymphatic endothelial decidual cells in the chorioamniotic membranes, and 2) non-proliferative interstitial cytotrophoblasts in the placental villi. Maternal macrophages from the chorioamniotic membranes displayed the largest differences in gene expression (e.g. NFKB1) in both processes of labor; yet, specific gene expression changes were also detected in preterm labor. Importantly, several placental scRNA-seq transcriptional signatures were modulated with advancing gestation in the maternal circulation, and specific immune cell type signatures were increased with labor at term (NK-cell and activated T-cell) and with preterm labor (macrophage, monocyte, and activated T-cell). Herein, we provide a catalogue of cell types and transcriptional profiles in the human placenta, shedding light on the molecular underpinnings and non-invasive prediction of the physiologic and pathologic parturition.One sentence summaryThe common molecular pathway of parturition for both term and preterm spontaneous labor is characterized using single cell gene expression analysis of the human placenta.

Download Full-text