Clustering single cells: a review of approaches on high-and low-depth single-cell RNA-seq data

Multimodal single-cell RNA sequencing enables the precise mapping of transcriptional and phenotypic features of cellular differentiation states but does not allow for simultaneous integration of critical posttranslational modification data. Here, we describe SUrface-protein Glycan And RNA-seq (SUGAR-seq), a method that enables detection and analysis of N-linked glycosylation, extracellular epitopes, and the transcriptome at the single-cell level. Integrated SUGAR-seq and glycoproteome analysis identified tumor-infiltrating T cells with unique surface glycan properties that report their epigenetic and functional state.

Download Full-text

Leveraging high-powered RNA-Seq datasets to improve inference of regulatory activity in single-cell RNA-Seq data

10.1101/553040 ◽

2019 ◽

Cited By ~ 1

Author(s):

Ning Wang ◽

Andrew E. Teschendorff

Keyword(s):

Transcription Factors ◽

Single Cell ◽

Cell Fate ◽

Regulatory Networks ◽

Large Scale ◽

Single Cells ◽

Differential Expression Analysis ◽

Dropout Rate ◽

Rna Seq ◽

Regulatory Activity

AbstractInferring the activity of transcription factors in single cells is a key task to improve our understanding of development and complex genetic diseases. This task is, however, challenging due to the relatively large dropout rate and noisy nature of single-cell RNA-Seq data. Here we present a novel statistical inference framework called SCIRA (Single Cell Inference of Regulatory Activity), which leverages the power of large-scale bulk RNA-Seq datasets to infer high-quality tissue-specific regulatory networks, from which regulatory activity estimates in single cells can be subsequently obtained. We show that SCIRA can correctly infer regulatory activity of transcription factors affected by high technical dropouts. In particular, SCIRA can improve sensitivity by as much as 70% compared to differential expression analysis and current state-of-the-art methods. Importantly, SCIRA can reveal novel regulators of cell-fate in tissue-development, even for cell-types that only make up 5% of the tissue, and can identify key novel tumor suppressor genes in cancer at single cell resolution. In summary, SCIRA will be an invaluable tool for single-cell studies aiming to accurately map activity patterns of key transcription factors during development, and how these are altered in disease.

Download Full-text

Characterizing Intercellular Communication of Pan-Cancer Reveals SPP1+ Tumor-Associated Macrophage Expanded in Hypoxia and Promoting Cancer Malignancy Through Single-Cell RNA-Seq Data

Frontiers in Cell and Developmental Biology ◽

10.3389/fcell.2021.749210 ◽

2021 ◽

Vol 9 ◽

Author(s):

Jinfen Wei ◽

Zixi Chen ◽

Meiling Hu ◽

Ziqing He ◽

Dawei Jiang ◽

...

Keyword(s):

Single Cell ◽

Cancer Cells ◽

Cancer Progression ◽

Epithelial Mesenchymal Transition ◽

Single Cells ◽

Matrix Remodeling ◽

Rna Seq ◽

Mesenchymal Transition ◽

Tumor Associated Macrophage ◽

Cancer Types

Hypoxia is a characteristic of tumor microenvironment (TME) and is a major contributor to tumor progression. Yet, subtype identification of tumor-associated non-malignant cells at single-cell resolution and how they influence cancer progression under hypoxia TME remain largely unexplored. Here, we used RNA-seq data of 424,194 single cells from 108 patients to identify the subtypes of cancer cells, stromal cells, and immune cells; to evaluate their hypoxia score; and also to uncover potential interaction signals between these cells in vivo across six cancer types. We identified SPP1+ tumor-associated macrophage (TAM) subpopulation potentially enhanced epithelial–mesenchymal transition (EMT) by interaction with cancer cells through paracrine pattern. We prioritized SPP1 as a TAM-secreted factor to act on cancer cells and found a significant enhanced migration phenotype and invasion ability in A549 lung cancer cells induced by recombinant protein SPP1. Besides, prognostic analysis indicated that a higher expression of SPP1 was found to be related to worse clinical outcome in six cancer types. SPP1 expression was higher in hypoxia-high macrophages based on single-cell data, which was further validated by an in vitro experiment that SPP1 was upregulated in macrophages under hypoxia-cultured compared with normoxic conditions. Additionally, a differential analysis demonstrated that hypoxia potentially influences extracellular matrix remodeling, glycolysis, and interleukin-10 signal activation in various cancer types. Our work illuminates the clearer underlying mechanism in the intricate interaction between different cell subtypes within hypoxia TME and proposes the guidelines for the development of therapeutic targets specifically for patients with high proportion of SPP1+ TAMs in hypoxic lesions.

Download Full-text

DEsingle for detecting three types of differential expression in single-cell RNA-seq data

10.1101/173997 ◽

2017 ◽

Cited By ~ 1

Author(s):

Zhun Miao ◽

Ke Deng ◽

Xiaowo Wang ◽

Xuegong Zhang

Keyword(s):

Single Cell ◽

Differential Expression ◽

Negative Binomial ◽

Single Cells ◽

R Package ◽

Supplementary Information ◽

Binomial Model ◽

Supplementary Data ◽

Rna Seq ◽

Real Zeros

AbstractSummaryThe excessive amount of zeros in single-cell RNA-seq data include “real” zeros due to the on-off nature of gene transcription in single cells and “dropout” zeros due to technical reasons. Existing differential expression (DE) analysis methods cannot distinguish these two types of zeros. We developed an R package DEsingle which employed Zero-Inflated Negative Binomial model to estimate the proportion of real and dropout zeros and to define and detect 3 types of DE genes in single-cell RNA-seq data with higher accuracy.Availability and ImplementationThe R package DEsingle is freely available at https://github.com/miaozhun/DEsingle and is under Bioconductor’s consideration [email protected] informationSupplementary data are available at bioRxiv online.

Download Full-text

Single cell census of human kidney organoids shows reproducibility and diminished off-target cells after transplantation

Nature Communications ◽

10.1038/s41467-019-13382-0 ◽

2019 ◽

Vol 10 (1) ◽

Cited By ~ 17

Author(s):

Ayshwarya Subramanian ◽

Eriene-Heidi Sidhom ◽

Maheswarareddy Emani ◽

Katherine Vernon ◽

Nareh Sahakian ◽

...

Keyword(s):

Single Cell ◽

Single Cells ◽

Human Kidney ◽

Mouse Kidney ◽

Target Cells ◽

Rna Seq ◽

Kidney Capsule ◽

Time Points ◽

Human Ipsc ◽

Kidney Organoids

AbstractHuman iPSC-derived kidney organoids have the potential to revolutionize discovery, but assessing their consistency and reproducibility across iPSC lines, and reducing the generation of off-target cells remain an open challenge. Here, we profile four human iPSC lines for a total of 450,118 single cells to show how organoid composition and development are comparable to human fetal and adult kidneys. Although cell classes are largely reproducible across time points, protocols, and replicates, we detect variability in cell proportions between different iPSC lines, largely due to off-target cells. To address this, we analyze organoids transplanted under the mouse kidney capsule and find diminished off-target cells. Our work shows how single cell RNA-seq (scRNA-seq) can score organoids for reproducibility, faithfulness and quality, that kidney organoids derived from different iPSC lines are comparable surrogates for human kidney, and that transplantation enhances their formation by diminishing off-target cells.

Download Full-text

Computational approaches towards reducing contamination in single-cell RNA-seq data

10.1101/2020.07.15.205062 ◽

2020 ◽

Author(s):

Siamak Yousefi ◽

Hao Chen ◽

Jesse F. Ingels ◽

Melinda S. McCarty ◽

Arthur G. Centeno ◽

...

Keyword(s):

Single Cell ◽

Single Cells ◽

Real Life ◽

Cell Types ◽

Cell Capture ◽

Rna Seq ◽

Sequence Analyses ◽

Cell Functions ◽

Biological Interpretation ◽

Different Cell Types

SUMMARYSingle cell RNA sequencing has enabled quantification of single cells and identification of different cell types and subtypes as well as cell functions in different tissues. Single cell RNA sequence analyses assume acquired RNAs correspond to cells, however, RNAs from contamination within the input data are also captured by these assays. The sequencing of background contamination as well as unwanted cells making their way to the final assay Potentially confound the correct biological interpretation of single cell transcriptomic data. Here we demonstrate two approaches to deal with background contamination as well as profiling of unwanted cells in the assays. We use three real-life datasets of whole-cell capture and nucleotide single-cell captures generated by Fluidigm and 10x technologies and show that these methods reduce the effect of contamination, strengthen clustering of cells and improves biological interpretation.

Download Full-text

Single Cell RNA-Seq Characterises Pre-Leukemic Transformation Driven By CEBPA N321D in the Hoxb8-FL Cell Line

Blood ◽

10.1182/blood-2018-99-110626 ◽

2018 ◽

Vol 132 (Supplement 1) ◽

pp. 3887-3887

Author(s):

Moosa Qureshi ◽

Fernando Calero-Nieto ◽

Iwo Kucinski ◽

Sarah Kinston ◽

George Giotopoulos ◽

...

Keyword(s):

Dendritic Cell ◽

Single Cell ◽

Cell Line ◽

Single Cells ◽

Mutant Form ◽

Rna Seq ◽

Wild Type ◽

Leukemic Transformation

Abstract The C/EBPα transcription factor plays a pivotal role in myeloid differentiation and E2F-mediated cell cycle regulation. Although CEBPA mutations are common in acute myeloid leukaemia (AML), little is known regarding pre-leukemic alterations caused by mutated CEBPA. Here, we investigated early events involved in pre-leukemic transformation driven by CEBPA N321D in the LMPP-like cell line Hoxb8-FL (Redecke et al., Nat Methods 2013), which can be maintained in vitro as a self-renewing LMPP population using Flt3L and estradiol, as well as differentiated both in vitro and in vivo into myeloid and lymphoid cell types. Hoxb8-FL cells were retrovirally transduced with Empty Vector (EV), wild-type CEBPA (CEBPA WT) or its N321D mutant form (CEBPA N321D). CEBPA WT-transduced cells showed increased expression of cd11b and SIRPα and downregulation of c-kit, suggesting that wild-type CEBPA was sufficient to promote differentiation even under LMPP growth conditions. Interestingly, we did not observe the same phenotype in CEBPA N321D-transduced cells. Upon withdrawal of estradiol, both EV and CEBPA WT-transduced cells differentiated rapidly into a conventional dendritic cell (cDC) phenotype by day 7 and died within 12 days. By contrast, CEBPA N321D-transduced cells continued to grow for in excess of 56 days, with an initial cDC phenotype but by day 30 demonstrating a plasmacytoid dendritic cell precursor phenotype. CEBPA N321D-transduced cells were morphologically distinct from EV-transduced cells. To test leukemogenic potential in vivo, we performed transplantation experiments in lethally irradiated mice. Serial monitoring of peripheral blood demonstrated that Hoxb8-FL derived cells had disappeared by 4 weeks, and did not reappear. However, at 6 months CEBPA N321D-transduced cells could still be detected in bone marrow in contrast to EV-transduced cells but without any leukemic phenotype. To identify early events involved in pre-leukemic transformation, the differentiation profiles of EV, CEBPA WT and CEBPA N321D-transduced cells were examined with single cell RNA-seq (scRNA-seq). 576 single cells were taken from 3 biological replicates at days 0 and 5 post-differentiation, and analysed using the Automated Single-Cell Analysis Pipeline (Gardeux et al., Bioinformatics 2017). Visualisation by t-SNE (Fig 1) demonstrated: (i) CEBPA WT-transduced cells formed a distinct cluster at day 0 before withdrawal of estradiol; (ii) CEBPA N321D-transduced cells separated from EV and CEBPA WT-transduced cells after 5 days of differentiation, (iii) two subpopulations could be identified within the CEBPA N321D-transduced cells at day 5, with a cluster of five CEBPA N321D-transduced single cells distributed amongst or very close to the day 0 non-differentiated cells. Differential expression analysis identified 224 genes upregulated and 633 genes downregulated specifically in the CEBPA N321D-transduced cells when compared to EV cells after 5 days of differentiation. This gene expression signature revealed that CEBPA N321D-transduced cells switched on a HSC/MEP/CMP transcriptional program and switched off a myeloid dendritic cell program. Finally, in order to further dissect the effect of the N321D mutation, the binding profile of endogenous and CEBPA N321D was compared by ChIP-seq before and after 5 days of differentiation. Integration with scRNA-seq data identified 160 genes specifically downregulated in CEBPA N321D-transduced cells which were associated with the binding of the mutant protein. This list of genes included genes previously implicated in dendritic cell differentiation (such as NOTCH2, JAK2), as well as a number of genes not previously implicated in the evolution of AML, representing potentially novel therapeutic targets. Disclosures No relevant conflicts of interest to declare.

Download Full-text

SHERRY2: A method for rapid and sensitive single cell RNA-seq

10.1101/2021.12.25.474161 ◽

2021 ◽

Author(s):

Lin Di ◽

Bo Liu ◽

Yuzhu Lyu ◽

Shihui Zhao ◽

Yuhong Pang ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

Dynamic Range ◽

Single Cells ◽

Rna Seq ◽

Wide Dynamic Range ◽

Uniform Coverage ◽

Optimized Protocol ◽

Tn5 Transposase ◽

Higher Sensitivity

Many single cell RNA-seq applications aim to probe a wide dynamic range of gene expression, but most of them are still challenging to accurately quantify low-aboundance transcripts. Based on our previous finding that Tn5 transposase can directly cut-and-tag DNA/RNA hetero-duplexes, we present SHERRY2, an optimized protocol for sequencing transcriptomes of single cells or single nuclei. SHERRY2 is robust and scalable, and it has higher sensitivity and more uniform coverage in comparison with prevalent scRNA-seq methods. With throughput of a few thousand cells per batch, SHERRY2 can reveal the subtle transcriptomic differences between cells and facilitate important biological discoveries.

Download Full-text

scBASE: A Bayesian mixture model for the analysis of allelic expression in single cells

10.1101/383224 ◽

2018 ◽

Author(s):

Kwangbom Choi ◽

Narayanan Raghupathy ◽

Gary A. Churchill

Keyword(s):

Single Cell ◽

Single Cells ◽

Allelic Expression ◽

Biological Variability ◽

Rna Seq ◽

Dynamic Features ◽

Specific Expression ◽

Bayesian Mixture Model ◽

Bayesian Mixture ◽

Allele Specific

Allele-specific expression (ASE) at single-cell resolution is a critical tool for understanding the stochastic and dynamic features of gene expression. However, low read coverage and high biological variability present challenges for analyzing ASE. We propose a new method for ASE analysis from single cell RNA-Seq data that accurately classifies allelic expression states and improves estimation of allelic proportions by pooling information across cells.

Download Full-text

Bayesian inference of the gene expression states of single cells from scRNA-seq data

10.1101/2019.12.28.889956 ◽

2019 ◽

Cited By ~ 3

Author(s):

Jérémie Breda ◽

Mihaela Zavolan ◽

Erik van Nimwegen

Keyword(s):

Gene Expression ◽

Single Cell ◽

Single Cells ◽

Downstream Processing ◽

Noise Removal ◽

Rna Seq ◽

Expression Of Genes ◽

Normalization Methods ◽

Quantify Gene Expression ◽

Selection Of

AbstractIn spite of a large investment in the development of methodologies for analysis of single-cell RNA-seq data, there is still little agreement on how to best normalize such data, i.e. how to quantify gene expression states of single cells from such data. Starting from a few basic requirements such as that inferred expression states should correct for both intrinsic biological fluctuations and measurement noise, and that changes in expression state should be measured in terms of fold-changes rather than changes in absolute levels, we here derive a unique Bayesian procedure for normalizing single-cell RNA-seq data from first principles. Our implementation of this normalization procedure, called Sanity (SAmpling Noise corrected Inference of Transcription activitY), estimates log expression values and associated errors bars directly from raw UMI counts without any tunable parameters.Comparison of Sanity with other recent normalization methods on a selection of scRNA-seq datasets shows that Sanity outperforms other methods on basic downstream processing tasks such as clustering cells into subtypes and identification of differentially expressed genes. More importantly, we show that all other normalization methods present severely distorted pictures of the data. By failing to account for biological and technical Poisson noise, many methods systematically predict the lowest expressed genes to be most variable in expression, whereas in reality these genes provide least evidence of true biological variability. In addition, by confounding noise removal with lower-dimensional representation of the data, many methods introduce strong spurious correlations of expression levels with the total UMI count of each cell as well as spurious co-expression of genes.

Download Full-text