Single-cell multi-omic profiling of chromatin conformation and DNA methylation

Abstract The ability to profile epigenomic features in single cells is facilitating the study of the variation in transcription regulation at the single cell level. Single cell methods have also facilitated the generation of cell-type resolved transcriptomic and epigenetic profiles of lineages derived from complex heterogeneous samples. However, integrating different epigenetic features remain challenging, as many current methods profile a single data type at at time. Furthermore, some epigenetic features, such as 3D genome organization, are intrinsically variable between single cells of the same lineage, so it remains unclear how well these methods may resolve cell-types from complex mixtures. Here we describe a method for profiling 3D genome organization and DNA methylation in single cells. This protocol accompanies Lee et al. (Nature Methods 2019) after peer review to aid potential users in applying the method to their own samples.

Download Full-text

Genomic Architecture of Cells in Tissues (GeACT): Study of Human Mid-gestation Fetus

10.1101/2020.04.12.038000 ◽

2020 ◽

Author(s):

Feng Tian ◽

Fan Zhou ◽

Xiang Li ◽

Wenping Ma ◽

Honggui Wu ◽

...

Keyword(s):

Transcription Factors ◽

Single Cell ◽

Human Cell ◽

Expression Profiles ◽

Single Cells ◽

Cell Types ◽

List Type ◽

Cell Type ◽

Genomic Architecture ◽

Gene Modules

SummaryBy circumventing cellular heterogeneity, single cell omics have now been widely utilized for cell typing in human tissues, culminating with the undertaking of human cell atlas aimed at characterizing all human cell types. However, more important are the probing of gene regulatory networks, underlying chromatin architecture and critical transcription factors for each cell type. Here we report the Genomic Architecture of Cells in Tissues (GeACT), a comprehensive genomic data base that collectively address the above needs with the goal of understanding the functional genome in action. GeACT was made possible by our novel single-cell RNA-seq (MALBAC-DT) and ATAC-seq (METATAC) methods of high detectability and precision. We exemplified GeACT by first studying representative organs in human mid-gestation fetus. In particular, correlated gene modules (CGMs) are observed and found to be cell-type-dependent. We linked gene expression profiles to the underlying chromatin states, and found the key transcription factors for representative CGMs.HighlightsGenomic Architecture of Cells in Tissues (GeACT) data for human mid-gestation fetusDetermining correlated gene modules (CGMs) in different cell types by MALBAC-DTMeasuring chromatin open regions in single cells with high detectability by METATACIntegrating transcriptomics and chromatin accessibility to reveal key TFs for a CGM

Download Full-text

A computational method to aid the design and analysis of single cell RNA-seq experiments for cell type identification

10.1101/247114 ◽

2018 ◽

Cited By ~ 1

Author(s):

Douglas Abrams ◽

Parveen Kumar ◽

R. Krishna Murthy Karuturi ◽

Joshy George

Keyword(s):

Experimental Design ◽

Single Cell ◽

Single Cells ◽

Cell Types ◽

Cell Number ◽

Fold Change ◽

Computational Method ◽

Marker Genes ◽

Cell Type ◽

Estimate Sample Size

AbstractBackgroundThe advent of single cell RNA sequencing (scRNA-seq) enabled researchers to study transcriptomic activity within individual cells and identify inherent cell types in the sample. Although numerous computational tools have been developed to analyze single cell transcriptomes, there are no published studies and analytical packages available to guide experimental design and to devise suitable analysis procedure for cell type identification.ResultsWe have developed an empirical methodology to address this important gap in single cell experimental design and analysis into an easy-to-use tool called SCEED (Single Cell Empirical Experimental Design and analysis). With SCEED, user can choose a variety of combinations of tools for analysis, conduct performance analysis of analytical procedures and choose the best procedure, and estimate sample size (number of cells to be profiled) required for a given analytical procedure at varying levels of cell type rarity and other experimental parameters. Using SCEED, we examined 3 single cell algorithms using 48 simulated single cell datasets that were generated for varying number of cell types and their proportions, number of genes expressed per cell, number of marker genes and their fold change, and number of single cells successfully profiled in the experiment.ConclusionsBased on our study, we found that when marker genes are expressed at fold change of 4 or more than the rest of the genes, either Seurat or Simlr algorithm can be used to analyze single cell dataset for any number of single cells isolated (minimum 1000 single cells were tested). However, when marker genes are expected to be only up to fC 2 upregulated, choice of the single cell algorithm is dependent on the number of single cells isolated and proportion of rare cell type to be identified. In conclusion, our work allows the assessment of various single cell methods and also aids in examining the single cell experimental design.

Download Full-text

HiCluster: A Robust Single-Cell Hi-C Clustering Method Based on Convolution and Random Walk

10.1101/506717 ◽

2018 ◽

Cited By ~ 2

Author(s):

Jingtian Zhou ◽

Jianzhu Ma ◽

Yusi Chen ◽

Chuankai Cheng ◽

Bokan Bao ◽

...

Keyword(s):

Random Walk ◽

Single Cell ◽

Clustering Algorithm ◽

Single Cell Analysis ◽

Single Cells ◽

Genome Structure ◽

Real Data ◽

Cell Types ◽

3D Genome ◽

Cell Clustering

3D genome structure plays a pivotal role in gene regulation and cellular function. Single-cell analysis of genome architecture has been achieved using imaging and chromatin conformation capture methods such as Hi-C. To study variation in chromosome structure between different cell types, computational approaches are needed that can utilize sparse and heterogeneous single-cell Hi-C data. However, few methods exist that are able to accurately and efficiently cluster such data into constituent cell types. Here, we describe HiCluster, a single-cell clustering algorithm for Hi-C contact matrices that is based on imputations using linear convolution and random walk. Using both simulated and real data as benchmarks, HiCluster significantly improves clustering accuracy when applied to low coverage Hi-C datasets compared to existing methods. After imputation by HiCluster, structures similar to topologically associating domains (TADs) could be identified within single cells, and their consensus boundaries among cells were enriched at the TAD boundaries observed in bulk samples. In summary, HiCluster facilitates visualization and comparison of single-cell 3D genomes.

Download Full-text

RNA splicing programs define tissue compartments and cell types at single cell resolution

10.1101/2021.05.01.442281 ◽

2021 ◽

Author(s):

Julia Eve Olivieri ◽

Roozbeh Dehghannasiri ◽

Peter Wang ◽

SoRi Jang ◽

Antoine de Morree ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

High Throughput ◽

Rna Splicing ◽

Single Cells ◽

Cell Types ◽

Mouse Lemur ◽

Cell Type ◽

Multiple Organs ◽

Single Cell Pcr

More than 95% of human genes are alternatively spliced. Yet, the extent splicing is regulated at single-cell resolution has remained controversial due to both available data and methods to interpret it. We apply the SpliZ, a new statistical approach that is agnostic to transcript annotation, to detect cell-type-specific regulated splicing in > 110K carefully annotated single cells from 12 human tissues. Using 10x data for discovery, 9.1% of genes with computable SpliZ scores are cell-type specifically spliced. These results are validated with RNA FISH, single cell PCR, and in high throughput with Smart-seq2. Regulated splicing is found in ubiquitously expressed genes such as actin light chain subunit MYL6 and ribosomal protein RPS24, which has an epithelial-specific microexon. 13% of the statistically most variable splice sites in cell-type specifically regulated genes are also most variable in mouse lemur or mouse. SpliZ analysis further reveals 170 genes with regulated splicing during sperm development using, 10 of which are conserved in mouse and mouse lemur. The statistical properties of the SpliZ allow model-based identification of subpopulations within otherwise indistinguishable cells based on gene expression, illustrated by subpopulations of classical monocytes with stereotyped splicing, including an un-annotated exon, in SAT1, a Diamine acetyltransferase. Together, this unsupervised and annotation-free analysis of differential splicing in ultra high throughput droplet-based sequencing of human cells across multiple organs establishes splicing is regulated cell-type-specifically independent of gene expression.

Download Full-text

Phenotypic convergence in the brain: distinct transcription factors regulate common terminal neuronal characters

10.1101/243113 ◽

2018 ◽

Cited By ~ 2

Author(s):

Nikos Konstantinides ◽

Katarina Kapuralin ◽

Chaimaa Fadil ◽

Luendreo Barboza ◽

Rahul Satija ◽

...

Keyword(s):

Transcription Factors ◽

Single Cell ◽

Large Scale ◽

Single Cells ◽

Deep Understanding ◽

Cell Types ◽

Marker Genes ◽

Cell Type ◽

Functional Specification ◽

Phenotypic Convergence

SummaryTranscription factors regulate the molecular, morphological, and physiological characters of neurons and generate their impressive cell type diversity. To gain insight into general principles that govern how transcription factors regulate cell type diversity, we used large-scale single-cell mRNA sequencing to characterize the extensive cellular diversity in the Drosophila optic lobes. We sequenced 55,000 single optic lobe neurons and glia and assigned them to 52 clusters of transcriptionally distinct single cells. We validated the clustering and annotated many of the clusters using RNA sequencing of characterized FACS-sorted single cell types, as well as marker genes specific to given clusters. To identify transcription factors responsible for inducing specific terminal differentiation features, we used machine-learning to generate a ‘random forest’ model. The predictive power of the model was confirmed by showing that two transcription factors expressed specifically in cholinergic (apterous) and glutamatergic (traffic-jam) neurons are necessary for the expression of ChAT and VGlut in many, but not all, cholinergic or glutamatergic neurons, respectively. We used a transcriptome-wide approach to show that the same terminal characters, including but not restricted to neurotransmitter identity, can be regulated by different transcription factors in different cell types, arguing for extensive phenotypic convergence. Our data provide a deep understanding of the developmental and functional specification of a complex brain structure.

Download Full-text

Heterogeneity and Intrinsic Variation in Spatial Genome Organization

10.1101/171801 ◽

2017 ◽

Cited By ~ 7

Author(s):

Elizabeth H. Finn ◽

Gianluca Pegoraro ◽

Hugo B. Brandão ◽

Anne-Laure Valton ◽

Marlies E. Oomen ◽

...

Keyword(s):

Genome Organization ◽

Single Cells ◽

Human Fibroblasts ◽

Intrinsic Variability ◽

Cell Level ◽

3D Genome ◽

3D Space ◽

A Cell ◽

Spatial Genome Organization ◽

Independent Behavior

AbstractThe genome is hierarchically organized in 3D space and its architecture is altered in differentiation, development and disease. Some of the general principles that determine global 3D genome organization have been established. However, the extent and nature of cell-to-cell and cell-intrinsic variability in genome architecture are poorly characterized. Here, we systematically probe the heterogeneity in genome organization in human fibroblasts by combining high-resolution Hi-C datasets and high-throughput genome imaging. Optical mapping of several hundred genome interaction pairs at the single cell level demonstrates low steady-state frequencies of colocalization in the population and independent behavior of individual alleles in single nuclei. Association frequencies are determined by genomic distance, higher-order chromatin architecture and chromatin environment. These observations reveal extensive variability and heterogeneity in genome organization at the level of single cells and alleles and they demonstrate the coexistence of a broad spectrum of chromatin and genome conformations in a cell population.

Download Full-text

Saturating Single-Cell atlas Datasets

10.1101/218370 ◽

2017 ◽

Cited By ~ 2

Author(s):

Aparna Bhaduri ◽

Tomasz J. Nowakowski ◽

Alex A. Pollen ◽

Arnold R. Kriegstein

Keyword(s):

Population Structure ◽

Single Cell ◽

Mouse Brain ◽

Large Scale ◽

Single Cells ◽

Cost Effective ◽

Cell Types ◽

Cell Number ◽

Cell Type ◽

The Relationship

AbstractHigh throughput methods for profiling the transcriptomes of single cells have recently emerged as transformative approaches for large-scale population surveys of cellular diversity in heterogeneous primary tissues. Efficient generation of such an atlas will depend on sufficient sampling of the diverse cell types while remaining cost-effective to enable a comprehensive examination of organs, developmental stages, and individuals. To examine the relationship between cell number and transcriptional heterogeneity in the context of unbiased cell type classification, we explicitly explored the population structure of a publically available 1.3 million cell dataset from the E18.5 mouse brain. We propose a computational framework for inferring the saturation point of cluster discovery in a single cell mRNA-seq experiment, centered around cluster preservation in downsampled datasets. In addition, we introduce a “complexity index”, which characterizes the heterogeneity of cells in a given dataset. Using Cajal-Retzius cells as an example of a limited complexity dataset, we explored whether biological distinctions relate to technical clustering. Surprisingly, we found that clustering distinctions carrying biologically interpretable meaning are achieved with far fewer cells (20,000). Together, these findings suggest that most of the biologically interpretable insights from the 1.3 million cells can be recapitulated by analyzing 50,000 randomly selected cells, indicating that instead of profiling few individuals at high “cellular coverage”, the much anticipated cell atlasing studies may instead benefit from profiling more individuals, or many time points at lower cellular coverage.Recent efforts seek to create a comprehensive cell atlas of the human body1,2 Current technology, however, makes it precipitously expensive to perform analysis of every cell. Therefore, designing effective sampling strategies be critical to generate a working atlas in an efficient, cost-effective, and streamlined manner. The advent of single cell and single nucleus mRNA sequencing (RNAseq) in droplet format3,4 now enables large scale sampling of cells from any tissue, and a recently released publicly available dataset of 1.3 million single cells from the E18.5 mouse brain generated with the 10X Chromium5 provides an opportunity to explore the relationship between population structure and the number of sampled cells necessary to reveal the underlying diversity of cell types. Here, we present a framework for how researchers can evaluate whether a dataset has reached saturation, and we estimate how many cells would be required to generate an atlas of the sample analyzed here. This framework can be applied to any organ or cell type specific atlas for any organism.

Download Full-text

Cell type-specific aging clocks to quantify aging and rejuvenation in regenerative regions of the brain

10.1101/2022.01.10.475747 ◽

2022 ◽

Author(s):

Matthew T Buckley ◽

Eric Sun ◽

Benson M. George ◽

Ling Liu ◽

Nicholas Schaum ◽

...

Keyword(s):

Single Cell ◽

Cell Types ◽

Rna Seq ◽

Cell Type ◽

Cell Level ◽

Transcriptomic Data ◽

Precise Quantification ◽

Cell Type Specific ◽

Tissue Aging ◽

The Brain

Aging manifests as progressive dysfunction culminating in death. The diversity of cell types is a challenge to the precise quantification of aging and its reversal. Here we develop a suite of 'aging clocks' based on single cell transcriptomic data to characterize cell type-specific aging and rejuvenation strategies. The subventricular zone (SVZ) neurogenic region contains many cell types and provides an excellent system to study cell-level tissue aging and regeneration. We generated 21,458 single-cell transcriptomes from the neurogenic regions of 28 mice, tiling ages from young to old. With these data, we trained a suite of single cell-based regression models (aging clocks) to predict both chronological age (passage of time) and biological age (fitness, in this case the proliferative capacity of the neurogenic region). Both types of clocks perform well on independent cohorts of mice. Genes underlying the single cell-based aging clocks are mostly cell-type specific, but also include a few shared genes in the interferon and lipid metabolism pathways. We used these single cell-based aging clocks to measure transcriptomic rejuvenation, by generating single cell RNA-seq datasets of SVZ neurogenic regions for two interventions - heterochronic parabiosis (young blood) and exercise. Interestingly, the use of aging clocks reveals that both heterochronic parabiosis and exercise reverse transcriptomic aging in the niche, but in different ways across cell types and genes. This study represents the first development of high-resolution aging clocks from single cell transcriptomic data and demonstrates their application to quantify transcriptomic rejuvenation.

Download Full-text

ProtAnno, an Automated Cell Type Annotation Tool for Single Cell Proteomics Data that Integrates Information from Multiple Reference Sources

10.1101/2021.09.13.460162 ◽

2021 ◽

Author(s):

Wenxuan Deng ◽

Biqing Zhu ◽

Seyoung Park ◽

Tomokazu S. Sumida ◽

Avraham Unterman ◽

...

Keyword(s):

Single Cell ◽

Single Cells ◽

Earth Metal ◽

Cell Types ◽

Specific Cell ◽

Cell Type ◽

Proteomics Data ◽

Data Annotation ◽

Different Cell Types ◽

Unique Molecular Identifier

Compared with sequencing-based global genomic profiling, cytometry labels targeted surface markers on millions of cells in parallel either by conjugated rare earth metal particles or Unique Molecular Identifier (UMI) barcodes. Correct annotation of these cells to specific cell types is a key step in the analysis of these data. However, there is no computational tool that automatically annotates single cell proteomics data for cell type inference. In this manuscript, we propose an automated single cell proteomics data annotation approach called ProtAnno to facilitate cell type assignments without laborious manual gating. ProtAnno is designed to incorporate information from annotated single cell RNA-seq (scRNA-seq), CITE-seq, and prior data knowledge (which can be imprecise) on biomarkers for different cell types. We have performed extensive simulations to demonstrate the accuracy and robustness of ProtAnno. For several single cell proteomics datasets that have been manually labeled, ProtAnno was able to correctly label most single cells. In summary, ProtAnno offers an accurate and robust tool to automate cell type annotations for large single cell proteomics datasets, and the analysis of such annotated cell types can offer valuable biological insights.

Download Full-text

Whole-organism eQTL mapping at cellular resolution with single-cell sequencing

10.1101/2020.08.23.263798 ◽

2020 ◽

Author(s):

Eyal Ben-David ◽

James Boocock ◽

Longhua Guo ◽

Stefan Zdraljevic ◽

Joshua S Bloom ◽

...

Keyword(s):

Single Cell ◽

Complex Traits ◽

Disease Risk ◽

Single Cells ◽

Large Population ◽

Regulation Of Gene Expression ◽

Cell Types ◽

Cell Type ◽

One Pot ◽

Cellular Resolution

Genetic regulation of gene expression underlies variation in disease risk and other complex traits. The effect of expression quantitative trait loci (eQTLs) varies across cell types; however, the complexity of mammalian tissues makes studying cell-type eQTLs highly challenging. We developed a novel approach in the model nematode Caenorhabditis elegans that uses single cell RNA sequencing to map eQTLs at cellular resolution in a single one-pot experiment. We mapped eQTLs across cell types in an extremely large population of genetically distinct C. elegnas individuals. We found cell-type-specific trans-eQTL hotspots that affect the expression of core pathways in the relevant cell types. Finally, we found single-cell-specific eQTL effects in the nervous system, including an eQTL with opposite effects in two individual neurons. Our results show that eQTL effects can be specific down to the level of single cells.

Download Full-text