Rapid Single Cell Evaluation of Human Disease and Disorder Targets Using REVEAL: SingleCell™

AbstractSingle-cell (sc) sequencing performs unbiased profiling of individual cells and enables evaluation of less prevalent cellular populations, often missed using bulk sequencing. However, the scale and the complexity of the sc datasets poses a great challenge in its utility and this problem is further exacerbated when working with larger datasets typically generated by consortium efforts. As the scale of single cell datasets continues to increase exponentially, there is an unmet technological need to develop database platforms that can evaluate key biological hypothesis by querying extensive single-cell datasets.Large single-cell datasets like human cell atlas and COVID-19 cell atlas (collection of annotated sc datasets from various human organs) are excellent resources for profiling target genes involved in human diseases and disorders ranging from oncology, auto-immunity, as well as infectious diseases like COVID-19 caused by SARS-CoV-2 virus. SARS-CoV-2 infections have led to a worldwide pandemic with massive loss of lives, infections exceeding 7 million cases. The virus uses ACE2 and TMPRSS2 as key viral entry associated proteins expressed in human cells for infections. Evaluating the expression profile of key genes in large single-cell datasets can facilitate testing for diagnostics, therapeutics and vaccine targets; as the world struggles to cope with the on-going spread of COVID-19 infections.In this manuscript we describe, REVEAL: SingleCell which enables storage, retrieval and rapid query of single-cell datasets inclusive of millions of cells. The analytical database described here enables selecting and analyzing cells across multiple studies. Cells can be selected using individual metadata tags, more complex hierarchical ontology filtering, and gene expression threshold ranges, including co-expression of multiple genes. The tags on selected cells can be further evaluated for testing biological hypothesis. One such example includes identifying the most prevalent cell type annotation tag on returned cells.We used REVEAL: SingleCell to evaluate expression of key SARS-CoV-2 entry associated genes, and queried the current database (2.2 Million cells, 32 projects) to obtain the results in <60 seconds. We highlighted cells expressing COVID-19 associated genes are expressed on multiple tissue types, thus in part explains the multi-organ involvement in infected patients observed worldwide during the on-going COVID-19 pandemic.

Download Full-text

Rapid single cell evaluation of human disease and disorder targets using REVEAL: SingleCell™

BMC Genomics ◽

10.1186/s12864-020-07300-8 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Namit Kumar ◽

Ryan Golhar ◽

Kriti Sen Sharma ◽

James L. Holloway ◽

Srikant Sarangi ◽

...

Keyword(s):

Precision Medicine ◽

Single Cell ◽

Viral Entry ◽

Target Genes ◽

Main Body ◽

Cell Type ◽

Human Organs ◽

The World ◽

Associated Proteins ◽

Current Database

Abstract Background Single-cell (sc) sequencing performs unbiased profiling of individual cells and enables evaluation of less prevalent cellular populations, often missed using bulk sequencing. However, the scale and the complexity of the sc datasets poses a great challenge in its utility and this problem is further exacerbated when working with larger datasets typically generated by consortium efforts. As the scale of single cell datasets continues to increase exponentially, there is an unmet technological need to develop database platforms that can evaluate key biological hypotheses by querying extensive single-cell datasets. Large single-cell datasets like Human Cell Atlas and COVID-19 cell atlas (collection of annotated sc datasets from various human organs) are excellent resources for profiling target genes involved in human diseases and disorders ranging from oncology, auto-immunity, as well as infectious diseases like COVID-19 caused by SARS-CoV-2 virus. SARS-CoV-2 infections have led to a worldwide pandemic with massive loss of lives, infections exceeding 7 million cases. The virus uses ACE2 and TMPRSS2 as key viral entry associated proteins expressed in human cells for infections. Evaluating the expression profile of key genes in large single-cell datasets can facilitate testing for diagnostics, therapeutics, and vaccine targets, as the world struggles to cope with the on-going spread of COVID-19 infections. Main body In this manuscript we describe REVEAL: SingleCell, which enables storage, retrieval, and rapid query of single-cell datasets inclusive of millions of cells. The array native database described here enables selecting and analyzing cells across multiple studies. Cells can be selected using individual metadata tags, more complex hierarchical ontology filtering, and gene expression threshold ranges, including co-expression of multiple genes. The tags on selected cells can be further evaluated for testing biological hypotheses. One such example includes identifying the most prevalent cell type annotation tag on returned cells. We used REVEAL: SingleCell to evaluate the expression of key SARS-CoV-2 entry associated genes, and queried the current database (2.2 Million cells, 32 projects) to obtain the results in < 60 s. We highlighted cells expressing COVID-19 associated genes are expressed on multiple tissue types, thus in part explains the multi-organ involvement in infected patients observed worldwide during the on-going COVID-19 pandemic. Conclusion In this paper, we introduce the REVEAL: SingleCell database that addresses immediate needs for SARS-CoV-2 research and has the potential to be used more broadly for many precision medicine applications. We used the REVEAL: SingleCell database as a reference to ask questions relevant to drug development and precision medicine regarding cell type and co-expression for genes that encode proteins necessary for SARS-CoV-2 to enter and reproduce in cells.

Download Full-text

Single-cell RNA sequencing of the mammalian pineal gland identifies two pinealocyte subtypes and cell type-specific daily patterns of gene expression

PLoS ONE ◽

10.1371/journal.pone.0205883 ◽

2018 ◽

Vol 13 (10) ◽

pp. e0205883 ◽

Cited By ~ 9

Author(s):

Joseph C. Mays ◽

Michael C. Kelly ◽

Steven L. Coon ◽

Lynne Holtzclaw ◽

Martin F. Rath ◽

...

Keyword(s):

Gene Expression ◽

Pineal Gland ◽

Single Cell ◽

Rna Sequencing ◽

Cell Type ◽

Single Cell Rna Sequencing ◽

Cell Type Specific ◽

Mammalian Pineal Gland ◽

Daily Patterns

Download Full-text

New insights into promoter–enhancer communication mechanisms revealed by dynamic single-molecule imaging

Biochemical Society Transactions ◽

10.1042/bst20200963 ◽

2021 ◽

Author(s):

Jieru Li ◽

Alexandros Pertsinidis

Keyword(s):

Gene Expression ◽

Single Molecule ◽

Target Genes ◽

Regulatory Elements ◽

Specific Gene ◽

Single Molecule Imaging ◽

Cell Type ◽

Regulatory Information ◽

Cell Type Specific ◽

Target Promoters

Establishing cell-type-specific gene expression programs relies on the action of distal enhancers, cis-regulatory elements that can activate target genes over large genomic distances — up to Mega-bases away. How distal enhancers physically relay regulatory information to target promoters has remained a mystery. Here, we review the latest developments and insights into promoter–enhancer communication mechanisms revealed by live-cell, real-time single-molecule imaging approaches.

Download Full-text

Putative cell type discovery from single-cell gene expression data

Nature Methods ◽

10.1038/s41592-020-0825-9 ◽

2020 ◽

Vol 17 (6) ◽

pp. 621-628 ◽

Cited By ~ 4

Author(s):

Zhichao Miao ◽

Pablo Moreno ◽

Ni Huang ◽

Irene Papatheodorou ◽

Alvis Brazma ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

Gene Expression Data ◽

Expression Data ◽

Cell Type ◽

Cell Gene Expression ◽

Cell Gene

Download Full-text

Identifying gene expression programs of cell-type identity and cellular activity with single-cell RNA-Seq

eLife ◽

10.7554/elife.43803 ◽

2019 ◽

Vol 8 ◽

Cited By ~ 37

Author(s):

Dylan Kotliar ◽

Adrian Veres ◽

M Aurel Nagy ◽

Shervin Tabrizi ◽

Eran Hodis ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

Matrix Factorization ◽

Cell Types ◽

Environmental Cues ◽

Rna Seq ◽

Cell Type ◽

Type Identity ◽

Brain Organoid ◽

Non Negative Matrix Factorization

Identifying gene expression programs underlying both cell-type identity and cellular activities (e.g. life-cycle processes, responses to environmental cues) is crucial for understanding the organization of cells and tissues. Although single-cell RNA-Seq (scRNA-Seq) can quantify transcripts in individual cells, each cell’s expression profile may be a mixture of both types of programs, making them difficult to disentangle. Here, we benchmark and enhance the use of matrix factorization to solve this problem. We show with simulations that a method we call consensus non-negative matrix factorization (cNMF) accurately infers identity and activity programs, including their relative contributions in each cell. To illustrate the insights this approach enables, we apply it to published brain organoid and visual cortex scRNA-Seq datasets; cNMF refines cell types and identifies both expected (e.g. cell cycle and hypoxia) and novel activity programs, including programs that may underlie a neurosecretory phenotype and synaptogenesis.

Download Full-text

A single cell brain atlas in human Alzheimer’s disease

10.1101/628347 ◽

2019 ◽

Cited By ~ 4

Author(s):

Alexandra Grubman ◽

Gabriel Chew ◽

John F. Ouyang ◽

Guizhi Sun ◽

Xin Yi Choo ◽

...

Keyword(s):

Gene Expression ◽

Transcription Factor ◽

Single Cell ◽

Cell Fate ◽

Expression Patterns ◽

Cell Types ◽

Gene Expression Patterns ◽

Cell Type ◽

Web Resource ◽

Cell Type Specific

AbstractAlzheimer’s disease (AD) is a heterogeneous disease that is largely dependent on the complex cellular microenvironment in the brain. This complexity impedes our understanding of how individual cell types contribute to disease progression and outcome. To characterize the molecular and functional cell diversity in the human AD brain we utilized single nuclei RNA- seq in AD and control patient brains in order to map the landscape of cellular heterogeneity in AD. We detail gene expression changes at the level of cells and cell subclusters, highlighting specific cellular contributions to global gene expression patterns between control and Alzheimer’s patient brains. We observed distinct cellular regulation of APOE which was repressed in oligodendrocyte progenitor cells (OPCs) and astrocyte AD subclusters, and highly enriched in a microglial AD subcluster. In addition, oligodendrocyte and microglia AD subclusters show discordant expression of APOE. Integration of transcription factor regulatory modules with downstream GWAS gene targets revealed subcluster-specific control of AD cell fate transitions. For example, this analysis uncovered that astrocyte diversity in AD was under the control of transcription factor EB (TFEB), a master regulator of lysosomal function and which initiated a regulatory cascade containing multiple AD GWAS genes. These results establish functional links between specific cellular sub-populations in AD, and provide new insights into the coordinated control of AD GWAS genes and their cell-type specific contribution to disease susceptibility. Finally, we created an interactive reference web resource which will facilitate brain and AD researchers to explore the molecular architecture of subtype and AD-specific cell identity, molecular and functional diversity at the single cell level.HighlightsWe generated the first human single cell transcriptome in AD patient brainsOur study unveiled 9 clusters of cell-type specific and common gene expression patterns between control and AD brains, including clusters of genes that present properties of different cell types (i.e. astrocytes and oligodendrocytes)Our analyses also uncovered functionally specialized sub-cellular clusters: 5 microglial clusters, 8 astrocyte clusters, 6 neuronal clusters, 6 oligodendrocyte clusters, 4 OPC and 2 endothelial clusters, each enriched for specific ontological gene categoriesOur analyses found manifold AD GWAS genes specifically associated with one cell-type, and sets of AD GWAS genes co-ordinately and differentially regulated between different brain cell-types in AD sub-cellular clustersWe mapped the regulatory landscape driving transcriptional changes in AD brain, and identified transcription factor networks which we predict to control cell fate transitions between control and AD sub-cellular clustersFinally, we provide an interactive web-resource that allows the user to further visualise and interrogate our dataset.Data resource web interface:http://adsn.ddnetbio.com

Download Full-text

Interpretable factor models of single-cell RNA-seq via variational autoencoders

10.1101/737601 ◽

2019 ◽

Cited By ~ 2

Author(s):

Valentine Svensson ◽

Lior Pachter

Keyword(s):

Gene Expression ◽

Single Cell ◽

Statistical Inference ◽

Factor Models ◽

Rna Seq ◽

Cell Type ◽

Massive Datasets ◽

Domain Specific ◽

Variational Autoencoder ◽

Inference Methods

Single cell RNA-seq makes possible the investigation of variability in gene expression among cells, and dependence of variation on cell type. Statistical inference methods for such analyses must be scalable, and ideally interpretable. We present an approach based on a modification of a recently published highly scalable variational autoencoder framework that provides interpretability without sacrificing much accuracy. We demonstrate that our approach enables identification of gene programs in massive datasets. Our strategy, namely the learning of factor models with the auto-encoding variational Bayes framework, is not domain specific and may be of interest for other applications.

Download Full-text

SnapHiC: a computational pipeline to map chromatin contacts from single cell Hi-C data

10.1101/2020.12.13.422543 ◽

2020 ◽

Author(s):

Miao Yu ◽

Armen Abnousi ◽

Yanxiao Zhang ◽

Guoqiang Li ◽

Lindsay Lee ◽

...

Keyword(s):

High Resolution ◽

Single Cell ◽

Target Genes ◽

Embryonic Stem ◽

Cell Type ◽

Cortical Cells ◽

Chromatin Loops ◽

Chromatin Architecture ◽

Prefrontal Cortical ◽

Cell Type Specific

Single cell Hi-C (scHi-C) analysis has been increasingly used to map the chromatin architecture in diverse tissue contexts, but computational tools to define chromatin contacts at high resolution from scHi-C data are still lacking. Here, we describe SnapHiC, a method that can identify chromatin loops at high resolution and accuracy from scHi-C data. We benchmark SnapHiC against HiCCUPS, a common tool for mapping chromatin contacts in bulk Hi-C data, using scHi-C data from 742 mouse embryonic stem cells. We further demonstrate its utility by analyzing single-nucleus methyl-3C-seq data from 2,869 human prefrontal cortical cells. We uncover cell-type-specific chromatin loops and predict putative target genes for non-coding sequence variants associated with neuropsychiatric disorders. Our results suggest that SnapHiC could facilitate the analysis of cell-type-specific chromatin architecture and gene regulatory programs in complex tissues.

Download Full-text

The single-cell epigenetic regulatory landscape in mammalian perinatal testis development

10.1101/2021.03.17.435776 ◽

2021 ◽

Author(s):

Jinyue Liao ◽

Hoi Ching Suen ◽

Shitao Rao ◽

Alfred Chun Shui Luk ◽

Ruoyu Zhang ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

Cell Fate ◽

Germ Cells ◽

Somatic Cells ◽

Cell Types ◽

Regulatory Elements ◽

Cellular Heterogeneity ◽

Cell Populations ◽

Cell Type

AbstractSpermatogenesis depends on an orchestrated series of developing events in germ cells and full maturation of the somatic microenvironment. To date, the majority of efforts to study cellular heterogeneity in testis has been focused on single-cell gene expression rather than the chromatin landscape shaping gene expression. To advance our understanding of the regulatory programs underlying testicular cell types, we analyzed single-cell chromatin accessibility profiles in more than 25,000 cells from mouse developing testis. We showed that scATAC-Seq allowed us to deconvolve distinct cell populations and identify cis-regulatory elements (CREs) underlying cell type specification. We identified sets of transcription factors associated with cell type-specific accessibility, revealing novel regulators of cell fate specification and maintenance. Pseudotime reconstruction revealed detailed regulatory dynamics coordinating the sequential developmental progressions of germ cells and somatic cells. This high-resolution data also revealed putative stem cells within the Sertoli and Leydig cell populations. Further, we defined candidate target cell types and genes of several GWAS signals, including those associated with testosterone levels and coronary artery disease. Collectively, our data provide a blueprint of the ‘regulon’ of the mouse male germline and supporting somatic cells.

Download Full-text

RNA splicing programs define tissue compartments and cell types at single cell resolution

10.1101/2021.05.01.442281 ◽

2021 ◽

Author(s):

Julia Eve Olivieri ◽

Roozbeh Dehghannasiri ◽

Peter Wang ◽

SoRi Jang ◽

Antoine de Morree ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

High Throughput ◽

Rna Splicing ◽

Single Cells ◽

Cell Types ◽

Mouse Lemur ◽

Cell Type ◽

Multiple Organs ◽

Single Cell Pcr

More than 95% of human genes are alternatively spliced. Yet, the extent splicing is regulated at single-cell resolution has remained controversial due to both available data and methods to interpret it. We apply the SpliZ, a new statistical approach that is agnostic to transcript annotation, to detect cell-type-specific regulated splicing in > 110K carefully annotated single cells from 12 human tissues. Using 10x data for discovery, 9.1% of genes with computable SpliZ scores are cell-type specifically spliced. These results are validated with RNA FISH, single cell PCR, and in high throughput with Smart-seq2. Regulated splicing is found in ubiquitously expressed genes such as actin light chain subunit MYL6 and ribosomal protein RPS24, which has an epithelial-specific microexon. 13% of the statistically most variable splice sites in cell-type specifically regulated genes are also most variable in mouse lemur or mouse. SpliZ analysis further reveals 170 genes with regulated splicing during sperm development using, 10 of which are conserved in mouse and mouse lemur. The statistical properties of the SpliZ allow model-based identification of subpopulations within otherwise indistinguishable cells based on gene expression, illustrated by subpopulations of classical monocytes with stereotyped splicing, including an un-annotated exon, in SAT1, a Diamine acetyltransferase. Together, this unsupervised and annotation-free analysis of differential splicing in ultra high throughput droplet-based sequencing of human cells across multiple organs establishes splicing is regulated cell-type-specifically independent of gene expression.

Download Full-text