Manifold learning analysis reveals the functional genomics at the cell-type level for neuronal electrophysiology in the mouse brain

Mapping Intimacies ◽

10.1101/2020.12.03.410555 ◽

2020 ◽

Author(s):

Jiawei Huang ◽

Daifeng Wang

Keyword(s):

Gene Expression ◽

Functional Genomics ◽

Mouse Brain ◽

Manifold Learning ◽

Single Cells ◽

Cell Types ◽

Functional Enrichment ◽

Cell Type ◽

Network Analyses ◽

Different Characteristics

AbstractRecent single-cell multi-modal data reveal different characteristics of single cells, such as transcriptomics, morphology, and electrophysiology. However, our understanding of functional genomics and gene regulation leading to the cellular characteristics remains elusive. To address this, we used emerging manifold learning to align gene expression and electrophysiological data of single neuronal cells in the mouse brain. After manifold alignment, the cell clusters highly correspond to transcriptomic and morphological cell-types, suggesting a strong nonlinear linkage between gene expression and electrophysiology at the cell-type level. Additional functional enrichment and gene regulatory network analyses revealed potential novel molecular mechanistic insights from genes to electrophysiology at cellular resolution.

Download Full-text

Manifold learning analysis suggests strategies to align single-cell multimodal data of neuronal electrophysiology and transcriptomics

Communications Biology ◽

10.1038/s42003-021-02807-6 ◽

2021 ◽

Vol 4 (1) ◽

Author(s):

Jiawei Huang ◽

Jie Sheng ◽

Daifeng Wang

Keyword(s):

Gene Expression ◽

Single Cell ◽

Manifold Learning ◽

Molecular Mechanisms ◽

Single Cells ◽

Cell Types ◽

Functional Enrichment ◽

Multimodal Data ◽

Network Analyses ◽

Manifold Alignment

AbstractRecent single-cell multimodal data reveal multi-scale characteristics of single cells, such as transcriptomics, morphology, and electrophysiology. However, integrating and analyzing such multimodal data to deeper understand functional genomics and gene regulation in various cellular characteristics remains elusive. To address this, we applied and benchmarked multiple machine learning methods to align gene expression and electrophysiological data of single neuronal cells in the mouse brain from the Brain Initiative. We found that nonlinear manifold learning outperforms other methods. After manifold alignment, the cells form clusters highly corresponding to transcriptomic and morphological cell types, suggesting a strong nonlinear relationship between gene expression and electrophysiology at the cell-type level. Also, the electrophysiological features are highly predictable by gene expression on the latent space from manifold alignment. The aligned cells further show continuous changes of electrophysiological features, implying cross-cluster gene expression transitions. Functional enrichment and gene regulatory network analyses for those cell clusters revealed potential genome functions and molecular mechanisms from gene expression to neuronal electrophysiology.

Download Full-text

CCPLS reveals cell-type-specific spatial dependence of transcriptomes in single cells

10.1101/2022.01.12.476034 ◽

2022 ◽

Author(s):

Takaho Tsuchiya ◽

Hiroki Hori ◽

Haruka Ozaki

Keyword(s):

Gene Expression ◽

Single Cells ◽

Cell Types ◽

Regression Modeling ◽

Transcriptome Data ◽

Cell Type ◽

Neighboring Cell ◽

Expression Variability ◽

Cell Expression ◽

Cell Cell

Motivation: Cell-cell communications regulate internal cellular states of the cell, e.g., gene expression and cell functions, and play pivotal roles in normal development and disease states. Furthermore, single-cell RNA sequencing methods have revealed cell-to-cell expression variability of highly variable genes (HVGs), which is also crucial. Nevertheless, the regulation on cell-to-cell expression variability of HVGs via cell-cell communications is still unexplored. The recent advent of spatial transcriptome measurement methods has linked gene expression profiles to the spatial context of single cells, which has provided opportunities to reveal those regulations. The existing computational methods extract genes with expression levels that are influenced by neighboring cell types based on the spatial transcriptome data. However, limitations remain in the quantitativeness and interpretability: it neither focuses on HVGs, considers cooperation of neighboring cell types, nor quantifies the degree of regulation with each neighboring cell type. Results: Here, we propose CCPLS (Cell-Cell communications analysis by Partial Least Square regression modeling), which is a statistical framework for identifying cell-cell communications as the effects of multiple neighboring cell types on cell-to-cell expression variability of HVGs, based on the spatial transcriptome data. For each cell type, CCPLS performs PLS regression modeling and reports coefficients as the quantitative index of the cell-cell communications. Evaluation using simulated data showed our method accurately estimated effects of multiple neighboring cell types on HVGs. Furthermore, by applying CCPLS to the two real datasets, we demonstrate CCPLS can be used to extract biologically interpretable insights from the inferred cell-cell communications.

Download Full-text

RNA splicing programs define tissue compartments and cell types at single cell resolution

10.1101/2021.05.01.442281 ◽

2021 ◽

Author(s):

Julia Eve Olivieri ◽

Roozbeh Dehghannasiri ◽

Peter Wang ◽

SoRi Jang ◽

Antoine de Morree ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

High Throughput ◽

Rna Splicing ◽

Single Cells ◽

Cell Types ◽

Mouse Lemur ◽

Cell Type ◽

Multiple Organs ◽

Single Cell Pcr

More than 95% of human genes are alternatively spliced. Yet, the extent splicing is regulated at single-cell resolution has remained controversial due to both available data and methods to interpret it. We apply the SpliZ, a new statistical approach that is agnostic to transcript annotation, to detect cell-type-specific regulated splicing in > 110K carefully annotated single cells from 12 human tissues. Using 10x data for discovery, 9.1% of genes with computable SpliZ scores are cell-type specifically spliced. These results are validated with RNA FISH, single cell PCR, and in high throughput with Smart-seq2. Regulated splicing is found in ubiquitously expressed genes such as actin light chain subunit MYL6 and ribosomal protein RPS24, which has an epithelial-specific microexon. 13% of the statistically most variable splice sites in cell-type specifically regulated genes are also most variable in mouse lemur or mouse. SpliZ analysis further reveals 170 genes with regulated splicing during sperm development using, 10 of which are conserved in mouse and mouse lemur. The statistical properties of the SpliZ allow model-based identification of subpopulations within otherwise indistinguishable cells based on gene expression, illustrated by subpopulations of classical monocytes with stereotyped splicing, including an un-annotated exon, in SAT1, a Diamine acetyltransferase. Together, this unsupervised and annotation-free analysis of differential splicing in ultra high throughput droplet-based sequencing of human cells across multiple organs establishes splicing is regulated cell-type-specifically independent of gene expression.

Download Full-text

A human cell atlas of fetal chromatin accessibility

Science ◽

10.1126/science.aba7612 ◽

2020 ◽

Vol 370 (6518) ◽

pp. eaba7612 ◽

Cited By ~ 1

Author(s):

Silvia Domcke ◽

Andrew J. Hill ◽

Riza M. Daza ◽

Junyue Cao ◽

Diana R. O’Day ◽

...

Keyword(s):

Gene Expression ◽

Human Cell ◽

Single Cells ◽

Complex Trait ◽

Cell Types ◽

Regulatory Elements ◽

Chromatin Accessibility ◽

Cell Type ◽

Cell Type Specific

The chromatin landscape underlying the specification of human cell types is of fundamental interest. We generated human cell atlases of chromatin accessibility and gene expression in fetal tissues. For chromatin accessibility, we devised a three-level combinatorial indexing assay and applied it to 53 samples representing 15 organs, profiling ~800,000 single cells. We leveraged cell types defined by gene expression to annotate these data and cataloged hundreds of thousands of candidate regulatory elements that exhibit cell type–specific chromatin accessibility. We investigated the properties of lineage-specific transcription factors (such as POU2F1 in neurons), organ-specific specializations of broadly distributed cell types (such as blood and endothelial), and cell type–specific enrichments of complex trait heritability. These data represent a rich resource for the exploration of in vivo human gene regulation in diverse tissues and cell types.

Download Full-text

Transposon-mediated, cell type-specific transcription factor recording in the mouse brain

10.1101/538504 ◽

2019 ◽

Cited By ~ 1

Author(s):

Alexander J. Cammack ◽

Arnav Moudgil ◽

Tomas Lagunas ◽

Michael J. Vasek ◽

Mark Shabsovich ◽

...

Keyword(s):

Gene Expression ◽

Mouse Brain ◽

Regulation Of Gene Expression ◽

Cell Types ◽

Mouse Tissue ◽

Specific Cell ◽

Cell Type ◽

Cell Fate Decisions ◽

Cell Type Specific ◽

The Brain

AbstractTranscription factors (TFs) play a central role in the regulation of gene expression, controlling everything from cell fate decisions to activity dependent gene expression. However, widely-used methods for TF profiling in vivo (e.g. ChIP-seq) yield only an aggregated picture of TF binding across all cell types present within the harvested tissue; thus, it is challenging or impossible to determine how the same TF might bind different portions of the genome in different cell types, or even to identify its binding events at all in rare cell types in a complex tissue such as the brain. Here we present a versatile methodology, FLEX Calling Cards, for the mapping of TF occupancy in specific cell types from heterogenous tissues. In this method, the TF of interest is fused to a hyperactive piggyBac transposase (hypPB), and this bipartite gene is delivered, along with donor transposons, to mouse tissue via a Cre-dependent adeno-associated virus (AAV). The fusion protein is expressed in Cre-expressing cells where it inserts transposon “Calling Cards” near to TF binding sites. These transposons permanently mark TF binding events and can be mapped using high-throughput sequencing. Alternatively, unfused hypPB interacts with and records the binding of the super enhancer (SE)-associated bromodomain protein, Brd4. To demonstrate the FLEX Calling Card method, we first show that donor transposon and transposase constructs can be efficiently delivered to the postnatal day 1 (P1) mouse brain with AAV and that insertion profiles report TF occupancy. Then, using a Cre-dependent hypPB virus, we show utility of this tool in defining cell type-specific TF profiles in multiple cell types of the brain. This approach will enable important cell type-specific studies of TF-mediated gene regulation in the brain and will provide valuable insights into brain development, homeostasis, and disease.

Download Full-text

Saturating Single-Cell atlas Datasets

10.1101/218370 ◽

2017 ◽

Cited By ~ 2

Author(s):

Aparna Bhaduri ◽

Tomasz J. Nowakowski ◽

Alex A. Pollen ◽

Arnold R. Kriegstein

Keyword(s):

Population Structure ◽

Single Cell ◽

Mouse Brain ◽

Large Scale ◽

Single Cells ◽

Cost Effective ◽

Cell Types ◽

Cell Number ◽

Cell Type ◽

The Relationship

AbstractHigh throughput methods for profiling the transcriptomes of single cells have recently emerged as transformative approaches for large-scale population surveys of cellular diversity in heterogeneous primary tissues. Efficient generation of such an atlas will depend on sufficient sampling of the diverse cell types while remaining cost-effective to enable a comprehensive examination of organs, developmental stages, and individuals. To examine the relationship between cell number and transcriptional heterogeneity in the context of unbiased cell type classification, we explicitly explored the population structure of a publically available 1.3 million cell dataset from the E18.5 mouse brain. We propose a computational framework for inferring the saturation point of cluster discovery in a single cell mRNA-seq experiment, centered around cluster preservation in downsampled datasets. In addition, we introduce a “complexity index”, which characterizes the heterogeneity of cells in a given dataset. Using Cajal-Retzius cells as an example of a limited complexity dataset, we explored whether biological distinctions relate to technical clustering. Surprisingly, we found that clustering distinctions carrying biologically interpretable meaning are achieved with far fewer cells (20,000). Together, these findings suggest that most of the biologically interpretable insights from the 1.3 million cells can be recapitulated by analyzing 50,000 randomly selected cells, indicating that instead of profiling few individuals at high “cellular coverage”, the much anticipated cell atlasing studies may instead benefit from profiling more individuals, or many time points at lower cellular coverage.Recent efforts seek to create a comprehensive cell atlas of the human body1,2 Current technology, however, makes it precipitously expensive to perform analysis of every cell. Therefore, designing effective sampling strategies be critical to generate a working atlas in an efficient, cost-effective, and streamlined manner. The advent of single cell and single nucleus mRNA sequencing (RNAseq) in droplet format3,4 now enables large scale sampling of cells from any tissue, and a recently released publicly available dataset of 1.3 million single cells from the E18.5 mouse brain generated with the 10X Chromium5 provides an opportunity to explore the relationship between population structure and the number of sampled cells necessary to reveal the underlying diversity of cell types. Here, we present a framework for how researchers can evaluate whether a dataset has reached saturation, and we estimate how many cells would be required to generate an atlas of the sample analyzed here. This framework can be applied to any organ or cell type specific atlas for any organism.

Download Full-text

Cell Type Assignments for Spatial Transcriptomics Data

10.1101/2021.02.25.432887 ◽

2021 ◽

Author(s):

Haotian Teng ◽

Ye Yuan ◽

Ziv Bar-Joseph

Keyword(s):

Gene Expression ◽

Single Cells ◽

Cell Types ◽

New Method ◽

Inhibitory Neurons ◽

Cell Type ◽

Inference Algorithms ◽

Probabilistic Function ◽

Transcriptomics Data

ABSTRACTMotivationRecent advancements in fluorescence in situ hybridization (FISH) techniques enable them to concurrently obtain information on the location and gene expression of single cells. A key question in the initial analysis of such spatial transcriptomics data is the assignment of cell types. To date, most studies used methods that only rely on the expression levels of the genes in each cell for such assignments. To fully utilize the data and to improve the ability to identify novel sub-types we developed a new method, FICT, which combines both expression and neighborhood information when assigning cell types.ResultsFICT optimizes a probabilistic function that we formalize and for which we provide learning and inference algorithms. We used FICT to analyze both simulated and several real spatial transcriptomics data. As we show, FICT can accurately identify cell types and sub-types improving on expression only methods and other methods proposed for clustering spatial transcriptomics data. Some of the spatial sub-types identified by FICT provide novel hypotheses about the new functions for excitatory and inhibitory neurons.AvailabilityFICT is available at: https://github.com/haotianteng/[email protected]

Download Full-text

Single-cell regulatory landscape and disease vulnerability map of adult Macaque cortex

10.1101/2020.05.14.087601 ◽

2020 ◽

Author(s):

Ying Lei ◽

Mengnan Cheng ◽

Zihao Li ◽

Zhenkun Zhuang ◽

Liang Wu ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

Primary Motor Cortex ◽

Neurological Diseases ◽

Single Cells ◽

Cell Types ◽

Regulatory Elements ◽

Chromatin Accessibility ◽

Nucleotide Polymorphisms ◽

Cell Type

Non-human primates (NHP) provide a unique opportunity to study human neurological diseases, yet detailed characterization of the cell types and transcriptional regulatory features in the NHP brain is lacking. We applied a combinatorial indexing assay, sci-ATAC-seq, as well as single-nuclei RNA-seq, to profile chromatin accessibility in 43,793 single cells and transcriptomics in 11,477 cells, respectively, from prefrontal cortex, primary motor cortex and the primary visual cortex of adult cynomolgus monkey Macaca fascularis. Integrative analysis of these two datasets, resolved regulatory elements and transcription factors that specify cell type distinctions, and discovered area-specific diversity in chromatin accessibility and gene expression within excitatory neurons. We also constructed the dynamic landscape of chromatin accessibility and gene expression of oligodendrocyte maturation to characterize adult remyelination. Furthermore, we identified cell type-specific enrichment of differentially spliced gene isoforms and disease-associated single nucleotide polymorphisms. Our datasets permit integrative exploration of complex regulatory dynamics in macaque brain tissue at single-cell resolution.

Download Full-text

PESCA: A scalable platform for the development of cell-type-specific viral drivers

10.1101/570895 ◽

2019 ◽

Cited By ~ 3

Author(s):

Sinisa Hrvatin ◽

Christopher P. Tzeng ◽

M. Aurel Nagy ◽

Hume Stroud ◽

Charalampia Koutsioumpa ◽

...

Keyword(s):

Gene Expression ◽

Heterologous Gene Expression ◽

Single Cells ◽

Cell Types ◽

Regulatory Elements ◽

Functional Evaluation ◽

Cell Type ◽

Cell Type Specificity ◽

Enhancer Activity ◽

Cell Type Specific

AbstractEnhancers are the primary DNA regulatory elements that confer cell type specificity of gene expression. Recent studies characterizing individual enhancers have revealed their potential to direct heterologous gene expression in a highly cell-type-specific manner. However, it has not yet been possible to systematically identify and test the function of enhancers for each of the many cell types in an organism. We have developed PESCA, a scalable and generalizable method that leverages ATAC- and single-cell RNA-sequencing protocols, to characterize cell-type-specific enhancers that should enable genetic access and perturbation of gene function across mammalian cell types. Focusing on the highly heterogeneous mammalian cerebral cortex, we apply PESCA to find enhancers and generate viral reagents capable of accessing and manipulating a subset of somatostatin-expressing cortical interneurons with high specificity. This study demonstrates the utility of this platform for developing new cell-type-specific viral reagents, with significant implications for both basic and translational research.One sentence summaryHighly paralleled functional evaluation of enhancer activity in single cells generates new cell-type-specific tools with broad medical and scientific applications.

Download Full-text

alona: a web server for single-cell RNA-seq analysis

Bioinformatics ◽

10.1093/bioinformatics/btaa269 ◽

2020 ◽

Vol 36 (12) ◽

pp. 3910-3912 ◽

Cited By ~ 6

Author(s):

Oscar Franzén ◽

Johan L M Björkegren

Keyword(s):

Gene Expression ◽

Single Cell ◽

Single Cell Analysis ◽

Single Cells ◽

Cluster Structure ◽

Web Server ◽

Cell Types ◽

Supplementary Information ◽

Marker Genes ◽

Cell Type

Abstract Summary Single-cell RNA sequencing (scRNA-seq) is a technology to measure gene expression in single cells. It has enabled discovery of new cell types and established cell type atlases of tissues and organs. The widespread adoption of scRNA-seq has created a need for user-friendly software for data analysis. We have developed a web server, alona that incorporates several of the most popular single-cell analysis algorithms into a flexible pipeline. alona can perform quality filtering, normalization, batch correction, clustering, cell type annotation and differential gene expression analysis. Data are visualized in the web browser using an interface based on JavaScript, allowing the user to query genes of interest and visualize the cluster structure. alona accepts a compressed gene expression matrix and identifies cell clusters with a graph-based clustering strategy. Cell types are identified from a comprehensive collection of marker genes or by specifying a custom set of marker genes. Availability and implementation The service runs at https://alona.panglaodb.se and the Python package can be downloaded from https://oscar-franzen.github.io/adobo/. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text