Single nucleus multi-omics regulatory landscape of the murine pituitary

AbstractTo provide a multi-omics resource and investigate transcriptional regulatory mechanisms, we profile the transcriptome, chromatin accessibility, and methylation status of over 70,000 single nuclei (sn) from adult mouse pituitaries. Paired snRNAseq and snATACseq datasets from individual animals highlight a continuum between developmental epigenetically-encoded cell types and transcriptionally-determined transient cell states. Co-accessibility analysis-based identification of a putative Fshb cis-regulatory domain that overlaps the fertility-linked rs11031006 human polymorphism, followed by experimental validation illustrate the use of this resource for hypothesis generation. We also identify transcriptional and chromatin accessibility programs distinguishing each major cell type. Regulons, which are co-regulated gene sets sharing binding sites for a common transcription factor driver, recapitulate cell type clustering. We identify both cell type-specific and sex-specific regulons that are highly correlated with promoter accessibility, but not with methylation state, supporting the centrality of chromatin accessibility in shaping cell-defining transcriptional programs. The sn multi-omics atlas is accessible at snpituitaryatlas.princeton.edu.

Download Full-text

Single-Cell Epigenomics and Functional Fine-Mapping of Atherosclerosis GWAS Loci

Circulation Research ◽

10.1161/circresaha.121.318971 ◽

2021 ◽

Author(s):

Tiit Örd ◽

Kadri Õunap ◽

Lindsey Stolze ◽

Rédouane Aherrahrou ◽

Valtteri Nurminen ◽

...

Keyword(s):

Smooth Muscle ◽

Smooth Muscle Cells ◽

Muscle Cells ◽

Cell Types ◽

Regulatory Elements ◽

Chromatin Accessibility ◽

Cell Type ◽

Atherosclerotic Lesions ◽

Genome Wide ◽

Single Nucleus

Rationale: Genome-wide association studies (GWAS) have identified hundreds of loci associated with coronary artery disease (CAD). Many of these loci are enriched in cis-regulatory elements (CREs) but not linked to cardiometabolic risk factors nor to candidate causal genes, complicating their functional interpretation. Objective: Single nucleus chromatin accessibility profiling of the human atherosclerotic lesions was used to investigate cell type-specific patterns of CREs, to understand transcription factors establishing cell identity and to interpret CAD-relevant, non-coding genetic variation. Methods and Results: We used single nucleus ATAC-seq to generate DNA accessibility maps in > 7,000 cells derived from human atherosclerotic lesions. We identified five major lesional cell types including endothelial cells, smooth muscle cells, monocyte/macrophages, NK/T-cells and B-cells and further investigated subtype characteristics of macrophages and smooth muscle cells transitioning into fibromyocytes. We demonstrated that CAD associated genetic variants are particularly enriched in endothelial and smooth muscle cell-specific open chromatin. Using single cell co-accessibility and cis-eQTL information, we prioritized putative target genes and candidate regulatory elements for ~30% of all known CAD loci. Finally, we performed genome-wide experimental fine-mapping of the CAD GWAS variants using epigenetic QTL analysis in primary human aortic endothelial cells and STARR-Seq massively parallel reporter assay in smooth muscle cells. This analysis identified potential causal SNP(s) and the associated target gene for over 30 CAD loci. We present several examples where the chromatin accessibility and gene expression could be assigned to one cell type predicting the cell type of action for CAD loci. Conclusions: These findings highlight the potential of applying snATAC-seq to human tissues in revealing relative contributions of distinct cell types to diseases and in identifying genes likely to be influenced by non-coding GWAS variants.

Download Full-text

Mapping genetic effects on cell type-specific chromatin accessibility and annotating complex trait variants using single nucleus ATAC-seq

10.1101/2020.12.03.387894 ◽

2020 ◽

Author(s):

Paola Benaglio ◽

Jacklyn Newsome ◽

Jee Yun Han ◽

Joshua Chiou ◽

Anthony Aylward ◽

...

Keyword(s):

Complex Traits ◽

Immune Cell ◽

Cell Types ◽

Chromatin Accessibility ◽

Genetic Effects ◽

Specific Cell ◽

Gene Promoters ◽

Cell Type ◽

Single Nucleus ◽

Cell Type Specific

AbstractGene regulation is highly cell type-specific and understanding the function of non-coding genetic variants associated with complex traits requires molecular phenotyping at cell type resolution. In this study we performed single nucleus ATAC-seq (snATAC-seq) and genotyping in peripheral blood mononuclear cells from 10 individuals. Clustering chromatin accessibility profiles of 66,843 total nuclei identified 14 immune cell types and sub-types. We mapped chromatin accessibility QTLs (caQTLs) in each immune cell type and sub-type which identified 6,248 total caQTLs, including those obscured from assays of bulk tissue such as with divergent effects on different cell types. For 3,379 caQTLs we further annotated putative target genes of variant activity using single cell co-accessibility, and caQTL variants were significantly correlated with the accessibility level of linked gene promoters. We fine-mapped loci associated with 16 complex immune traits and identified immune cell caQTLs at 517 candidate causal variants, including those with cell type-specific effects. At the 6q15 locus associated with type 1 diabetes, in line with previous reports, variant rs72928038 was a naïve CD4+ T cell caQTL linked to BACH2 and we validated the allelic effects of this variant on regulatory activity in Jurkat T cells. These results highlight the utility of snATAC-seq for mapping genetic effects on accessible chromatin in specific cell types and provide a resource for annotating complex immune trait loci.

Download Full-text

Human and rat skeletal muscle single-nuclei multi-omic integrative analyses nominate causal cell types, regulatory elements, and SNPs for complex traits

Genome Research ◽

10.1101/gr.268482.120 ◽

2021 ◽

Author(s):

Peter Orchard ◽

Nandini Manickam ◽

Christa Ventresca ◽

Swarooparani Vadlamudi ◽

Arushi Varshney ◽

...

Keyword(s):

Skeletal Muscle ◽

Muscle Cell ◽

Muscle Fiber ◽

Cell Types ◽

Chromatin Accessibility ◽

Skeletal Muscle Cell ◽

Rna Seq ◽

Rat Skeletal Muscle ◽

Integrative Analyses ◽

Single Nucleus

Skeletal muscle accounts for the largest proportion of human body mass, on average, and is a key tissue in complex diseases and mobility. It is composed of several different cell and muscle fiber types. Here, we optimize single-nucleus ATAC-seq (snATAC-seq) to map skeletal muscle cell–specific chromatin accessibility landscapes in frozen human and rat samples, and single-nucleus RNA-seq (snRNA-seq) to map cell-specific transcriptomes in human. We additionally perform multi-omics profiling (gene expression and chromatin accessibility) on human and rat muscle samples. We capture type I and type II muscle fiber signatures, which are generally missed by existing single-cell RNA-seq methods. We perform cross-modality and cross-species integrative analyses on 33,862 nuclei and identify seven cell types ranging in abundance from 59.6% to 1.0% of all nuclei. We introduce a regression-based approach to infer cell types by comparing transcription start site–distal ATAC-seq peaks to reference enhancer maps and show consistency with RNA-based marker gene cell type assignments. We find heterogeneity in enrichment of genetic variants linked to complex phenotypes from the UK Biobank and diabetes genome-wide association studies in cell-specific ATAC-seq peaks, with the most striking enrichment patterns in muscle mesenchymal stem cells (∼3.5% of nuclei). Finally, we overlay these chromatin accessibility maps on GWAS data to nominate causal cell types, SNPs, transcription factor motifs, and target genes for type 2 diabetes signals. These chromatin accessibility profiles for human and rat skeletal muscle cell types are a useful resource for nominating causal GWAS SNPs and cell types.

Download Full-text

CellMap: Characterizing the types and composition of iPSC-derived cells from RNA-seq data

10.1101/2021.05.24.445360 ◽

2021 ◽

Author(s):

Zhengyu Ouyang ◽

Nathanael Bourgeois ◽

Eugenia Lyashenko ◽

Paige Cundiff ◽

Patrick F Cullen ◽

...

Keyword(s):

Single Cell ◽

Induced Pluripotent Stem Cell ◽

Cell Types ◽

Model Systems ◽

Rna Seq ◽

Cell Type ◽

Fine Grained ◽

Single Nucleus ◽

Induced Pluripotent

Induced pluripotent stem cell (iPSC) derived cell types are increasingly employed as in vitro model systems for drug discovery. For these studies to be meaningful, it is important to understand the reproducibility of the iPSC-derived cultures and their similarity to equivalent endogenous cell types. Single-cell and single-nucleus RNA sequencing (RNA-seq) are useful to gain such understanding, but they are expensive and time consuming, while bulk RNA-seq data can be generated quicker and at lower cost. In silico cell type decomposition is an efficient, inexpensive, and convenient alternative that can leverage bulk RNA-seq to derive more fine-grained information about these cultures. We developed CellMap, a computational tool that derives cell type profiles from publicly available single-cell and single-nucleus datasets to infer cell types in bulk RNA-seq data from iPSC-derived cell lines.

Download Full-text

TRlnc: a comprehensive database for human transcriptional regulatory information of lncRNAs

Briefings in Bioinformatics ◽

10.1093/bib/bbaa011 ◽

2020 ◽

Cited By ~ 3

Author(s):

Yanyu Li ◽

Xuecang Li ◽

Yongsan Yang ◽

Meng Li ◽

Fengcui Qian ◽

...

Keyword(s):

Chromatin Accessibility ◽

Process Data ◽

Regulatory Regions ◽

Comprehensive Database ◽

Super Enhancer ◽

Annotation Information ◽

Transcriptional Regulatory ◽

Regulatory Information ◽

Risk Snps ◽

Transcriptional Regulatory Mechanisms

Abstract Long noncoding RNAs (lncRNAs) have been proven to play important roles in transcriptional processes and biological functions. With the increasing study of human diseases and biological processes, information in human H3K27ac ChIP-seq, ATAC-seq and DNase-seq datasets is accumulating rapidly, resulting in an urgent need to collect and process data to identify transcriptional regulatory regions of lncRNAs. We therefore developed a comprehensive database for human regulatory information of lncRNAs (TRlnc, http://bio.licpathway.net/TRlnc), which aimed to collect available resources of transcriptional regulatory regions of lncRNAs and to annotate and illustrate their potential roles in the regulation of lncRNAs in a cell type-specific manner. The current version of TRlnc contains 8 683 028 typical enhancers/super-enhancers and 32 348 244 chromatin accessibility regions associated with 91 906 human lncRNAs. These regions are identified from over 900 human H3K27ac ChIP-seq, ATAC-seq and DNase-seq samples. Furthermore, TRlnc provides the detailed genetic and epigenetic annotation information within transcriptional regulatory regions (promoter, enhancer/super-enhancer and chromatin accessibility regions) of lncRNAs, including common SNPs, risk SNPs, eQTLs, linkage disequilibrium SNPs, transcription factors, methylation sites, histone modifications and 3D chromatin interactions. It is anticipated that the use of TRlnc will help users to gain in-depth and useful insights into the transcriptional regulatory mechanisms of lncRNAs.

Download Full-text

SAT-298 Integrative Single-Cell Transcriptomic and Epigenomic Landscape of Mouse Anterior Pituitary Cell Types

Journal of the Endocrine Society ◽

10.1210/jendso/bvaa046.593 ◽

2020 ◽

Vol 4 (Supplement_1) ◽

Author(s):

Frederique Murielle Ruf-Zamojski ◽

Michel A Zamojski ◽

German Nudelman ◽

Yongchao Ge ◽

Natalia Mendelev ◽

...

Keyword(s):

Single Cell ◽

Cell Line ◽

Anterior Pituitary ◽

Cell Types ◽

Chromatin Accessibility ◽

Pituitary Cell ◽

Integrated Analysis ◽

Pituitary Cells ◽

Rna Seq ◽

Cell Type

Abstract The pituitary gland is a critical regulator of the neuroendocrine system. To further our understanding of the classification, cellular heterogeneity, and regulatory landscape of pituitary cell types, we performed and computationally integrated single cell (SC)/single nucleus (SN) resolution experiments capturing RNA expression, chromatin accessibility, and DNA methylation state from mouse dissociated whole pituitaries. Both SC and SN transcriptome analysis and promoter accessibility identified the five classical hormone-producing cell types (somatotropes, gonadotropes (GT), lactotropes, thyrotropes, and corticotropes). GT cells distinctively expressed transcripts for Cga, Fshb, Lhb, Nr5a1, and Gnrhr in SC RNA-seq and SN RNA-seq. This was matched in SN ATAC-seq with GTs specifically showing open chromatin at the promoter regions for the same genes. Similarly, the other classically defined anterior pituitary cells displayed transcript expression and chromatin accessibility patterns characteristic of their own cell type. This integrated analysis identified additional cell-types, such as a stem cell cluster expressing transcripts for Sox2, Sox9, Mia, and Rbpms, and a broadly accessible chromatin state. In addition, we performed bulk ATAC-seq in the LβT2b gonadotrope-like cell line. While the FSHB promoter region was closed in the cell line, we identified a region upstream of Fshb that became accessible by the synergistic actions of GnRH and activin A, and that corresponded to a conserved region identified by a polycystic ovary syndrome (PCOS) single nucleotide polymorphism (SNP). Although this locus appears closed in deep sequencing bulk ATAC-seq of dissociated mouse pituitary cells, SN ATAC-seq of the same preparation showed that this site was specifically open in mouse GT, but closed in 14 other pituitary cell type clusters. This discrepancy highlighted the detection limit of a bulk ATAC-seq experiment in a subpopulation, as GT represented ~5% of this dissociated anterior pituitary sample. These results identified this locus as a candidate for explaining the dual dependence of Fshb expression on GnRH and activin/TGFβ signaling, and potential new evidence for upstream regulation of Fshb. The pituitary epigenetic landscape provides a resource for improved cell type identification and for the investigation of the regulatory mechanisms driving cell-to-cell heterogeneity. Additional authors not listed due to abstract submission restrictions: N. Seenarine, M. Amper, N. Jain (ISMMS).

Download Full-text

Accurate imputation of histone modifications using transcription

10.1101/2020.04.08.032730 ◽

2020 ◽

Cited By ~ 3

Author(s):

Zhong Wang ◽

Alexandra G. Chivu ◽

Lauren A. Choate ◽

Edward J. Rice ◽

Donald C. Miller ◽

...

Keyword(s):

Transcription Initiation ◽

Cell Types ◽

Chromatin Accessibility ◽

Cell Type ◽

Active Chromatin ◽

Histone Marks ◽

Repressive Mark ◽

Cell Type Specific ◽

The Relationship ◽

Machine Learning Tool

AbstractWe trained a sensitive machine learning tool to infer the distribution of histone marks using maps of nascent transcription. Transcription captured the variation in active histone marks and complex chromatin states, like bivalent promoters, down to single-nucleosome resolution and at an accuracy that rivaled the correspondence between independent ChIP-seq experiments. The relationship between active histone marks and transcription was conserved in all cell types examined, allowing individual labs to annotate active functional elements in mammals with similar richness as major consortia. Using imputation as an interpretative tool uncovered cell-type specific differences in how the PRC2-dependent repressive mark, H3K27me3, corresponds to transcription, and revealed that transcription initiation requires both chromatin accessibility and an active chromatin environment demonstrating that initiation is less promiscuous than previously thought.

Download Full-text

DNA Methylation Atlas of the Mouse Brain at Single-Cell Resolution

10.1101/2020.04.30.069377 ◽

2020 ◽

Cited By ~ 1

Author(s):

Hanqing Liu ◽

Jingtian Zhou ◽

Wei Tian ◽

Chongyuan Luo ◽

Anna Bartlett ◽

...

Keyword(s):

Dna Methylation ◽

Mouse Brain ◽

Spatial Organization ◽

Brain Area ◽

Cell Types ◽

Regulatory Elements ◽

Mammalian Brain ◽

Open Chromatin ◽

Cell Type ◽

Single Nucleus

SummaryMammalian brain cells are remarkably diverse in gene expression, anatomy, and function, yet the regulatory DNA landscape underlying this extensive heterogeneity is poorly understood. We carried out a comprehensive assessment of the epigenomes of mouse brain cell types by applying single nucleus DNA methylation sequencing to profile 110,294 nuclei from 45 regions of the mouse cortex, hippocampus, striatum, pallidum, and olfactory areas. We identified 161 cell clusters with distinct spatial locations and projection targets. We constructed taxonomies of these epigenetic types, annotated with signature genes, regulatory elements, and transcription factors. These features indicate the potential regulatory landscape supporting the assignment of putative cell types, and reveal repetitive usage of regulators in excitatory and inhibitory cells for determining subtypes. The DNA methylation landscape of excitatory neurons in the cortex and hippocampus varied continuously along spatial gradients. Using this deep dataset, an artificial neural network model was constructed that precisely predicts single neuron cell-type identity and brain area spatial location. Integration of high-resolution DNA methylomes with single-nucleus chromatin accessibility data allowed prediction of high-confidence enhancer-gene interactions for all identified cell types, which were subsequently validated by cell-type-specific chromatin conformation capture experiments. By combining multi-omic datasets (DNA methylation, chromatin contacts, and open chromatin) from single nuclei and annotating the regulatory genome of hundreds of cell types in the mouse brain, our DNA methylation atlas establishes the epigenetic basis for neuronal diversity and spatial organization throughout the mouse brain.

Download Full-text

Enhancement and Imputation of Peak Signal Enables Accurate Cell-Type Classification in scATAC-seq

Frontiers in Genetics ◽

10.3389/fgene.2021.658352 ◽

2021 ◽

Vol 12 ◽

Author(s):

Zhe Cui ◽

Ya Cui ◽

Yan Gao ◽

Tao Jiang ◽

Tianyi Zang ◽

...

Keyword(s):

Single Cell ◽

Mononuclear Cells ◽

Cell Types ◽

Chromatin Accessibility ◽

Support Vector ◽

Cell Type ◽

Peripheral Blood Mononuclear ◽

Copy Numbers ◽

Genome Wide ◽

Accessible Chromatin

Single-cell Assay Transposase Accessible Chromatin sequencing (scATAC-seq) has been widely used in profiling genome-wide chromatin accessibility in thousands of individual cells. However, compared with single-cell RNA-seq, the peaks of scATAC-seq are much sparser due to the lower copy numbers (diploid in humans) and the inherent missing signals, which makes it more challenging to classify cell type based on specific expressed gene or other canonical markers. Here, we present svmATAC, a support vector machine (SVM)-based method for accurately identifying cell types in scATAC-seq datasets by enhancing peak signal strength and imputing signals through patterns of co-accessibility. We applied svmATAC to several scATAC-seq data from human immune cells, human hematopoietic system cells, and peripheral blood mononuclear cells. The benchmark results showed that svmATAC is free of literature-based markers and robust across datasets in different libraries and platforms. The source code of svmATAC is available at https://github.com/mrcuizhe/svmATAC under the MIT license.

Download Full-text

Exploiting marker genes for robust classification and characterization of single-cell chromatin accessibility

10.1101/2021.04.01.438068 ◽

2021 ◽

Author(s):

Risa Karakida Kawaguchi ◽

Ziqi Tang ◽

Stephan Fischer ◽

Rohit Tripathy ◽

Peter K. Koo ◽

...

Keyword(s):

Single Cell ◽

Marker Gene ◽

Cell Types ◽

Chromatin Accessibility ◽

Marker Genes ◽

Cell Type ◽

Gene Sets ◽

Typing Methods ◽

Cell Type Specific ◽

Cell Typing

Background: Single-cell Assay for Transposase Accessible Chromatin using sequencing (scATAC-seq) measures genome-wide chromatin accessibility for the discovery of cell-type specific regulatory networks. ScATAC-seq combined with single-cell RNA sequencing (scRNA-seq) offers important avenues for ongoing research, such as novel cell-type specific activation of enhancer and transcription factor binding sites as well as chromatin changes specific to cell states. On the other hand, scATAC-seq data is known to be challenging to interpret due to its high number of zeros as well as the heterogeneity derived from different protocols. Because of the stochastic lack of marker gene activities, cell type identification by scATAC-seq remains difficult even at a cluster level. Results: In this study, we exploit reference knowledge obtained from external scATAC-seq or scRNA-seq datasets to define existing cell types and uncover the genomic regions which drive cell-type specific gene regulation. To investigate the robustness of existing cell-typing methods, we collected 7 scATAC-seq datasets targeting mouse brain for a meta-analytic comparison of neuronal cell-type annotation, including a reference atlas generated by the BRAIN Initiative Cell Census Network (BICCN). By comparing the area under the receiver operating characteristics curves (AUROCs) for the three major cell types (inhibitory, excitatory, and non-neuronal cells), cell-typing performance by single markers is found to be highly variable even for known marker genes due to study-specific biases. However, the signal aggregation of a large and redundant marker gene set, optimized via multiple scRNA-seq data, achieves the highest cell-typing performances among 5 existing marker gene sets, from the individual cell to cluster level. That gene set also shows a high consistency with the cluster-specific genes from inhibitory subtypes in two well-annotated datasets, suggesting applicability to rare cell types. Next, we demonstrate a comprehensive assessment of scATAC-seq cell typing using exhaustive combinations of the marker gene sets with supervised learning methods including machine learning classifiers and joint clustering methods. Our results show that the combinations using robust marker gene sets systematically ranked at the top, not only with model based prediction using a large reference data but also with a simple summation of expression strengths across markers. To demonstrate the utility of this robust cell typing approach, we trained a deep neural network to predict chromatin accessibility in each subtype using only DNA sequence. Through model interpretation methods, we identify key motifs enriched about robust gene sets for each neuronal subtype. Conclusions: Through the meta-analytic evaluation of scATAC-seq cell-typing methods, we develop a novel method set to exploit the BICCN reference atlas. Our study strongly supports the value of robust marker gene selection as a feature selection tool and cross-dataset comparison between scATAC-seq datasets to improve alignment of scATAC-seq to known biology. With this novel, high quality epigenetic data, genomic analysis of regulatory regions can reveal sequence motifs that drive cell type-specific regulatory programs.

Download Full-text