scholarly journals SnapHiC: a computational pipeline to map chromatin contacts from single cell Hi-C data

2020 ◽  
Author(s):  
Miao Yu ◽  
Armen Abnousi ◽  
Yanxiao Zhang ◽  
Guoqiang Li ◽  
Lindsay Lee ◽  
...  

Single cell Hi-C (scHi-C) analysis has been increasingly used to map the chromatin architecture in diverse tissue contexts, but computational tools to define chromatin contacts at high resolution from scHi-C data are still lacking. Here, we describe SnapHiC, a method that can identify chromatin loops at high resolution and accuracy from scHi-C data. We benchmark SnapHiC against HiCCUPS, a common tool for mapping chromatin contacts in bulk Hi-C data, using scHi-C data from 742 mouse embryonic stem cells. We further demonstrate its utility by analyzing single-nucleus methyl-3C-seq data from 2,869 human prefrontal cortical cells. We uncover cell-type-specific chromatin loops and predict putative target genes for non-coding sequence variants associated with neuropsychiatric disorders. Our results suggest that SnapHiC could facilitate the analysis of cell-type-specific chromatin architecture and gene regulatory programs in complex tissues.

2021 ◽  
Vol 18 (9) ◽  
pp. 1056-1059
Author(s):  
Miao Yu ◽  
Armen Abnousi ◽  
Yanxiao Zhang ◽  
Guoqiang Li ◽  
Lindsay Lee ◽  
...  

AbstractSingle-cell Hi-C (scHi-C) analysis has been increasingly used to map chromatin architecture in diverse tissue contexts, but computational tools to define chromatin loops at high resolution from scHi-C data are still lacking. Here, we describe Single-Nucleus Analysis Pipeline for Hi-C (SnapHiC), a method that can identify chromatin loops at high resolution and accuracy from scHi-C data. Using scHi-C data from 742 mouse embryonic stem cells, we benchmark SnapHiC against a number of computational tools developed for mapping chromatin loops and interactions from bulk Hi-C. We further demonstrate its use by analyzing single-nucleus methyl-3C-seq data from 2,869 human prefrontal cortical cells, which uncovers cell type-specific chromatin loops and predicts putative target genes for noncoding sequence variants associated with neuropsychiatric disorders. Our results indicate that SnapHiC could facilitate the analysis of cell type-specific chromatin architecture and gene regulatory programs in complex tissues.


2019 ◽  
Author(s):  
Joshua Chiou ◽  
Chun Zeng ◽  
Zhang Cheng ◽  
Jee Yun Han ◽  
Michael Schlichting ◽  
...  

AbstractGenetic risk variants for complex, multifactorial diseases are enriched in cis-regulatory elements. Single cell epigenomic technologies create new opportunities to dissect cell type-specific mechanisms of risk variants, yet this approach has not been widely applied to disease-relevant tissues. Given the central role of pancreatic islets in type 2 diabetes (T2D) pathophysiology, we generated accessible chromatin profiles from 14.2k islet cells and identified 13 cell clusters including multiple alpha, beta and delta cell clusters which represented hormone-producing and signal-responsive cell states. We cataloged 244,236 islet cell type accessible chromatin sites and identified transcription factors (TFs) underlying both lineage- and state-specific regulation. We measured the enrichment of T2D and glycemic trait GWAS for the accessible chromatin profiles of single cells, which revealed heterogeneity in the effects of beta cell states and TFs on fasting glucose and T2D risk. We further used machine learning to predict the cell type-specific regulatory function of genetic variants, and single cell co-accessibility to link distal sites to putative cell type-specific target genes. We localized 239 fine-mapped T2D risk signals to islet accessible chromatin, and further prioritized variants at these signals with predicted regulatory function and co-accessibility with target genes. At the KCNQ1 locus, the causal T2D variant rs231361 had predicted effects on an enhancer with beta cell-specific, long-range co-accessibility to the insulin promoter, and deletion of this enhancer reduced insulin gene and protein expression in human embryonic stem cell-derived beta cells. Our findings provide a cell type- and state-resolved map of gene regulation in human islets, illuminate likely mechanisms of T2D risk at hundreds of loci, and demonstrate the power of single cell epigenomics for interpreting complex disease genetics.


2020 ◽  
Vol 48 (W1) ◽  
pp. W275-W286 ◽  
Author(s):  
Anjun Ma ◽  
Cankun Wang ◽  
Yuzhou Chang ◽  
Faith H Brennan ◽  
Adam McDermaid ◽  
...  

Abstract A group of genes controlled as a unit, usually by the same repressor or activator gene, is known as a regulon. The ability to identify active regulons within a specific cell type, i.e., cell-type-specific regulons (CTSR), provides an extraordinary opportunity to pinpoint crucial regulators and target genes responsible for complex diseases. However, the identification of CTSRs from single-cell RNA-Seq (scRNA-Seq) data is computationally challenging. We introduce IRIS3, the first-of-its-kind web server for CTSR inference from scRNA-Seq data for human and mouse. IRIS3 is an easy-to-use server empowered by over 20 functionalities to support comprehensive interpretations and graphical visualizations of identified CTSRs. CTSR data can be used to reliably characterize and distinguish the corresponding cell type from others and can be combined with other computational or experimental analyses for biomedical studies. CTSRs can, therefore, aid in the discovery of major regulatory mechanisms and allow reliable constructions of global transcriptional regulation networks encoded in a specific cell type. The broader impact of IRIS3 includes, but is not limited to, investigation of complex diseases hierarchies and heterogeneity, causal gene regulatory network construction, and drug development. IRIS3 is freely accessible from https://bmbl.bmi.osumc.edu/iris3/ with no login requirement.


2019 ◽  
Author(s):  
Ashley G. Anderson ◽  
Ashwinikumar Kulkarni ◽  
Matthew Harper ◽  
Genevieve Konopka

AbstractThe striatum is a critical forebrain structure for integrating cognitive, sensory, and motor information from diverse brain regions into meaningful behavioral output. However, the transcriptional mechanisms that underlie striatal development and organization at single-cell resolution remain unknown. Here, we show that Foxp1, a transcription factor strongly linked to autism and intellectual disability, regulates organizational features of striatal circuitry in a cell-type-dependent fashion. Using single-cell RNA-sequencing, we examine the cellular diversity of the early postnatal striatum and find that cell-type-specific deletion ofFoxp1in striatal projection neurons alters the cellular composition and neurochemical architecture of the striatum. Importantly, using this approach, we identify the non-cell autonomous effects produced by disruptingFoxp1in one cell-type and the molecular compensation that occurs in other populations. Finally, we identify Foxp1-regulated target genes within distinct cell-types and connect these molecular changes to functional and behavioral deficits relevant to phenotypes described in patients withFOXP1loss-of-function mutations. These data reveal cell-type-specific transcriptional mechanisms underlying distinct features of striatal circuitry and identify Foxp1 as a key regulator of striatal development.


2018 ◽  
Author(s):  
Anders S. Hansen ◽  
Tsung-Han S. Hsieh ◽  
Claudia Cattoglio ◽  
Iryna Pustova ◽  
Xavier Darzacq ◽  
...  

Mammalian genomes are folded into Topologically Associating Domains (TADs), consisting of cell-type specific chromatin loops anchored by CTCF and cohesin. Since CTCF and cohesin are expressed ubiquitously, how cell-type specific CTCF-mediated loops are formed poses a paradox. Here we show RNase-sensitive CTCF self-association in vitro and that an RNA-binding region (RBR) mediates CTCF clustering in vivo. Intriguingly, deleting the RBR abolishes or impairs almost half of all chromatin loops in mouse embryonic stem cells. Disrupted loop formation correlates with abrogated clustering and diminished chromatin binding of the RBR mutant CTCF protein, which in turn results in a failure to halt cohesin-mediated extrusion. Thus, CTCF loops fall into at least 2 classes: RBR-independent and RBR-dependent loops. We suggest that evidence for distinct classes of RBR-dependent loops may provide a mechanism for establishing cell-specific CTCF loops regulated by RNAs and other RBR partner.


Circulation ◽  
2020 ◽  
Vol 142 (Suppl_3) ◽  
Author(s):  
Suvi Linna Kuosmanen ◽  
Eloi Schmauch ◽  
Kyriakitsa Galani ◽  
Carles Boix ◽  
Yongjin P Park ◽  
...  

Genome-wide association studies have uncovered over 200 genetic loci underlying coronary artery disease (CAD), providing great hope for a deeper understanding of the causal mechanisms leading to this disease. However, in order to understand CAD at the molecular level, it is necessary to uncover cell-type-specific circuits and to use these circuits to dissect driver variants, genes, pathways, and cell types, in normal and diseased tissues. Here, we provide the most detailed single-cell dissection of human heart cell types, using cardiac biopsies collected during open-heart surgery from healthy, CAD, and CAD-related heart failure donors, and profiling both transcriptional (scRNA-seq) and epigenomic (scATAC-seq) changes. Using this approach, we identify 12 major heart cell types, including typical cardiovascular cells (cardiomyocytes, endothelial cells, fibroblasts), rarer cell types (B cells, neurons, Schwann cells), and previously-unrecognized layer-specific epithelial and endothelial cell types. We define markers for each cell type, providing the first extensive reference set for the living human heart. In addition, we define differential gene expression patterns in CAD relative to control samples, revealing substantial differences in cell-type-specific expression of disease-related genes, emphasizing, for example, the importance of the vascular endothelium in the pathogenesis of CAD. Strikingly, further clustering of the cell types based on specific subtypes revealed important differences in their expression patterns of disease-associated genes. These changes enrich in known CAD genetic loci, enabling us to recognize their likely target genes from scRNA-seq expression changes, candidate driver variants based on scATAC-seq localization and differential DNA accessibility, and candidate upstream regulators based on their enriched motif occurrences in scATAC loci. Overall, our results highlight the relevance and potential of single-cell transcriptional and epigenomic analyses to gain new biological insights into cardiovascular disease, and to recognize novel therapeutic target genes, pathways, and the cell types where they act.


2021 ◽  
Author(s):  
Su Chun ◽  
Long Gao ◽  
Catherine L May ◽  
James A Pippin ◽  
Keith Boehm ◽  
...  

Three-dimensional (3D) chromatin organization maps help to dissect cell type-specific gene regulatory programs. Furthermore, 3D chromatin maps have contributed to elucidating the pathogenesis of complex genetic diseases by connecting distal regulatory regions and genetic risk variants to their respective target genes. To understand the cell type-specific regulatory architecture of diabetes risk, we generated transcriptomic and 3D epigenomic profiles of human pancreatic acinar, alpha, and beta cells using single-cell RNA-seq, single-cell ATAC-seq, and high-resolution Hi-C of sorted cells. Comparisons of these profiles revealed differential A/B (open/closed) chromatin compartmentalization, chromatin looping, and control of cell type-specific gene regulatory programs. We identified a total of 1,094 putative causal-variant-target-gene pairs at 129 type 2 diabetes GWAS signals using pancreatic 3D chromatin maps. We found that the connections between candidate causal variants and their putative target effector genes are cell-type stratified and emphasize previously underappreciated roles for alpha and acinar cells in diabetes pathogenesis


2020 ◽  
Vol 36 (Supplement_2) ◽  
pp. i610-i617
Author(s):  
Mohammad Lotfollahi ◽  
Mohsen Naghipourfar ◽  
Fabian J Theis ◽  
F Alexander Wolf

Abstract Motivation While generative models have shown great success in sampling high-dimensional samples conditional on low-dimensional descriptors (stroke thickness in MNIST, hair color in CelebA, speaker identity in WaveNet), their generation out-of-distribution poses fundamental problems due to the difficulty of learning compact joint distribution across conditions. The canonical example of the conditional variational autoencoder (CVAE), for instance, does not explicitly relate conditions during training and, hence, has no explicit incentive of learning such a compact representation. Results We overcome the limitation of the CVAE by matching distributions across conditions using maximum mean discrepancy in the decoder layer that follows the bottleneck. This introduces a strong regularization both for reconstructing samples within the same condition and for transforming samples across conditions, resulting in much improved generalization. As this amount to solving a style-transfer problem, we refer to the model as transfer VAE (trVAE). Benchmarking trVAE on high-dimensional image and single-cell RNA-seq, we demonstrate higher robustness and higher accuracy than existing approaches. We also show qualitatively improved predictions by tackling previously problematic minority classes and multiple conditions in the context of cellular perturbation response to treatment and disease based on high-dimensional single-cell gene expression data. For generic tasks, we improve Pearson correlations of high-dimensional estimated means and variances with their ground truths from 0.89 to 0.97 and 0.75 to 0.87, respectively. We further demonstrate that trVAE learns cell-type-specific responses after perturbation and improves the prediction of most cell-type-specific genes by 65%. Availability and implementation The trVAE implementation is available via github.com/theislab/trvae. The results of this article can be reproduced via github.com/theislab/trvae_reproducibility.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Rongxin Fang ◽  
Sebastian Preissl ◽  
Yang Li ◽  
Xiaomeng Hou ◽  
Jacinta Lucero ◽  
...  

AbstractIdentification of the cis-regulatory elements controlling cell-type specific gene expression patterns is essential for understanding the origin of cellular diversity. Conventional assays to map regulatory elements via open chromatin analysis of primary tissues is hindered by sample heterogeneity. Single cell analysis of accessible chromatin (scATAC-seq) can overcome this limitation. However, the high-level noise of each single cell profile and the large volume of data pose unique computational challenges. Here, we introduce SnapATAC, a software package for analyzing scATAC-seq datasets. SnapATAC dissects cellular heterogeneity in an unbiased manner and map the trajectories of cellular states. Using the Nyström method, SnapATAC can process data from up to a million cells. Furthermore, SnapATAC incorporates existing tools into a comprehensive package for analyzing single cell ATAC-seq dataset. As demonstration of its utility, SnapATAC is applied to 55,592 single-nucleus ATAC-seq profiles from the mouse secondary motor cortex. The analysis reveals ~370,000 candidate regulatory elements in 31 distinct cell populations in this brain region and inferred candidate cell-type specific transcriptional regulators.


PLoS ONE ◽  
2018 ◽  
Vol 13 (10) ◽  
pp. e0205883 ◽  
Author(s):  
Joseph C. Mays ◽  
Michael C. Kelly ◽  
Steven L. Coon ◽  
Lynne Holtzclaw ◽  
Martin F. Rath ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document