scholarly journals scDALI: modeling allelic heterogeneity in single cells reveals context-specific genetic regulation

2022 ◽  
Vol 23 (1) ◽  
Author(s):  
Tobias Heinen ◽  
Stefano Secchia ◽  
James P. Reddington ◽  
Bingqing Zhao ◽  
Eileen E. M. Furlong ◽  
...  

AbstractWhile it is established that the functional impact of genetic variation can vary across cell types and states, capturing this diversity remains challenging. Current studies using bulk sequencing either ignore this heterogeneity or use sorted cell populations, reducing discovery and explanatory power. Here, we develop scDALI, a versatile computational framework that integrates information on cellular states with allelic quantifications of single-cell sequencing data to characterize cell-state-specific genetic effects. We apply scDALI to scATAC-seq profiles from developing F1 Drosophila embryos and scRNA-seq from differentiating human iPSCs, uncovering heterogeneous genetic effects in specific lineages, developmental stages, or cell types.

2021 ◽  
Author(s):  
Tobias Heinen ◽  
Stefano Secchia ◽  
James P. Reddington ◽  
Bingqing Zhao ◽  
Eileen E. M. Furlong ◽  
...  

While the functional impact of genetic variation can vary across cell types and states, capturing this diversity remains challenging. Current studies, using bulk sequencing, ignore much of this heterogeneity, reducing discovery and explanatory power. Single-cell approaches combined with F1 genetic designs provide a new opportunity to address this problem, however suitable computational methods to model these complex relationships are lacking. Here, we developed scDALI, an analysis framework that integrates single-cell chromatin accessibility for unbiased cell state identification with allelic quantifications to assay genetic effects. scDALI builds on Gaussian process regression and can differentiate between homogeneous (pervasive) allelic imbalances and cell state-specific regulation. As a proof-of-principle, we applied scDALI to whole Drosophila embryos from F1 crosses, profiling sciATAC-seq at three embryonic stages. Even in these very complex samples, scDALI discovered hundreds of peaks with heterogeneous allelic imbalance, having effects in specific lineages and/or developmental stages. Our study provides a general strategy to identify the cellular context of allelic imbalance, a crucial step in linking genetic traits to cellular phenotypes.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Vikram Agarwal ◽  
Sereno Lopez-Darwin ◽  
David R. Kelley ◽  
Jay Shendure

Abstract3′ untranslated regions (3′ UTRs) post-transcriptionally regulate mRNA stability, localization, and translation rate. While 3′-UTR isoforms have been globally quantified in limited cell types using bulk measurements, their differential usage among cell types during mammalian development remains poorly characterized. In this study, we examine a dataset comprising ~2 million nuclei spanning E9.5–E13.5 of mouse embryonic development to quantify transcriptome-wide changes in alternative polyadenylation (APA). We observe a global lengthening of 3′ UTRs across embryonic stages in all cell types, although we detect shorter 3′ UTRs in hematopoietic lineages and longer 3′ UTRs in neuronal cell types within each stage. An analysis of RNA-binding protein (RBP) dynamics identifies ELAV-like family members, which are concomitantly induced in neuronal lineages and developmental stages experiencing 3′-UTR lengthening, as putative regulators of APA. By measuring 3′-UTR isoforms in an expansive single cell dataset, our work provides a transcriptome-wide and organism-wide map of the dynamic landscape of alternative polyadenylation during mammalian organogenesis.


2013 ◽  
Vol 52 (1) ◽  
pp. R35-R49 ◽  
Author(s):  
Nils Wierup ◽  
Frank Sundler ◽  
R Scott Heller

The islets of Langerhans are key regulators of glucose homeostasis and have been known as a structure for almost one and a half centuries. During the twentieth century several different cell types were described in the islets of different species and at different developmental stages. Six cell types with identified hormonal product have been described so far by the use of histochemical staining methods, transmission electron microscopy, and immunohistochemistry. Thus, glucagon-producing α-cells, insulin-producing β-cells, somatostatin-producing δ-cells, pancreatic polypeptide-producing PP-cells, serotonin-producing enterochromaffin-cells, and gastrin-producing G-cells have all been found in the mammalian pancreas at least at some developmental stage. Species differences are at hand and age-related differences are also to be considered. Eleven years ago a novel cell type, the ghrelin cell, was discovered in the human islets. Subsequent studies have shown the presence of islet ghrelin cells in several animals, including mouse, rat, gerbils, and fish. The developmental regulation of ghrelin cells in the islets of mice has gained a lot of interest and several studies have added important pieces to the puzzle of molecular mechanisms and the genetic regulation that lead to differentiation into mature ghrelin cells. A body of evidence has shown that ghrelin is an insulinostatic hormone, and the potential for blockade of ghrelin signalling as a therapeutic avenue for type 2 diabetes is intriguing. Furthermore, ghrelin-expressing pancreatic tumours have been reported and ghrelin needs to be taken into account when diagnosing pancreatic tumours. In this review article, we summarise the knowledge about islet ghrelin cells obtained so far.


Blood ◽  
2018 ◽  
Vol 132 (Supplement 1) ◽  
pp. 531-531
Author(s):  
Erik L. Bao ◽  
Jacob C. Ulirsch ◽  
Caleb A. Lareau ◽  
Leif S. Ludwig ◽  
Michael H. Guo ◽  
...  

Abstract Hematopoiesis is a well-characterized paradigm of cellular differentiation that is highly regulated to ensure balanced proportions of mature blood cells. However, many aspects of this process remain poorly understood in humans. For example, there is extensive variation in commonly measured blood cell traits, which can manifest as diseases at extreme ends of the spectrum, yet the vast majority of genetic loci responsible for driving these differences are currently unknown. Here, we integrate fine-mapped population genetics with high-resolution chromatin landscapes to gain novel insights into regulatory mechanisms critical for human blood cell production and disease. First, we conducted a genome-wide association study in 115,000 individuals from the UK Biobank, measuring the effects of genetic variation on 16 blood traits spanning 7 hematopoietic lineages (erythroid, platelet, lymphocyte, monocyte, neutrophil, eosinophil, basophil). Within each region of association (n = 2,056), we performed Bayesian fine-mapping on all common variants to resolve the most likely causal hits. Going further, we were interested in whether genetic variants predominantly act in terminal cell states or less differentiated progenitors. To this end, we overlapped fine-mapped variants with chromatin accessibility profiles (ATAC-seq) of 18 primary hematopoietic populations sorted from healthy donors. Across all lineages, 21% of regulatory variants were restricted to accessible chromatin (AC) peaks in terminal progenitors. Interestingly, 59% of variants fell in AC regions of one or more upstream progenitor states, suggesting that a significant amount of variation in blood traits stems from regulatory signaling in earlier stages of hematopoiesis. Motivated by this finding, we hypothesized that different branches of hematopoiesis (e.g., monocyte and red blood cell count) could be co-regulated by pleiotropic variants acting in common progenitor populations. Therefore, we investigated variants associated with 2 or more of the 7 blood cell types for which phenotypes were available. Remarkably, across 172 such variants, there was an average of 60% more open chromatin in progenitors than terminal cell types (mean 4.01 vs. 2.44 counts per million; p = 0.025). Examining the directional effects of these variants on distinct lineages, we discovered that 91% of pleiotropic variants exhibited a tune mechanism by changing the levels of different blood cells in the same direction. One such example was rs17758695 located in intron 1 of BCL2, an anti-apoptotic protein known to regulate cell death similarly across multiple hematopoietic cell types. In contrast, the remaining 9% of pleiotropic variants favored one lineage at the expense of others (switch mechanism), including novel variants near key myeloid-determining transcription factors CEBPA and MYC (rs78744187 and rs562240450). Together, these results suggest that pleiotropic variants 1) preferentially act in common progenitor rather than terminal cell types, and 2) predominantly tune multiple traits in the same direction, but may favor one at the expense of others when influencing lineage commitment. Finally, given the enrichment of fine-mapped variants in common progenitor states, we set out to determine whether classically defined hematopoietic populations could be divided into lineage-biased subpopulations based on differential genetic regulation of blood traits. To do so, we measured the enrichment of fine-mapped variants in the chromatin landscapes of 2,034 single cells isolated from 8 hematopoietic progenitor populations. Strikingly, we discovered significant heterogeneity within the common myeloid progenitor (CMP) population, in which one subset of cells exhibited greater open-chromatin enrichment for myeloid trait variants and relevant transcription factor (TF) binding (CEBPA, IRF8), whereas the other subset showed enrichment for erythroid trait variants and TFs (GATA1, KLF1). By integrating genetic fine-mapping with chromatin data, we identified hundreds of causal variants regulating 16 blood traits, characterized novel mechanisms of pleiotropic effects, and discovered cell states enriched for blood trait regulation. These findings provide new insights into the importance of genetic regulation in progenitor cell states and will contribute to knowledge of how these processes go awry in diseases of blood cell production. Disclosures No relevant conflicts of interest to declare.


2019 ◽  
Vol 21 (4) ◽  
pp. 1209-1223 ◽  
Author(s):  
Raphael Petegrosso ◽  
Zhuliu Li ◽  
Rui Kuang

Abstract   Single-cell RNAsequencing (scRNA-seq) technologies have enabled the large-scale whole-transcriptome profiling of each individual single cell in a cell population. A core analysis of the scRNA-seq transcriptome profiles is to cluster the single cells to reveal cell subtypes and infer cell lineages based on the relations among the cells. This article reviews the machine learning and statistical methods for clustering scRNA-seq transcriptomes developed in the past few years. The review focuses on how conventional clustering techniques such as hierarchical clustering, graph-based clustering, mixture models, $k$-means, ensemble learning, neural networks and density-based clustering are modified or customized to tackle the unique challenges in scRNA-seq data analysis, such as the dropout of low-expression genes, low and uneven read coverage of transcripts, highly variable total mRNAs from single cells and ambiguous cell markers in the presence of technical biases and irrelevant confounding biological variations. We review how cell-specific normalization, the imputation of dropouts and dimension reduction methods can be applied with new statistical or optimization strategies to improve the clustering of single cells. We will also introduce those more advanced approaches to cluster scRNA-seq transcriptomes in time series data and multiple cell populations and to detect rare cell types. Several software packages developed to support the cluster analysis of scRNA-seq data are also reviewed and experimentally compared to evaluate their performance and efficiency. Finally, we conclude with useful observations and possible future directions in scRNA-seq data analytics. Availability All the source code and data are available at https://github.com/kuanglab/single-cell-review.


2021 ◽  
Author(s):  
Ryn Cuddleston ◽  
Junhao Li ◽  
Xuanjia Fan ◽  
Alexey Kozenkov ◽  
Matthew Lalli ◽  
...  

Posttranscriptional adenosine-to-inosine modifications amplify the functionality of RNA molecules in the brain, yet the cellular and genetic regulation of RNA editing is poorly described. We quantified base-specific RNA editing across three major cell populations from the human prefrontal cortex: glutamatergic neurons, medial ganglionic eminence GABAergic neurons, and oligodendrocytes. We found more selective editing and RNA hyper-editing in neurons relative to oligodendrocytes. The pattern of RNA editing was highly cell type-specific, with 189,229 cell type-associated sites. The cellular specificity for thousands of sites was confirmed by single nucleus RNA-sequencing. Importantly, cell type-associated sites were enriched in GTEx RNA-sequencing data, edited ~twentyfold higher than all other sites, and variation in RNA editing was predominantly explained by neuronal proportions in bulk brain tissue. Finally, we discovered 661,791 cis-editing quantitative trait loci across thirteen brain regions, including hundreds with cell type-associated features. These data reveal an expansive repertoire of highly regulated RNA editing sites across human brain cell types and provide a resolved atlas linking cell types to editing variation and genetic regulatory effects.


2021 ◽  
Author(s):  
Jake Yeung ◽  
Maria Florescu ◽  
Peter Zeller ◽  
Buys Anton de Barbanson ◽  
Alexander van Oudenaarden

Recent advances have enabled mapping of histone modifications in single cells, but current methods are constrained to profile only one histone modification per cell. Here we present an integrated experimental and computational framework, scChIX (single-cell chromatin immunocleavage and unmixing), to map multiple histone modifications in single cells. We first validate this method using purified blood cells and show that although the two repressive marks, H3K27me3 and H3K9me3, are generally mutually exclusive, the transitions between the two regions can vary between cell types. Next we apply scChIX to a heterogenous cell population from mouse bone marrow to generate linked maps of active (H3K4me1) and repressive (H3K27me3) chromatin landscapes in single cells, where coordinates in the active modification map correspond to coordinates in the repressive map. Linked analysis reveals that immunoglobulin genes in the region are in a repressed chromatin state in pro-B cells, but become activated in B cells. Overall, scChIX unlocks systematic interrogation of the interplay between histone modifications in single cells.


2014 ◽  
Author(s):  
Irene Gallego Romero ◽  
Bryan J Pavlovic ◽  
Irene Hernando-Herraez ◽  
Nicholas E Banovich ◽  
Courtney L Kagan ◽  
...  

Comparative genomics studies in primates are extremely restricted because we only have access to a few types of cell lines from non-human apes and to a limited collection of frozen tissues. In order to gain better insight into regulatory processes that underlie variation in complex phenotypes, we must have access to faithful model systems for a wide range of tissues and cell types. To facilitate this, we have generated a panel of 7 fully characterized chimpanzee (Pan troglodytes) induced pluripotent stem cell (iPSC) lines derived from fibroblasts of healthy donors. All lines appear to be free of integration from exogenous reprogramming vectors, can be maintained using standard iPSC culture techniques, and have proliferative and differentiation potential similar to human and mouse lines. To begin demonstrating the utility of comparative iPSC panels, we collected RNA sequencing data and methylation profiles from the chimpanzee iPSCs and their corresponding fibroblast precursors, as well as from 7 human iPSCs and their precursors, which were of multiple cell type and population origins. Overall, we observed much less regulatory variation within species in the iPSCs than in the somatic precursors, indicating that the reprogramming process has erased many of the differences observed between somatic cells of different origins. We identified 4,918 differentially expressed genes and 3,598 differentially methylated regions between iPSCs of the two species, many of which are novel inter-species differences that were not observed between the somatic cells of the two species. Our panel will help realise the potential of iPSCs in primate studies, and in combination with genomic technologies, transform studies of comparative evolution.


2019 ◽  
Author(s):  
Smriti Chawla ◽  
Sudhagar Samydurai ◽  
Say Li Kong ◽  
Zhenxun Wang ◽  
Wai Leong Tam ◽  
...  

AbstractHere, we introduce UniPath, for representing single-cells using pathway and gene-set enrichment scores by a transformation of their open-chromatin or gene-expression profiles. Besides being robust to variability in dropout, UniPath provides consistency and scalability in estimating gene-set enrichment scores for every cell. UniPath’s approach of predicting temporal-order of single-cells using their gene-set activity score enables suppression of known covariates. UniPath based analysis of mouse cell atlas yielded surprising, albeit biologically-meaningful co-clustering of cell-types from distant organs and helped in annotating many unlabeled cells. By enabling unconventional analysis, UniPath also proves to be useful in inferring context-specific regulation in cancer cells.


2021 ◽  
Vol 12 ◽  
Author(s):  
Simon Haile ◽  
Richard D. Corbett ◽  
Veronique G. LeBlanc ◽  
Lisa Wei ◽  
Stephen Pleasance ◽  
...  

RNA sequencing (RNAseq) has been widely used to generate bulk gene expression measurements collected from pools of cells. Only relatively recently have single-cell RNAseq (scRNAseq) methods provided opportunities for gene expression analyses at the single-cell level, allowing researchers to study heterogeneous mixtures of cells at unprecedented resolution. Tumors tend to be composed of heterogeneous cellular mixtures and are frequently the subjects of such analyses. Extensive method developments have led to several protocols for scRNAseq but, owing to the small amounts of RNA in single cells, technical constraints have required compromises. For example, the majority of scRNAseq methods are limited to sequencing only the 3′ or 5′ termini of transcripts. Other protocols that facilitate full-length transcript profiling tend to capture only polyadenylated mRNAs and are generally limited to processing only 96 cells at a time. Here, we address these limitations and present a novel protocol that allows for the high-throughput sequencing of full-length, total RNA at single-cell resolution. We demonstrate that our method produced strand-specific sequencing data for both polyadenylated and non-polyadenylated transcripts, enabled the profiling of transcript regions beyond only transcript termini, and yielded data rich enough to allow identification of cell types from heterogeneous biological samples.


Sign in / Sign up

Export Citation Format

Share Document