Fine-Mapping and Genome Editing Reveal An Essential Erythroid Enhancer At The HbF-Associated BCL11A Locus

Blood ◽  
2013 ◽  
Vol 122 (21) ◽  
pp. 437-437 ◽  
Author(s):  
Daniel E. Bauer ◽  
Sophia C. Kamran ◽  
Samuel Lessard ◽  
Jian Xu ◽  
Yuko Fujiwara ◽  
...  

Abstract Introduction Genome-wide association studies (GWAS) have ascertained numerous trait-associated common genetic variants localized to regulatory DNA. The hypothesis that regulatory variation accounts for substantial heritability has undergone scarce experimental evaluation. Common variation at BCL11A is estimated to explain ∼15% of the trait variance in fetal hemoglobin (HbF) level but the functional variants remain unknown. Materials and Methods We use chromatin immunoprecipitation (ChIP), DNase I sensitivity and chromosome conformation capture to evaluate the BCL11A locus in mouse and human primary erythroblasts. We extensively genotype 1,263 samples from the Collaborative Study of Sickle Cell Disease within three HbF-associated erythroid DNase I hypersensitive sites (DHSs) at BCL11A. We pyrosequence heterozygous erythroblasts to assess allele-specific transcription factor binding and gene expression. We conduct transgenic analysis by mouse zygotic microinjection and genome editing with transcription activator-like effector nucleases (TALENs) and clustered, regularly interspaced, short palindromic repeats (CRISPR)/CRISPR-associated nuclease 9 (Cas9) RNA-guided nucleases. Results Common genetic variation at BCL11A associated with HbF level lies in noncoding sequences decorated by an erythroid enhancer chromatin signature. Fine-mapping this putative regulatory DNA uncovers a motif-disrupting common variant associated with reduced GATA1 and TAL1 transcription factor binding, modestly diminished BCL11A expression and elevated HbF. This variant, rs1427407, accounts for the HbF association of the previously reported sentinel SNPs. The composite element functions in vivo as a developmental stage-specific lineage-restricted enhancer. Genome editing reveals that the enhancer is required in erythroid but dispensable in B-lymphoid cells for expression of BCL11A. We demonstrate species-specific functional components of the composite enhancer in mouse as compared to human erythroid precursor cells. The mouse sequences homologous to the human DHS sufficient to drive reporter activity are dispensable from the mouse composite element, whereas the adjacent DHS, whose human homolog does not direct reporter activity, is absolutely required for BCL11A expression. Conclusions We describe a comprehensive and widely applicable approach, including chromatin mapping followed by fine-mapping, allele-specific ChIP and gene expression studies, and functional analyses, to reveal causal variants and critical elements. We assert that functional validation of regulatory DNA ought to include perturbation of the endogenous genomic context by genome editing and not solely rely on in vitro or ectopic surrogate assays. These results validate the hypothesis that common variation modulates cell type-specific regulatory elements, and reveal that although functional variants themselves may be of modest impact, their harboring elements may be critical for appropriate gene expression. We speculate that species-level functional differences in components of the composite enhancer might partially account for differences in timing of globin gene expression among animals. We suggest that the GWAS-marked BCL11A enhancer represents a highly attractive target for therapeutic genome editing for the major b-hemoglobin disorders. Disclosures: No relevant conflicts of interest to declare.

Author(s):  
Jeff Vierstra ◽  
John Lazar ◽  
Richard Sandstrom ◽  
Jessica Halow ◽  
Kristen Lee ◽  
...  

AbstractCombinatorial binding of transcription factors to regulatory DNA underpins gene regulation in all organisms. Genetic variation in regulatory regions has been connected with diseases and diverse phenotypic traits1, yet it remains challenging to distinguish variants that impact regulatory function2. Genomic DNase I footprinting enables quantitative, nucleotide-resolution delineation of sites of transcription factor occupancy within native chromatin3–5. However, to date only a small fraction of such sites have been precisely resolved on the human genome sequence5. To enable comprehensive mapping of transcription factor footprints, we produced high-density DNase I cleavage maps from 243 human cell and tissue types and states and integrated these data to delineate at nucleotide resolution ~4.5 million compact genomic elements encoding transcription factor occupancy. We map the fine-scale structure of ~1.6 million DHS and show that the overwhelming majority is populated by well-spaced sites of single transcription factor:DNA interaction. Cell context-dependent cis-regulation is chiefly executed by wholesale actuation of accessibility at regulatory DNA versus by differential transcription factor occupancy within accessible elements. We show further that the well-described enrichment of disease- and phenotypic trait-associated genetic variants in regulatory regions1,6 is almost entirely attributable to variants localizing within footprints, and that functional variants impacting transcription factor occupancy are nearly evenly partitioned between loss- and gain-of-function alleles. Unexpectedly, we find that the global density of human genetic variation is markedly increased within transcription factor footprints, revealing an unappreciated driver of cis-regulatory evolution. Our results provide a new framework for both global and nucleotide-precision analyses of gene regulatory mechanisms and functional genetic variation.


2016 ◽  
Vol 135 (5) ◽  
pp. 485-497 ◽  
Author(s):  
Marco Cavalli ◽  
Gang Pan ◽  
Helena Nord ◽  
Ola Wallerman ◽  
Emelie Wallén Arzt ◽  
...  

2021 ◽  
Author(s):  
Thomas Hartwig ◽  
Michael Banf ◽  
Gisele Prietsch ◽  
Julia Engelhorn ◽  
Jinliang Yang ◽  
...  

Abstract Variation in transcriptional regulation is a major cause of phenotypic diversity. Genome-wide association studies (GWAS) have shown that most functional variants reside in non-coding regions, where they potentially affect transcription factor (TF) binding and chromatin accessibility to alter gene expression. Pinpointing such regulatory variations, however, remains challenging. Here, we developed a hybrid allele-specific chromatin binding sequencing (HASCh-seq) approach and identified variations in target binding of the brassinosteroid (BR) responsive transcription factor ZmBZR1 in maize. Chromatin immunoprecipitation followed by sequencing (ChIP-seq) in B73xMo17 F1s identified thousands of target genes of ZmBZR1. Allele-specific ZmBZR1 binding (ASB) was observed for about 14.3% of target genes. It correlated with over 550 loci containing sequence variation in BZR1-binding motifs and over 340 loci with haplotype-specific DNA methylation, linking genetic and epigenetic variations to ZmBZR1 occupancy. Comparison with GWAS data linked hundreds of ASB loci to important yield, growth, and disease-related traits. Our study provides a robust method for analyzing genome-wide variations of transcription factor occupancy and identified genetic and epigenetic variations of the BR response transcription network in maize.


2020 ◽  
Author(s):  
Swann Floc’hlay ◽  
Emily Wong ◽  
Bingqing Zhao ◽  
Rebecca R. Viales ◽  
Morgane Thomas-Chollier ◽  
...  

AbstractPrecise patterns of gene expression are driven by interactions between transcription factors, regulatory DNA sequence, and chromatin. How DNA mutations affecting any one of these regulatory ‘layers’ is buffered or propagated to gene expression remains unclear. To address this, we quantified allele-specific changes in chromatin accessibility, histone modifications, and gene expression in F1 embryos generated from eight Drosophila crosses, at three embryonic stages, yielding a comprehensive dataset of 240 samples spanning multiple regulatory layers. Genetic variation in cis-regulatory elements is common, highly heritable, and surprisingly consistent in its effects across embryonic stages. Much of this variation does not propagate to gene expression. When it does, it acts through H3K4me3 or alternatively through chromatin accessibility and H3K27ac. The magnitude and evolutionary impact of mutations is influenced by a genes’ regulatory complexity (i.e. enhancer number), with transcription factors being most robust to cis-acting, and most influenced by trans-acting, variation. Overall, the impact of genetic variation on regulatory phenotypes appears context-dependent even within the constraints of embryogenesis.


2020 ◽  
Author(s):  
Yanyu Liang ◽  
François Aguet ◽  
Alvaro Barbeira ◽  
Kristin Ardlie ◽  
Hae Kyung Im

AbstractGenome-wide association studies (GWAS) have been highly successful in identifying genomic loci associated with complex traits. However, identification of the causal genes that mediate these associations remains challenging, and many approaches integrating transcriptomic data with GWAS have been proposed. However, there currently exist no computationally scalable methods that integrate total and allele-specific gene expression to maximize power to detect genetic effects on gene expression. Here, we describe a unified framework that is scalable to studies with thousands of samples. Using simulations and data from GTEx, we demonstrate an average power gain equivalent to a 29% increase in sample size for genes with sufficient allele-specific read coverage. We provide a suite of freely available tools, mixQTL, mixFine, and mixPred, that apply this framework for mapping of quantitative trait loci, fine-mapping, and prediction.


2014 ◽  
Author(s):  
Nicholas E. Banovich ◽  
Xun Lan ◽  
Graham McVicker ◽  
Bryce van de Geijn ◽  
Jacob F. Degner ◽  
...  

AbstractDNA methylation is an important epigenetic regulator of gene expression. Recent studies have revealed widespread associations between genetic variation and methylation levels. However, the mechanistic links between genetic variation and methylation remain unclear. To begin addressing this gap, we collected methylation data at ∼300,000 loci in lymphoblastoid cell lines (LCLs) from 64 HapMap Yoruba individuals, and genome-wide bisulfite sequence data in ten of these individuals. We identified (at an FDR of 10%) 13,915 cis methylation QTLs (meQTLs)—i.e., CpG sites in which changes in DNA methylation are associated with genetic variation at proximal loci. We found that meQTLs are frequently associated with changes in methylation at multiple CpGs across regions of up to 3 kb. Interestingly, meQTLs are also frequently associated with variation in other properties of gene regulation, including histone modifications, DNase I accessibility, chromatin accessibility, and expression levels of nearby genes. These observations suggest that genetic variants may lead to coordinated molecular changes in all of these regulatory phenotypes. One plausible driver of coordinated changes in different regulatory mechanisms is variation in transcription factor (TF) binding. Indeed, we found that SNPs that change predicted TF binding affinities are significantly enriched for associations with DNA methylation at nearby CpGs.Author SummaryDNA methylation is an important epigenetic mark that contributes to many biological processes including the regulation of gene expression. Genetic variation has been associated with quantitative changes in DNA methylation (meQTLs). We identified thousands of meQTLs using an assay that allowed us to measure methylation levels at around 300 thousand cytosines. We found that meQTLs are enriched with loci that is also associated with quantitative changes in gene expression, DNase I hypersensitivity, PolII occupancy, and a number of histone marks. This suggests that many molecular events are likely regulated in concert. Finally, we found that changes in transcription factor binding as well as transcription factor abundance are associated with changes in DNA methylation near transcription factor binding sites. This work contributes to our understanding of the regulation of DNA methylation in the larger context of gene regulatory landscape.


2014 ◽  
Vol 34 (suppl_1) ◽  
Author(s):  
Avanthi Raghavan ◽  
Derek Peters ◽  
Nicolas Kuperwasser ◽  
Alexandre Melnikov ◽  
Peter Rogov ◽  
...  

Genome-wide association studies (GWASs) have discovered many novel genetic loci linked to serum lipid levels, yet pinpointing the causal variants remains a major challenge. Expression quantitative trait locus (eQTL) analysis of liver and adipose indicates that many lipid-associated variants influence gene expression in a cis-regulatory manner. To identify causal variants at 57 lipid-associated eQTL loci, we performed a massively parallel reporter assay (MPRA) in which the genomic region surrounding each candidate SNP was coupled to a reporter gene with a unique barcode identifier. This construct pool was transfected into 3T3-L1 adipocytes, and the copy number of each expressed barcode determined by RNAseq. Variants were prioritized according to allele-specific regulatory activity. The top-ranked variant, rs2277862, is associated with total cholesterol and is flanked by the ERGIC3, CPNE1, and CEP250 genes, none of which have previous connections to lipid metabolism. We hypothesized that rs2277862 is causal for the eQTL at the 20q11 locus, and that it lies within a transcriptional regulatory site to influence local gene expression. We identified human pluripotent stem cell (hPSC) lines with different genotypes at rs2277862 and, using CRISPR/Cas genome-editing technology, either deleted the putative regulatory site encompassing rs2277862 or knocked in the alternate SNP allele. Analysis of undifferentiated rs2277862 homozygous major deletion mutants revealed diminished expression of ERGIC3, CPNE1, and CEP250; conversely, heterozygous deletion mutants had increased expression of the same three genes. Our results suggest that in hPSCs the major and minor alleles of rs2277862 have opposing effects on gene expression. In ongoing work, we are validating these findings in differentiated adipocytes and seek to identify the factor(s) that binds the putative regulatory site to mediate allele-specific gene expression. We anticipate that this work will offer fresh insight into the mechanisms by which GWAS-implicated SNPs influence expression of causal genes for lipid metabolism and, more broadly, establish a novel methodology for finding causal variants for any phenotype of interest.


2021 ◽  
Author(s):  
Carlos A. Villarroel ◽  
Paulo Canessa ◽  
Macarena Bastias ◽  
Francisco A Cubillos

Saccharomyces cerevisiae rewires its transcriptional output to survive stressful environments, such as nitrogen scarcity under fermentative conditions. Although divergence in nitrogen metabolism has been described among natural yeast populations, the impact of regulatory genetic variants modulating gene expression and nitrogen consumption remains to be investigated. Here, we employed an F1 hybrid from two contrasting S. cerevisiae strains, providing a controlled genetic environment to map cis factors involved in the divergence of gene expression regulation in response to nitrogen scarcity. We used a dual approach to obtain genome-wide allele-specific profiles of chromatin accessibility, transcription factor binding, and gene expression through ATAC-seq and RNA-seq. We observed large variability in allele-specific expression and accessibility between the two genetic backgrounds, with a third of these differences specific to a deficient nitrogen environment. Furthermore, we discovered events of allelic bias in gene expression correlating with allelic bias in transcription factor binding solely under nitrogen scarcity, where the majority of these transcription factors orchestrates the Nitrogen Catabolite Repression regulatory pathway and demonstrates a cis x environment-specific response. Our approach allowed us to find cis variants modulating gene expression, chromatin accessibility and allelic differences in transcription factor binding in response to low nitrogen culture conditions.


Sign in / Sign up

Export Citation Format

Share Document