scholarly journals Global reference mapping and dynamics of human transcription factor footprints

Author(s):  
Jeff Vierstra ◽  
John Lazar ◽  
Richard Sandstrom ◽  
Jessica Halow ◽  
Kristen Lee ◽  
...  

AbstractCombinatorial binding of transcription factors to regulatory DNA underpins gene regulation in all organisms. Genetic variation in regulatory regions has been connected with diseases and diverse phenotypic traits1, yet it remains challenging to distinguish variants that impact regulatory function2. Genomic DNase I footprinting enables quantitative, nucleotide-resolution delineation of sites of transcription factor occupancy within native chromatin3–5. However, to date only a small fraction of such sites have been precisely resolved on the human genome sequence5. To enable comprehensive mapping of transcription factor footprints, we produced high-density DNase I cleavage maps from 243 human cell and tissue types and states and integrated these data to delineate at nucleotide resolution ~4.5 million compact genomic elements encoding transcription factor occupancy. We map the fine-scale structure of ~1.6 million DHS and show that the overwhelming majority is populated by well-spaced sites of single transcription factor:DNA interaction. Cell context-dependent cis-regulation is chiefly executed by wholesale actuation of accessibility at regulatory DNA versus by differential transcription factor occupancy within accessible elements. We show further that the well-described enrichment of disease- and phenotypic trait-associated genetic variants in regulatory regions1,6 is almost entirely attributable to variants localizing within footprints, and that functional variants impacting transcription factor occupancy are nearly evenly partitioned between loss- and gain-of-function alleles. Unexpectedly, we find that the global density of human genetic variation is markedly increased within transcription factor footprints, revealing an unappreciated driver of cis-regulatory evolution. Our results provide a new framework for both global and nucleotide-precision analyses of gene regulatory mechanisms and functional genetic variation.

Blood ◽  
2013 ◽  
Vol 122 (21) ◽  
pp. 437-437 ◽  
Author(s):  
Daniel E. Bauer ◽  
Sophia C. Kamran ◽  
Samuel Lessard ◽  
Jian Xu ◽  
Yuko Fujiwara ◽  
...  

Abstract Introduction Genome-wide association studies (GWAS) have ascertained numerous trait-associated common genetic variants localized to regulatory DNA. The hypothesis that regulatory variation accounts for substantial heritability has undergone scarce experimental evaluation. Common variation at BCL11A is estimated to explain ∼15% of the trait variance in fetal hemoglobin (HbF) level but the functional variants remain unknown. Materials and Methods We use chromatin immunoprecipitation (ChIP), DNase I sensitivity and chromosome conformation capture to evaluate the BCL11A locus in mouse and human primary erythroblasts. We extensively genotype 1,263 samples from the Collaborative Study of Sickle Cell Disease within three HbF-associated erythroid DNase I hypersensitive sites (DHSs) at BCL11A. We pyrosequence heterozygous erythroblasts to assess allele-specific transcription factor binding and gene expression. We conduct transgenic analysis by mouse zygotic microinjection and genome editing with transcription activator-like effector nucleases (TALENs) and clustered, regularly interspaced, short palindromic repeats (CRISPR)/CRISPR-associated nuclease 9 (Cas9) RNA-guided nucleases. Results Common genetic variation at BCL11A associated with HbF level lies in noncoding sequences decorated by an erythroid enhancer chromatin signature. Fine-mapping this putative regulatory DNA uncovers a motif-disrupting common variant associated with reduced GATA1 and TAL1 transcription factor binding, modestly diminished BCL11A expression and elevated HbF. This variant, rs1427407, accounts for the HbF association of the previously reported sentinel SNPs. The composite element functions in vivo as a developmental stage-specific lineage-restricted enhancer. Genome editing reveals that the enhancer is required in erythroid but dispensable in B-lymphoid cells for expression of BCL11A. We demonstrate species-specific functional components of the composite enhancer in mouse as compared to human erythroid precursor cells. The mouse sequences homologous to the human DHS sufficient to drive reporter activity are dispensable from the mouse composite element, whereas the adjacent DHS, whose human homolog does not direct reporter activity, is absolutely required for BCL11A expression. Conclusions We describe a comprehensive and widely applicable approach, including chromatin mapping followed by fine-mapping, allele-specific ChIP and gene expression studies, and functional analyses, to reveal causal variants and critical elements. We assert that functional validation of regulatory DNA ought to include perturbation of the endogenous genomic context by genome editing and not solely rely on in vitro or ectopic surrogate assays. These results validate the hypothesis that common variation modulates cell type-specific regulatory elements, and reveal that although functional variants themselves may be of modest impact, their harboring elements may be critical for appropriate gene expression. We speculate that species-level functional differences in components of the composite enhancer might partially account for differences in timing of globin gene expression among animals. We suggest that the GWAS-marked BCL11A enhancer represents a highly attractive target for therapeutic genome editing for the major b-hemoglobin disorders. Disclosures: No relevant conflicts of interest to declare.


Blood ◽  
2004 ◽  
Vol 104 (11) ◽  
pp. 1610-1610
Author(s):  
Paresh Vyas ◽  
Boris Guyot ◽  
Veronica Valverde-Garduno ◽  
Eduardo Anguita ◽  
Isla Hamlett ◽  
...  

Abstract Normal differentiation of red cells, platelets and eosinophils from a myeloid progenitor requires expression of the transcription factor GATA1. Moreover, GATA1 expression level influences lineage output; higher levels promote erythromegakaryocytic differentiation and lower levels eosinophil maturation. Conversely, repression of GATA1 expression is required for monocyte/neutrophil development. GATA1 expression is principally controlled transcriptionally. Thus, dissecting the molecular basis of transcriptional control of GATA1 expression will be one important facet in understanding how myeloid lineages are specified. To address this question we sought to identify all DNA sequences important for GATA1 expression. Previous analysis identified 3 murine (m)Gata1 cis-elements (an upstream enhancer, mHS-3.5, a haematopoietic IE promoter and elements in a GATA1 intron, mHS+3.5) conserved in sequence between human(h) and mouse. These studies also suggested additional unidentified elements were required for erythroid and eosinophil GATA1 expression. We compared sequence, mapped DNase I hypersensitive sites (HS) and determined histone H3/H4 acetylation over ~120 kb flanking the hGATA1 locus and corresponding region in mouse to pinpoint cis-elements. Remarkably, despite lying in a ~10 MB conserved syntenic segment, the chromatin structures of both GATA1 loci are strikingly different. Two previously unidentified haematopoietic cis-elements, one in each species (mHS-25 and hHS+14), are not conserved in position and sequence and have enhancer activity in erythroid cells. Chromatin immunoprecipitation studies show both mHS-25 and hHS+14 are bound in vivo in red cells by the transcription factors GATA1, SCL, LMO2, Ldb1. These findings suggest that some cis-elements regulating human and mouse GATA1 genes differ. Further analysis of in vivo transcription factor occupancy at GATA1 cis-elements in primary mouse eosinophils and red cells, megakaryocytic cells (L8057) and control fibroblasts show lineage- and cis-element-specific patterns of regulator binding (see table below). In red cells and megakaryocytes, GATA1, SCL, LMO2 and Ldb1 bind at two regulatory elements (mhHS-25 and mHS-3.5). Interestingly, the megakaryocyte transcriptional regulator Fli1 factor binds to mHS+3.5 specifically in megakaryocytes. In eosinophils, a different pattern of DNase I HS and transcription factor binding is seen. GATA1, PU.1 and C/EBPe (all regulate eosinophil gene expression) bind IE promoter and/or mHS+3.5. Collectively, these results suggest lineage-specific GATA1 expession is dependent on combinations of cis-elements and haematopoietic trans-acting factors that are unique for each lineage. DNase I Hypersensitive sites and transcription factor occupancy at mGATA1 cis-elements. mHS-26/-25* mHS-3.5 mIE mHS+3.5 m: mouse, h: human, *: HS identified in this study, TF: transcription factor Primary erythroid cells HS present, GATA1, SCL, LMO2, Ldb1 HS present, GATA1, SCL, LMO2, Ldb1 HS present, GATA1 HS present, GATA1 Megakaryocytic cells HS present, GATA1, SCL, LMO2, Ldb1 HS present, GATA1, SCL, LMO2, Ldb1 HS present, GATA1 HS present, GATA1 and Fli1 Primary eosinophils HS absent HS present, No TF detected HS present, GATA1 and C/EBPε HS present, GATA1, C/EBP ε and PU.1 Fibroblasts HS absent HS absent HS absent HS absent


Blood ◽  
2006 ◽  
Vol 108 (11) ◽  
pp. 1813-1813
Author(s):  
James G. Taylor ◽  
Gila Idelman ◽  
Ron Tongbai ◽  
Renee A. Chen ◽  
Cynthia M. Haggerty ◽  
...  

Abstract Cell adhesion molecules direct the inflammatory response through cell-cell interactions between leukocytes and endothelial cells, leukocyte trafficking, and cell signaling. Vascular cell adhesion molecule 1 (VCAM1) is a member of the CAM immunoglobulin superfamily whose expression at the cell surface is highly tissue and mitogen specific. Because genetic variation in VCAM1 has been implicated in the pathogenesis of a variety of human diseases, we sought a method for rapidly identifying functional simple nucleotide polymorphsisms (SNPs) present within a discrete non-coding regulatory regions of this gene. Using a novel transfection-based transcriptional pathway profiling method, we show that several uncommon variant haplotypes are functionally hyperactive. Initially, DNA sequencing across the 2.5 kb VCAM1 promoter in a screening population of 40 healthy African Americans identified 21 SNPs that define 18 different promoter haplotypes. Eight of these promoter haplotypes, defined by 13 SNPs and representing 80% of the haplotypes present in the screening population, were then evaluated for response to 5 hour stimulation with known mitogens in transient transfections of Jurkat T cells. Transcriptional reporter activity in response to combinations of phorbol ester, lectins and ionophore were clustered by inducibility using principal component analysis (PCA). Three uncommon haplotypes, each with frequencies of less than 5% in the controls, were clearly identified by PCA as hyperactive in response to mitogens. Next, in vitro haplotype activity was correlated with a bioinformatic analysis of transcription factor binding site gains or losses created by 11 of the 13 variant nucleotide sites and unique to each haplotype. Using this approach, a low frequency regulatory allele (A-540G), present on one of the hyperactive VCAM1 promoter haplotypes (haplotype 5), was identified as a putative binding site for the transcription factor ETS2. The selective gain of the ETS binding site was confirmed in vivo by chromatin immunoprecipitation experiments comparing two lymphblastoid cell lines (LCLs) of known genotype. An LCL heterozygous for the hyperactive VCAM1 haplotype 5 demonstrated nearly an 8 fold enrichment in ETS2 complexes at the VCAM1 promoter relative to a cell line homozygous for the wild type alleles. Together, these results suggest that some variants in the VCAM1 promoter alter function by changing the affinity of specific transcription factors for their DNA binding sites. This study provides the first functional evaluation of VCAM1 promoter polymorphisms and establishes a hypothetical foundation for investigating specific VCAM1 functional variants in the pathogenesis of complex genetic diseases that disproportionately afflict African Americans, including hemoglobinopathies, asthma, and hematopoietic malignancies. Overall, this study demonstrates feasibility of combining a series of genetic, bioinformatic, and wet lab methodologies for rapid identification of functional genetic variants within a larger pool of SNPs present across non-coding and regulatory regions of human genes.


PLoS Genetics ◽  
2013 ◽  
Vol 9 (12) ◽  
pp. e1003994 ◽  
Author(s):  
Angelika Feldmann ◽  
Robert Ivanek ◽  
Rabih Murr ◽  
Dimos Gaidatzis ◽  
Lukas Burger ◽  
...  

2014 ◽  
Author(s):  
Nicholas E. Banovich ◽  
Xun Lan ◽  
Graham McVicker ◽  
Bryce van de Geijn ◽  
Jacob F. Degner ◽  
...  

AbstractDNA methylation is an important epigenetic regulator of gene expression. Recent studies have revealed widespread associations between genetic variation and methylation levels. However, the mechanistic links between genetic variation and methylation remain unclear. To begin addressing this gap, we collected methylation data at ∼300,000 loci in lymphoblastoid cell lines (LCLs) from 64 HapMap Yoruba individuals, and genome-wide bisulfite sequence data in ten of these individuals. We identified (at an FDR of 10%) 13,915 cis methylation QTLs (meQTLs)—i.e., CpG sites in which changes in DNA methylation are associated with genetic variation at proximal loci. We found that meQTLs are frequently associated with changes in methylation at multiple CpGs across regions of up to 3 kb. Interestingly, meQTLs are also frequently associated with variation in other properties of gene regulation, including histone modifications, DNase I accessibility, chromatin accessibility, and expression levels of nearby genes. These observations suggest that genetic variants may lead to coordinated molecular changes in all of these regulatory phenotypes. One plausible driver of coordinated changes in different regulatory mechanisms is variation in transcription factor (TF) binding. Indeed, we found that SNPs that change predicted TF binding affinities are significantly enriched for associations with DNA methylation at nearby CpGs.Author SummaryDNA methylation is an important epigenetic mark that contributes to many biological processes including the regulation of gene expression. Genetic variation has been associated with quantitative changes in DNA methylation (meQTLs). We identified thousands of meQTLs using an assay that allowed us to measure methylation levels at around 300 thousand cytosines. We found that meQTLs are enriched with loci that is also associated with quantitative changes in gene expression, DNase I hypersensitivity, PolII occupancy, and a number of histone marks. This suggests that many molecular events are likely regulated in concert. Finally, we found that changes in transcription factor binding as well as transcription factor abundance are associated with changes in DNA methylation near transcription factor binding sites. This work contributes to our understanding of the regulation of DNA methylation in the larger context of gene regulatory landscape.


2020 ◽  
Author(s):  
Charles E. Breeze ◽  
John Lazar ◽  
Tim Mercer ◽  
Jessica Halow ◽  
Ida Washington ◽  
...  

AbstractEarly mammalian development is orchestrated by genome-encoded regulatory elements populated by a changing complement of regulatory factors, creating a dynamic chromatin landscape. To define the spatiotemporal organization of regulatory DNA landscapes during mouse development and maturation, we generated nucleotide-resolution DNA accessibility maps from 15 tissues sampled at 9 intervals spanning post-conception day 9.5 through early adult, and integrated these with 41 adult-stage DNase-seq profiles to create a global atlas of mouse regulatory DNA. Collectively, we delineated >1.8 million DNase I hypersensitive sites (DHSs), with the vast majority displaying temporal and tissue-selective patterning. Here we show that tissue regulatory DNA compartments show sharp embryonic-to-fetal transitions characterized by wholesale turnover of DHSs and progressive domination by a diminishing number of transcription factors. We show further that aligning mouse and human fetal development on a regulatory axis exposes disease-associated variation enriched in early intervals lacking human samples. Our results provide an expansive new resource for decoding mammalian developmental regulatory programs.


Science ◽  
2020 ◽  
Vol 368 (6498) ◽  
pp. 1449-1454 ◽  
Author(s):  
Andrew B. Stergachis ◽  
Brian M. Debo ◽  
Eric Haugen ◽  
L. Stirling Churchman ◽  
John A. Stamatoyannopoulos

Gene regulation is chiefly determined at the level of individual linear chromatin molecules, yet our current understanding of cis-regulatory architectures derives from fragmented sampling of large numbers of disparate molecules. We developed an approach for precisely stenciling the structure of individual chromatin fibers onto their composite DNA templates using nonspecific DNA N6-adenine methyltransferases. Single-molecule long-read sequencing of chromatin stencils enabled nucleotide-resolution readout of the primary architecture of multikilobase chromatin fibers (Fiber-seq). Fiber-seq exposed widespread plasticity in the linear organization of individual chromatin fibers and illuminated principles guiding regulatory DNA actuation, the coordinated actuation of neighboring regulatory elements, single-molecule nucleosome positioning, and single-molecule transcription factor occupancy. Our approach and results open new vistas on the primary architecture of gene regulation.


2018 ◽  
Author(s):  
Jia-Hsin Huang ◽  
Ryan Shun-Yuen Kwan ◽  
Zing Tsung-Yeh Tsai ◽  
Huai-Kuang Tsai

AbstractChanges in the cis-regulatory DNA sequences and transcription factor (TF) repertoires provide major sources that shape the gene regulatory evolution in eukaryotes. However, it is currently unclear how dynamic change of DNA sequences introduce various divergence level of TF binding motifs in the genome over evolutionary time. Here, we estimated the evolutionary divergence level of the TF binding motifs, and quantified their occurrences in the DNase I hypersensitive sites. Results from our in silico motif scan and empirical TF-ChIP (chromatin immunoprecipitation) demonstrate that the divergent motifs tend to be introduced at the borders of the cis-regulatory regions, that are likely accompanied with the expansion through evolutionary time. Accordingly, we propose that an expansion by incorporating divergent motifs within the cis-regulatory regions provides a rationale for the evolutionary divergence of regulatory circuits.


Circulation ◽  
2014 ◽  
Vol 129 (suppl_1) ◽  
Author(s):  
Marco Dauriz ◽  
Belinda K Cornes ◽  
Jennifer A Brody ◽  
Naghmeh Nikpoor ◽  
Alanna C Morrison ◽  
...  

Aim: Common variation at the polygenic 11p11.2 locus has been associated with fasting glucose (FG) and insulin (FI) in genome-wide association studies. Further insights into the genetic pathways involved in glucose homeostasis and type 2 diabetes pathogenesis might rely on discovery of functional variants in genes or regulatory regions. Hypothesis: We hypothesized that high-throughput next-generation deep sequencing at the polygenic 11p11.2 locus might identify additional rare, potentially functional variants influencing FG and/or FI levels. Methods: We deeply sequenced (mean depth 38X) 16.1kb across the 11p11.2 locus in 3,566 non-diabetic individuals enrolled in the CHARGE Consortium (http://web.chargeconsortium.com/). We analyzed rare variants (minor allele frequency [MAF] <1%) in five gene regions, including MADD , ACP2 , NR1H3 , MYBPC3 and SPI1 , with FI or FG using Sequence Kernel Association Test (SKAT). Predicted regulatory variants were then analyzed by conditioning in SKAT on two previously known variants at MADD locus (rs7944584 and rs10838687 associated, respectively, with FG and FI). All analyses were adjusted for age, sex and study design variables. FI (adjusted for BMI) was naturally log-transformed to improve normality. Further functional studies were performed in human HepG2 hepatoma cells to unravel possible mechanistic pathways linked to functional variants. Results: We identified 653 allelic variants (including the known rs7944584 and rs10838687), 79.9% of which were rare and novel. At NR1H3, 53 rare variants were jointly associated with FI ( p =2.7 x 10 -3 ); of these, seven were predicted to have regulatory function. Conditional analysis suggested more than two independent signals at 11p11.2- MADD locus. One predicted regulatory variant, chr11:47227430 (hg18; MAF=0.0007), contributed 20.6% to the overall SKAT score at NR1H3, and lies in intron 2 of NR1H3 , a predicted binding site of the FOXA1 enhancer, a transcription factor associated with insulin regulation. Functional studies in HepG2 cells showed that the chr11:47227430 variant disrupts FOXA1 binding and significantly reduces FOXA1-dependent transcriptional activity. Conclusions/interpretation: We confirmed known common FI-associated variants near MADD gene and identified rare variation in an intron of NR1H3 associated with FI. Functional in vitro studies showed that the rare A allele of the chr11:47227430 variant at the NR1H3 locus might theoretically affect insulin regulation by interfering with transcription factor FOXA1 binding and, consequently, FOXA1-dependent transcriptional activity. Our targeted deep resequencing approach proved valuable in identifying new rare functional variants; quantitation of their actual impact on glucose homeostasis needs further confirmation.


Diabetes ◽  
2019 ◽  
Vol 68 (Supplement 1) ◽  
pp. 1715-P
Author(s):  
YUNHUA L. MULLER ◽  
SAMANTHA E. DAY ◽  
SAYUKO KOBES ◽  
WILLIAM C. KNOWLER ◽  
ROBERT L. HANSON ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document