The transcription factor STAT5 binds to distinct super-enhancer sites and controls Lrrc32 expression in a prominent autoimmune and allergic disease risk locus

Mapping Intimacies ◽

10.1101/2020.06.13.150177 ◽

2020 ◽

Author(s):

Lothar Hennighausen ◽

Hye Kyung Lee

Keyword(s):

Transcription Factor ◽

Genetic Variants ◽

Binding Sites ◽

Disease Risk ◽

Mammary Epithelium ◽

Cell Type ◽

Super Enhancer ◽

Functional Understanding ◽

Risk Locus ◽

Definition Of

SummaryGenetic variants associated with diseases are enriched in genomic sequences linked to regulatory regions, such as enhancers, super-enhancers and possibly repressors, that control nearby and distant genes. A known allergic and autoimmune risk locus at chromosome 11q13.51,2 is associated with the LRRC32 gene, which encodes GARP, a protein critical for TGF-β delivery3. This region coincides with a candidate enhancer that was predicted by the presence of activating chromatin marks and contains a polymorphism significantly associated with GARP expression on CD4+CD127-CD25+ Treg cells4. In the mouse, binding of the cytokine-induced transcription factor STAT5 was detected at two sites within the expansive candidate enhancer region and a 2.3 kb deletion resulted in reduced Lrrc32 expression4. However, a clear definition of the enhancer units controlled by STAT5 and a functional understanding of STAT5 in the regulation of Lrrc32 are needed. Here we use high-resolution ChIP-seq and identify three STAT5 binding sites within the Lrrc32 super-enhancer, one shared between Treg cells and mammary epithelium and one specific to each respective cell type. Using mice that express only 10% of normal STAT5 levels we demonstrate the defining contribution of STAT5 in the activation of the Lrrc32 super-enhancer.

Download Full-text

Definition of the transcription factor TdIF1 consensus-binding sequence through genomewide mapping of its binding sites

Genes to Cells ◽

10.1111/gtc.12216 ◽

2015 ◽

Vol 20 (3) ◽

pp. 242-254 ◽

Cited By ~ 3

Author(s):

Kotaro Koiwai ◽

Takashi Kubota ◽

Nobuhisa Watanabe ◽

Katsutoshi Hori ◽

Osamu Koiwai ◽

...

Keyword(s):

Transcription Factor ◽

Binding Sites ◽

Consensus Binding Sequence ◽

Definition Of ◽

Binding Sequence

Download Full-text

Transcription Factor Binding Sites Are Genetic Determinant of Retroviral Integration in the Human Genome

Blood ◽

10.1182/blood.v112.11.820.820 ◽

2008 ◽

Vol 112 (11) ◽

pp. 820-820

Author(s):

Claudia Cattoglio ◽

Barbara Felice ◽

Davide Cittaro ◽

Annarita Miccio ◽

Giuliana Ferrari ◽

...

Keyword(s):

Transcription Factor ◽

Site Selection ◽

Binding Sites ◽

High Frequency ◽

Cell Type ◽

Retroviral Integration ◽

Target Site Selection ◽

Integration Sites ◽

Cell Type Specific ◽

Genomic Regions

Abstract Gamma-retroviruses and lentiviruses integrate non-randomly in mammalian genomes, with specific preferences for active chromatin, promoters and regulatory regions. Gamma-retroviral gene transfer vectors show a high frequency of insertional gene activation, and may induce neoplastic or pre-neoplastic clonal expansions in patients treated with genetically modified cells. An analysis of >10,000 integration sites of a Moloney leukemia virus (MLV)-derived vector in human CD34+ hematopoietic progenitors showed that genes involved the control of growth, differentiation and development of the hematopoietic and immune system are targeted at high frequency and enriched in integration hot spots, suggesting that the cell gene expression program is instrumental in directing MLV integration. To investigate the role of transcriptional regulatory networks in retroviral target site selection, we analyzed the distribution of transcription factor binding sites (TFBSs) flanking >4,000 MLV- and HIV-derived proviruses in human hematopoietic and non-hematopoietic cells. We show that MLV vectors integrate in genomic regions enriched in cell-type specific subsets of TFBSs, independently from their relative position with respect to genes and transcription start sites. Conversely, HIV vectors appear to avoid TFBS-rich genomic regions. Hierarchical clustering and a principal component analysis of TFBSs flanking retroviral integration sites in CD34+ and HeLa cells showed that TFBS subsets are vector- and cell type-specific. Analysis of sequences flanking the integration sites of vectors carrying mutations in their long terminal repeats (LTRs), and of HIV vectors packaged with a MLV integrase, indicates that the MLV integrase and LTR enhancer are the viral determinants of the selection of TFBS-rich regions in the genome. Chromatin immunoprecipitation analysis shows that transcription factors bind the LTR enhancers in the cell nucleus before integration, and may therefore synergize with the integrase in tethering retroviral pre-integration complexes to transcriptionally active regulatory regions. This study identifies TFBSs as differential genomic determinants of retroviral target site selection in the human genome, and indicates that gamma-retroviruses and lentiviruses have evolved dramatically different strategies to interact with the host cell chromatin. These differences predict a higher risk in using gamma-retroviral vs. lentiviral vectors for human gene therapy applications, independently from the design of the vector and the transgene expression cassette.

Download Full-text

Deep Mining of Natural Genetic Variation in Erythroid Cells Reveals New Insights about In Vivo Transcription Factor Binding and Chromatin Accessibility

Blood ◽

10.1182/blood.v128.22.3879.3879 ◽

2016 ◽

Vol 128 (22) ◽

pp. 3879-3879

Author(s):

Vivek Behera ◽

Perry Evans ◽

Carolyne J Face ◽

Laavanya Sankaranarayanan ◽

Gerd A. Blobel

Keyword(s):

Transcription Factor ◽

Genetic Variation ◽

Transcription Factors ◽

Cell Line ◽

Cell Lines ◽

Genetic Variants ◽

Binding Sites ◽

Chromatin Accessibility ◽

Erythroid Cell

Abstract Erythroid transcription factors (TFs) control gene expression programs, lineage decisions, and disease outcomes. How transcription factors contact DNA has been studied extensively in vitro, but in vivo binding characteristics are less well understood as they are influenced in a reciprocal manner by chromatin accessibility and neighboring transcription factors. Here, we present a comparative analysis approach that takes advantage of non-coding sequence variation between functionally equivalent erythroid cell lines to conduct an in-depth analysis of erythroid TF binding profiles and chromatin features. Specifically, we analyzed ChIP-seq datasets to identify millions of genetic non-coding variants between the mouse erythroleukemia cell line (MEL), a GATA1-inducible erythroid progenitor cell line (G1E-ER4), and primary murine erythroblast cells. We found that while these cell lines are highly positively correlated in chromatin features, larger differences in TF binding intensity are correlated with higher degrees of genetic variation between cell lines. We next examined discriminatory genetic variants between the cell lines that are located in ChIP-seq peaks of the erythroid transcription factor GATA1. Hundreds of such variants fall within GATA1 motifs. Differential GATA1 binding intensities associated with the variants revealed nucleotide positions that contribute most to in vivo GATA1 chromatin occupancy and identified which alternative nucleotides are most likely to disrupt binding. Notably, this additional information about GATA1's in vivo nucleotide binding preferences improved prediction of GATA1 binding sites genome-wide. We applied similar approaches to determine the bp-resolution in vivo binding preferences of TAL1/SCL and CTCF. We additionally identified thousands of discriminatory genetic variants within GATA1 sites that fall outside canonical GATA elements but within binding sites of other known TFs. Association of these variants with differential GATA1 binding intensities revealed that the hematopoietic transcription factors TAL1/SCL and KLF1 positively regulate GATA1 chromatin occupancy. Strikingly, we identified a number of motifs not previously implicated in cooperating with GATA1 that positively impact GATA1 chromatin binding. Notably, we also defined motifs associated with negative regulation of GATA1 chromatin occupancy. Applying a similar analysis to TAL1/SCL and CTCF revealed additional motifs involved in regulating the chromatin occupancy of these TFs. Finally, we associated discriminatory genetic variation between erythroid cell lines with large changes in sub-kb-scale DNase hypersensitivity. We found that single base pair substitutions within or near a number of erythroid TF motifs, including that for the RUNX family of nuclear factors, are strongly associated with changes in chromatin accessibility. Our findings use novel methods in comparative ChIP-seq and DNase-seq analysis to reveal new insights about the genetic basis for erythroid TF chromatin occupancy and chromatin accessibility. Disclosures No relevant conflicts of interest to declare.

Download Full-text

Cell-type specificity of ChIP-predicted transcription factor binding sites

BMC Genomics ◽

10.1186/1471-2164-13-372 ◽

2012 ◽

Vol 13 (1) ◽

pp. 372 ◽

Cited By ~ 12

Author(s):

Tony Håndstad ◽

Morten Rye ◽

Rok Močnik ◽

Finn Drabløs ◽

Pål Sætrom

Keyword(s):

Transcription Factor ◽

Binding Sites ◽

Transcription Factor Binding Sites ◽

Transcription Factor Binding ◽

Cell Type ◽

Cell Type Specificity ◽

Factor Binding

Download Full-text

The qBED track: a novel genome browser visualization for point processes

Bioinformatics ◽

10.1093/bioinformatics/btaa771 ◽

2020 ◽

Author(s):

Arnav Moudgil ◽

Daofeng Li ◽

Silas Hsu ◽

Deepak Purushotham ◽

Ting Wang ◽

...

Keyword(s):

Transcription Factor ◽

Single Cell ◽

Binding Sites ◽

Point Processes ◽

Source Code ◽

Transcription Factor Binding Sites ◽

Genome Browser ◽

Genomic Context ◽

Factor Binding ◽

Definition Of

Abstract Summary Transposon calling cards is a genomic assay for identifying transcription factor binding sites in both bulk and single cell experiments. Here, we describe the qBED format, an open, text-based standard for encoding and analyzing calling card data. In parallel, we introduce the qBED track on the WashU Epigenome Browser, a novel visualization that enables researchers to inspect calling card data in their genomic context. Finally, through examples, we demonstrate that qBED files can be used to visualize non-calling card datasets, such as Combined Annotation-Dependent Depletion scores and GWAS/eQTL hits, and thus may have broad utility to the genomics community. Availability and implementation The qBED track is available on the WashU Epigenome Browser (http://epigenomegateway.wustl.edu/browser), beginning with version 46. Source code for the WashU Epigenome Browser with qBED support is available on GitHub (http://github.com/arnavm/eg-react and http://github.com/lidaof/eg-react). A complete definition of the qBED format is available as part of the WashU Epigenome Browser documentation (https://eg.readthedocs.io/en/latest/tracks.html#qbed-track). We have also released a tutorial on how to upload qBED data to the browser (http://dx.doi.org/10.17504/protocols.io.bca8ishw).

Download Full-text

Disease-associated genetic variants in the regulatory regions of human genes: mechanisms of action on transcription and genomic resources for dissecting these mechanisms

Vavilov Journal of Genetics and Breeding ◽

10.18699/vj21.003 ◽

2021 ◽

Vol 25 (1) ◽

pp. 18-29

Author(s):

E. V. Ignatieva ◽

E. A. Matrosova

Keyword(s):

Gene Expression ◽

Transcription Factor ◽

Genetic Variants ◽

Functional Activity ◽

Binding Sites ◽

Regulatory Elements ◽

Transcription Factor Binding ◽

Omics Technologies ◽

Noncoding Regions ◽

Factor Binding

Whole genome and whole exome sequencing technologies play a very important role in the studies of the genetic aspects of the pathogenesis of various diseases. The ample use of genome-wide and exome-wide association study methodology (GWAS and EWAS) made it possible to identify a large number of genetic variants associated with diseases. This information is accumulated in the databases like GWAS central, GWAS catalog, OMIM, ClinVar, etc. Most of the variants identified by the GWAS technique are located in the noncoding regions of the human genome. According to the ENCODE project, the fraction of regions in the human genome potentially involved in transcriptional control is many times greater than the fraction of coding regions. Thus, genetic variation in noncoding regions of the genome can increase the susceptibility to diseases by disrupting various regulatory elements (promoters, enhancers, silencers, insulator regions, etc.). However, identification of the mechanisms of influence of pathogenic genetic variants on the diseases risk is difficult due to a wide variety of regulatory elements. The present review focuses on the molecular genetic mechanisms by which pathogenic genetic variants affect gene expression. At the same time, attention is concentrated on the transcriptional level of regulation as an initial step in the expression of any gene. A triggering event mediating the effect of a pathogenic genetic variant on the level of gene expression can be, for example, a change in the functional activity of transcription factor binding sites (TFBSs) or DNA methylation change, which, in turn, affects the functional activity of promoters or enhancers. Dissecting the regulatory roles of polymorphic loci have been impossible without close integration of modern experimental approaches with computer analysis of a growing wealth of genetic and biological data obtained using omics technologies. The review provides a brief description of a number of the most well-known public genomic information resources containing data obtained using omics technologies, including (1) resources that accumulate data on the chromatin states and the regions of transcription factor binding derived from ChIP-seq experiments; (2) resources containing data on genomic loci, for which allele-specific transcription factor binding was revealed based on ChIP-seq technology; (3) resources containing in silico predicted data on the potential impact of genetic variants on the transcription factor binding sites.

Download Full-text

Definition of the binding sites of individual zinc fingers in the transcription factor IIIA-5S RNA gene complex.

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.89.22.10822 ◽

1992 ◽

Vol 89 (22) ◽

pp. 10822-10826 ◽

Cited By ~ 53

Author(s):

K. R. Clemens ◽

X. Liao ◽

V. Wolf ◽

P. E. Wright ◽

J. M. Gottesfeld

Keyword(s):

Transcription Factor ◽

Binding Sites ◽

Zinc Fingers ◽

Gene Complex ◽

5S Rna ◽

Transcription Factor Iiia ◽

Definition Of

Download Full-text

Faculty Opinions recommendation of Distinct properties of cell-type-specific and shared transcription factor binding sites.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.718122928.793485028 ◽

2013 ◽

Author(s):

Andrew D Sharrocks

Keyword(s):

Transcription Factor ◽

Binding Sites ◽

Transcription Factor Binding Sites ◽

Transcription Factor Binding ◽

Cell Type ◽

Factor Binding ◽

Cell Type Specific

Download Full-text

Regulatory SNPs: Altered Transcription Factor Binding Sites Implicated in Complex Traits and Diseases

International Journal of Molecular Sciences ◽

10.3390/ijms22126454 ◽

2021 ◽

Vol 22 (12) ◽

pp. 6454

Author(s):

Arina O. Degtyareva ◽

Elena V. Antontseva ◽

Tatiana I. Merkulova

Keyword(s):

Gene Expression ◽

Transcription Factor ◽

Genetic Variants ◽

Binding Sites ◽

Transcription Factor Binding Sites ◽

Transcription Factor Binding ◽

Genome Wide Association Studies ◽

Factor Binding ◽

Genome Wide ◽

Regulatory Snps

The vast majority of the genetic variants (mainly SNPs) associated with various human traits and diseases map to a noncoding part of the genome and are enriched in its regulatory compartment, suggesting that many causal variants may affect gene expression. The leading mechanism of action of these SNPs consists in the alterations in the transcription factor binding via creation or disruption of transcription factor binding sites (TFBSs) or some change in the affinity of these regulatory proteins to their cognate sites. In this review, we first focus on the history of the discovery of regulatory SNPs (rSNPs) and systematized description of the existing methodical approaches to their study. Then, we brief the recent comprehensive examples of rSNPs studied from the discovery of the changes in the TFBS sequence as a result of a nucleotide substitution to identification of its effect on the target gene expression and, eventually, to phenotype. We also describe state-of-the-art genome-wide approaches to identification of regulatory variants, including both making molecular sense of genome-wide association studies (GWAS) and the alternative approaches the primary goal of which is to determine the functionality of genetic variants. Among these approaches, special attention is paid to expression quantitative trait loci (eQTLs) analysis and the search for allele-specific events in RNA-seq (ASE events) as well as in ChIP-seq, DNase-seq, and ATAC-seq (ASB events) data.

Download Full-text