scholarly journals Systematic Evaluation of DNA Sequence Variations on in vivo Transcription Factor Binding Affinity

2021 ◽  
Vol 12 ◽  
Author(s):  
Yutong Jin ◽  
Jiahui Jiang ◽  
Ruixuan Wang ◽  
Zhaohui S. Qin

The majority of the single nucleotide variants (SNVs) identified by genome-wide association studies (GWAS) fall outside of the protein-coding regions. Elucidating the functional implications of these variants has been a major challenge. A possible mechanism for functional non-coding variants is that they disrupted the canonical transcription factor (TF) binding sites that affect the in vivo binding of the TF. However, their impact varies since many positions within a TF binding motif are not well conserved. Therefore, simply annotating all variants located in putative TF binding sites may overestimate the functional impact of these SNVs. We conducted a comprehensive survey to study the effect of SNVs on the TF binding affinity. A sequence-based machine learning method was used to estimate the change in binding affinity for each SNV located inside a putative motif site. From the results obtained on 18 TF binding motifs, we found that there is a substantial variation in terms of a SNV’s impact on TF binding affinity. We found that only about 20% of SNVs located inside putative TF binding sites would likely to have significant impact on the TF-DNA binding.

2019 ◽  
Author(s):  
Sierra S Nishizaki ◽  
Natalie Ng ◽  
Shengcheng Dong ◽  
Robert S Porter ◽  
Cody Morterud ◽  
...  

Abstract Motivation Genome-wide association studies have revealed that 88% of disease-associated single-nucleotide polymorphisms (SNPs) reside in noncoding regions. However, noncoding SNPs remain understudied, partly because they are challenging to prioritize for experimental validation. To address this deficiency, we developed the SNP effect matrix pipeline (SEMpl). Results SEMpl estimates transcription factor-binding affinity by observing differences in chromatin immunoprecipitation followed by deep sequencing signal intensity for SNPs within functional transcription factor-binding sites (TFBSs) genome-wide. By cataloging the effects of every possible mutation within the TFBS motif, SEMpl can predict the consequences of SNPs to transcription factor binding. This knowledge can be used to identify potential disease-causing regulatory loci. Availability and implementation SEMpl is available from https://github.com/Boyle-Lab/SEM_CPP. Supplementary information Supplementary data are available at Bioinformatics online.


1996 ◽  
Vol 16 (4) ◽  
pp. 1659-1667 ◽  
Author(s):  
J Karlseder ◽  
H Rotheneder ◽  
E Wintersberger

Within the region around 150 bp upstream of the initiation codon, which was previously shown to suffice for growth-regulated expression, the murine thymidine kinase gene carries a single binding site for transcription factor Sp1; about 10 bp downstream of this site, there is a binding motif for transcription factor E2F. The latter protein appears to be responsible for growth regulation of the promoter. Mutational inactivation of either the Sp1 or the E2F site almost completely abolishes promoter activity, suggesting that the two transcription factors interact directly in delivering an activation signal to the basic transcription machinery. This was verified by demonstrating with the use of glutathione S-transferase fusion proteins that E2F and Sp1 bind to each other in vitro. For this interaction, the C-terminal part of Sp1 and the N terminus of E2F1, a domain also present in E2F2 and E2F3 but absent in E2F4 and E2F5, were essential. Accordingly, E2F1 to E2F3 but not E2F4 and E2F5 were found to bind sp1 in vitro. Coimmunoprecipitation experiments showed that complexes exist in vivo, and it was estabilished that the distance between the binding sites for the two transcription factors was critical for optimal promoter activity. Finally, in vivo footprinting experiments indicated that both the sp1 and E2F binding sites are occupied throughout the cell cycle. Mutation of either binding motif abolished binding of both transcription factors in vivo, which may indicate cooperative binding of the two proteins to chromatin-organized DNA. Our data are in line with the hypothesis that E2F functions as a growth- and cell cycle regulated tethering factor between Sp1 and the basic transcription machinery.


Author(s):  
Moritz von Scheidt ◽  
Yuqi Zhao ◽  
Thomas Q. de Aguiar Vallim ◽  
Nam Che ◽  
Michael Wierer ◽  
...  

Background: Coronary artery disease (CAD) is a multifactorial condition with both genetic and exogenous causes. The contribution of tissue specific functional networks to the development of atherosclerosis remains largely unclear. The aim of this study was to identify and characterise central regulators and networks leading to atherosclerosis. Methods: Based on several hundred genes known to affect atherosclerosis risk in mouse (as demonstrated in knock-out models) and human (as shown by genome-wide association studies (GWAS)) liver gene regulatory networks were modeled. The hierarchical order and regulatory directions of genes within the network were based on Bayesian prediction models as well as experimental studies including chromatin immunoprecipitation DNA-Sequencing (ChIP-Seq), ChIP mass spectrometry (ChIP-MS), overexpression, siRNA knockdown in mouse and human liver cells, and knockout mouse experiments. Bioinformatics and correlation analyses were used to clarify associations between central genes and CAD phenotypes in both human and mouse. Results: The transcription factor MAFF interacted as a key driver of a liver network with three human genes at CAD GWAS loci and eleven atherosclerotic murine genes. Most importantly, expression levels of the low-density lipoprotein receptor ( LDLR ) gene correlated with MAFF in 600 CAD patients undergoing bypass surgery (STARNET) and a hybrid mouse diversity panel involving 105 different inbred mouse strains. Molecular mechanisms of MAFF were tested under non-inflammatory conditions showing a positive correlation between MAFF and LDLR in vitro and in vivo . Interestingly, after LPS stimulation (inflammatory conditions) an inverse correlation between MAFF and LDLR in vitro and in vivo was observed. ChIP-MS revealed that the human CAD GWAS candidate BACH1 assists MAFF in the presence of LPS stimulation with respective heterodimers binding at the MAF recognition element (MARE) of the LDLR promoter to transcriptionally downregulate LDLR expression. Conclusions: The transcription factor MAFF was identified as a novel central regulator of an atherosclerosis/CAD relevant liver network. MAFF triggered context specific expression of LDLR and other genes known to affect CAD risk. Our results suggest that MAFF is a missing link between inflammation, lipid and lipoprotein metabolism and a possible treatment target.


2021 ◽  
Author(s):  
Jiayu Zhu ◽  
Chih-Fan Yeh ◽  
Ru-Ting Huang ◽  
Tzu-Han Lee ◽  
Tzu-Pin Shentu ◽  
...  

Genome-wide association studies (GWAS) have suggested new molecular mechanisms in vascular cells driving atherosclerotic diseases such as coronary artery disease (CAD) and ischemic stroke (IS). Nevertheless, a major challenge to develop new therapeutic approaches is to spatiotemporally manipulate these GWAS-identified genes in specific vascular tissues in vivo. YAP (Yes-associated protein) and TAZ (transcriptional coactivator with PDZ-binding motif) have merged as critical transcriptional regulators in cells responding to biomechanical stimuli, such as in athero-susceptible endothelial cells activated by disturbed flow (DF). The molecular mechanisms by which DF activates while unidirectional flow (UF) inactivates YAP/TAZ remain incompletely understood. Recent studies demonstrated that DF and genetic predisposition (risk allele) of CAD/IS locus 1p32.2 converge to reduce phospholipid phosphatase 3 (PLPP3) expression in vascular endothelium. Restoration of endothelial PLPP3 in vivo, although remains challenging and unexplored, is hypothesized to reduce atherosclerosis. We devised a nanomedicine system integrating nanoparticles and Cdh5 promoter-driven plasmids to successfully restore PLPP3 expression in activated endothelium, resulting in suppressed YAP/TAZ activity and reduced DF-induced atherosclerosis in mice. Mechanistically, our studies discovered a molecular paradigm by which CAD/IS GWAS gene PLPP3 inactivates YAP/TAZ by reducing lysophosphatidic acid (LPA)-induced myosin II and ROCK in endothelium under UF. These results highlight a new mechanistic link between GWAS and YAP/TAZ mechano-regulation and moreover, establish a proof of concept of vascular wall-based therapies employing targeted nanomedicine to manipulate CAD/IS GWAS genes in vivo.


2020 ◽  
Vol 36 (9) ◽  
pp. 2936-2937 ◽  
Author(s):  
Gareth Peat ◽  
William Jones ◽  
Michael Nuhn ◽  
José Carlos Marugán ◽  
William Newell ◽  
...  

Abstract Motivation Genome-wide association studies (GWAS) are a powerful method to detect even weak associations between variants and phenotypes; however, many of the identified associated variants are in non-coding regions, and presumably influence gene expression regulation. Identifying potential drug targets, i.e. causal protein-coding genes, therefore, requires crossing the genetics results with functional data. Results We present a novel data integration pipeline that analyses GWAS results in the light of experimental epigenetic and cis-regulatory datasets, such as ChIP-Seq, Promoter-Capture Hi-C or eQTL, and presents them in a single report, which can be used for inferring likely causal genes. This pipeline was then fed into an interactive data resource. Availability and implementation The analysis code is available at www.github.com/Ensembl/postgap and the interactive data browser at postgwas.opentargets.io.


2021 ◽  
Vol 49 (7) ◽  
pp. 3856-3875
Author(s):  
Marina Kulik ◽  
Melissa Bothe ◽  
Gözde Kibar ◽  
Alisa Fuchs ◽  
Stefanie Schöne ◽  
...  

Abstract The glucocorticoid (GR) and androgen (AR) receptors execute unique functions in vivo, yet have nearly identical DNA binding specificities. To identify mechanisms that facilitate functional diversification among these transcription factor paralogs, we studied them in an equivalent cellular context. Analysis of chromatin and sequence suggest that divergent binding, and corresponding gene regulation, are driven by different abilities of AR and GR to interact with relatively inaccessible chromatin. Divergent genomic binding patterns can also be the result of subtle differences in DNA binding preference between AR and GR. Furthermore, the sequence composition of large regions (>10 kb) surrounding selectively occupied binding sites differs significantly, indicating a role for the sequence environment in guiding AR and GR to distinct binding sites. The comparison of binding sites that are shared shows that the specificity paradox can also be resolved by differences in the events that occur downstream of receptor binding. Specifically, shared binding sites display receptor-specific enhancer activity, cofactor recruitment and changes in histone modifications. Genomic deletion of shared binding sites demonstrates their contribution to directing receptor-specific gene regulation. Together, these data suggest that differences in genomic occupancy as well as divergence in the events that occur downstream of receptor binding direct functional diversification among transcription factor paralogs.


2013 ◽  
Vol 368 (1632) ◽  
pp. 20130018 ◽  
Author(s):  
Andrea I. Ramos ◽  
Scott Barolo

In the era of functional genomics, the role of transcription factor (TF)–DNA binding affinity is of increasing interest: for example, it has recently been proposed that low-affinity genomic binding events, though frequent, are functionally irrelevant. Here, we investigate the role of binding site affinity in the transcriptional interpretation of Hedgehog (Hh) morphogen gradients . We noted that enhancers of several Hh-responsive Drosophila genes have low predicted affinity for Ci, the Gli family TF that transduces Hh signalling in the fly. Contrary to our initial hypothesis, improving the affinity of Ci/Gli sites in enhancers of dpp , wingless and stripe , by transplanting optimal sites from the patched gene, did not result in ectopic responses to Hh signalling. Instead, we found that these enhancers require low-affinity binding sites for normal activation in regions of relatively low signalling. When Ci/Gli sites in these enhancers were altered to improve their binding affinity, we observed patterning defects in the transcriptional response that are consistent with a switch from Ci-mediated activation to Ci-mediated repression. Synthetic transgenic reporters containing isolated Ci/Gli sites confirmed this finding in imaginal discs. We propose that the requirement for gene activation by Ci in the regions of low-to-moderate Hh signalling results in evolutionary pressure favouring weak binding sites in enhancers of certain Hh target genes.


Blood ◽  
2005 ◽  
Vol 106 (6) ◽  
pp. 1938-1947 ◽  
Author(s):  
Tomohiko Tamura ◽  
Pratima Thotakura ◽  
Tetsuya S. Tanaka ◽  
Minoru S. H. Ko ◽  
Keiko Ozato

Abstract Interferon regulatory factor-8 (IRF-8)/interferon consensus sequence–binding protein (ICSBP) is a transcription factor that controls myeloid-cell development. Microarray gene expression analysis of Irf-8-/- myeloid progenitor cells expressing an IRF-8/estrogen receptor chimera (which differentiate into macrophages after addition of estradiol) was used to identify 69 genes altered by IRF-8 during early differentiation (62 up-regulated and 7 down-regulated). Among them, 4 lysosomal/endosomal enzyme-related genes (cystatin C, cathepsin C, lysozyme, and prosaposin) did not require de novo protein synthesis for induction, suggesting that they were direct targets of IRF-8. We developed a reporter assay system employing a self-inactivating retrovirus and analyzed the cystatin C and cathepsin C promoters. We found that a unique cis element mediates IRF-8–induced activation of both promoters. Similar elements were also found in other IRF-8 target genes with a consensus sequence (GAAANN[N]GGAA) comprising a core IRF-binding motif and an Ets-binding motif; this sequence is similar but distinct from the previously reported Ets/IRF composite element. Chromatin immunoprecipitation assays demonstrated that IRF-8 and the PU.1 Ets transcription factor bind to this element in vivo. Collectively, these data indicate that IRF-8 stimulates transcription of target genes through a novel cis element to specify macrophage differentiation.


Author(s):  
Fernanda M Bosada ◽  
Mathilde R Rivaud ◽  
Jae-Sun Uhm ◽  
Sander Verheule ◽  
Karel van Duijvenboden ◽  
...  

Rationale: Atrial Fibrillation (AF) is the most common cardiac arrhythmia diagnosed in clinical practice. Genome-wide association studies have identified AF-associated common variants across 100+ genomic loci, but the mechanism underlying the impact of these variant loci on AF susceptibility in vivo has remained largely undefined. One such variant region, highly associated with AF, is found at 1q24, close to PRRX1, encoding the Paired Related Homeobox 1 transcription factor. Objective: To identify the mechanistic link between the variant region at 1q24 and AF predisposition. Methods and Results: The mouse orthologue of the noncoding variant genomic region (R1A) at 1q24 was deleted using CRISPR genome editing. Among the genes sharing the topologically associated domain with the deleted R1A region (Kifap3, Prrx1, Fmo2, Prrc2c), only the broadly expressed gene Prrx1 was downregulated in mutants, and only in cardiomyocytes. Expression and epigenetic profiling revealed that a cardiomyocyte lineage-specific gene program (Mhrt, Myh6, Rbm20, Tnnt2, Ttn, Ckm) was upregulated in R1A-/- atrial cardiomyocytes, and that Mef2 binding motifs were significantly enriched at differentially accessible chromatin sites. Consistently, Prrx1 suppressed Mef2-activated enhancer activity in HL-1 cells. Mice heterozygous or homozygous for the R1A deletion were susceptible to atrial arrhythmia induction, had atrial conduction slowing and more irregular RR intervals. Isolated R1A-/- mouse left atrial cardiomyocytes showed lower action potential upstroke velocities and sodium current, as well as increased systolic and diastolic calcium concentrations compared to controls. Conclusions: The noncoding AF variant region at 1q24 modulates Prrx1 expression in cardiomyocytes. Cardiomyocyte-specific reduction of Prrx1 expression upon deletion of the noncoding region leads to a profound induction of a cardiac lineage-specific gene program and to propensity for AF. These data indicate that AF-associated variants in humans may exert AF predisposition through reduced PRRX1 expression in cardiomyocytes.


Development ◽  
2020 ◽  
Vol 147 (14) ◽  
pp. dev190330
Author(s):  
Brett R. Lancaster ◽  
James D. McGhee

ABSTRACTWe define a quantitative relationship between the affinity with which the intestine-specific GATA factor ELT-2 binds to cis-acting regulatory motifs and the resulting transcription of asp-1, a target gene representative of genes involved in Caenorhabditis elegans intestine differentiation. By establishing an experimental system that allows unknown parameters (e.g. the influence of chromatin) to effectively cancel out, we show that levels of asp-1 transcripts increase monotonically with increasing binding affinity of ELT-2 to variant promoter TGATAA sites. The shape of the response curve reveals that the product of the unbound ELT-2 concentration in vivo [i.e. (ELT-2free) or ELT-2 ‘activity’] and the largest ELT-XXTGATAAXX association constant (Kmax) lies between five and ten. We suggest that this (unitless) product [Kmax×(ELT-2free) or the equivalent product for any other transcription factor] provides an important quantitative descriptor of transcription-factor/regulatory-motif interaction in development, evolution and genetic disease. A more complicated model than simple binding affinity is necessary to explain the fact that ELT-2 appears to discriminate in vivo against equal-affinity binding sites that contain AGATAA instead of TGATAA.


Sign in / Sign up

Export Citation Format

Share Document