scholarly journals MAGGIE: leveraging genetic variation to identify DNA sequence motifs mediating transcription factor binding and function

Author(s):  
Zeyang Shen ◽  
Marten A Hoeksema ◽  
Zhengyu Ouyang ◽  
Christopher Benner ◽  
Christopher K Glass

AbstractGenetic variation in regulatory elements can alter transcription factor (TF) binding by mutating a TF binding motif, which in turn may affect the activity of the regulatory elements. However, it is unclear which TFs are prone to be affected by a given variant. Current motif analysis tools either prioritize TFs based on motif enrichment without linking to a function or are limited in their applications due to the assumption of linearity between motifs and their functional effects. Here, we present MAGGIE, a novel method for identifying motifs mediating TF binding and function. By leveraging measurements from diverse genotypes, MAGGIE uses a statistical approach to link mutation of a motif to changes of an epigenomic feature without assuming a linear relationship. We benchmark MAGGIE across various applications using both simulated and biological datasets and demonstrate its improvement in sensitivity and specificity compared to the state-of-the-art motif analysis approaches. We use MAGGIE to reveal insights into the divergent functions of distinct NF-κB factors in the pro-inflammatory macrophages, showing its promise in discovering novel functions of TFs. The Python package for MAGGIE is freely available at https://github.com/zeyang-shen/maggie.

2020 ◽  
Vol 36 (Supplement_1) ◽  
pp. i84-i92
Author(s):  
Zeyang Shen ◽  
Marten A Hoeksema ◽  
Zhengyu Ouyang ◽  
Christopher Benner ◽  
Christopher K Glass

Abstract Motivation Genetic variation in regulatory elements can alter transcription factor (TF) binding by mutating a TF binding motif, which in turn may affect the activity of the regulatory elements. However, it is unclear which motifs are prone to impact transcriptional regulation if mutated. Current motif analysis tools either prioritize TFs based on motif enrichment without linking to a function or are limited in their applications due to the assumption of linearity between motifs and their functional effects. Results We present MAGGIE (Motif Alteration Genome-wide to Globally Investigate Elements), a novel method for identifying motifs mediating TF binding and function. By leveraging measurements from diverse genotypes, MAGGIE uses a statistical approach to link mutations of a motif to changes of an epigenomic feature without assuming a linear relationship. We benchmark MAGGIE across various applications using both simulated and biological datasets and demonstrate its improvement in sensitivity and specificity compared with the state-of-the-art motif analysis approaches. We use MAGGIE to gain novel insights into the divergent functions of distinct NF-κB factors in pro-inflammatory macrophages, revealing the association of p65–p50 co-binding with transcriptional activation and the association of p50 binding lacking p65 with transcriptional repression. Availability and implementation The Python package for MAGGIE is freely available at https://github.com/zeyang-shen/maggie. The accession number for the NF-κB ChIP-seq data generated for this study is Gene Expression Omnibus: GSE144070. Supplementary information Supplementary data are available at Bioinformatics online.


Biomedicines ◽  
2021 ◽  
Vol 9 (11) ◽  
pp. 1742
Author(s):  
Stefania Mantziou ◽  
Georgios S. Markopoulos

Long non-coding RNAs (lncRNAs) have emerged during the post-genomic era as significant epigenetic regulators. Viral-like 30 elements (VL30s) are a family of mouse retrotransposons that are transcribed into functional lncRNAs. Recent data suggest that VL30 RNAs are efficiently packaged in small extracellular vesicles (SEVs) through an SEV enrichment sequence. We analysed VL30 elements for the presence of the distinct 26 nt SEV enrichment motif and found that SEV enrichment is an inherent hallmark of the VL30 family, contained in 36 full-length elements, with a widespread chromosomal distribution. Among them, 25 elements represent active, present-day integrations and contain an abundance of regulatory sequences. Phylogenetic analysis revealed a recent spread of SEV-VL30s from 4.4 million years ago till today. Importantly, 39 elements contain an SFPQ-binding motif, associated with the transcriptional induction of oncogenes. Most SEV-VL30s reside in transcriptionally active regions, as characterised by their distribution adjacent to candidate cis-regulatory elements (cCREs). Network analysis of SEV-VL30-associated genes suggests a distinct transcriptional footprint associated with embryonal abnormalities and neoplasia. Given the established role of VL30s in oncogenesis, we conclude that their potential to spread through SEVs represents a novel mechanism for non-coding RNA biology with numerous implications for cellular homeostasis and disease.


2019 ◽  
Vol 115 (10) ◽  
pp. 1487-1499 ◽  
Author(s):  
Olga Bondareva ◽  
Roman Tsaryk ◽  
Vesna Bojovic ◽  
Maria Odenthal-Schnittler ◽  
Arndt F Siekmann ◽  
...  

Abstract Aims Oscillatory shear stress (OSS) is an atheroprone haemodynamic force that occurs in areas of vessel irregularities and is implicated in the pathogenesis of atherosclerosis. Changes in signalling and transcriptional programme in response to OSS have been vigorously studied; however, the underlying changes in the chromatin landscape controlling transcription remain to be elucidated. Here, we investigated the changes in the regulatory element (RE) landscape of endothelial cells under atheroprone OSS conditions in an in vitro model. Methods and results Analyses of H3K27ac chromatin immunoprecipitation-Seq enrichment and RNA-Seq in primary human umbilical vein endothelial cells 6 h after onset of OSS identified 2806 differential responsive REs and 33 differentially expressed genes compared with control cells kept under static conditions. Furthermore, gene ontology analyses of putative RE-associated genes uncovered enrichment of WNT/HIPPO pathway and cytoskeleton reorganization signatures. Transcription factor (TF) binding motif analysis within RE sequences identified over-representation of ETS, Zinc finger, and activator protein 1 TF families that regulate cell cycle, proliferation, and apoptosis, implicating them in the development of atherosclerosis. Importantly, we confirmed the activation of EGR1 as well as the YAP/TAZ complex early (6 h) after onset of OSS in both cultured human vein and artery endothelial cells and, by undertaking luciferase assays, functionally verified their role in RE activation in response to OSS. Conclusions Based on the identification and verification of specific responsive REs early upon OSS exposure, we propose an expanded mechanism of how OSS might contribute to the development of atherosclerosis.


2016 ◽  
Vol 113 (16) ◽  
pp. 4434-4439 ◽  
Author(s):  
Aoi Wakabayashi ◽  
Jacob C. Ulirsch ◽  
Leif S. Ludwig ◽  
Claudia Fiorini ◽  
Makiko Yasuda ◽  
...  

Whole-exome sequencing has been incredibly successful in identifying causal genetic variants and has revealed a number of novel genes associated with blood and other diseases. One limitation of this approach is that it overlooks mutations in noncoding regulatory elements. Furthermore, the mechanisms by which mutations in transcriptional cis-regulatory elements result in disease remain poorly understood. Here we used CRISPR/Cas9 genome editing to interrogate three such elements harboring mutations in human erythroid disorders, which in all cases are predicted to disrupt a canonical binding motif for the hematopoietic transcription factor GATA1. Deletions of as few as two to four nucleotides resulted in a substantial decrease (>80%) in target gene expression. Isolated deletions of the canonical GATA1 binding motif completely abrogated binding of the cofactor TAL1, which binds to a separate motif. Having verified the functionality of these three GATA1 motifs, we demonstrate strong evolutionary conservation of GATA1 motifs in regulatory elements proximal to other genes implicated in erythroid disorders, and show that targeted disruption of such elements results in altered gene expression. By modeling transcription factor binding patterns, we show that multiple transcription factors are associated with erythroid gene expression, and have created predictive maps modeling putative disruptions of their binding sites at key regulatory elements. Our study provides insight into GATA1 transcriptional activity and may prove a useful resource for investigating the pathogenicity of noncoding variants in human erythroid disorders.


2013 ◽  
Vol 42 (5) ◽  
pp. 3059-3072 ◽  
Author(s):  
Montse Gustems ◽  
Anne Woellmer ◽  
Ulrich Rothbauer ◽  
Sebastian H. Eck ◽  
Thomas Wieland ◽  
...  

Abstract CpG methylation in mammalian DNA is known to interfere with gene expression by inhibiting the binding of transactivators to their cognate sequence motifs or recruiting proteins involved in gene repression. An Epstein–Barr virus-encoded transcription factor, Zta, was the first example of a sequence-specific transcription factor that preferentially recognizes and selectively binds DNA sequence motifs with methylated CpG residues, reverses epigenetic silencing and activates gene transcription. The DNA binding domain of Zta is homologous to c-Fos, a member of the cellular AP-1 (activator protein 1) transcription factor family, which regulates cell proliferation and survival, apoptosis, transformation and oncogenesis. We have identified a novel AP-1 binding site termed meAP-1, which contains a CpG dinucleotide. If methylated, meAP-1 sites are preferentially bound by the AP-1 heterodimer c-Jun/c-Fos in vitro and in cellular chromatin in vivo. In activated human primary B cells, c-Jun/c-Fos locates to these methylated elements in promoter regions of transcriptionally activated genes. Reminiscent of the viral Zta protein, c-Jun/c-Fos is the first identified cellular member of the AP-1 family of transactivators that can induce expression of genes with methylated, hence repressed promoters, reversing epigenetic silencing.


2015 ◽  
Vol 35 (suppl_1) ◽  
Author(s):  
Nicholas T Hogan ◽  
Casey E Romanoski ◽  
Michael T Lam ◽  
Christopher K Glass

Introduction: Sequence-specific transcription factors bind DNA regulatory elements and play a key role in establishing cellular identity. Studies comparing macrophages to B cells have revealed that small numbers of such collaborative or lineage-determining transcription factors (LDTF) establish distinct enhancers in each cell type. These factors also allow for the binding of signal dependent transcription factors. Here we present data which suggest members of the AP-1, ETS, and STAT transcription factor families serve as collaborative transcriptional regulators in human aortic endothelial cells (HAEC). Hypothesis: We hypothesize that a set of AP-1 and ETS transcription factors collaborate to establish key endothelial cell enhancers. Methods: Working in HAEC, we measured poised and active enhancers using ChIP-seq for the epigenetic histone modifications H3K4me2 and H3K27Ac, performed motif analysis, and measured transcription factor binding for candidate factors. Knockdowns of JUN, ERG, and STAT3 followed by RNA-seq were used to evaluate altered enhancer function and gene targets of candidate factors. Results: Our de novo motif analysis revealed that motifs for ETS and AP-1 transcription factors are highly enriched at HAEC enhancers. ChIP-seq experiments for JUN, JUNB, ERG, and STAT3 showed between 8,000 and 55,000 intergenic peaks for each factor. Together these peaks bind 50% of poised enhancers, with a subset co-localizing at these sites. Gene ontology analysis showed that gene targets of these enhancers are involved in endothelial-specific functions. Further, knockdown of JUN, ERG, and STAT3 resulted in a twofold or greater change in expression of hundreds of HAEC transcripts. Conclusion: The genome-wide pattern of JUN, JUNB, ERG, and STAT3 co-localization at enhancers in HAEC suggests these factors serve as key regulators that collaboratively modulate endothelial-specific gene expression. Further investigation of candidate lineage-determining transcription factors using pro-atherogenic signals could reveal regulatory mechanisms of disease-relevant endothelial transcriptional programs.


2017 ◽  
Vol 114 (32) ◽  
pp. E6710-E6719 ◽  
Author(s):  
Julie M. Pelletier ◽  
Raymond W. Kwong ◽  
Soomin Park ◽  
Brandon H. Le ◽  
Russell Baden ◽  
...  

LEAFY COTYLEDON1 (LEC1), an atypical subunit of the nuclear transcription factor Y (NF-Y) CCAAT-binding transcription factor, is a central regulator that controls many aspects of seed development including the maturation phase during which seeds accumulate storage macromolecules and embryos acquire the ability to withstand desiccation. To define the gene networks and developmental processes controlled by LEC1, genes regulated directly by and downstream of LEC1 were identified. We compared the mRNA profiles of wild-type and lec1-null mutant seeds at several stages of development to define genes that are down-regulated or up-regulated by the lec1 mutation. We used ChIP and differential gene-expression analyses in Arabidopsis seedlings overexpressing LEC1 and in developing Arabidopsis and soybean seeds to identify globally the target genes that are transcriptionally regulated by LEC1 in planta. Collectively, our results show that LEC1 controls distinct gene sets at different developmental stages, including those that mediate the temporal transition between photosynthesis and chloroplast biogenesis early in seed development and seed maturation late in development. Analyses of enriched DNA sequence motifs that may act as cis-regulatory elements in the promoters of LEC1 target genes suggest that LEC1 may interact with other transcription factors to regulate distinct gene sets at different stages of seed development. Moreover, our results demonstrate strong conservation in the developmental processes and gene networks regulated by LEC1 in two dicotyledonous plants that diverged ∼92 Mya.


2019 ◽  
Author(s):  
Cristina Tavera-Montañez ◽  
Sarah J. Hainer ◽  
Daniella Cangussu ◽  
Shellaina J.V. Gordon ◽  
Yao Xiao ◽  
...  

AbstractMTF1 is a conserved metal-binding transcription factor in eukaryotes that binds to conserved DNA sequence motifs, termed metal response elements (MREs). MTF1 responds to metal excess and deprivation, protects cells from oxidative and hypoxic stresses, and is required for embryonic development in vertebrates. We used multiple strategies to identify an unappreciated role for MTF1 and copper (Cu) in cell differentiation. Upon initiation of myogenesis from primary myoblasts, MTF1 expression increased, as did nuclear localization. Mtf1 knockdown impaired differentiation, while addition of non-toxic concentrations of Cu+ enhanced MTF1 expression and promoted myogenesis. Cu+ bound stoichiometrically to a C-terminus tetra-cysteine of MTF1. MTF1 bound to chromatin at the promoter regions of myogenic genes and binding was stimulated by copper. MTF1 formed a complex with MyoD at myogenic promoters, the master transcriptional regulator of the myogenic lineage. These studies establish novel mechanisms by which copper and MTF1 regulate gene expression in myoblast differentiation.


2000 ◽  
Vol 74 (13) ◽  
pp. 6213-6216 ◽  
Author(s):  
Patricio Meneses ◽  
Kenneth I. Berns ◽  
Ernest Winocour

ABSTRACT The DNA sequence motifs which direct adeno-associated virus type 2 site-specific integration are being investigated using a shuttle vector, propagated as a stable episome in cultured cell lines, as the target for integration. Previously, we reported that the minimum episomal targeting elements comprise a 16-bp binding motif (Rep binding site [RBS]) for a viral regulatory protein (Rep) separated by a short DNA spacer from a sequence (terminal resolution site [TRS]) that can serve as a substrate for Rep-mediated nicking activity (R. M. Linden, P. Ward, C. Giraud, E. Winocour, and K. I. Berns, Proc. Natl. Acad. Sci. USA 93:11288–11294, 1996; R. M. Linden, E. Winocour, and K. I. Berns, Proc. Natl. Acad. Sci. USA 93:7966–7972, 1996). We now report that episomal integration depends upon both the sequence and the position of the spacer DNA separating the RBS and TRS motifs. The spacer thus constitutes a third element required for site-specific episomal integration.


2020 ◽  
Vol 127 (12) ◽  
pp. 1502-1518
Author(s):  
Giselle Galang ◽  
Ravi Mandla ◽  
Hongmei Ruan ◽  
Catherine Jung ◽  
Tanvi Sinha ◽  
...  

Rationale: Cardiac pacemaker cells (PCs) in the sinoatrial node (SAN) have a distinct gene expression program that allows them to fire automatically and initiate the heartbeat. Although critical SAN transcription factors, including Isl1 (Islet-1), Tbx3 (T-box transcription factor 3), and Shox2 (short-stature homeobox protein 2), have been identified, the cis -regulatory architecture that governs PC-specific gene expression is not understood, and discrete enhancers required for gene regulation in the SAN have not been identified. Objective: To define the epigenetic profile of PCs using comparative ATAC-seq (assay for transposase-accessible chromatin with sequencing) and to identify novel enhancers involved in SAN gene regulation, development, and function. Methods and Results: We used ATAC-seq on sorted neonatal mouse SAN to compare regions of accessible chromatin in PCs and right atrial cardiomyocytes. PC-enriched assay for transposase-accessible chromatin peaks, representing candidate SAN regulatory elements, were located near established SAN genes and were enriched for distinct sets of TF (transcription factor) binding sites. Among several novel SAN enhancers that were experimentally validated using transgenic mice, we identified a 2.9-kb regulatory element at the Isl1 locus that was active specifically in the cardiac inflow at embryonic day 8.5 and throughout later SAN development and maturation. Deletion of this enhancer from the genome of mice resulted in SAN hypoplasia and sinus arrhythmias. The mouse SAN enhancer also directed reporter activity to the inflow tract in developing zebrafish hearts, demonstrating deep conservation of its upstream regulatory network. Finally, single nucleotide polymorphisms in the human genome that occur near the region syntenic to the mouse enhancer exhibit significant associations with resting heart rate in human populations. Conclusions: (1) PCs have distinct regions of accessible chromatin that correlate with their gene expression profile and contain novel SAN enhancers, (2) cis -regulation of Isl1 specifically in the SAN depends upon a conserved SAN enhancer that regulates PC development and SAN function, and (3) a corresponding human ISL1 enhancer may regulate human SAN function.


Sign in / Sign up

Export Citation Format

Share Document