scholarly journals ChIP-GSM: Inferring active transcription factor modules to predict functional regulatory elements

2021 ◽  
Vol 17 (7) ◽  
pp. e1009203
Author(s):  
Xi Chen ◽  
Andrew F. Neuwald ◽  
Leena Hilakivi-Clarke ◽  
Robert Clarke ◽  
Jianhua Xuan

Transcription factors (TFs) often function as a module including both master factors and mediators binding at cis-regulatory regions to modulate nearby gene transcription. ChIP-seq profiling of multiple TFs makes it feasible to infer functional TF modules. However, when inferring TF modules based on co-localization of ChIP-seq peaks, often many weak binding events are missed, especially for mediators, resulting in incomplete identification of modules. To address this problem, we develop a ChIP-seq data-driven Gibbs Sampler to infer Modules (ChIP-GSM) using a Bayesian framework that integrates ChIP-seq profiles of multiple TFs. ChIP-GSM samples read counts of module TFs iteratively to estimate the binding potential of a module to each region and, across all regions, estimates the module abundance. Using inferred module-region probabilistic bindings as feature units, ChIP-GSM then employs logistic regression to predict active regulatory elements. Validation of ChIP-GSM predicted regulatory regions on multiple independent datasets sharing the same context confirms the advantage of using TF modules for predicting regulatory activity. In a case study of K562 cells, we demonstrate that the ChIP-GSM inferred modules form as groups, activate gene expression at different time points, and mediate diverse functional cellular processes. Hence, ChIP-GSM infers biologically meaningful TF modules and improves the prediction accuracy of regulatory region activities.

Blood ◽  
1995 ◽  
Vol 85 (11) ◽  
pp. 3199-3207 ◽  
Author(s):  
F Ishimaru ◽  
MA Shipp

The cell surface zinc metalloproteinase CD10/neutral endopeptidase 24.11 (NEP) is expressed on normal and malignant lymphoid progenitors, granulocytes, and a variety of epithelial cells. To further define the tissue-specific and developmentally related expression of CD10/NEP, we have characterized two separate regulatory regions that control the transcription of 5′ alternatively spliced CD10/NEP transcripts. These type 1 and 2 CD10/NEP regulatory regions are both characterized by the presence of multiple transcription initiation sites and the absence of classic TATA boxes and consensus initiator elements. The purine-rich type 1 regulatory region, which includes 5′ UTR exon 1 sequence, is characterized by multiple putative PU.1 binding sites and consensus ets-binding motifs. In marked contrast, the GC-rich type 2 regulatory region contains multiple putative Sp1 binding sites, a potential consensus retinoblastoma control element (RCE), and an inverted CCAAT box. In the majority of tissues examined to date, type 2 CD10/NEP transcripts were more abundant; the abundance of type 1 transcripts was more variable, with the highest type 1 levels in fetal thymus and certain lymphoblastic leukemia cell lines.


Development ◽  
1989 ◽  
Vol 107 (3) ◽  
pp. 575-583 ◽  
Author(s):  
A. Busturia ◽  
J. Casanova ◽  
E. Sanchez-Herrero ◽  
R. Gonzalez ◽  
G. Morata

We report the embryonic and adult phenotypes of a number of mutations of the abd-A gene of the bithorax complex. Some of them result in loss of abd-A function in the whole abd-A domain and are usually lethal. These probably eliminate or inactivate abd-A protein products. Other mutations affect only part of the abd-A domain. These are viable, appear to map outside the abd-A transcription unit, and presumably alter the normal spatial regulation of abd-A products. We propose a model of abd-A structure based on a protein-coding region and two cis-regulatory regions. Regulatory region 1, 3′ to the transcription unit, contains positive and negative regulatory elements. Regulatory region 2, 5′ to the transcription unit, establishes the correct level of abd-A activity in the abdominal metameres.


2006 ◽  
Vol 26 (10) ◽  
pp. 3942-3954 ◽  
Author(s):  
Francesca Bosè ◽  
Cristina Fugazza ◽  
Maura Casalgrandi ◽  
Alessia Capelli ◽  
John M. Cunningham ◽  
...  

ABSTRACT We observed that binding sites for the ubiquitously expressed transcription factor CP2 were present in regulatory regions of multiple erythroid genes. In these regions, the CP2 binding site was adjacent to a site for the erythroid factor GATA-1. Using three such regulatory regions (from genes encoding the transcription factors GATA-1, EKLF, and p45 NF-E2), we demonstrated the functional importance of the adjacent CP2/GATA-1 sites. In particular, CP2 binds to the GATA-1 HS2 enhancer, generating a ternary complex with GATA-1 and DNA. Mutations in the CP2 consensus greatly impaired HS2 activity in transient transfection assays with K562 cells. Similar results were obtained by transfection of EKLF and p45 NF-E2 mutant constructs. Chromatin immunoprecipitation with K562 cells showed that CP2 binds in vivo to all three regulatory elements and that both GATA-1 and CP2 were present on the same GATA-1 and EKLF regulatory elements. Adjacent CP2/GATA-1 sites may represent a novel module for erythroid expression of a number of genes. Additionally, coimmunoprecipitation and glutathione S-transferase pull-down experiments demonstrated a physical interaction between GATA-1 and CP2. This may contribute to the functional cooperation between these factors and provide an explanation for the important role of ubiquitous CP2 in the regulation of erythroid genes.


2019 ◽  
Author(s):  
Xi Chen ◽  
Andrew F. Neuwald ◽  
Leena Hilakivi-Clarke ◽  
Robert Clarke ◽  
Jianhua Xuan

AbstractTranscription factors (TFs) often function as cis-regulatory modules (CRMs) including both master factors and mediator coactivators to activate enhancers or promoters and regulate target gene transcription. Cell type-specific ChIP-seq profiling of multiple TFs makes it feasible to infer functional CRMs for a particular cell type. Yet, approaches based on co-localization of TF ChIP-seq peaks to infer CRMs are applied but many weak binding events, especially of those mediators, are missed by peak callers, resulting in an incomplete identification of CRMs. We developed a ChIP-seq data-based CRM inference approach with Gibbs-Sampling (ChIP-GSM). In a Bayesian framework, ChIP-GSM samples read counts of TFs iteratively for the joint effect of each potential TF combination. Using inferred CRMs as novel features, ChIP-GSM employs a logistic regression model to predict active regulatory elements. Performance validation on FANTOM5 enhancer or promoter regions revealed the superior performance of CRMs on regulatory region activity prediction than TFs. Finally, integrating CRMs inferred for K562 cells and gene expression data we found that CRMs are likely to activate regulatory regions or genes at different time points to mediate distinct cellular functions.Author SummaryAccurately inferring cis-regulatory modules (CRMs) from a large set of TFs is a challenging task because the binding signals of TFs are often weak, noisy and sensitive to the cellular environment. Nevertheless, investigating TF associations may help understand the difference between enhancer and promoter activation mechanisms. In this paper, we develop a computational method (ChIP-GSM) to infer CRMs acting on regulatory elements at enhancer and promote regions. The novel method is built upon a Bayesian framework with Gibbs sampling that can be used to infer CRMs reliably hence to predict regulatory elements. The performance of ChIP-GSM is compared to that of existing methods, demonstrating its improved performance. Experimental results demonstrate that CRMs identified by ChIP-GSM are likely activating regulatory regions at different time points to mediate distinct cellular functions.


2013 ◽  
Vol 368 (1632) ◽  
pp. 20130028 ◽  
Author(s):  
David L. Stern ◽  
Nicolás Frankel

In this paper, we provide a historical account of the contribution of a single line of research to our current understanding of the structure of cis -regulatory regions and the genetic basis for morphological evolution. We revisit the experiments that shed light on the evolution of larval cuticular patterns within the genus Drosophila and the evolution and structure of the shavenbaby gene. We describe the experiments that led to the discovery that multiple genetic changes in the cis -regulatory region of shavenbaby caused the loss of dorsal cuticular hairs (quaternary trichomes) in first instar larvae of Drosophila sechellia . We also discuss the experiments that showed that the convergent loss of quaternary trichomes in D. sechellia and Drosophila ezoana was generated by parallel genetic changes in orthologous enhancers of shavenbaby . We discuss the observation that multiple shavenbaby enhancers drive overlapping patterns of expression in the embryo and that these apparently redundant enhancers ensure robust shavenbaby expression and trichome morphogenesis under stressful conditions. All together, these data, collected over 13 years, provide a fundamental case study in the fields of gene regulation and morphological evolution, and highlight the importance of prolonged, detailed studies of single genes.


2021 ◽  
Author(s):  
Jian Ming Khor ◽  
Jennifer Guerrero-Santoro ◽  
Charles A Ettensohn

The gene regulatory network (GRN) that underlies echinoderm skeletogenesis is a prominent model of GRN architecture and evolution. KirrelL is an essential downstream effector gene in this network and encodes an Ig-superfamily protein required for the fusion of skeletogenic cells and the formation of the skeleton. In this study, we dissected the transcriptional control region of the kirrelL gene of the purple sea urchin, Strongylocentrotus purpuratus. Using plasmid- and BAC-based transgenic reporter assays, we identified key cis-regulatory elements (CREs) and transcription factor inputs that regulate Sp-kirrelL, including direct, positive inputs from two key transcription factors in the skeletogenic GRN, Alx1 and Ets1. We next identified kirrelLcis-regulatory regions from seven other echinoderm species that together represent all classes within the phylum. By introducing these heterologous regulatory regions into developing sea urchin embryos we provide evidence of their remarkable conservation across ~500 million years of evolution. We dissected in detail the kirrelL regulatory region of the sea star, Patiria miniata, and demonstrated that it also receives direct inputs from Alx1 and Ets1. Our findings identify kirrelL as a component of the ancestral echinoderm skeletogenic GRN. They support the view that GRN sub-circuits, including specific transcription factor-CRE interactions, can remain stable over vast periods of evolutionary history. Lastly, our analysis of kirrelL establishes direct linkages between a developmental GRN and an effector gene that controls a key morphogenetic cell behavior, cell-cell fusion, providing a paradigm for extending the explanatory power of GRNs.


2021 ◽  
Author(s):  
Tyler S. Klann ◽  
Alejandro Barrera ◽  
Adarsh R. Ettyreddy ◽  
Ryan A. Rickels ◽  
Julien Bryois ◽  
...  

AbstractNoncoding regulatory elements control gene expression and govern all biological processes. Epigenomic profiling assays have identified millions of putative regulatory elements, but systematically determining the function of each of those regulatory elements remains a substantial challenge. Here we adapt CRISPR-dCas9-based epigenomic regulatory element screening (CERES) technology to screen all >100,000 putative non-coding regulatory elements defined by open chromatin sites in human K562 leukemia cells for their role in regulating essential cellular processes. In an initial screen containing more than 1 million gRNAs, we discovered approximately 12,000 regulatory elements with evidence of impact on cell fitness. We validated many of the screen hits in K562 cells, evaluated cell-type specificity in a second cancer cell line, and identified target genes of regulatory elements using CERES perturbations combined with single cell RNA-seq. This comprehensive and quantitative genome-wide map of essential regulatory elements represents a framework for extensive characterization of noncoding regulatory elements that drive complex cell phenotypes and for prioritizing non-coding genetic variants that likely contribute to common traits and disease risk.


2021 ◽  
Vol 22 (14) ◽  
pp. 7390
Author(s):  
Nicole Wesch ◽  
Frank Löhr ◽  
Natalia Rogova ◽  
Volker Dötsch ◽  
Vladimir V. Rogov

Ubiquitin fold modifier 1 (UFM1) is a member of the ubiquitin-like protein family. UFM1 undergoes a cascade of enzymatic reactions including activation by UBA5 (E1), transfer to UFC1 (E2) and selective conjugation to a number of target proteins via UFL1 (E3) enzymes. Despite the importance of ufmylation in a variety of cellular processes and its role in the pathogenicity of many human diseases, the molecular mechanisms of the ufmylation cascade remains unclear. In this study we focused on the biophysical and biochemical characterization of the interaction between UBA5 and UFC1. We explored the hypothesis that the unstructured C-terminal region of UBA5 serves as a regulatory region, controlling cellular localization of the elements of the ufmylation cascade and effective interaction between them. We found that the last 20 residues in UBA5 are pivotal for binding to UFC1 and can accelerate the transfer of UFM1 to UFC1. We solved the structure of a complex of UFC1 and a peptide spanning the last 20 residues of UBA5 by NMR spectroscopy. This structure in combination with additional NMR titration and isothermal titration calorimetry experiments revealed the mechanism of interaction and confirmed the importance of the C-terminal unstructured region in UBA5 for the ufmylation cascade.


eLife ◽  
2014 ◽  
Vol 3 ◽  
Author(s):  
Andrew R Bassett ◽  
Asifa Akhtar ◽  
Denise P Barlow ◽  
Adrian P Bird ◽  
Neil Brockdorff ◽  
...  

Although a small number of the vast array of animal long non-coding RNAs (lncRNAs) have known effects on cellular processes examined in vitro, the extent of their contributions to normal cell processes throughout development, differentiation and disease for the most part remains less clear. Phenotypes arising from deletion of an entire genomic locus cannot be unequivocally attributed either to the loss of the lncRNA per se or to the associated loss of other overlapping DNA regulatory elements. The distinction between cis- or trans-effects is also often problematic. We discuss the advantages and challenges associated with the current techniques for studying the in vivo function of lncRNAs in the light of different models of lncRNA molecular mechanism, and reflect on the design of experiments to mutate lncRNA loci. These considerations should assist in the further investigation of these transcriptional products of the genome.


1997 ◽  
Vol 323 (2) ◽  
pp. 511-519 ◽  
Author(s):  
Chad K. OH ◽  
Markus NEURATH ◽  
Jeong-Je CHO ◽  
Tekli SEMERE ◽  
Dean D. METCALFE

T-cell activation gene 3 (TCA3) encodes a β-chemokine that is transcriptionally regulated in mast cells; the gene has a functional NF-κB element at positions -194 to -185. The 5´-flanking region of this gene is also known to have a negative regulatory region between -2057 and -1342. To characterize the negative regulatory elements (NREs), this region was sequenced and then digested by HindIII enzyme into two fragments, NRE-1 (-2057 to -1493) and NRE-2 (-1492 to -1342). Both NRE-1 and NRE-2 in the 5´–3´ orientation inhibited chloramphenicol acetyltransferase (CAT)-protein synthesis by a TCA3–CAT construct transfected into mast cells that were then activated. Only NRE-1 inhibited CAT-protein synthesis in the 3´–5´ orientation. Further deletion of the 5´ region of NRE-1 partially abolished the inhibitory activity. Both NRE-1 and NRE-2 inhibited the activity of a CD20–CAT construct independent of cell activation. Electrophoretic mobility shift assays showed DNA–protein complex formation with subsequences (CCCCCATTCT) of NRE-1 (NRE-1a) and (CCATGA) of NRE-2 (NRE-2b). NRE-1a appears to be novel. NRE-2b is identical with a putative silencer motif in the αIIb integrin gene. Site-directed mutagenesis demonstrated that both NRE-1a and NRE-2b are important in the negative regulation of TCA3 promoter activity. In vivo ligation-mediated PCR footprinting of the NRE-2 region revealed protection between -1372 and -1354, which contains NRE-2b. The data thus demonstrate identity of a silencer motif, here termed NRE-2b, in both the αIIb integrin gene and the TCA3, and that this silencer region in mast cells is functional both in vivoand in vitro. Further, evidence is presented that the promoter for TCA3 contains a novel silencer motif, termed NRE-1a, characterized by a CT-rich sequence.


Sign in / Sign up

Export Citation Format

Share Document