scholarly journals Synthetic and genomic regulatory elements reveal aspects of cis-regulatory grammar in mouse embryonic stem cells

eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
Dana M King ◽  
Clarice Kit Yee Hong ◽  
James L Shepherdson ◽  
David M Granas ◽  
Brett B Maricque ◽  
...  

In embryonic stem cells (ESCs), a core transcription factor (TF) network establishes the gene expression program necessary for pluripotency. To address how interactions between four key TFs contribute to cis-regulation in mouse ESCs, we assayed two massively parallel reporter assay (MPRA) libraries composed of binding sites for SOX2, POU5F1 (OCT4), KLF4, and ESRRB. Comparisons between synthetic cis-regulatory elements and genomic sequences with comparable binding site configurations revealed some aspects of a regulatory grammar. The expression of synthetic elements is influenced by both the number and arrangement of binding sites. This grammar plays only a small role for genomic sequences, as the relative activities of genomic sequences are best explained by the predicted occupancy of binding sites, regardless of binding site identity and positioning. Our results suggest that the effects of transcription factor binding sites (TFBS) are influenced by the order and orientation of sites, but that in the genome the overall occupancy of TFs is the primary determinant of activity.

2018 ◽  
Author(s):  
Dana M. King ◽  
Brett B. Maricque ◽  
Barak A. Cohen

In embryonic stem cells (ESCs), a core network of transcription factors establish and maintain the gene expression program necessary to grow indefinitely in cell culture and generate all three primary germ layers. To understand how interactions between four key pluripotency transcription factors (TFs), SOX2, POU5F1 (OCT4), KLF4, and ESRRB, contribute to cis-regulation in mouse ESCs, we assayed two massively parallel reporter assay (MPRA) libraries composed of different combinations of binding sites for these TFs. One library was an exhaustive set of synthetic cis-regulatory elements and the second was a set of genomic sequences with comparable configurations of binding sites. Comparisons between the libraries allowed us to determine the regulatory grammar requirements for these binding sites in constrained synthetic contexts versus genomic sequence contexts. We found that binding site quality is a common attribute for active elements in both the synthetic and genomic contexts. For synthetic regulatory elements, the level of expression is mostly determined by the number of binding sites but is tuned by a grammar that includes position effects. Surprisingly, this grammar appears to only play a small role in setting the output levels of genomic sequences. The relative activity of genomic sequences is best explained by the predicted affinity of binding sites, regardless of identity, and optimized spacing between sites. Our findings highlight the need for detailed examinations of complex sequence space when trying to understand cis-regulatory grammar in the genome.


2012 ◽  
Vol 9 (2) ◽  
pp. 88-100 ◽  
Author(s):  
Yuriy Orlov ◽  
Han Xu ◽  
Dmitri Afonnikov ◽  
Bing Lim ◽  
Jian-Chien Heng ◽  
...  

Summary Advances in high throughput sequencing technology have enabled the identification of transcription factor (TF) binding sites in genome scale. TF binding studies are important for medical applications and stem cell research. Somatic cells can be reprogrammed to a pluripotent state by the combined introduction of factors such as Oct4, Sox2, c-Myc, Klf4. These reprogrammed cells share many characteristics with embryonic stem cells (ESCs) and are known as induced pluripotent stem cells (iPSCs). The signaling requirements for maintenance of human and murine embryonic stem cells (ESCs) differ considerably. Genome wide ChIP-seq TF binding maps in mouse stem cells include Oct4, Sox2, Nanog, Tbx3, Smad2 as well as group of other factors. ChIP-seq allows study of new candidate transcription factors for reprogramming. It was shown that Nr5a2 could replace Oct4 for reprogramming. Epigenetic modifications play important role in regulation of gene expression adding additional complexity to transcription network functioning. We have studied associations between different histone modification using published data together with RNA Pol II sites. We found strong associations between activation marks and TF binding sites and present it qualitatively. To meet issues of statistical analysis of genome ChIP-sequencing maps we developed computer program to filter out noise signals and find significant association between binding site affinity and number of sequence reads. The data provide new insights into the function of chromatin organization and regulation in stem cells.


2012 ◽  
Vol 26 (5) ◽  
pp. 859-872 ◽  
Author(s):  
Rangan Gupta ◽  
Toshihiko Ezashi ◽  
R. Michael Roberts

Abstract The subunit genes encoding human chorionic gonadotropin, CGA, and CGB, are up-regulated in human trophoblast. However, they are effectively silenced in choriocarcinoma cells by ectopically expressed POU domain class 5 transcription factor 1 (POU5F1). Here we show that POU5F1 represses activity of the CGA promoter through its interactions with ETS2, a transcription factor required for both placental development and human chorionic gonadotropin subunit gene expression, by forming a complex that precludes ETS2 from interacting with the CGA promoter. Mutation of a POU5F1 binding site proximal to the ETS2 binding site does not alter the ability of POU5F1 to act as a repressor but causes a drop in basal promoter activity due to overlap with the binding site for DLX3. DLX3 has only a modest ability to raise basal CGA promoter activity, but its coexpression with ETS2 can up-regulate it 100-fold or more. The two factors form a complex, and both must bind to the promoter for the combination to be transcriptionally effective, a synergy compromised by POU5F1. Similarly, in human embryonic stem cells, which express ETS2 but not CGA, ETS2 does not occupy its binding site on the CGA promoter but is found instead as a soluble complex with POU5F1. When human embryonic stem cells differentiate in response to bone morphogenetic protein-4 and concentrations of POU5F1 fall and hCG and DLX3 rise, ETS2 then occupies its binding site on the CGA promoter. Hence, a squelching mechanism underpins the transcriptional silencing of CGA by POU5F1 and could have general relevance to how pluripotency is maintained and how the trophoblast lineage emerges from pluripotent precursor cells.


2011 ◽  
Author(s):  
Μαρία Καπασά

Mammalian development occurs by the progressive determination of cells from a pluripotent undifferentiated state through successive states of gradually restricted developmental potential, until the full complement of mature terminally differentiated cells has been specified. Embryonic development is a complex and highly orchestrated process during which multiple cell movements and changes in gene expression must be spatially and temporally coordinated to ensure that embryogenesis proceeds correctly. Complex genetic regulatory networks receive input in the form of extracellular signals and output instructions on the regulated expression of specific genes. The linchpins of the regulatory networks are the cis-regulatory elements that directly control gene expression through interpretation of the tissue-specific transcription factors (trans-elements). Embryonic stem cells are orientated across the dorso-ventral and the anterior-posterior axis of the early embryo. The orientation of progenitor cells along these two axes is thought to influence their fate by defining the identity and concentration of inductive signals to which they are exposed.In an effort to develop cell-based therapies, (i.e. for diabetes) experimental protocols aim to mimic the biological procedures that take place during embryonic development in order to differentiate embryonic stem cells towards specific cell types. One of the foremost challenges towards the development of cell therapies for diabetic people is to achieve the directed differentiation of cells capable of producing insulin. Elucidation of the genetic networks involved in the endocrine pancreas specification are thought to be essential for devising rational protocols to efficiently differentiate embryonic stem cells or pancreas progenitor cells into fully differentiated endocrine subtypes. Computational approaches allow the unravelling of complex regulatory networks including genomic (cis-cis) or proteomic (trans-trans) interactions or a combination (cis-trans) of both. In this study the genomic regulatory regions (cis elements) of several genes known and putative targets of the transcription factor NGN3 were analyzed. The NGN3 transcription factor is the major regulator of “insulin-producing cell” formation. Taking into account data from microarray experiments from pancreas progenitor cells, in which NGN3 has been induced, genes shown to be co-regulated (upregulated or downregulated) by this transcription factor were selected for analysis. Using a combination of sophisticated computational tools for exploiting and analyzing genomic data and developing the suitable algorithms, an extensive in silico analysis of the regulatory regions of these genes was performed.Evolutionarily conserved regions are linked with experimentally identified regulatory elements. Comparative genomics are commonly used in order to identify transcription factor binding sites, which are functionally important regions that are thought to be well-conserved. Analysis of genomic regulatory regions included not only genes corregulated by NGN3, but also their orthologs in several species including the most phylogenetically distant species (fish), which have pancreas. In parallel, housekeeping genes, like B-ACTIN, and those not expressed in embryos and stem cells, like B-GLOBIN, were used as negative controls. Regulatory region analysis revealed the presence of a highly conserved regulatory element, where many transcription factors with established involvement in pancreas development bind, in all the orthologs of several genes co-regulated by NGN3. Furthermore, motif identification in separate clusters of the regulatory elements of either upregulated or downregulated genes revealed the presence of additional binding motifs for the factor AP4 only in downregulated genes. In parallel, the regulatory region analysis of the entire mouse genome and the statistical analysis of the upcoming results showed that both types of regulatory elements (with and without AP4) were non-randomly identified inside the regulatory regions of genes whose transcription is controlled by NGN3. Moreover the selective presence of the AP4 binding sequence into this region renders it a highly specific suppressor found in only a small number of genes downregulated by NGN3. Taking into account that both these regulatory elements were identified at considerable distances from each gene’s transcription start site, it was assumed that they represent enhancers, and those capable of binding AP4 were considered silencers. This conclusion was enforced by the compositional analysis of these regions showing low GC levels, similarly to the majority of the regulatory regions implicated in embryonic development, something that has not been reported for promoter sequences. Moreover, analysis of protein-protein interactions showed that some of the transcription factors, predicted to bind onto these elements, together with other non-specific transcription factors, constitute a core transcription control complex. This protein complex interacts with the remaining members of the predicted cluster of transcription regulators and works either as an inducer or a suppressor of transcription. This is determined by the presence of a HAT and/or an HDAC in this protein complex assumed to locally control chromatin acetylation. Based on these data, we constructed a model of the complex regulatory network that describes how through the transcriptional regulation of the analyzed genes mainly guided by ΝGN3 the gradual differentiation of cells capable of producing insulin takes place.


Author(s):  
Gurdeep Singh ◽  
Shanelle Mullany ◽  
Sakthi D Moorthy ◽  
Richard Zhang ◽  
Tahmid Mehdi ◽  
...  

ABSTRACTTranscriptional enhancers are critical for development, phenotype evolution and often mutated in disease contexts; however, even in well-studied cell types, the sequence code conferring enhancer activity remains unknown. We found genomic regions with conserved binding of multiple transcription factors in mouse and human embryonic stem cells (ESCs) contain on average 12.6 conserved transcription factor binding sites (TFBS). These TFBS are a diverse repertoire of 70 different sequences representing the binding sites of both known and novel ESC regulators. Remarkably, using a diverse set of TFBS from this repertoire was sufficient to construct short synthetic enhancers with activity comparable to native enhancers. Site directed mutagenesis of conserved TFBS in endogenous enhancers or TFBS deletion from synthetic sequences revealed a requirement for more than ten different TFBS. Furthermore, specific TFBS, including the OCT4:SOX2 co-motif, are dispensable, despite co-binding the OCT4, SOX2 and NANOG master regulators of pluripotency. These findings reveal a TFBS diversity threshold overrides the need for optimized regulatory grammar and individual TFBS that bind specific master regulators.


1994 ◽  
Vol 14 (5) ◽  
pp. 3108-3114
Author(s):  
M H Baron ◽  
S M Farrington

The zinc finger transcription factor GATA-1 is a major regulator of gene expression in erythroid, megakaryocyte, and mast cell lineages. GATA-1 binds to WGATAR consensus motifs in the regulatory regions of virtually all erythroid cell-specific genes. Analyses with cultured cells and cell-free systems have provided strong evidence that GATA-1 is involved in control of globin gene expression during erythroid differentiation. Targeted mutagenesis of the GATA-1 gene in embryonic stem cells has demonstrated its requirement in normal erythroid development. Efficient rescue of the defect requires an intact GATA element in the distal promoter, suggesting autoregulatory control of GATA-1 transcription. To examine whether GATA-1 expression involves additional regulatory factors or is maintained entirely by an autoregulatory loop, we have used a transient heterokaryon system to test the ability of erythroid factors to activate the GATA-1 gene in nonerythroid nuclei. We show here that proerythroblasts and mature erythroid cells contain a diffusible activity (TAG) capable of transcriptional activation of GATA-1 and that this activity decreases during the terminal differentiation of erythroid cells. Nuclei from GATA-1- mutant embryonic stem cells can still be reprogrammed to express their globin genes in erythroid heterokaryons, indicating that de novo induction of GATA-1 is not required for globin gene activation following cell fusion.


Circulation ◽  
2014 ◽  
Vol 130 (suppl_2) ◽  
Author(s):  
Kiwon Ban ◽  
Brian Wile ◽  
Kyu-Won Cho ◽  
Sangsung Kim ◽  
Jason Singerd ◽  
...  

Background: Ventricular cardiomyocytes (CMs) are an ideal cell type for cardiac cell therapy since they are the main cells generating cardiac forces. However, isolating them from differentiating pluripotent stem cells (PSCs) has been challenging due to the lack of specific surface markers. Here we show that ventricular CMs can be purified from differentiating mouse embryonic stem cells (mESCs) using molecular beacons (MBs) targeting specific intracellular mRNAs. MBs are dual-labeled oligonucleotide hairpin probes that emit a fluorescence signal when hybridized to target mRNAs, allowing isolation of specific target cells by fluorescence activated cell sorting (FACS) with high specificity and sensitivity. Methods and Results: We generated three different MBs (IRX4-1, -2, -3) designed to target specific regions of mRNAs of iroquois homeobox protein 4 (Irx4), a specific transcription factor for ventricular CMs. Among three IRX4 MBs, IRX4-2 MB demonstrated the highest sensitivity and specificity, thus IRX4-2 MB was selected to purify mESC-derived ventricular CMs. Subsequently, IRX4-2 MBs were delivered into cardiomyogenically differentiating mESC cultures and cells showing strong signals from IRX4-2 MBs were FACS-sorted. Flow cytometry demonstrated that 92~97% of IRX4-2 MB-positive cells expressed a marker for ventricular CMs myosin light chain 2 ventricular isoform (Myl2) as well as cardiac troponin 2 (Tnnt2). Importantly, higher than 98% of IRX4-2 MB-positive cells displayed ventricular CM-like action potentials during electrophysiological analyses. These IRX4-2 MB-based purified ventricular CMs continuously maintained their CM characteristics verified by synchronous beating, Ca2+ transient, and expression of ventricular CM-specific proteins. Conclusions: We established a novel MB-based cell sorting system targeting a transcription factor that is specific for ventricular CM to generate homogeneous and functional ventricular CMs. This is the first report to show the feasibility of isolating pure ventricular CMs without modifying host genes, and this platform will be useful for therapeutic applications, disease modeling, and drug discovery.


Sign in / Sign up

Export Citation Format

Share Document