scholarly journals Functional signatures of evolutionarily young CTCF binding sites

2020 ◽  
Author(s):  
Dhoyazan Azazi ◽  
Jonathan M. Mudge ◽  
Duncan T. Odom ◽  
Paul Flicek

ABSTRACTThe introduction of novel CTCF binding sites in gene regulatory regions in the rodent lineage is partly the effect of transposable element expansion. The exact mechanism and functional impact of evolutionarily novel CTCF binding sites are not yet fully understood. We investigated the impact of novel species-specific CTCF binding sites in two Mus genus subspecies, Mus musculus domesticus and Mus musculus castaneus, that diverged 0.5 million years ago. The activity of the B2-B4 family of transposable elements independently in both lineages leads to the proliferation of novel CTCF binding sites. A subset of evolutionarily young sites may harbour transcriptional functionality, as evidenced by the stability of their binding across multiple tissues in M. musculus domesticus (BL6), while overall the distance of species-specific CTCF binding to the nearest transcription start sites and/or topologically-associated domains (TADs) is largely similar to musculus-common CTCF sites. Remarkably, we discovered a recurrent regulatory architecture consisting of a CTCF binding site and an interferon gene that appears to have been tandemly duplicated to create a 15-gene cluster on chromosome 4, thus forming a novel BL6 specific immune locus, in which CTCF may play a regulatory role. Our results demonstrate that thousands of CTCF binding sites show multiple functional signatures rapidly after incorporation into the genome.

BMC Biology ◽  
2020 ◽  
Vol 18 (1) ◽  
Author(s):  
Dhoyazan Azazi ◽  
Jonathan M. Mudge ◽  
Duncan T. Odom ◽  
Paul Flicek

Abstract Background The introduction of novel CTCF binding sites in gene regulatory regions in the rodent lineage is partly the effect of transposable element expansion, particularly in the murine lineage. The exact mechanism and functional impact of evolutionarily novel CTCF binding sites are not yet fully understood. We investigated the impact of novel subspecies-specific CTCF binding sites in two Mus genus subspecies, Mus musculus domesticus and Mus musculus castaneus, that diverged 0.5 million years ago. Results CTCF binding site evolution is influenced by the action of the B2-B4 family of transposable elements independently in both lineages, leading to the proliferation of novel CTCF binding sites. A subset of evolutionarily young sites may harbour transcriptional functionality as evidenced by the stability of their binding across multiple tissues in M. musculus domesticus (BL6), while overall the distance of subspecies-specific CTCF binding to the nearest transcription start sites and/or topologically associated domains (TADs) is largely similar to musculus-common CTCF sites. Remarkably, we discovered a recurrent regulatory architecture consisting of a CTCF binding site and an interferon gene that appears to have been tandemly duplicated to create a 15-gene cluster on chromosome 4, thus forming a novel BL6 specific immune locus in which CTCF may play a regulatory role. Conclusions Our results demonstrate that thousands of CTCF binding sites show multiple functional signatures rapidly after incorporation into the genome.


2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Mayank NK Choudhary ◽  
Ryan Z. Friedman ◽  
Julia T. Wang ◽  
Hyo Sik Jang ◽  
Xiaoyu Zhuo ◽  
...  

Abstract Background Transposable elements (TEs) make up half of mammalian genomes and shape genome regulation by harboring binding sites for regulatory factors. These include binding sites for architectural proteins, such as CTCF, RAD21, and SMC3, that are involved in tethering chromatin loops and marking domain boundaries. The 3D organization of the mammalian genome is intimately linked to its function and is remarkably conserved. However, the mechanisms by which these structural intricacies emerge and evolve have not been thoroughly probed. Results Here, we show that TEs contribute extensively to both the formation of species-specific loops in humans and mice through deposition of novel anchoring motifs, as well as to the maintenance of conserved loops across both species through CTCF binding site turnover. The latter function demonstrates the ability of TEs to contribute to genome plasticity and reinforce conserved genome architecture as redundant loop anchors. Deleting such candidate TEs in human cells leads to the collapse of conserved loop and domain structures. These TEs are also marked by reduced DNA methylation and bear mutational signatures of hypomethylation through evolutionary time. Conclusions TEs have long been considered a source of genetic innovation. By examining their contribution to genome topology, we show that TEs can contribute to regulatory plasticity by inducing redundancy and potentiating genetic drift locally while conserving genome architecture globally, revealing a paradigm for defining regulatory conservation in the noncoding genome beyond classic sequence-level conservation.


2018 ◽  
Vol 3 ◽  
pp. 105 ◽  
Author(s):  
Michi Miura ◽  
Paola Miyazato ◽  
Yorifumi Satou ◽  
Yuetsu Tanaka ◽  
Charles R.M. Bangham

Background:The human retrovirus HTLV-1 inserts the viral complementary DNA of 9 kb into the host genome. Both plus- and minus-strands of the provirus are transcribed, respectively from the 5′ and 3′ long terminal repeats (LTR). Plus-strand expression is rapid and intense once activated, whereas the minus-strand is transcribed at a lower, more constant level. To identify how HTLV-1 transcription is regulated, we investigated the epigenetic modifications associated with the onset of spontaneous plus-strand expression and the potential impact of the host factor CTCF.Methods:Patient-derived peripheral blood mononuclear cells (PBMCs) and in vitro HTLV-1-infected T cell clones were examined. Cells were stained for the plus-strand-encoded viral protein Tax, and sorted into Tax+and Tax–populations. Chromatin immunoprecipitation and methylated DNA immunoprecipitation were performed to identify epigenetic modifications in the provirus. Bisulfite-treated DNA fragments from the HTLV-1 LTRs were sequenced. Single-molecule RNA-FISH was performed, targeting HTLV-1 transcripts, for the estimation of transcription kinetics. The CRISPR/Cas9 technique was applied to alter the CTCF-binding site in the provirus, to test the impact of CTCF on the epigenetic modifications.Results:Changes in the histone modifications H3K4me3, H3K9Ac and H3K27Ac were strongly correlated with plus-strand expression. DNA in the body of the provirus was largely methylated except for the pX and 3′ LTR regions, regardless of Tax expression. The plus-strand promoter was hypomethylated when Tax was expressed. Removal of CTCF had no discernible impact on the viral transcription or epigenetic modifications.Conclusions:The histone modifications H3K4me3, H3K9Ac and H3K27Ac are highly dynamic in the HTLV-1 provirus: they show rapid change with the onset of Tax expression, and are reversible. The HTLV-1 provirus has an intrinsic pattern of epigenetic modifications that is independent of both the provirus insertion site and the chromatin architectural protein CTCF which binds to the HTLV-1 provirus.


2020 ◽  
Vol 64 (4) ◽  
pp. R45-R56 ◽  
Author(s):  
Andrea Hanel ◽  
Henna-Riikka Malmberg ◽  
Carsten Carlberg

Molecular endocrinology of vitamin D is based on the activation of the transcription factor vitamin D receptor (VDR) by the vitamin D metabolite 1α,25-dihydroxyvitamin D3. This nuclear vitamin D-sensing process causes epigenome-wide effects, such as changes in chromatin accessibility as well as in the contact of VDR and its supporting pioneer factors with thousands of genomic binding sites, referred to as vitamin D response elements. VDR binding enhancer regions loop to transcription start sites of hundreds of vitamin D target genes resulting in changes of their expression. Thus, vitamin D signaling is based on epigenome- and transcriptome-wide shifts in VDR-expressing tissues. Monocytes are the most responsive cell type of the immune system and serve as a paradigm for uncovering the chromatin model of vitamin D signaling. In this review, an alternative approach for selecting vitamin D target genes is presented, which are most relevant for understanding the impact of vitamin D endocrinology on innate immunity. Different scenarios of the regulation of primary upregulated vitamin D target genes are presented, in which vitamin D-driven super-enhancers comprise a cluster of persistent (constant) and/or inducible (transient) VDR-binding sites. In conclusion, the spatio-temporal VDR binding in the context of chromatin is most critical for the regulation of vitamin D target genes.


2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Elissavet Kentepozidou ◽  
Sarah J. Aitken ◽  
Christine Feig ◽  
Klara Stefflova ◽  
Ximena Ibarra-Soria ◽  
...  

Abstract Background CTCF binding contributes to the establishment of a higher-order genome structure by demarcating the boundaries of large-scale topologically associating domains (TADs). However, despite the importance and conservation of TADs, the role of CTCF binding in their evolution and stability remains elusive. Results We carry out an experimental and computational study that exploits the natural genetic variation across five closely related species to assess how CTCF binding patterns stably fixed by evolution in each species contribute to the establishment and evolutionary dynamics of TAD boundaries. We perform CTCF ChIP-seq in multiple mouse species to create genome-wide binding profiles and associate them with TAD boundaries. Our analyses reveal that CTCF binding is maintained at TAD boundaries by a balance of selective constraints and dynamic evolutionary processes. Regardless of their conservation across species, CTCF binding sites at TAD boundaries are subject to stronger sequence and functional constraints compared to other CTCF sites. TAD boundaries frequently harbor dynamically evolving clusters containing both evolutionarily old and young CTCF sites as a result of the repeated acquisition of new species-specific sites close to conserved ones. The overwhelming majority of clustered CTCF sites colocalize with cohesin and are significantly closer to gene transcription start sites than nonclustered CTCF sites, suggesting that CTCF clusters particularly contribute to cohesin stabilization and transcriptional regulation. Conclusions Dynamic conservation of CTCF site clusters is an apparently important feature of CTCF binding evolution that is critical to the functional stability of a higher-order chromatin structure.


2020 ◽  
Author(s):  
Emilia Puig Lombardi ◽  
Madalena Tarsounas

ABSTRACTTopologically associating domains (TADs) are units of the genome architecture defined by binding sites for the CTCF transcription factor and cohesin-mediated loop extrusion. Genomic regions containing DNA replication initiation sites have been mapped in the proximity of TAD boundaries. However, the factors that determine this positioning have not been identified. Moreover, the impact of TADs on the directionality of replication fork progression remains unknown. Here we use EdU-seq technology to map origin firing sites at 10 kb resolution and to monitor replication fork progression after restart from hydroxyurea arrest. We show that origins firing in early/mid S-phase within TAD boundaries map to two distinct peaks flanking the centre of the boundary, which is occupied by CTCF and cohesin. When transcription is inhibited chemically or deregulated by oncogene overexpression, replication origins become repositioned to the centre of the TAD. Furthermore, we demonstrate the strikingly asymmetric fork progression initiating from origins located within TAD boundaries. Divergent CTCF binding sites and neighbouring TADs with different replication timing (RT) cause fork stalling in regions external to the TAD. Thus, our work assigns for the first time a role to transcription within TAD boundaries in promoting replication origin firing and demonstrates how genomic regions adjacent to the TAD boundaries could restrict replication progression.


2017 ◽  
Author(s):  
David Thybert ◽  
Maša Roller ◽  
Fábio C.P. Navarro ◽  
Ian Fiddes ◽  
Ian Streeter ◽  
...  

ABSTRACTUnderstanding the mechanisms driving lineage-specific evolution in both primates and rodents has been hindered by the lack of sister clades with a similar phylogenetic structure having high-quality genome assemblies. Here, we have created chromosome-level assemblies of the Mus caroli and Mus pahari genomes. Together with the Mus musculus and Rattus norvegicus genomes, this set of rodent genomes is similar in divergence times to the Hominidae (human-chimpanzee-gorilla-orangutan). By comparing the evolutionary dynamics between the Muridae and Hominidae, we identified punctate events of chromosome reshuffling that shaped the ancestral karyotype of Mus musculus and Mus caroli between 3 to 6 MYA, but that are absent in the Hominidae. In fact, Hominidae show between four-and seven-fold lower rates of nucleotide change and feature turnover in both neutral and functional sequences suggesting an underlying coherence to the Muridae acceleration. Our system of matched, high-quality genome assemblies revealed how specific classes of repeats can play lineage-specific roles in related species. For example, recent LINE activity has remodeled protein-coding loci to a greater extent across the Muridae than the Hominidae, with functional consequences at the species level such as reproductive isolation. Furthermore, we charted a Muridae-specific retrotransposon expansion at unprecedented resolution, revealing how a single nucleotide mutation transformed a specific SINE element into an active CTCF binding site carrier specifically in Mus caroli. This process resulted in thousands of novel, species-specific CTCF binding sites. Our results demonstrate that the comparison of matched phylogenetic sets of genomes will be an increasingly powerful strategy for understanding mammalian biology.


2019 ◽  
Author(s):  
Adam G Diehl ◽  
Ningxin Ouyang ◽  
Alan P Boyle

AbstractBackgroundChromatin looping is exceedingly important to gene regulation and a host of other nuclear processes. Many recent insights into 3D chromatin structure across species and cell types have contributed to our understanding of the principles governing chromatin looping. However, 3D genome evolution and how it relates to Mendelian selection remain largely unexplored. CTCF, an insulator protein found at most loop anchors, has been described as the “master weaver” of mammalian genomes, and variations in CTCF occupancy are known to influence looping divergence. A large fraction of mammalian CTCF binding sites fall within transposable elements (TEs) but their contributions to looping variation are unknown. Here we investigated the effect of TE-driven CTCF binding site expansions on chromatin looping in human and mouse.ResultsTEs have broadly contributed to CTCF binding and loop boundary specification, primarily forming variable loops across species and cell types and contributing nearly 1/3 of species-specific and cell-specific loops.ConclusionsOur results demonstrate that TE activity is a major source of looping variability across species and cell types. Thus, TE-mediated CTCF expansions explain a large fraction of population-level looping variation and may play a role in adaptive evolution.


2019 ◽  
Author(s):  
Jessica E. Davis ◽  
Kimberly D. Insigne ◽  
Eric M. Jones ◽  
Quinn B Hastings ◽  
Sriram Kosuri

AbstractIn eukaryotes, transcription factors orchestrate gene expression by binding to TF-Binding Sites (TFBSs) and localizing transcriptional co-regulators and RNA Polymerase II to cis-regulatory elements. The strength and regulation of transcription can be modulated by a variety of factors including TFBS composition, TFBS affinity and number, distance between TFBSs, distance of TFBSs to transcription start sites, and epigenetic modifications. We still lack a basic comprehension of how such variables shaping cis-regulatory architecture culminate in quantitative transcriptional responses. Here we explored how such factors determine the transcriptional activity of a model transcription factor, the c-AMP Response Element (CRE) binding protein. We measured expression driven by 4,602 synthetic regulatory elements in a massively parallel reporter assay (MPRA) exploring the impact of CRE number, affinity, distance to the promoter, and spacing between multiple CREs. We found the number and affinity of CREs within regulatory elements largely determines overall expression, and this relationship is shaped by the proximity of each CRE to the downstream promoter. In addition, while we observed expression periodicity as the CRE distance to the promoter varied, the spacing between multiple CREs altered this periodicity. Finally, we compare library expression between an episomal MPRA and a new, genomically-integrated MPRA in which a single synthetic regulatory element is present per cell at a defined locus. We observe that these largely recapitulate each other although weaker, non-canonical CREs exhibited greater activity in the genomic context.


2019 ◽  
Author(s):  
Mayilaadumveettil Nishana ◽  
Caryn Ha ◽  
Javier Rodriguez-Hernaez ◽  
Ali Ranjbaran ◽  
Erica Chio ◽  
...  

BackgroundUbiquitously expressed CTCF is involved in numerous cellular functions, such as organizing chromatin into TAD structures. In contrast, its paralog, CTCFL is normally only present in testis. However, it is also aberrantly expressed in many cancers. While it is known that shared and unique zinc finger sequences in CTCF and CTCFL enable CTCFL to bind competitively to a subset of CTCF binding sites as well as its own unique locations, the impact of CTCFL on chromosome organization and gene expression has not been comprehensively analyzed in the context of CTCF function. Using an inducible complementation system, we analyze the impact of expressing CTCFL and CTCF-CTCFL chimeric proteins in the presence or absence of endogenous CTCF to clarify the relative and combined contribution of CTCF and CTCFL to chromosome organization and transcription.ResultsWe demonstrate that the N terminus of CTCF interacts with cohesin which explains the requirement for convergent CTCF binding sites in loop formation. By analyzing CTCF and CTCFL binding in tandem we identify phenotypically distinct sites with respect to motifs, targeting to promoter/intronic intergenic regions and chromatin folding. Finally, we reveal that the N, C and zinc finger terminal domains play unique roles in targeting each paralog to distinct binding sites, to regulate transcription, chromatin looping and insulation.ConclusionThis study clarifies the unique and combined contribution of CTCF and CTCFL to chromosome organization and transcription, with direct implications for understanding how their co-expression deregulates transcription in cancer.


Sign in / Sign up

Export Citation Format

Share Document