scholarly journals ZCCHC8 is required for the degradation of pervasive transcripts originating from multiple genomic regulatory features

2021 ◽  
Author(s):  
Joshua W. Collins ◽  
Daniel Martin ◽  
Shaohe Wang ◽  
Kenneth M. Yamada ◽  

ABSTRACTThe vast majority of mammalian genomes are transcribed as non-coding RNA in what is referred to as “pervasive transcription.” Recent studies have uncovered various families of non-coding RNA transcribed upstream of transcription start sites. In particular, highly unstable promoter upstream transcripts known as PROMPTs have been shown to be targeted for exosomal degradation by the nuclear exosome targeting complex (NEXT) consisting of the RNA helicase MTR4, the zinc-knuckle scaffold ZCCHC8, and the RNA binding protein RBM7. Here, we report that in addition to its known RNA substrates, ZCCHC8 is required for the targeted degradation of pervasive transcripts produced at CTCF binding sites, open chromatin regions, promoters, promoter flanking regions, and transcription factor binding sites. Additionally, we report that a significant number of RIKEN cDNAs and predicted genes display the hallmarks of PROMPTs and are also substrates for ZCCHC8 and/or NEXT complex regulation suggesting these are unlikely to be functional genes. Our results suggest that ZCCHC8 and/or the NEXT complex may play a larger role in the global regulation of pervasive transcription than previously reported.

2015 ◽  
Vol 112 (47) ◽  
pp. E6456-E6465 ◽  
Author(s):  
Adrian L. Sanborn ◽  
Suhas S. P. Rao ◽  
Su-Chen Huang ◽  
Neva C. Durand ◽  
Miriam H. Huntley ◽  
...  

We recently used in situ Hi-C to create kilobase-resolution 3D maps of mammalian genomes. Here, we combine these maps with new Hi-C, microscopy, and genome-editing experiments to study the physical structure of chromatin fibers, domains, and loops. We find that the observed contact domains are inconsistent with the equilibrium state for an ordinary condensed polymer. Combining Hi-C data and novel mathematical theorems, we show that contact domains are also not consistent with a fractal globule. Instead, we use physical simulations to study two models of genome folding. In one, intermonomer attraction during polymer condensation leads to formation of an anisotropic “tension globule.” In the other, CCCTC-binding factor (CTCF) and cohesin act together to extrude unknotted loops during interphase. Both models are consistent with the observed contact domains and with the observation that contact domains tend to form inside loops. However, the extrusion model explains a far wider array of observations, such as why loops tend not to overlap and why the CTCF-binding motifs at pairs of loop anchors lie in the convergent orientation. Finally, we perform 13 genome-editing experiments examining the effect of altering CTCF-binding sites on chromatin folding. The convergent rule correctly predicts the affected loops in every case. Moreover, the extrusion model accurately predicts in silico the 3D maps resulting from each experiment using only the location of CTCF-binding sites in the WT. Thus, we show that it is possible to disrupt, restore, and move loops and domains using targeted mutations as small as a single base pair.


2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Mayank NK Choudhary ◽  
Ryan Z. Friedman ◽  
Julia T. Wang ◽  
Hyo Sik Jang ◽  
Xiaoyu Zhuo ◽  
...  

Abstract Background Transposable elements (TEs) make up half of mammalian genomes and shape genome regulation by harboring binding sites for regulatory factors. These include binding sites for architectural proteins, such as CTCF, RAD21, and SMC3, that are involved in tethering chromatin loops and marking domain boundaries. The 3D organization of the mammalian genome is intimately linked to its function and is remarkably conserved. However, the mechanisms by which these structural intricacies emerge and evolve have not been thoroughly probed. Results Here, we show that TEs contribute extensively to both the formation of species-specific loops in humans and mice through deposition of novel anchoring motifs, as well as to the maintenance of conserved loops across both species through CTCF binding site turnover. The latter function demonstrates the ability of TEs to contribute to genome plasticity and reinforce conserved genome architecture as redundant loop anchors. Deleting such candidate TEs in human cells leads to the collapse of conserved loop and domain structures. These TEs are also marked by reduced DNA methylation and bear mutational signatures of hypomethylation through evolutionary time. Conclusions TEs have long been considered a source of genetic innovation. By examining their contribution to genome topology, we show that TEs can contribute to regulatory plasticity by inducing redundancy and potentiating genetic drift locally while conserving genome architecture globally, revealing a paradigm for defining regulatory conservation in the noncoding genome beyond classic sequence-level conservation.


2020 ◽  
Author(s):  
Dhoyazan Azazi ◽  
Jonathan M. Mudge ◽  
Duncan T. Odom ◽  
Paul Flicek

ABSTRACTThe introduction of novel CTCF binding sites in gene regulatory regions in the rodent lineage is partly the effect of transposable element expansion. The exact mechanism and functional impact of evolutionarily novel CTCF binding sites are not yet fully understood. We investigated the impact of novel species-specific CTCF binding sites in two Mus genus subspecies, Mus musculus domesticus and Mus musculus castaneus, that diverged 0.5 million years ago. The activity of the B2-B4 family of transposable elements independently in both lineages leads to the proliferation of novel CTCF binding sites. A subset of evolutionarily young sites may harbour transcriptional functionality, as evidenced by the stability of their binding across multiple tissues in M. musculus domesticus (BL6), while overall the distance of species-specific CTCF binding to the nearest transcription start sites and/or topologically-associated domains (TADs) is largely similar to musculus-common CTCF sites. Remarkably, we discovered a recurrent regulatory architecture consisting of a CTCF binding site and an interferon gene that appears to have been tandemly duplicated to create a 15-gene cluster on chromosome 4, thus forming a novel BL6 specific immune locus, in which CTCF may play a regulatory role. Our results demonstrate that thousands of CTCF binding sites show multiple functional signatures rapidly after incorporation into the genome.


eLife ◽  
2019 ◽  
Vol 8 ◽  
Author(s):  
Martin Wegner ◽  
Valentina Diehl ◽  
Verena Bittl ◽  
Rahel de Bruyn ◽  
Svenja Wiechmann ◽  
...  

Current technologies used to generate CRISPR/Cas gene perturbation reagents are labor intense and require multiple ligation and cloning steps. Furthermore, increasing gRNA sequence diversity negatively affects gRNA distribution, leading to libraries of heterogeneous quality. Here, we present a rapid and cloning-free mutagenesis technology that can efficiently generate covalently-closed-circular-synthesized (3Cs) CRISPR/Cas gRNA reagents and that uncouples sequence diversity from sequence distribution. We demonstrate the fidelity and performance of 3Cs reagents by tailored targeting of all human deubiquitinating enzymes (DUBs) and identify their essentiality for cell fitness. To explore high-content screening, we aimed to generate the largest up-to-date gRNA library that can be used to interrogate the coding and noncoding human genome and simultaneously to identify genes, predicted promoter flanking regions, transcription factors and CTCF binding sites that are linked to doxorubicin resistance. Our 3Cs technology enables fast and robust generation of bias-free gene perturbation libraries with yet unmatched diversities and should be considered an alternative to established technologies.


Genes ◽  
2018 ◽  
Vol 9 (9) ◽  
pp. 426 ◽  
Author(s):  
Xing Zhao ◽  
Danze Chen ◽  
Yujie Cai ◽  
Fan Zhang ◽  
Jianzhen Xu

Gene post-transcription regulation involves several critical regulators such as microRNAs (miRNAs) and RNA-binding proteins (RBPs). Accumulated experimental evidences have shown that miRNAs and RBPs can competitively regulate the shared targeting transcripts. Although this establishes a novel post-transcription regulation mechanism, there are currently no computational tools to scan for the possible competing miRNA and RBP pairs. Here, we developed a novel computational pipeline—RBPvsMIR—that enables us to statistically evaluate the competing relationship between miRNAs and RBPs. RBPvsMIR first combines with previously successful miRNAs and RBP motifs discovery applications to search for overlapping or adjacent binding sites along a given RNA sequence. Then a permutation test is performed to select the miRNA and RBP pairs with the significantly enriched binding sites. As an example, we used RBPvsMIR to identify 235 competing RBP-miRNA pairs for long non-coding RNA (lncRNA) MALAT1. Wet lab experiments verified that splicing factor SRSF2 competes with miR-383, miR-502 and miR-101 to regulate MALAT1 in esophageal squamous carcinoma cells. Our study also revealed the global mutual exclusive pattern for miRNAs and RBP to regulate human lncRNAs. In addition, we provided a convenient web server (http://bmc.med.stu.edu.cn/RBPvsMIR), which should accelerate the exploration of competing miRNAs and RBP pairs regulating the shared targeting transcripts.


2020 ◽  
Author(s):  
Mitra Ansariola ◽  
Valerie N. Fraser ◽  
Sergei A. Filichkin ◽  
Maria G. Ivanchenko ◽  
Zachary A. Bright ◽  
...  

AbstractAcross tissues, gene expression is regulated by a combination of determinants, including the binding of transcription factors (TFs), along with other aspects of cellular state. Recent studies emphasize the importance of both genetic and epigenetic states – TF binding sites and binding site chromatin accessibility have emerged as potentially causal determinants of tissue specificity. To investigate the relative contributions of these determinants, we constructed three genome-scale datasets for both root and shoot tissues of the same Arabidopsis thaliana plants: TSS-seq data to identify Transcription Start Sites, OC-seq data to identify regions of Open Chromatin, and RNA-seq data to assess gene expression levels. For genes that are differentially expressed between root and shoot, we constructed a machine learning model predicting tissue of expression from chromatin accessibility and TF binding information upstream of TSS locations. The resulting model was highly accurate (over 90% auROC and auPRC), and our analysis of model contributions (feature weights) strongly suggests that patterns of TF binding sites within ∼500 nt TSS-proximal regions are predominant explainers of tissue of expression in most cases. Thus, in plants, cis-regulatory control of tissue-specific gene expression appears to be primarily determined by TSS-proximal sequences, and rarely by distal enhancer-like accessible chromatin regions. This study highlights the exciting future possibility of a native TF site-based design process for the tissue-specific targeting of plant gene promoters.


2019 ◽  
Vol 12 (1) ◽  
Author(s):  
Divyesh Patel ◽  
Manthan Patel ◽  
Subhamoy Datta ◽  
Umashankar Singh

Abstract Background CGGBP1 is a repeat-binding protein with diverse functions in the regulation of gene expression, cytosine methylation, repeat silencing and genomic integrity. CGGBP1 has also been identified as a cooperator of histone-modifying enzymes and as a component of CTCF-containing complexes that regulate the enhancer–promoter looping. CGGBP1–CTCF cross talk in chromatin regulation has been hitherto unknown. Results Here, we report that the occupancy of CTCF at repeats depends on CGGBP1. Using ChIP-sequencing for CTCF, we describe its occupancy at repetitive DNA. Our results show that endogenous level of CGGBP1 ensures CTCF occupancy preferentially on repeats over canonical CTCF motifs. By combining CTCF ChIP-sequencing results with ChIP sequencing for three different kinds of histone modifications (H3K4me3, H3K9me3 and H3K27me3), we show that the CGGBP1-dependent repeat-rich CTCF-binding sites regulate histone marks in flanking regions. Conclusion CGGBP1 affects the pattern of CTCF occupancy. Our results posit CGGBP1 as a regulator of CTCF and its binding sites in interspersed repeats.


Author(s):  
Eric A J Simko ◽  
Honghe Liu ◽  
Tao Zhang ◽  
Adan Velasquez ◽  
Shraddha Teli ◽  
...  

Abstract The long non-coding RNA NEAT1 serves as a scaffold for the assembly of paraspeckles, membraneless nuclear organelles involved in gene regulation. Paraspeckle assembly requires NEAT1 recruitment of the RNA-binding protein NONO, however the NEAT1 elements responsible for recruitment are unknown. Herein we present evidence that previously unrecognized structural features of NEAT1 serve an important role in these interactions. Led by the initial observation that NONO preferentially binds the G-quadruplex conformation of G-rich C9orf72 repeat RNA, we find that G-quadruplex motifs are abundant and conserved features of NEAT1. Furthermore, we determine that NONO binds NEAT1 G-quadruplexes with structural specificity and provide evidence that G-quadruplex motifs mediate NONO-NEAT1 association, with NONO binding sites on NEAT1 corresponding largely to G-quadruplex motifs, and treatment with a G-quadruplex-disrupting small molecule causing dissociation of native NONO-NEAT1 complexes. Together, these findings position G-quadruplexes as a primary candidate for the NONO-recruiting elements of NEAT1 and provide a framework for further investigation into the role of G-quadruplexes in paraspeckle formation and function.


2021 ◽  
Author(s):  
Laura E M Dunn ◽  
Fang Lu ◽  
Paul M. Lieberman ◽  
Joel D Baines

The ability of Epstein-Barr Virus (EBV) to switch between latent and lytic infection is key to its long-term persistence, yet the molecular mechanisms behind this switch remain unclear. To investigate transcriptional events during the latent to lytic switch we utilized Precision nuclear Run On followed by deep Sequencing (PRO-Seq) to map cellular RNA polymerase (Pol) activity to single-nucleotide resolution on the host and EBV genome in latently infected Mutu-I cells at 1, 4 and 12 h post-reactivation. During latency, Pol activity was primarily limited to the EBNA1 transcript initiating at the Qp promoter, the EBER and RPMS1/BART regions and the BHLF1 transcript. Unexpectedly at early time-points post-reactivation, the EBV transcripts with the largest increase in Pol activity were LMP-2A, EBER-1 and RPMS1. Closer analysis of the PRO-Seq data at these regions revealed a distinct pattern of high Pol activity with bidirectional transcription and strong peaks indicative of Pol pausing. Alignment to ChIP-Seq data revealed a strong correlation with CTCF binding sites on the EBV genome. In addition, alignment to ATAC-Seq data indicated that many of these transcription regulatory regions were sites of accessible chromatin. Similar features were observed in Akata cells activated from latency with anti-IgG. Overall, these data suggest that during reactivation, EBV recruits RNA polymerase to CTCF binding sites where it transcribes short distances and pauses. These activities likely help open chromatin on the viral genome to initiate productive replication.


2020 ◽  
Author(s):  
Anushua Biswas ◽  
Leelavati Narlikar

AbstractHigh-throughput sequencing-based assays measure different biochemical activities pertaining to gene regulation, genome-wide. These activities include protein-DNA binding, enhancer-activity, open chromatin, and more. A major goal is to understand underlying sequence components, or motifs, that can explain the measured activity. It is usually not one motif, but a combination of motifs bound by cooperatively acting proteins that confers activity to such regions. Furthermore, although having a single type of activity, the regions can still be diverse, governed by different combinations of proteins/motifs. Current approaches do not take into account this issue of combinatorial diversity. We present a new statistical framework cisDiversity, which models regions as diverse modules characterized by combinations of motifs, while simultaneously learning the motifs themselves. We show that ChIP-seq data for the CTCF protein in fly contains diverse sequence structures, with most direct CTCF-binding sites situated far from promoters, giving insights into its co-factors and potential role in looping. Human CTCF-bound regions, on the other hand, have a different architecture. Because cisDiversity does not rely on knowledge of motifs, modules, cell-type, or organism, it is general enough to be applied to regions reported by most high-throughput assays. Indeed, enhancer predictions resulting from different assays—GRO-cap, STARR-seq, and those measuring chromatin structure—show distinct modules and combinations of TF binding sites, some specific to the assay. No module occurs universally in all enhancer-assays. Finally, analysis of accessible chromatin suggests that regions open in one cell-state encode information about future states, with certain modules staying open and others closing down later. The code is freely available at https://github.com/NarlikarLab/cisDIVERSITY.


Sign in / Sign up

Export Citation Format

Share Document