Expression of human snRNA genes from beginning to end

2008 ◽  
Vol 36 (4) ◽  
pp. 590-594 ◽  
Author(s):  
Sylvain Egloff ◽  
Dawn O'Reilly ◽  
Shona Murphy

In addition to protein-coding genes, mammalian pol II (RNA polymerase II) transcribes independent genes for some non-coding RNAs, including the spliceosomal U1 and U2 snRNAs (small nuclear RNAs). snRNA genes differ from protein-coding genes in several key respects and some of the mechanisms involved in expression are gene-type-specific. For example, snRNA gene promoters contain an essential PSE (proximal sequence element) unique to these genes, the RNA-encoding regions contain no introns, elongation of transcription is P-TEFb (positive transcription elongation factor b)-independent and RNA 3′-end formation is directed by a 3′-box rather than a cleavage and polyadenylation signal. However, the CTD (C-terminal domain) of pol II closely couples transcription with RNA 5′ and 3′ processing in expression of both gene types. Recently, it was shown that snRNA promoter-specific recognition of the 3′-box RNA processing signal requires a novel phosphorylation mark on the pol II CTD. This new mark plays a critical role in the recruitment of the snRNA gene-specific RNA-processing complex, Integrator. These new findings provide the first example of a phosphorylation mark on the CTD heptapeptide that can be read in a gene-type-specific manner, reinforcing the notion of a CTD code. Here, we review the control of expression of snRNA genes from initiation to termination of transcription.

2021 ◽  
Author(s):  
◽  
Mirko Brüggemann

Most cellular processes are regulated by RNA-binding proteins (RBPs). These RBPs usually use defined binding sites to recognize and directly interact with their target RNA molecule. Individual-nucleotide resolution UV crosslinking and immunoprecipitation (iCLIP) experiments are an important tool to de- scribe such interactions in cell cultures in-vivo. This experimental protocol yields millions of individual sequencing reads from which the binding spec- trum of the RBP under study can be deduced. In this PhD thesis I studied how RNA processing is driven from RBP binding by analyzing iCLIP-derived sequencing datasets. First, I described a complete data analysis pipeline to detect RBP binding sites from iCLIP sequencing reads. This workflow covers all essential process- ing steps, from the first quality control to the final annotation of binding sites. I described the accurate integration of biological iCLIP replicates to boost the initial peak calling step while ensuring high specificity through replicate re- producibility analysis. Further I proposed a routine to level binding site width to streamline downstream analysis processes. This was exemplified in the re- analysis of the binding spectrum of the U2 small nuclear RNA auxiliary factor 2 (U2AF2, U2AF65). I recaptured the known dominance of U2AF65 to bind to intronic sequences of protein-coding genes, where it likely recognizes the polypyrimidine tract as part of the core spliceosome machinery. In the second part of my thesis, I analyzed the binding spectrum of the serine and arginine rich splicing factor 6 (SRSF6) in the context of diabetes. In pancreatic beta-cells, the expression of SRSF6 is regulated by the transcription factor GLIS3, which encodes for a diabetes susceptibility gene. It is known that SRSF6 promotes beta-cell death through the splicing dysregulation of genes essential to beta-cell function and survival. However, the exact mechanism of how these RNAs are targeted by SRSF6 remains poorly understood. Here, I applied the defined iCLIP processing pipeline to describe the binding landscape of the splicing factor SRSF6 in the human pancreatic beta-cell line EndoC-H1. The initial binding sites definition revealed a predominant binding to coding sequences (CDS) of protein-coding genes. This was followed up by extensive motif analysis which revealed a so far, in human, unknown purine-rich binding motif. SRSF6 seemed to specifically recognize repetitions of the triplet GAA. I also showed that the number of contiguous triplets correlated with increasing binding site strength. I further integrated RNA-sequencing data from the same cell type, with SRSF6 in KD and in basal conditions, to analyze SRSF6- related splicing changes. I showed that the exact positioning of SRSF6 on alternatively spliced exons regulates the produced transcript isoforms. This mechanism seemed to control exons in several known susceptibility genes for diabetes. In summary, in my PhD thesis, I presented a comprehensive workflow for the processing of iCLIP-derived sequencing data. I applied this pipeline on a dataset from pancreatic beta-cells to unveil the impact of SRSF6-mediated splicing changes. Thus, my analysis provides novel insights into the regulation of diabetes susceptibility genes.


2017 ◽  
Author(s):  
Anastasia McKinlay ◽  
Ram Podicheti ◽  
Jered M. Wendte ◽  
Ross Cocklin ◽  
Douglas B. Rusch

AbstractNuclear multisubunit RNA polymerases IV and V (Pol IV and Pol V) evolved in plants as specialized forms of Pol II. Their functions are best understood in the context of RNA-directed DNA methylation (RdDM), a process in which Pol IV-dependent 24 nt siRNAs direct the de novo cytosine methylation of regions transcribed by Pol V. Pol V has additional functions, independent of Pol IV and 24 nt siRNA biogenesis, in maintaining the repression of transposons and genomic repeats whose silencing depends on maintenance cytosine methylation. Here we report that Pol IV and Pol V play unexpected roles in defining the 3’ boundaries of Pol II transcription units. Nuclear run-on assays reveal that in the absence of Pol IV or Pol V, Pol II occupancy downstream of poly A sites increases for approximately 12% of protein-coding genes. This effect is most pronounced for convergently transcribed gene pairs. Although Pols IV and V are detected near transcript ends of the affected Pol II – transcribed genes, their role in limiting Pol II read-through is independent of siRNA biogenesis or cytosine methylation. We speculate that Pols IV and V (and/or their associated factors) play roles in Pol II transcription termination by influencing polymerase bypass or release at collision sites for convergent genes.


Blood ◽  
2016 ◽  
Vol 128 (22) ◽  
pp. 2705-2705 ◽  
Author(s):  
Lara Rizzotto ◽  
Arianna Bottoni ◽  
Tzung-Huei Lai ◽  
Chaomei Liu ◽  
Pearlly S Yan ◽  
...  

Abstract Chronic lymphocytic leukemia (CLL) follows a variable clinical course mostly dependent upon genomic factors, with a subset of patients having low risk disease and others displaying rapid progression associated with clonal evolution. Epigenetic mechanisms such as DNA promoter hypermethylation were shown to have a role in CLL evolution where the acquisition of increasingly heterogeneous DNA methylation patters occurred in conjunction with clonal evolution of genetic aberrations and was associated with disease progression. However the role of epigenetic mechanisms regulated by the histone deacetylase group of transcriptional repressors in the progression of CLL has not been well characterized. The histone deacetylases (HDACs) 1 and 2 are recruited onto gene promoters and form a complex with the histone demethylase KDM1. Once recruited, the complex mediate the removal of acetyl groups from specific lysines on histones (H3K9 and H3K14) thus triggering the demethylation of lysine 4 (H3K4me3) and the silencing of gene expression. CLL is characterized by the dysregulation of numerous coding and non coding genes, many of which have key roles in regulating the survival or progression of CLL. For instance, our group showed that the levels of HDAC1 were elevated in high risk as compared to low risk CLL or normal lymphocytes and this over-expression was responsible for the silencing of miR-106b, mR-15, miR-16, and miR-29b which affected CLL survival by modulating the expression of key anti-apoptotic proteins Bcl-2 and Mcl-1. To characterize the HDAC-repressed gene signature in high risk CLL, we conducted chromatin immunoprecipitation (ChIP) of the nuclear lysates from 3 high risk and 3 low risk CLL patients using antibodies against HDAC1, HDAC2 and KDM1 or non-specific IgG, sequenced and aligned the eluted DNA to a reference genome and determined the binding of HDAC1, HDAC2 and KDM1 at the promoters for all protein coding and microRNA genes. Preliminary results from this ChIP-seq showed a strong recruitment of HDAC1, HDAC2 and KDM1 to the promoters of several microRNA as well as protein coding genes in high risk CLL. To further corroborate these data we performed ChIP-Seq in the same 6 CLL samples to analyze the levels of H3K4me2 and H3K4me3 around gene promoters before and after 6h exposure to the HDACi panobinostat. Our goal was to demonstrate that HDAC inhibition elicited an increase in the levels of acetylation on histones and triggered the accrual of H3K4me2 at the repressed promoter, events likely to facilitate the recruitment of RNA polymerase II to this promoter. Initial analysis confirmed a robust accumulation of H3K4me2 and H3K4me3 marks at the gene promoters of representative genes that recruited HDAC1 and its co-repressors in the previous ChIP-Seq analysis in high risk CLL patients. Finally, 5 aggressive CLL samples were treated with the HDACi abexinostat for 48h and RNA before and after treatment was subjected to RNA-seq for small and large RNA to confirm that the regions of chromatin uncoiled by HDACi treatment were actively transcribed. HDAC inhibition induced the expression of a large number of miRNA genes as well as key protein coding genes, such as miR-29b, miR-210, miR-182, miR-183, miR-95, miR-940, FOXO3, EBF1 and BCL2L11. Of note, some of the predicted or validated targets of the induced miRNAs were key facilitators in the progression of CLL, such as BTK, SYK, MCL-1, BCL-2, TCL1, and ROR1. Moreover, RNA-seq showed that the expression of these protein coding genes was reduced by 2-33 folds upon HDAC inhibition. We plan to extend the RNA-seq to 5 CLL samples with indolent disease and combine all the data to identify a common signature of protein coding and miRNA genes that recruited the HDAC1 complex, accumulated activating histone modifications upon treatment with HDACi and altered gene and miRNA expression after HDAC inhibition in high risk CLL versus low risk CLL. The signature will be than validated on a large cohort of indolent and aggressive CLL patients. Our final goal is to define a signature of coding and non coding genes silenced by HDACs in high risk CLL and its role in facilitating disease progression. Disclosures Woyach: Acerta: Research Funding; Karyopharm: Research Funding; Morphosys: Research Funding.


Blood ◽  
2012 ◽  
Vol 120 (21) ◽  
pp. 3298-3298 ◽  
Author(s):  
Eric R. Londin ◽  
Eleftheria Hatzimichael ◽  
Phillipe Loher ◽  
Yue Zhao ◽  
Yi Jing ◽  
...  

Abstract Abstract 3298 The anucleate platelets play a critical role in the formation of thrombi and prevention of bleeding. While the repertoire of platelet transcripts is a reflection of the megakaryocyte at the time of platelet differentiation, post-transcriptional events are known to occur. Furthermore, a strong correlation between the expressed mRNAs and proteome has been identified. Having a complete understanding of the platelet transcriptome is important for generating insights into the genetic basis of platelet disease traits. To capture the complexity of the platelet transcriptome, we performed RNA sequencing (RNA-seq) in leukocyte-depleted platelets from 10 males, with median age of 24.5 yrs and unremarkable medical history. Their short and long RNA platelet transcriptomes were analyzed on the SOLiD 5500xl sequencing platform. We generated ∼3.5 billion sequence reads ∼40% of which could be mapped uniquely to the human genome. Our analysis revealed that ∼9,000 distinct protein-coding mRNAs and ∼800 microRNAs (miRNAs) were present in the transcriptome of each of the 10 sequenced individuals. Comparison of the levels of mRNA expression across the 10 individuals showed an exceptional level of consistency with pair-wise Pearson correlation values ≥0.98. The miRNA expression profiles across the 10 individuals showed a similar consistency with pair-wise Pearson correlation values ≥0.98. Surprisingly, we found that these mRNAs and miRNAs accounted for a little over 1/2 of all of the uniquely mapped sequence reads suggesting the abundant presence of additional non-protein coding RNA (ncRNA) transcripts. Using the annotated entries of the latest release of the ENSEMBL database, we investigated the genetic make-up of these other transcripts. We found that ∼25% of each individual's uniquely mapped reads corresponded to non-protein coding transcripts from mRNA-coding loci. These reads accounted for more than 10,000 distinct such transcripts. In addition, each of the individuals in our cohort expressed an average of ∼1,500 pseudogenes and ∼200 long intergenic non-coding RNAs (lincRNAs). The short RNA profiles of the ten individuals revealed an abundance of diverse categories of ncRNAs including the signal recognition particle RNA (srpRNA), small nuclear RNA (snRNA) and small cytoplasmic RNAs (scRNA). These ncRNAs are involved in the processing of pre-mRNAs and their presence and prevalence in the anucleate platetet suggests the existence of a complex network of mRNA processing that persists after the megakaryocyte fragmentation. We also investigated the RNA-omes of the ten individuals for evidence of transcription of the pyknon category of ncRNAs. Pyknons are of particular interest because each has numerous intergenic and intronic copies whereas nearly all known human protein-coding genes contain one or more pyknons in their mRNA. Recent experimental work has shown that intergenic instances of the pyknons are transcribed in a tissue- and cell-state specific manner. An average of ∼100,000 pyknons are transcribed in each of the 10 sequenced individuals suggesting the possibility of a far-reaching network of interactions that link exonic space to distant non-exonic regions and are active in platelets. Lastly, we found that a large variety of distinct repeat element categories are expressed in the RNA-omes (both short and long) of these individuals. Among the most abundantly represented categories of repeat elements were DNA transposons, long terminal repeat (LTR) retrotransposons, and non-LTR retrotransposons such as long interspersed elements (LINEs) and short interspersed elements (SINEs). In summary, our RNA-seq analyses have revealed a spectrum of platelet transcripts that transcends protein-coding genes and miRNAs. Indeed, the transcripts that have their source in genomic features not previously discussed or analyzed in the platelet context represent a very significant portion of all platelet transcripts. This in turn suggests an unanticipated richness, and presumably commensurate complexity, for the platelet transcriptome. While the role of these novel non-protein coding RNAs is currently unknown it is expected that at least some of them may be of functional significance which will in turn permit a better understanding of the molecular mechanisms that regulate platelet physiology and may contribute to processes beyond thrombosis and hemostasis. Disclosures: No relevant conflicts of interest to declare.


1993 ◽  
Vol 13 (10) ◽  
pp. 6403-6415 ◽  
Author(s):  
S Connelly ◽  
W Filipowicz

Formation of the 3' ends of RNA polymerase II (Pol II)-specific U small nuclear RNAs (U snRNAs) in vertebrate cells is dependent upon transcription initiation from the U snRNA gene promoter. Moreover, U snRNA promoters are unable to direct the synthesis of functional polyadenylated mRNAs. In this work, we have investigated whether U snRNA 3'-end formation and transcription initiation are also coupled in plants. We have first characterized the requirements for 3'-end formation of an Arabidopsis U2 snRNA expressed in transfected protoplasts of Nicotiana plumbaginifolia. We found that the 3'-end-adjacent sequence CA (N)3-10AGTNNAA, conserved in plant Pol II-specific U snRNA genes, is essential for the 3'-end formation of U2 transcripts and, similar to the vertebrate 3' box, is highly tolerant to mutation. The 3'-flanking regions of an Arabidopsis U5 and a maize U2 snRNA gene can effectively substitute for the Arabidopsis U2 3'-end formation signal, indicating that these signals are functionally equivalent among different Pol II-transcribed snRNA genes. The plant U snRNA 3'-end formation signal can be recognized irrespective of whether transcription initiation occurs at U snRNA or mRNA gene promoters, although efficiency of 3' box utilization is higher when transcription initiation occurs at the U snRNA promoter. Moreover, transcripts initiated from the U2 gene promoter can be spliced and polyadenylated. Transcription from a Pol III-specific plant U snRNA gene promoter is not compatible with polyadenylation. Finally, we reveal that initiation at a Pol II-specific plant U snRNA gene promoter can occur in the absence of the snRNA coding region and a functional snRNA 3'-end formation signal, demonstrating that these sequences play no role in determining the RNA polymerase specificity of plant U snRNA genes.


Open Biology ◽  
2017 ◽  
Vol 7 (6) ◽  
pp. 170073 ◽  
Author(s):  
Joana Guiro ◽  
Shona Murphy

In addition to protein-coding genes, RNA polymerase II (pol II) transcribes numerous genes for non-coding RNAs, including the small-nuclear (sn)RNA genes. snRNAs are an important class of non-coding RNAs, several of which are involved in pre-mRNA splicing. The molecular mechanisms underlying expression of human pol II-transcribed snRNA genes are less well characterized than for protein-coding genes and there are important differences in expression of these two gene types. Here, we review the DNA features and proteins required for efficient transcription of snRNA genes and co-transcriptional 3′ end formation of the transcripts.


2004 ◽  
Vol 24 (21) ◽  
pp. 9610-9618 ◽  
Author(s):  
Jia-peng Ruan ◽  
George K. Arhin ◽  
Elisabetta Ullu ◽  
Christian Tschudi

ABSTRACT Transcriptional mechanisms remain poorly understood in trypanosomatid protozoa. In particular, there is no knowledge about the function of basal transcription factors, and there is an apparent rarity of promoters for protein-coding genes transcribed by RNA polymerase (Pol) II. Here we describe a Trypanosoma brucei factor related to the TATA-binding protein (TBP). Although this TBP-related factor (TBP-related factor 4 [TRF4]) has about 31% identity to the TBP core domain, several key residues involved in TATA box binding are not conserved. Depletion of the T. brucei TRF4 (TbTRF4) by RNA interference revealed an essential role in RNA Pol I, II, and III transcription. Using chromatin immunoprecipitation, we further showed that TRF4 is recruited to the Pol I-transcribed procyclic acidic repetitive genes, Pol II-transcribed spliced leader RNA genes, and Pol III-transcribed U-snRNA and 7SL RNA genes, thus supporting a role for TbTRF4 in transcription performed by all three nuclear RNA polymerases. Finally, a search for TRF4 binding sites in the T. brucei genome led to the identification of such sites in the 3′ portion of certain protein-coding genes, indicating a unique aspect of Pol II transcription in these organisms.


1994 ◽  
Vol 14 (9) ◽  
pp. 5910-5919
Author(s):  
S Connelly ◽  
C Marshallsay ◽  
D Leader ◽  
J W Brown ◽  
W Filipowicz

RNA polymerase (Pol) II- and RNA Pol III-transcribed small nuclear RNA (snRNA) genes of dicotyledonous plants contain two essential upstream promoter elements, the USE and TATA. The USE is a highly conserved plant snRNA gene-specific element, and its distance from the -30 TATA box, corresponding to approximately three and four helical DNA turns in Pol III and Pol II genes, respectively, is crucial for determining RNA Pol specificity of transcription. Sequences upstream of the USE play no role in snRNA gene transcription in dicot plants. Here we show that for expression of snRNA genes in maize, a monocotyledonous plant, the USE and TATA elements are essential, but not sufficient, for transcription. Efficient expression of both Pol II- and Pol III-specific snRNA genes in transfected maize protoplasts requires an additional element(s) positioned upstream of the USE. This element, named MSP (for monocot-specific promoter; consensus, RGCCCR), is present in one to three copies in monocot snRNA genes and is interchangeable between Pol II- and Pol III-specific genes. The efficiency of snRNA gene expression in maize protoplast is determined primarily by the strength of the MSP element(s); this contrasts with the situation in protoplasts of a dicot plant, Nicotiana plumbaginifolia, where promoter strength is a function of the quality of the USE element. Interestingly, the organization of monocot Pol III-specific snRNA gene promoters closely resembles those of equivalent vertebrate promoters. The data are discussed in the context of the coevolution of Pol II- and Pol III-specific snRNA gene promoters within many eukaryotic organisms.


2008 ◽  
Vol 36 (3) ◽  
pp. 537-539 ◽  
Author(s):  
Sylvain Egloff ◽  
Shona Murphy

Pol II (RNA polymerase II) transcribes the genes encoding proteins and non-coding snRNAs (small nuclear RNAs). The largest subunit of Pol II contains a distinctive CTD (C-terminal domain) comprising a repetitive heptad amino acid sequence, Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7. This domain is now known to play a major role in the processes of transcription and co-transcriptional RNA processing in expression of both snRNA and protein-coding genes. The heptapeptide repeat unit can be extensively modified in vivo and covalent modifications of the CTD during the transcription cycle result in the ordered recruitment of RNA-processing factors. The most studied modifications are the phosphorylation of the serine residues in position 2 and 5 (Ser2 and Ser5), which play an important role in the co-transcriptional processing of both mRNA and snRNA. An additional, recently identified CTD modification, phosphorylation of the serine residue in position 7 (Ser7) of the heptapeptide, is however specifically required for expression of snRNA genes. These findings provide interesting insights into the control of gene-specific Pol II function.


Cancers ◽  
2018 ◽  
Vol 10 (8) ◽  
pp. 256 ◽  
Author(s):  
Xinling Hu ◽  
Liu Yang ◽  
Yin-Yuan Mo

Functional genomics has provided evidence that the human genome transcribes a large number of non-coding genes in addition to protein-coding genes, including microRNAs and long non-coding RNAs (lncRNAs). Among the group of lncRNAs are pseudogenes that have not been paid attention in the past, compared to other members of lncRNAs. However, increasing evidence points the important role of pseudogenes in diverse cellular functions, and dysregulation of pseudogenes are often associated with various human diseases including cancer. Like other types of lncRNAs, pseudogenes can also function as master regulators for gene expression and thus, they can play a critical role in various aspects of tumorigenesis. In this review we discuss the latest developments in pseudogene research, focusing on how pseudogenes impact tumorigenesis through different gene regulation mechanisms. Given the high sequence homology with the corresponding parent genes, we also discuss challenges for pseudogene research.


Sign in / Sign up

Export Citation Format

Share Document