scholarly journals ReCappable Seq: Comprehensive Determination of Transcription Start Sites derived from all RNA polymerases

2019 ◽  
Author(s):  
Bo Yan ◽  
George Tzertzinis ◽  
Ira Schildkraut ◽  
Laurence Ettwiller

AbstractMethodologies for determining eukaryotic Transcription Start Sites (TSS) rely on the selection of the 5’ canonical cap structure of Pol-II transcripts and are consequently ignoring entire classes of TSS derived from other RNA polymerases which play critical roles in various cell functions. To overcome this limitation, we developed ReCappable-seq and identified TSS from Pol-ll and non-Pol-II transcripts at nucleotide resolution. Applied to the human transcriptome, ReCappable-seq identifies Pol-II TSS with higher specificity than CAGE and reveals a rich landscape of TSS associated notably with Pol-III transcripts which have been previously not possible to study on a genome-wide scale. Novel TSS consistent with non-Pol-II transcripts can be found in the nuclear and mitochondrial genomes. By identifying TSS derived from all RNA-polymerases, ReCappable-seq reveals distinct epigenetic marks among Pol-lI and non-Pol-II TSS and provides a unique opportunity to concurrently interrogate the regulatory landscape of coding and non-coding RNA.

2021 ◽  
pp. gr.275784.121
Author(s):  
Bo Yan ◽  
George Tzertzinis ◽  
Ira Schildkraut ◽  
Laurence Ettwiller

Determination of eukaryotic Transcription Start Sites (TSS) has been based on methods that require the cap structure at the 5-prime end of transcripts derived from Pol-II RNA polymerase. Consequently, these methods do not reveal TSS derived from the other RNA polymerases which also play critical roles in various cell functions. To address this limitation, we developed ReCappable-seq which comprehensively identifies TSS for both Pol-lI and non-Pol-II transcripts at single-nucleotide resolution. The method relies on specific enzymatic exchange of 5-prime m7G caps and 5-prime triphosphates with a selectable tag. When applied to human transcriptomes, ReCappable-seq identifies Pol-II TSS that are in agreement with orthogonal methods such as CAGE. Additionally, ReCappable-seq reveals a rich landscape of TSS associated with Pol-III transcripts which have not previously been amenable to study at genome-wide scale. Novel TSS from non-Pol-II transcription can be located in the nuclear and mitochondrial genomes. ReCappable-seq interrogates the regulatory landscape of coding and noncoding RNA concurrently and enables the classification of epigenetic profiles associated with Pol-lI and non-Pol-II TSS.


2019 ◽  
Author(s):  
Stepan Pachganov ◽  
Khalimat Murtazalieva ◽  
Alexei Zarubin ◽  
Dmitry Sokolov ◽  
Duane Chartier ◽  
...  

As interest in genetic resequencing increases, so does the need for effective mathematical, computational, and statistical approaches. One of the difficult problems in genome annotation is determination of precise positions of transcription start sites. In this paper we present TransPrise - an efficient deep learning tool for prediction of positions of eukaryotic transcription start sites. TransPrise offers significant improvement over existing promoter-prediction methods. To illustrate this, we compared predictions of TransPrise with the TSSPlant approach for well annotated genome of Oryza sativa. Using a computer equipped with a graphics processing unit, the run time of TransPrise is 250 minutes on a genome of 374 Mb long. We provide the full basis for the comparison and encourage users to freely access a set of our computational tools to facilitate and streamline their own analyses. The ready-to-use Docker image with all necessary packages, models, code as well as the source code of the TransPrise algorithm are available at ( http://compubioverne.group /). The source code is ready to use and customizable to predict TSS in any eukaryotic organism.


2019 ◽  
Author(s):  
Stepan Pachganov ◽  
Khalimat Murtazalieva ◽  
Alexei Zarubin ◽  
Dmitry Sokolov ◽  
Duane Chartier ◽  
...  

As interest in genetic resequencing increases, so does the need for effective mathematical, computational, and statistical approaches. One of the difficult problems in genome annotation is determination of precise positions of transcription start sites. In this paper we present TransPrise - an efficient deep learning tool for prediction of positions of eukaryotic transcription start sites. TransPrise offers significant improvement over existing promoter-prediction methods. To illustrate this, we compared predictions of TransPrise with the TSSPlant approach for well annotated genome of Oryza sativa. Using a computer equipped with a graphics processing unit, the run time of TransPrise is 250 minutes on a genome of 374 Mb long. We provide the full basis for the comparison and encourage users to freely access a set of our computational tools to facilitate and streamline their own analyses. The ready-to-use Docker image with all necessary packages, models, code as well as the source code of the TransPrise algorithm are available at ( http://compubioverne.group /). The source code is ready to use and customizable to predict TSS in any eukaryotic organism.


2021 ◽  
pp. gr.275750.121
Author(s):  
Debasish Sarkar ◽  
Z. Iris Zhu ◽  
Elisabeth R. Knoll ◽  
Emily Paul ◽  
David Landsman ◽  
...  

The Mediator complex is central to transcription by RNA polymerase II (Pol II) in eukaryotes. In budding yeast (Saccharomyces cerevisiae), Mediator is recruited by activators and associates with core promoter regions, where it facilitates pre-initiation complex (PIC) assembly, only transiently prior to Pol II escape. Interruption of the transcription cycle by inactivation or depletion of Kin28 inhibits Pol II escape and stabilizes this association. However, Mediator occupancy and dynamics have not been examined on a genome-wide scale in yeast grown in nonstandard conditions. Here we investigate Mediator occupancy following heat shock or CdCl2 exposure, with and without depletion of Kin28. We find that Pol II occupancy exhibits similar dependence on Mediator under normal and heat shock conditions. However, while Mediator association increases at many genes upon Kin28 depletion under standard growth conditions, little or no increase is observed at most genes upon heat shock, indicating a more stable association of Mediator after heat shock. Mediator remains associated upstream of the core promoter at genes repressed by heat shock or CdCl2 exposure whether or not Kin28 is depleted, suggesting that Mediator is recruited by activators but is unable to engage PIC components at these repressed targets. This persistent association is strongest at promoters that bind the HMGB family member Hmo1, and is reduced but not eliminated in hmo1∆ yeast. Finally, we show a reduced dependence on PIC components for Mediator occupancy at promoters after heat shock, further supporting altered dynamics or stronger engagement with activators under these conditions.


2021 ◽  
Vol 11 ◽  
Author(s):  
Matthew J. Rybin ◽  
Melina Ramic ◽  
Natalie R. Ricciardi ◽  
Philipp Kapranov ◽  
Claes Wahlestedt ◽  
...  

Genome instability is associated with myriad human diseases and is a well-known feature of both cancer and neurodegenerative disease. Until recently, the ability to assess DNA damage—the principal driver of genome instability—was limited to relatively imprecise methods or restricted to studying predefined genomic regions. Recently, new techniques for detecting DNA double strand breaks (DSBs) and single strand breaks (SSBs) with next-generation sequencing on a genome-wide scale with single nucleotide resolution have emerged. With these new tools, efforts are underway to define the “breakome” in normal aging and disease. Here, we compare the relative strengths and weaknesses of these technologies and their potential application to studying neurodegenerative diseases.


2020 ◽  
Vol 36 (11) ◽  
pp. 3605-3606
Author(s):  
Pumin Li ◽  
Qi Xu ◽  
Xu Hua ◽  
Zhongwei Xie ◽  
Jie Li ◽  
...  

Abstract Summary The R/Bioconductor package primirTSS is a fast and convenient tool that allows implementation of the analytical method to identify transcription start sites of microRNAs by integrating ChIP-seq data of H3K4me3 and Pol II. It further ensures the precision by employing the conservation score and sequence features. The tool showed a good performance when using H3K4me3 or Pol II Chip-seq data alone as input, which brings convenience to applications where multiple datasets are hard to acquire. This flexible package is provided with both R-programming interfaces as well as graphical web interfaces. Availability and implementation primirTSS is available at: http://bioconductor.org/packages/primirTSS. The documentation of the package including an accompanying tutorial was deposited at: https://bioconductor.org/packages/release/bioc/vignettes/primirTSS/inst/doc/primirTSS.html. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 295 (12) ◽  
pp. 3990-4000 ◽  
Author(s):  
Sandeep Singh ◽  
Karol Szlachta ◽  
Arkadi Manukyan ◽  
Heather M. Raimer ◽  
Manikarna Dinda ◽  
...  

DNA double-stranded breaks (DSBs) are strongly associated with active transcription, and promoter-proximal pausing of RNA polymerase II (Pol II) is a critical step in transcriptional regulation. Mapping the distribution of DSBs along actively expressed genes and identifying the location of DSBs relative to pausing sites can provide mechanistic insights into transcriptional regulation. Using genome-wide DNA break mapping/sequencing techniques at single-nucleotide resolution in human cells, we found that DSBs are preferentially located around transcription start sites of highly transcribed and paused genes and that Pol II promoter-proximal pausing sites are enriched in DSBs. We observed that DSB frequency at pausing sites increases as the strength of pausing increases, regardless of whether the pausing sites are near or far from annotated transcription start sites. Inhibition of topoisomerase I and II by camptothecin and etoposide treatment, respectively, increased DSBs at the pausing sites as the concentrations of drugs increased, demonstrating the involvement of topoisomerases in DSB generation at the pausing sites. DNA breaks generated by topoisomerases are short-lived because of the religation activity of these enzymes, which these drugs inhibit; therefore, the observation of increased DSBs with increasing drug doses at pausing sites indicated active recruitment of topoisomerases to these sites. Furthermore, the enrichment and locations of DSBs at pausing sites were shared among different cell types, suggesting that Pol II promoter-proximal pausing is a common regulatory mechanism. Our findings support a model in which topoisomerases participate in Pol II promoter-proximal pausing and indicated that DSBs at pausing sites contribute to transcriptional activation.


2019 ◽  
Author(s):  
Jin Wang ◽  
Bing Liang Alvin Chew ◽  
Yong Lai ◽  
Hongping Dong ◽  
Luang Xu ◽  
...  

ABSTRACTChemical modification of transcripts with 5’ caps occurs in all organisms. Here we report a systems-level mass spectrometry-based technique, CapQuant, for quantitative analysis of the cap epitranscriptome in any organism. The method was piloted with 21 canonical caps – m7GpppN, m7GpppNm, GpppN, GpppNm, and m2,2,7GpppG – and 5 “metabolite” caps – NAD, FAD, UDP-Glc, UDP-GlcNAc, and dpCoA. Applying CapQuant to RNA from purified dengue virus,Escherichia coli, yeast, mice, and humans, we discovered four new cap structures in humans and mice (FAD, UDP-Glc, UDP-GlcNAc, and m7Gpppm6A), cell- and tissue-specific variations in cap methylation, and surprisingly high proportions of caps lacking 2’-O-methylation, such as m7Gpppm6A in mammals and m7GpppA in dengue virus, and we did not detect cap m1A/m1Am in humans. CapQuant accurately captured the preference for purine nucleotides at eukaryotic transcription start sites and the correlation between metabolite levels and metabolite caps. The mystery around cap m1A/m1Am analysis remains unresolved.


2019 ◽  
Vol 47 (20) ◽  
pp. e130-e130 ◽  
Author(s):  
Jin Wang ◽  
Bing Liang Alvin Chew ◽  
Yong Lai ◽  
Hongping Dong ◽  
Luang Xu ◽  
...  

Abstract Chemical modification of transcripts with 5′ caps occurs in all organisms. Here, we report a systems-level mass spectrometry-based technique, CapQuant, for quantitative analysis of an organism's cap epitranscriptome. The method was piloted with 21 canonical caps—m7GpppN, m7GpppNm, GpppN, GpppNm, and m2,2,7GpppG—and 5 ‘metabolite’ caps—NAD, FAD, UDP-Glc, UDP-GlcNAc, and dpCoA. Applying CapQuant to RNA from purified dengue virus, Escherichia coli, yeast, mouse tissues, and human cells, we discovered new cap structures in humans and mice (FAD, UDP-Glc, UDP-GlcNAc, and m7Gpppm6A), cell- and tissue-specific variations in cap methylation, and high proportions of caps lacking 2′-O-methylation (m7Gpppm6A in mammals, m7GpppA in dengue virus). While substantial Dimroth-induced loss of m1A and m1Am arose with specific RNA processing conditions, human lymphoblast cells showed no detectable m1A or m1Am in caps. CapQuant accurately captured the preference for purine nucleotides at eukaryotic transcription start sites and the correlation between metabolite levels and metabolite caps.


Sign in / Sign up

Export Citation Format

Share Document