scholarly journals A machine learning-based framework for modeling transcription elongation

2021 ◽  
Vol 118 (6) ◽  
pp. e2007450118
Author(s):  
Peiyuan Feng ◽  
An Xiao ◽  
Meng Fang ◽  
Fangping Wan ◽  
Shuya Li ◽  
...  

RNA polymerase II (Pol II) generally pauses at certain positions along gene bodies, thereby interrupting the transcription elongation process, which is often coupled with various important biological functions, such as precursor mRNA splicing and gene expression regulation. Characterizing the transcriptional elongation dynamics can thus help us understand many essential biological processes in eukaryotic cells. However, experimentally measuring Pol II elongation rates is generally time and resource consuming. We developed PEPMAN (polymerase II elongation pausing modeling through attention-based deep neural network), a deep learning-based model that accurately predicts Pol II pausing sites based on the native elongating transcript sequencing (NET-seq) data. Through fully taking advantage of the attention mechanism, PEPMAN is able to decipher important sequence features underlying Pol II pausing. More importantly, we demonstrated that the analyses of the PEPMAN-predicted results around various types of alternative splicing sites can provide useful clues into understanding the cotranscriptional splicing events. In addition, associating the PEPMAN prediction results with different epigenetic features can help reveal important factors related to the transcription elongation process. All these results demonstrated that PEPMAN can provide a useful and effective tool for modeling transcription elongation and understanding the related biological factors from available high-throughput sequencing data.

2020 ◽  
Vol 22 (Supplement_3) ◽  
pp. iii287-iii287
Author(s):  
Hiroaki Katagi ◽  
Nozomu Takata ◽  
Yuki Aoi ◽  
Yongzhan Zhang ◽  
Emily J Rendleman ◽  
...  

Abstract Diffuse intrinsic pontine glioma (DIPG) is highly aggressive brain stem tumor and needed to develop novel therapeutic agents for the treatment. The super elongation complex (SEC) is essential for transcription elongation through release of RNA polymerase II (Pol II). We found that AFF4, a scaffold protein of the SEC, is required for the growth of H3K27M-mutant DIPG cells. In addition, the small molecule SEC inhibitor, KL-1, increased promoter-proximal pausing of Pol II, and reduced transcription elongation, resulting in down-regulate cell cycle, transcription and DNA repair genes. KL-1 treatment decreased cell growth and increased apoptosis in H3K27M-mutant DIPG cells, and prolonged animal survival in our human H3K27M-mutant DIPG xenograft model. Our results demonstrate that the SEC disruption by KL-1 is a novel therapeutic strategy for H3K27M-mutant DIPG.


2005 ◽  
Vol 4 (8) ◽  
pp. 1446-1454 ◽  
Author(s):  
Stephanie A. Morris ◽  
Yoichiro Shibata ◽  
Ken-ichi Noma ◽  
Yuko Tsukamoto ◽  
Erin Warren ◽  
...  

ABSTRACT Set2 methylation of histone H3 at lysine 36 (K36) has recently been shown to be associated with RNA polymerase II (Pol II) elongation in Saccharomyces cerevisiae. However, whether this modification is conserved and associated with transcription elongation in other organisms is not known. Here we report the identification and characterization of the Set2 ortholog responsible for K36 methylation in the fission yeast Schizosaccharomyces pombe. We find that similar to the budding yeast enzyme, S. pombe Set2 is also a robust nucleosome-selective H3 methyltransferase that is specific for K36. Deletion of the S. pombe set2 + gene results in complete abolishment of K36 methylation as well as a slow-growth phenotype on plates containing synthetic medium. These results indicate that Set2 is the sole enzyme responsible for this modification in fission yeast and is important for cell growth under stressed conditions. Using the chromatin immunoprecipitation assay, we demonstrate that K36 methylation in S. pombe is associated with the transcribed regions of Pol II-regulated genes and is devoid in regions that are not transcribed by Pol II. Consistent with a role for Set2 in transcription elongation, we find that S. pombe Set2 associates with the hyperphosphorylated form of Pol II and can fully rescue K36 methylation and Pol II interaction in budding yeast cells deleted for Set2. These results, along with our finding that K36 methylation is highly conserved among eukaryotes, imply a conserved role for this modification in the transcription elongation process.


2017 ◽  
Vol 37 (13) ◽  
Author(s):  
Rwik Sen ◽  
Amala Kaja ◽  
Jannatul Ferdoush ◽  
Shweta Lahudkar ◽  
Priyanka Barman ◽  
...  

ABSTRACT We have recently demonstrated that an mRNA capping enzyme, Cet1, impairs promoter-proximal accumulation/pausing of RNA polymerase II (Pol II) independently of its capping activity in Saccharomyces cerevisiae to control transcription. However, it is still unknown how Pol II pausing is regulated by Cet1. Here, we show that Cet1's N-terminal domain (NTD) promotes the recruitment of FACT (facilitates chromatin transcription that enhances the engagement of Pol II into transcriptional elongation) to the coding sequence of an active gene, ADH1, independently of mRNA-capping activity. Absence of Cet1's NTD decreases FACT targeting to ADH1 and consequently reduces the engagement of Pol II in transcriptional elongation, leading to promoter-proximal accumulation of Pol II. Similar results were also observed at other genes. Consistently, Cet1 interacts with FACT. Collectively, our results support the notion that Cet1's NTD promotes FACT targeting to the active gene independently of mRNA-capping activity in facilitating Pol II's engagement in transcriptional elongation, thus deciphering a novel regulatory pathway of gene expression.


2020 ◽  
Vol 117 (41) ◽  
pp. 25486-25493 ◽  
Author(s):  
Jun Xu ◽  
Wei Wang ◽  
Liang Xu ◽  
Jia-Yu Chen ◽  
Jenny Chong ◽  
...  

While loss-of-function mutations in Cockayne syndrome group B protein (CSB) cause neurological diseases, this unique member of the SWI2/SNF2 family of chromatin remodelers has been broadly implicated in transcription elongation and transcription-coupled DNA damage repair, yet its mechanism remains largely elusive. Here, we use a reconstituted in vitro transcription system with purified polymerase II (Pol II) and Rad26, a yeast ortholog of CSB, to study the role of CSB in transcription elongation through nucleosome barriers. We show that CSB forms a stable complex with Pol II and acts as an ATP-dependent processivity factor that helps Pol II across a nucleosome barrier. This noncanonical mechanism is distinct from the canonical modes of chromatin remodelers that directly engage and remodel nucleosomes or transcription elongation factors that facilitate Pol II nucleosome bypass without hydrolyzing ATP. We propose a model where CSB facilitates gene expression by helping Pol II bypass chromatin obstacles while maintaining their structures.


2021 ◽  
Vol 99 (2) ◽  
Author(s):  
Yuhua Fu ◽  
Pengyu Fan ◽  
Lu Wang ◽  
Ziqiang Shu ◽  
Shilin Zhu ◽  
...  

Abstract Despite the broad variety of available microRNA (miRNA) research tools and methods, their application to the identification, annotation, and target prediction of miRNAs in nonmodel organisms is still limited. In this study, we collected nearly all public sRNA-seq data to improve the annotation for known miRNAs and identify novel miRNAs that have not been annotated in pigs (Sus scrofa). We newly annotated 210 mature sequences in known miRNAs and found that 43 of the known miRNA precursors were problematic due to redundant/missing annotations or incorrect sequences. We also predicted 811 novel miRNAs with high confidence, which was twice the current number of known miRNAs for pigs in miRBase. In addition, we proposed a correlation-based strategy to predict target genes for miRNAs by using a large amount of sRNA-seq and RNA-seq data. We found that the correlation-based strategy provided additional evidence of expression compared with traditional target prediction methods. The correlation-based strategy also identified the regulatory pairs that were controlled by nonbinding sites with a particular pattern, which provided abundant complementarity for studying the mechanism of miRNAs that regulate gene expression. In summary, our study improved the annotation of known miRNAs, identified a large number of novel miRNAs, and predicted target genes for all pig miRNAs by using massive public data. This large data-based strategy is also applicable for other nonmodel organisms with incomplete annotation information.


2020 ◽  
Vol 49 (D1) ◽  
pp. D877-D883
Author(s):  
Fangzhou Xie ◽  
Shurong Liu ◽  
Junhao Wang ◽  
Jiajia Xuan ◽  
Xiaoqin Zhang ◽  
...  

Abstract Eukaryotic genomes encode thousands of small and large non-coding RNAs (ncRNAs). However, the expression, functions and evolution of these ncRNAs are still largely unknown. In this study, we have updated deepBase to version 3.0 (deepBase v3.0, http://rna.sysu.edu.cn/deepbase3/index.html), an increasingly popular and openly licensed resource that facilitates integrative and interactive display and analysis of the expression, evolution, and functions of various ncRNAs by deeply mining thousands of high-throughput sequencing data from tissue, tumor and exosome samples. We updated deepBase v3.0 to provide the most comprehensive expression atlas of small RNAs and lncRNAs by integrating ∼67 620 data from 80 normal tissues and ∼50 cancer tissues. The extracellular patterns of various ncRNAs were profiled to explore their applications for discovery of noninvasive biomarkers. Moreover, we constructed survival maps of tRNA-derived RNA Fragments (tRFs), miRNAs, snoRNAs and lncRNAs by analyzing >45 000 cancer sample data and corresponding clinical information. We also developed interactive webs to analyze the differential expression and biological functions of various ncRNAs in ∼50 types of cancers. This update is expected to provide a variety of new modules and graphic visualizations to facilitate analyses and explorations of the functions and mechanisms of various types of ncRNAs.


2007 ◽  
Vol 27 (13) ◽  
pp. 4641-4651 ◽  
Author(s):  
Junjiang Fu ◽  
Ho-Geun Yoon ◽  
Jun Qin ◽  
Jiemin Wong

ABSTRACT P-TEFb, comprised of CDK9 and a cyclin T subunit, is a global transcriptional elongation factor important for most RNA polymerase II (pol II) transcription. P-TEFb facilitates transcription elongation in part by phosphorylating Ser2 of the heptapeptide repeat of the carboxy-terminal domain (CTD) of the largest subunit of pol II. Previous studies have shown that P-TEFb is subjected to negative regulation by forming an inactive complex with 7SK small RNA and HEXIM1. In an effort to investigate the molecular mechanism by which corepressor N-CoR mediates transcription repression, we identified HEXIM1 as an N-CoR-interacting protein. This finding led us to test whether the P-TEFb complex is regulated by acetylation. We demonstrate that CDK9 is an acetylated protein in cells and can be acetylated by p300 in vitro. Through both in vitro and in vivo assays, we identified lysine 44 of CDK9 as a major acetylation site. We present evidence that CDK9 is regulated by N-CoR and its associated HDAC3 and that acetylation of CDK9 affects its ability to phosphorylate the CTD of pol II. These results suggest that acetylation of CDK9 is an important posttranslational modification that is involved in regulating P-TEFb transcriptional elongation function.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Carlos G. Urzúa-Traslaviña ◽  
Vincent C. Leeuwenburgh ◽  
Arkajyoti Bhattacharya ◽  
Stefan Loipfinger ◽  
Marcel A. T. M. van Vugt ◽  
...  

AbstractThe interpretation of high throughput sequencing data is limited by our incomplete functional understanding of coding and non-coding transcripts. Reliably predicting the function of such transcripts can overcome this limitation. Here we report the use of a consensus independent component analysis and guilt-by-association approach to predict over 23,000 functional groups comprised of over 55,000 coding and non-coding transcripts using publicly available transcriptomic profiles. We show that, compared to using Principal Component Analysis, Independent Component Analysis-derived transcriptional components enable more confident functionality predictions, improve predictions when new members are added to the gene sets, and are less affected by gene multi-functionality. Predictions generated using human or mouse transcriptomic data are made available for exploration in a publicly available web portal.


Sign in / Sign up

Export Citation Format

Share Document