Characterizing RNA stability genome-wide through combined analysis of PRO-seq and RNA-seq data

Abstract Background The concentrations of distinct types of RNA in cells result from a dynamic equilibrium between RNA synthesis and decay. Despite the critical importance of RNA decay rates, current approaches for measuring them are generally labor-intensive, limited in sensitivity, and/or disruptive to normal cellular processes. Here, we introduce a simple method for estimating relative RNA half-lives that is based on two standard and widely available high-throughput assays: Precision Run-On sequencing (PRO-seq) and RNA sequencing (RNA-seq). Results Our method treats PRO-seq as a measure of transcription rate and RNA-seq as a measure of RNA concentration, and estimates the rate of RNA decay required for a steady-state equilibrium. We show that this approach can be used to assay relative RNA half-lives genome-wide, with good accuracy and sensitivity for both coding and noncoding transcription units. Using a structural equation model (SEM), we test several features of transcription units, nearby DNA sequences, and nearby epigenomic marks for associations with RNA stability after controlling for their effects on transcription. We find that RNA splicing-related features are positively correlated with RNA stability, whereas features related to miRNA binding and DNA methylation are negatively correlated with RNA stability. Furthermore, we find that a measure based on U1 binding and polyadenylation sites distinguishes between unstable noncoding and stable coding transcripts but is not predictive of relative stability within the mRNA or lincRNA classes. We also identify several histone modifications that are associated with RNA stability. Conclusion We introduce an approach for estimating the relative half-lives of individual RNAs. Together, our estimation method and systematic analysis shed light on the pervasive impacts of RNA stability on cellular RNA concentrations.

Download Full-text

Characterizing RNA stability genome-wide through combined analysis of PRO-seq and RNA-seq data

10.1101/690644 ◽

2019 ◽

Cited By ~ 5

Author(s):

Amit Blumberg ◽

Yixin Zhao ◽

Yi-Fei Huang ◽

Noah Dukler ◽

Edward J. Rice ◽

...

Keyword(s):

Dna Sequences ◽

Rna Splicing ◽

Structural Equation ◽

Rna Stability ◽

Estimation Method ◽

Equation Model ◽

Rna Seq ◽

Simple Method ◽

Systematic Analysis ◽

Genome Wide

AbstractThe rate at which RNA molecules decay is a key determinant of cellular RNA concentrations, yet current approaches for measuring RNA half-lives are generally labor-intensive, limited in sensitivity, and/or disruptive to normal cellular processes. Here we introduce a simple method for estimating relative RNA half-lives that is based on two standard and widely available high-throughput assays: Precision Run-On and sequencing (PRO-seq) and RNA sequencing (RNA-seq). Our method treats PRO-seq as a measure of transcription rate and RNA-seq as a measure of RNA concentration, and estimates the rate of RNA decay required for a steady-state equilibrium. We show that this approach can be used to assay relative RNA half-lives genome-wide, with good accuracy and sensitivity for both coding and noncoding transcription units. Using a structural equation model (SEM), we test several features of transcription units, nearby DNA sequences, and nearby epigenomic marks for associations with RNA stability after controlling for their effects on transcription. We find that RNA splicing-related features are positively correlated with RNA stability, whereas features related to miRNA binding, DNA methylation, and G+C-richness are negatively correlated with RNA stability. Furthermore, we find that a measure based on U1-binding and polyadenylation sites distinguishes between unstable noncoding and stable coding transcripts but is not predictive of relative stability within the mRNA or lincRNA classes. We also identify several histone modifications that are associated with RNA stability. Together, our estimation method and systematic analysis shed light on the pervasive impacts of RNA stability on cellular RNA concentrations.

Download Full-text

Distinct regulation of hippocampal neuroplasticity and ciliary genes by corticosteroid receptors

Nature Communications ◽

10.1038/s41467-021-24967-z ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Karen R. Mifsud ◽

Clare L. M. Kennedy ◽

Silvia Salatino ◽

Eshita Sharma ◽

Emily M. Price ◽

...

Keyword(s):

Dna Sequences ◽

Glucocorticoid Receptors ◽

Acute Stress ◽

Circadian Variation ◽

Rna Seq ◽

Physiological Regulation ◽

Behavioural Adaptation ◽

Neuronal Progenitor ◽

Genome Wide ◽

Transcriptional Changes

AbstractGlucocorticoid hormones (GCs) — acting through hippocampal mineralocorticoid receptors (MRs) and glucocorticoid receptors (GRs) — are critical to physiological regulation and behavioural adaptation. We conducted genome-wide MR and GR ChIP-seq and Ribo-Zero RNA-seq studies on rat hippocampus to elucidate MR- and GR-regulated genes under circadian variation or acute stress. In a subset of genes, these physiological conditions resulted in enhanced MR and/or GR binding to DNA sequences and associated transcriptional changes. Binding of MR at a substantial number of sites however remained unchanged. MR and GR binding occur at overlapping as well as distinct loci. Moreover, although the GC response element (GRE) was the predominant motif, the transcription factor recognition site composition within MR and GR binding peaks show marked differences. Pathway analysis uncovered that MR and GR regulate a substantial number of genes involved in synaptic/neuro-plasticity, cell morphology and development, behavior, and neuropsychiatric disorders. We find that MR, not GR, is the predominant receptor binding to >50 ciliary genes; and that MR function is linked to neuronal differentiation and ciliogenesis in human fetal neuronal progenitor cells. These results show that hippocampal MRs and GRs constitutively and dynamically regulate genomic activities underpinning neuronal plasticity and behavioral adaptation to changing environments.

Download Full-text

DeeReCT-TSS: A novel meta-learning-based method annotates TSS in multiple cell types based on DNA sequences and RNA-seq data

10.1101/2021.07.14.452328 ◽

2021 ◽

Author(s):

Juexiao Zhou ◽

Bin Zhang ◽

Haoyang Li ◽

Longxi Zhou ◽

Zhongxiao Li ◽

...

Keyword(s):

Dna Sequences ◽

Cell Types ◽

Rna Seq ◽

Sources Of Information ◽

Genome Wide ◽

A Genome ◽

Meta Learning ◽

Multiple Cell ◽

Different Cell Types ◽

Genome Scale

The accurate annotation of TSSs and their usage is critical for the mechanistic understanding of gene regulation under different biological contexts. To fulfill this, specific high-throughput experimental technologies have been developed to capture TSSs in a genome-wide manner. Various computational tools have also been developed for in silico prediction of TSSs solely based on genomic sequences. Most of these tools have drastic false positive predictions when applied on the genome-scale. Here, we present DeeReCT-TSS, a deep-learning-based method that is capable of TSSs identification across the whole genome based on DNA sequences and conventional RNA-seq data. We show that by effectively incorporating these two sources of information, DeeReCT-TSS significantly outperforms other solely sequence-based methods on the precise annotation of TSSs used in different cell types. Furthermore, we develop a meta-learning-based extension for simultaneous transcription start site (TSS) annotation on 10 cell types, which enables the identification of cell-type-specific TSS. Finally, we demonstrate the high precision of DeeReCT-TSS on two independent datasets from the ENCODE project by correlating our predicted TSSs with experimentally defined TSS chromatin states.

Download Full-text

Metabolic labeling of RNAs uncovers hidden features and dynamics of the Arabidopsis thaliana transcriptome

10.1101/588780 ◽

2019 ◽

Author(s):

Emese Xochitl Szabo ◽

Philipp Reichert ◽

Marie-Kristin Lehniger ◽

Marilena Ohmer ◽

Marcella de Francisco Amorim ◽

...

Keyword(s):

Arabidopsis Thaliana ◽

Transcriptome Analysis ◽

Rna Processing ◽

Rna Synthesis ◽

Rna Stability ◽

Metabolic Labeling ◽

Rna Seq ◽

Genome Wide ◽

Stability Measurement ◽

Chemical Inhibition

AbstractTranscriptome analysis by RNA sequencing (RNA-seq) has become an indispensable core research tool in modern plant biology. Virtually all RNA-seq studies provide a snapshot of the steady-state transcriptome, which contains valuable information about RNA populations at a given time, but lacks information about the dynamics of RNA synthesis and degradation. Only a few specialized sequencing techniques, such as global run-on sequencing (GRO-seq), have been applied in plants and provide information about RNA synthesis rates. Here, we demonstrate that RNA labeling with a modified, non-toxic uridine analog, 5-ethynyl uridine (5-EU), in Arabidopsis thaliana seedlings provides insight into the dynamic nature of a plant transcriptome. Pulse-labeling with 5-EU allowed the detection and analysis of nascent and unstable RNAs, of RNA processing intermediates generated by splicing, and of chloroplast RNAs. We also conducted pulse-chase experiments with 5-EU, which allowed us to determine RNA stabilities without the need for chemical inhibition of transcription using compounds such as actinomycin and cordycepin. Genome-wide analysis of RNA stabilities by 5-EU pulse-chase experiments revealed that this inhibitor-free RNA stability measurement results in RNA half-lives much shorter than those reported after chemical inhibition of transcription. In summary, our results show that the Arabidopsis nascent transcriptome contains unstable RNAs and RNA processing intermediates, and suggest that half-lives of plant RNAs are largely overestimated. Our results lay the ground for an easy and affordable nascent transcriptome analysis and inhibitor-free analysis of RNA stabilities in plants.

Download Full-text

Predicting chromatin interactions between open chromatin regions from DNA sequences

10.1101/720748 ◽

2019 ◽

Cited By ~ 3

Author(s):

Fan Cao ◽

Ying Zhang ◽

Yan Ping Loh ◽

Yichao Cai ◽

Melissa J. Fullwood

Keyword(s):

Rna Polymerase Ii ◽

Dna Sequences ◽

State Of The Art ◽

Lymphocytic Leukemia ◽

Computational Method ◽

Chromatin Interaction ◽

Open Chromatin ◽

Rna Seq ◽

Chromatin Interactions ◽

Genome Wide

AbstractChromatin interactions play important roles in regulating gene expression. However, the availability of genome-wide chromatin interaction data is very limited. Various computational methods have been developed to predict chromatin interactions. Most of these methods rely on large collections of ChIP-Seq/RNA-Seq/DNase-Seq datasets and predict only enhancer-promoter interactions. Some of the ‘state-of-the-art’ methods have poor experimental designs, leading to over-exaggerated performances and misleading conclusions. Here we developed a computational method, Chromatin Interaction Neural Network (CHINN), to predict chromatin interactions between open chromatin regions by using only DNA sequences of the interacting open chromatin regions. CHINN is able to predict CTCF- and RNA polymerase II-associated chromatin interactions between open chromatin regions. CHINN also shows good across-sample performances and captures various sequence features that are predictive of chromatin interactions. We applied CHINN to 84 chronic lymphocytic leukemia (CLL) samples and detected systematic differences in the chromatin interactome between IGVH-mutated and IGVH-unmutated CLL samples.

Download Full-text

Global Analysis of the Human RNA Degradome Reveals Widespread Decapped and Endonucleolytic Cleaved Transcripts

International Journal of Molecular Sciences ◽

10.3390/ijms21186452 ◽

2020 ◽

Vol 21 (18) ◽

pp. 6452

Author(s):

Jung-Im Won ◽

JaeMoon Shin ◽

So Young Park ◽

JeeHee Yoon ◽

Dong-Hoon Jeong

Keyword(s):

Global Analysis ◽

Rna Decay ◽

Mrna Levels ◽

Rna Turnover ◽

Rna Seq ◽

Next Generation Sequencing Technology ◽

Endonucleolytic Cleavage ◽

Genome Wide ◽

Hela Cell Lines ◽

Wide Scale

RNA decay is an important regulatory mechanism for gene expression at the posttranscriptional level. Although the main pathways and major enzymes that facilitate this process are well defined, global analysis of RNA turnover remains under-investigated. Recent advances in the application of next-generation sequencing technology enable its use in order to examine various RNA decay patterns at the genome-wide scale. In this study, we investigated human RNA decay patterns using parallel analysis of RNA end-sequencing (PARE-seq) data from XRN1-knockdown HeLa cell lines, followed by a comparison of steady state and degraded mRNA levels from RNA-seq and PARE-seq data, respectively. The results revealed 1103 and 1347 transcripts classified as stable and unstable candidates, respectively. Of the unstable candidates, we found that a subset of the replication-dependent histone transcripts was polyadenylated and rapidly degraded. Additionally, we identified 380 endonucleolytically cleaved candidates by analyzing the most abundant PARE sequence on a transcript. Of these, 41.4% of genes were classified as unstable genes, which implied that their endonucleolytic cleavage might affect their mRNA stability. Furthermore, we identified 1877 decapped candidates, including HSP90B1 and SWI5, having the most abundant PARE sequences at the 5′-end positions of the transcripts. These results provide a useful resource for further analysis of RNA decay patterns in human cells.

Download Full-text

Measurement Error Correction of Genome-Wide Polygenic Scores in Prediction Samples

10.1101/165472 ◽

2017 ◽

Cited By ~ 7

Author(s):

Elliot M. Tucker-Drob

Keyword(s):

Structural Equation Modeling ◽

Error Correction ◽

Instrumental Variable ◽

Latent Variable ◽

Structural Equation ◽

Estimation Method ◽

Equation Modeling ◽

Variable Approach ◽

Genome Wide ◽

Polygenic Scores

Abstract/IntroductionDiPrete, Burik, & Koellinger (2017; http://dx.doi.org/10.1101/134197) propose using an instrumental variable (IV) framework to correct genome-wide polygenic scores (GPSs) for error, thereby producing disattenuated estimates of SNP heritability in predictions samples. They demonstrate their approach by producing two independent GPSs for Educational Attainment (“multiple indicators”) in a prediction sample (Health and Retirement Study; HRS) from independent sets of SNP regression weights, each computed from a different half of the discovery sample (EA2; Okbay et al. 2016), i.e. “by randomly splitting the GWAS sample that was used for [the GPS] construction.”Here, I elucidate how a structural equation modeling (SEM) framework that specifies true score variance in GPSs as a latent variable can be used to derive an equivalent correction to the IV approach proposed by DiPrete et al. (2017). This approach, which is rooted in a psychometric modeling tradition, has a number of advantages: (1) it formalizes the assumed data-generating model, (2) it estimates all parameters of interest in a single step, (3) is can be flexibly incorporated into a larger multivariate analysis (such as the “Genetic Instrumental Variable” approach proposed by DiPrete et al., 2017), (4) it can easily be adapted to relax assumptions (e.g. that the GPS indicators equally represent the true genetic factor score), and (5) it can easily be extended to include more than two GPS indicators. After describing how the multiple indicator approach to GPS correction can specified as a structural equation model, I demonstrate how a structural equation modeling approach can be used to correct GPSs for error using SNP heritability obtained using GREML or LD score regression to produce a correction that is equivalent to an approach recently proposed by Daniel Benjamin and colleagues. Finally, I briefly discuss what I view as some conceptual limitations surrounding the error correction approaches described, regardless of the estimation method implemented.

Download Full-text

H3K9me2 genome-wide distribution in the holocentric insect Spodoptera frugiperda (Lepidoptera: Noctuidae)

10.1101/2021.07.07.451438 ◽

2021 ◽

Author(s):

Sandra NHIM ◽

Sylvie GIMENEZ ◽

Rima NAIT SAIDI ◽

Dany Severac ◽

Kiwoong Nam ◽

...

Keyword(s):

Dna Sequences ◽

Spodoptera Frugiperda ◽

Transcriptional Repression ◽

Active Regions ◽

Wide Distribution ◽

Model Organisms ◽

Rna Seq ◽

Protein Coding ◽

Genome Wide ◽

Lepidoptera Noctuidae

Eukaryotic genomes are packaged by Histone proteins in a structure called chromatin. There are different chromatin types. Euchromatin is typically associated with decondensed, transcriptionally active regions and heterochromatin to more condensed regions of the chromosomes. Methylation of Lysine 9 of Histone H3 (H3K9me) is a conserved biochemical marker of heterochromatin. In many organisms, heterochromatin is usually localized at telomeric as well as pericentromeric regions but can also be found at interstitial chromosomal loci. This distribution may vary in different species depending on their general chromosomal organization. Holocentric species such as Spodoptera frugiperda (Lepidoptera: Noctuidae) possess dispersed centromeres instead of a monocentric one and thus no observable pericentromeric compartment. To identify the localization of heterochromatin in such species we performed ChIP-Seq experiments and analyzed the distribution of the heterochromatin marker H3K9me2 in the Sf9 cell line and whole 4th instar larvae (L4) in relation to RNA-Seq data. In both samples we measured an enrichment of H3K9me2 at the (sub) telomeres, rDNA loci, and satellite DNA sequences, which could represent dispersed centromeric regions. We also observed that density of H3K9me2 is positively correlated with transposable elements and protein-coding genes. But contrary to most model organisms, H3K9me2 density is not correlated with transcriptional repression. This is the first genome-wide ChIP-Seq analysis conducted in S. frugiperda for H3K9me2. Compared to model organisms, this mark is found in expected chromosomal compartments such as rDNA and telomeres. However, it is also localized at numerous dispersed regions, instead of the well described large pericentromeric domains, indicating that H3K9me2 might not represent a classical heterochromatin marker in Lepidoptera.

Download Full-text