scholarly journals hyperTRIBER: a flexible R package for the analysis of differential RNA editing

2021 ◽  
Author(s):  
Sarah Rennie ◽  
Daniel Heidar Magnusson ◽  
Robin Andersson

RNA editing by ADAR (adenosine deaminase acting on RNA) is gaining an increased interest in the field of post-transcriptional regulation. Fused to an RNA-binding protein (RBP) of interest, the catalytic activity of ADAR results in A-to-I RNA edits, whose identification will determine RBP-bound RNA transcripts. However, the computational tools available for their identification and differential RNA editing statistical analysis are limited or too specialised for general-purpose usage. Here we present hyperTRIBER, a flexible suite of tools, wrapped into a convenient R package, for the detection of differential RNA editing. hyperTRIBER is applicable to complex scenarios and experimental designs, and provides a robust statistical framework allowing for the control for coverage of reads at a given base, the total expression level and other co-variates. We demonstrate the capabilities of our approach on HyperTRIBE RNA-seq data for the detection of bound RNAs by the N6-methyladenosine (m6A) reader protein ECT2 in Arabidopsis roots. We show that hyperTRIBER finds edits with a high statistical power, even where editing proportions and RNA transcript expression levels are low, together demonstrating its usability and versatility for analysing differential RNA editing.

2019 ◽  
Vol 8 (4) ◽  
pp. 19 ◽  
Author(s):  
Tyler Weirick ◽  
Giuseppe Militello ◽  
Mohammed Rabiul Hosen ◽  
David John ◽  
Joseph B. Moore ◽  
...  

Studies in epitranscriptomics indicate that RNA is modified by a variety of enzymes. Among these RNA modifications, adenosine to inosine (A-to-I) RNA editing occurs frequently in the mammalian transcriptome. These RNA editing sites can be detected directly from RNA sequencing (RNA-seq) data by examining nucleotide changes from adenosine (A) to guanine (G), which substitutes for inosine (I). However, a careful investigation of such nucleotide changes must be conducted to distinguish sequencing errors and genomic mutations from the genuine editing sites. Building upon our recent introduction of an easy-to-use bioinformatics tool, RNA Editor, to detect RNA editing events from RNA-seq data, we examined the extent by which RNA editing events affect the binding of RNA-binding proteins (RBP). Through employing bioinformatic techniques, we uncovered that RNA editing sites occur frequently in RBP-bound regions. Moreover, the presence of RNA editing sites are more frequent when RNA editing islands were examined, which are regions in which RNA editing sites are present in clusters. When the binding of one RBP, human antigen R [HuR; encoded by ELAV-like protein 1 (ELAV1)], was quantified experimentally, its binding was reduced upon silencing of the RNA editing enzyme adenosine deaminases acting on RNA (ADAR) compared to the control—suggesting that the presence of RNA editing islands influence HuR binding to its target regions. These data indicate RNA editing as an important mediator of RBP–RNA interactions—a mechanism which likely constitutes an additional mode of post-transcription gene regulation in biological systems.


Author(s):  
Medhat Mahmoud ◽  
Ngoc-Thuy Ha ◽  
Henner Simianer ◽  
Timothy Beissinger

AbstractIdentifying selection on polygenic complex traits in crops and livestock is key to understanding evolution and helps prioritize important characteristics for breeding. However, the QTL that contribute to polygenic trait variation exhibit small or infinitesimal effects. This hinders the ability to detect QTL controlling polygenic traits because enormously high statistical power is needed for their detection. Recently, we circumvented this challenge by introducing a method to identify selection on complex traits by evaluating the relationship between genome-wide changes in allele frequency and estimates of effect-size. The method involves calculating a composite-statistic across all markers that captures this relationship, followed by implementing a linkage disequilibrium-aware permutation test to evaluate if the observed pattern differs from that expected due to drift during evolution and population stratification. In this manuscript, we describe “Ghat”, an R package developed to implement the method to test for selection on polygenic traits. We demonstrate the package by applying it to test for polygenic selection on 15 published European winter wheat traits including yield, biomass, quality, morphological characteristics, and disease resistance traits. The results highlight the power of Ghat to identify selection on complex traits. The Ghat package is accessible on CRAN, The Comprehensive R Archival Network, and on GitHub.


2021 ◽  
Author(s):  
Jian-Rong Li ◽  
Mabel Tang ◽  
Yafang Li ◽  
Christopher I Amos ◽  
Chao Cheng

AbstractExpression quantitative trait loci (eQTLs) analyses have been widely used to identify genetic variants associated with gene expression levels to understand what molecular mechanisms underlie genetic traits. The resultant eQTLs might affect the expression of associated genes through transcriptional or post-transcriptional regulation. In this study, we attempt to distinguish these two types of regulation by identifying genetic variants associated with mRNA stability of genes (stQTLs). Specifically, we computationally inferred mRNA stability of genes based on RNA-seq data and performed association analysis to identify stQTLs. Using the Genotype-Tissue Expression (GTEx) lung RNA-Seq data, we identified a total of 142,801 stQTLs for 3,942 genes and 186,132 eQTLs for 4,751 genes from 15,122,700 genetic variants for 13,476 genes, respectively. Interesting, our results indicated that stQTLs were enriched in the CDS and 3’UTR regions, while eQTLs are enriched in the CDS, 3’UTR, 5’UTR, and upstream regions. We also found that stQTLs are more likely than eQTLs to overlap with RNA binding protein (RBP) and microRNA (miRNA) binding sites. Our analyses demonstrate that simultaneous identification of stQTLs and eQTLs can provide more mechanistic insight on the association between genetic variants and gene expression levels.Author SummaryIn the past decade, many studies have identified genetic variants associated with gene expression level (eQTLs) in different phenotypes, including tissues and diseases. Gene expression is the result of cooperation between transcriptional regulation, such as transcriptional activity, and post-transcriptional regulation, such as mRNA stability. Here, we present a computational framework that take advantage of recently developed methods to estimate mRNA stability from RNA-Seq, which is widely used to estimate gene expression, and then to identify genetic variants associated with mRNA stability (stQTLs) in lung tissue. Compared to eQTLs, we found that genetic variants that affects mRNA stability are more significantly located in the CDS and 3’UTR regions, which are known to interact with RNA-binding proteins (RBPs) or microRNAs to regulate stability. In addition, stQTLs are significantly more likely to overlap the binding sites of RBPs. We show that the six RBPs that most significantly bind to stQTLs are all known to regulate mRNA stability. This pipeline of simultaneously identifying eQTLs and stQTLs using only RNA-Seq data can provide higher resolution than traditional eQTLs study to better understand the molecular mechanisms of genetic variants on the regulation of gene expression.


2021 ◽  
Vol 129 (Suppl_1) ◽  
Author(s):  
Chen Gao ◽  
Zhaojun Xiong ◽  
Jianfang Liu ◽  
Nancy Cao ◽  
Tomohiro Yokota ◽  
...  

Post-transcriptional regulation plays a key role in transcriptome reprogramming during cardiac pathogenesis. In previous studies, we have identified that cardiac enriched RNA-binding protein, RBFox1 plays key role in cardiac hypertrophy through mRNA alternative splicing regulation in nuclei. However, RBFox1 gene also generates a cytosolic isoform (RBFox1c), suggesting additional functions of post-transcriptional regulation in heart. In adult heart, RBFox1c mRNA constituted ~ 40% of total RBFox1 level but was significantly repressed in pressure-overloaded failing mouse heart. Using CRISPR-Cas9 technology, we have established an isoform specific RBFox1c-cKO mouse. At baseline inactivation of RBFox1c led to decreased cardiac function along with induction of cardiac fibrosis. RBFox1c-cKO mice also showed macrophages infiltration into myocardium post 7days MI. In contrast, restoration of RBFox1c expression in adult intact hearts significantly reduced cardiac fibrosis post stress. RNA-seq analyses in RBFox1c expressing cardiomyocytes showed that RBFox1c specifically suppressed the expression of pro-inflammatory genes. Secondly, CLIP-Seq analysis and targeted RNA-IP showed that RBFox1c could directly interact with inflammatory pathway mRNAs. These results suggested the inflammatory mRNAs are direct downstream targets regulated by RBFox1c. Using both in vitro cultured cardiomyocytes and intact mouse hearts, we demonstrated that expression of RBFox1c reduces pro-inflammatory mRNA expression at baseline and upon hypertrophy stimulation. Lastly, we characterized the interactome of RBFox1c through proteomic analysis and found RBFox1c specifically interacted with a component of the RNA NMD machinery-Upf1. RBFox1c interaction with Upf1 in cardiomyocytes was diminished upon hypertrophic stress. Furthermore, by inactivation of Upf1 via siRNA, we demonstrated that RBFox1c mediated repression of proinflammatory genes was Upf1 dependent.RBFox1 regulates cardiac transcriptome reprogramming in two post-transcriptional processes via distinct isoforms. While the RBFox1n regulates RNA splicing, the RBFox1c functions through targeted mRNA repression of proinflammatory genes by recruitment of Upf1 mediated RNA degradation.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Kip D. Zimmerman ◽  
Carl D. Langefeld

Abstract Background Study design is a critical aspect of any experiment, and sample size calculations for statistical power that are consistent with that study design are central to robust and reproducible results. However, the existing power calculators for tests of differential expression in single-cell RNA-seq data focus on the total number of cells and not the number of independent experimental units, the true unit of interest for power. Thus, current methods grossly overestimate the power. Results Hierarchicell is the first single-cell power calculator to explicitly simulate and account for the hierarchical correlation structure (i.e., within sample correlation) that exists in single-cell RNA-seq data. Hierarchicell, an R-package available on GitHub, estimates the within sample correlation structure from real data to simulate hierarchical single-cell RNA-seq data and estimate power for tests of differential expression. This multi-stage approach models gene dropout rates, intra-individual dispersion, inter-individual variation, variable or fixed number of cells per individual, and the correlation among cells within an individual. Without modeling the within sample correlation structure and without properly accounting for the correlation in downstream analysis, we demonstrate that estimates of power are falsely inflated. Hierarchicell can be used to estimate power for binary and continuous phenotypes based on user-specified number of independent experimental units (e.g., individuals) and cells within the experimental unit. Conclusions Hierarchicell is a user-friendly R-package that provides accurate estimates of power for testing hypotheses of differential expression in single-cell RNA-seq data. This R-package represents an important addition to single-cell RNA analytic tools and will help researchers design experiments with appropriate and accurate power, increasing discovery and improving robustness and reproducibility.


2021 ◽  
Author(s):  
Eun Seon Kim ◽  
Chang Geon Chung ◽  
Jeong Hyang Park ◽  
Byung Su Ko ◽  
Sung Soon Park ◽  
...  

Abstract RNA-binding proteins (RBPs) play essential roles in diverse cellular processes through post-transcriptional regulation of RNAs. The subcellular localization of RBPs is thus under tight control, the breakdown of which is associated with aberrant cytoplasmic accumulation of nuclear RBPs such as TDP-43 and FUS, well-known pathological markers for amyotrophic lateral sclerosis and frontotemporal dementia (ALS/FTD). Here, we report in Drosophila model for ALS/FTD that nuclear accumulation of a cytoplasmic RBP, Staufen, may be a new pathological feature. We found that in Drosophila C4da neurons expressing PR36, one of the arginine-rich dipeptide repeat proteins (DPRs), Staufen accumulated in the nucleus in Importin- and RNA-dependent manner. Notably, expressing Staufen with exogenous NLS—but not with mutated endogenous NLS—potentiated PR-induced dendritic defect, suggesting that nuclear-accumulated Staufen can enhance PR toxicity. PR36 expression increased Fibrillarin staining in the nucleolus, which was enhanced by heterozygous mutation of stau (stau+/−), a gene that codes Staufen. Furthermore, knockdown of fib, which codes Fibrillarin, exacerbated retinal degeneration mediated by PR toxicity, suggesting that increased amount of Fibrillarin by stau+/− is protective. Stau+/− also reduced the amount of PR-induced nuclear-accumulated Staufen and mitigated retinal degeneration and rescued viability of flies expressing PR36. Taken together, our data show that nuclear accumulation of Staufen in neurons may be an important pathological feature contributing to the pathogenesis of ALS/FTD.


Author(s):  
Irzam Sarfraz ◽  
Muhammad Asif ◽  
Joshua D Campbell

Abstract Motivation R Experiment objects such as the SummarizedExperiment or SingleCellExperiment are data containers for storing one or more matrix-like assays along with associated row and column data. These objects have been used to facilitate the storage and analysis of high-throughput genomic data generated from technologies such as single-cell RNA sequencing. One common computational task in many genomics analysis workflows is to perform subsetting of the data matrix before applying down-stream analytical methods. For example, one may need to subset the columns of the assay matrix to exclude poor-quality samples or subset the rows of the matrix to select the most variable features. Traditionally, a second object is created that contains the desired subset of assay from the original object. However, this approach is inefficient as it requires the creation of an additional object containing a copy of the original assay and leads to challenges with data provenance. Results To overcome these challenges, we developed an R package called ExperimentSubset, which is a data container that implements classes for efficient storage and streamlined retrieval of assays that have been subsetted by rows and/or columns. These classes are able to inherently provide data provenance by maintaining the relationship between the subsetted and parent assays. We demonstrate the utility of this package on a single-cell RNA-seq dataset by storing and retrieving subsets at different stages of the analysis while maintaining a lower memory footprint. Overall, the ExperimentSubset is a flexible container for the efficient management of subsets. Availability and implementation ExperimentSubset package is available at Bioconductor: https://bioconductor.org/packages/ExperimentSubset/ and Github: https://github.com/campbio/ExperimentSubset. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Jiaying Zhu ◽  
Changhao Li ◽  
Xu Peng ◽  
Xiuren Zhang

Abstract The majority of the genome is transcribed to RNA in living organisms. RNA transcripts can form astonishing arrays of secondary and tertiary structures via Watson-Crick, Hoogsteen or wobble base pairing. In vivo, RNA folding is not a simple thermodynamics event of minimizing free energy. Instead, the process is constrained by transcription, RNA binding proteins (RBPs), steric factors and micro-environment. RNA secondary structure (RSS) plays myriad roles in numerous biological processes, such as RNA processing, stability, transportation and translation in prokaryotes and eukaryotes. Emerging evidence has also implicated RSS in RNA trafficking, liquid-liquid phase separation and plant responses to environmental variations such as temperature and salinity. At the molecular level, RSS is correlated with regulating splicing, polyadenylation, protein systhsis, and miRNA biogenesis and functions. In this review, we summarized newly reported methods for probing RSS in vivo and functions and mechanisms of RSS in plant physiology.


Sign in / Sign up

Export Citation Format

Share Document