A general and powerful stage-wise testing procedure for differential expression and differential transcript usage

Detection of differential transcript usage (DTU) from RNA-seq data is an important bioinformatic analysis that complements differential gene expression analysis. Here we present a simple workflow using a set of existing R/Bioconductor packages for analysis of DTU. We show how these packages can be used downstream of RNA-seq quantification using the Salmon software package. The entire pipeline is fast, benefiting from inference steps by Salmon to quantify expression at the transcript level. The workflow includes live, runnable code chunks for analysis using DRIMSeq and DEXSeq, as well as for performing two-stage testing of DTU using the stageR package, a statistical framework to screen at the gene level and then confirm which transcripts within the significant genes show evidence of DTU. We evaluate these packages and other related packages on a simulated dataset with parameters estimated from real data.

Download Full-text

The Inheritance Procedure: Multiple Testing of Tree-structured Hypotheses

Statistical Applications in Genetics and Molecular Biology ◽

10.1515/1544-6115.1554 ◽

2012 ◽

Vol 11 (1) ◽

Cited By ~ 25

Author(s):

Jelle J. Goeman ◽

Livio Finos

Keyword(s):

Multiple Testing ◽

Error Control ◽

Testing Procedure ◽

Tree Structure ◽

Graph Structure ◽

Multiple Testing Procedure ◽

Natural Way ◽

Chromosome Level

Hypotheses tests in bioinformatics can often be set in a tree structure in a very natural way, e.g. when tests are performed at probe, gene, and chromosome level. Exploiting this graph structure in a multiple testing procedure may result in a gain in power or increased interpretability of the results.We present the inheritance procedure, a method of familywise error control for hypotheses structured in a tree. The method starts testing at the top of the tree, following up on those branches in which it finds significant results, and following up on leaf nodes in the neighborhood of those leaves. The method is a uniform improvement over a recently proposed method by Meinshausen. The inheritance procedure has been implemented in the globaltest package which is available on www.bioconductor.org.

Download Full-text

Is there sufficient scientific evidence to rule out the use of hydroxychloroquine for postexposure prophylaxis of COVID-19?

10.31219/osf.io/d9prq ◽

2020 ◽

Author(s):

Llorenç Quintó ◽

Jose Miguel Morales-Asencio ◽

Raquel González ◽

Clara Menéndez

Keyword(s):

Public Health ◽

Effect Size ◽

Statistical Power ◽

Scientific Evidence ◽

Post Exposure Prophylaxis ◽

Postexposure Prophylaxis ◽

Post Hoc Analysis ◽

Exposure Prophylaxis ◽

Post Hoc ◽

Rule Out

Since the beginning of the COVID-19 pandemic, the use of hydroxychloroquine (HCQ) has been surrounded by a lot of controversy, both scientific and non-scientific. This has continued with the publication of two trials of HCQ for post-exposure prophylaxis of the infection, which concluded that HCQ is not efficacious to prevent SARS-CoV-2 infection, and their results are influencing public health decisions.We have carried out a comprehensive post-hoc analysis of the statistical power of the two trials, which shows that their power to detect an effect of HCQ in preventing COVID-19 is low, not only for their observed effect size, but also for other clinically important levels of efficacy, and therefore both studies are inconclusive.

Download Full-text

Swimming downstream: statistical analysis of differential transcript usage following Salmon quantification

F1000Research ◽

10.12688/f1000research.15398.2 ◽

2018 ◽

Vol 7 ◽

pp. 952 ◽

Cited By ~ 2

Author(s):

Michael I. Love ◽

Charlotte Soneson ◽

Rob Patro

Keyword(s):

Software Package ◽

Gene Expression Analysis ◽

Real Data ◽

Transcript Level ◽

Bioinformatic Analysis ◽

Rna Seq ◽

Statistical Framework ◽

Gene Level ◽

Show Evidence ◽

Differential Gene

Detection of differential transcript usage (DTU) from RNA-seq data is an important bioinformatic analysis that complements differential gene expression analysis. Here we present a simple workflow using a set of existing R/Bioconductor packages for analysis of DTU. We show how these packages can be used downstream of RNA-seq quantification using the Salmon software package. The entire pipeline is fast, benefiting from inference steps by Salmon to quantify expression at the transcript level. The workflow includes live, runnable code chunks for analysis using DRIMSeq and DEXSeq, as well as for performing two-stage testing of DTU using the stageR package, a statistical framework to screen at the gene level and then confirm which transcripts within the significant genes show evidence of DTU. We evaluate these packages and other related packages on a simulated dataset with parameters estimated from real data.

Download Full-text

Differential transcript usage analysis of bulk and single-cell RNA-seq data with DTUrtle

Bioinformatics ◽

10.1093/bioinformatics/btab629 ◽

2021 ◽

Author(s):

Tobias Tekath ◽

Martin Dugas

Keyword(s):

Single Cell ◽

Transcript Level ◽

R Package ◽

Supplementary Information ◽

Data Sets ◽

Rna Seq ◽

Cell Type ◽

Gene Level ◽

Analysis Workflow ◽

Usage Analysis

Abstract Motivation Each year, the number of published bulk and single-cell RNA-seq data sets is growing exponentially. Studies analyzing such data are commonly looking at gene-level differences, while the collected RNA-seq data inherently represents reads of transcript isoform sequences. Utilizing transcriptomic quantifiers, RNA-seq reads can be attributed to specific isoforms, allowing for analysis of transcript-level differences. A differential transcript usage (DTU) analysis is testing for proportional differences in a gene’s transcript composition, and has been of rising interest for many research questions, such as analysis of differential splicing or cell type identification. Results We present the R package DTUrtle, the first DTU analysis workflow for both bulk and single-cell RNA-seq data sets, and the first package to conduct a ‘classical’ DTU analysis in a single-cell context. DTUrtle extends established statistical frameworks, offers various result aggregation and visualization options and a novel detection probability score for tagged-end data. It has been successfully applied to bulk and single-cell RNA-seq data of human and mouse, confirming and extending key results. Additionally, we present novel potential DTU applications like the identification of cell type specific transcript isoforms as biomarkers. Availability The R package DTUrtle is available at https://github.com/TobiTekath/DTUrtle with extensive vignettes and documentation at https://tobitekath.github.io/DTUrtle/. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Faculty Opinions recommendation of Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.726181976.793513319 ◽

2016 ◽

Author(s):

Wolfgang Huber

Keyword(s):

Transcript Level ◽

Rna Seq ◽

Gene Level

Download Full-text

FuSe: a tool to move RNA-Seq analyses from chromosomal/gene loci to functional grouping of mRNA transcripts

Bioinformatics ◽

10.1093/bioinformatics/btaa735 ◽

2020 ◽

Author(s):

Rajinder Gupta ◽

Yannick Schrooders ◽

Marcha Verheijen ◽

Adrian Roth ◽

Jos Kleinjans ◽

...

Keyword(s):

Transcript Level ◽

Supplementary Information ◽

Rna Seq ◽

Functional Changes ◽

Gene Level ◽

Secondary Structure Of Proteins ◽

The Impact ◽

Structure Of Proteins ◽

Functional Grouping

Abstract Summary Typical RNA sequencing (RNA-Seq) analyses are performed either at the gene level by summing all reads from the same locus, assuming that all transcripts from a gene make a protein or at the transcript level, assuming that each transcript displays unique function. However, these assumptions are flawed, as a gene can code for different types of transcripts and different transcripts are capable of synthesizing similar, different or no protein. As a consequence, functional changes are not well illustrated by either gene or transcript analyses. We propose to improve RNA-Seq analyses by grouping the transcripts based on their similar functions. We developed FuSe to predict functional similarities using the primary and secondary structure of proteins. To estimate the likelihood of proteins with similar functions, FuSe computes two confidence scores: knowledge (KS) and discovery (DS) for protein pairs. Overlapping protein pairs exhibiting high confidence are grouped to form ‘similar function protein groups’ and expression is calculated for each functional group. The impact of using FuSe is demonstrated on in vitro cells exposed to paracetamol, which highlight genes responsible for cell adhesion and glycogen regulation which were earlier shown to be not differentially expressed with traditional analysis methods. Availability and implementation The source code is available at https://github.com/rajinder4489/FuSe. Data for APAP exposure are available in the BioStudies database (http://www.ebi.ac.uk/biostudies) under accession numbers S-HECA143, S-HECA(158) and S-HECA139. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences

F1000Research ◽

10.12688/f1000research.7563.2 ◽

2016 ◽

Vol 4 ◽

pp. 1521 ◽

Cited By ~ 268

Author(s):

Charlotte Soneson ◽

Michael I. Love ◽

Mark D. Robinson

Keyword(s):

Statistical Inference ◽

High Throughput Sequencing ◽

Real Data ◽

Transcript Level ◽

R Package ◽

Data Sets ◽

Rna Seq ◽

Abundance Estimates ◽

Gene Level ◽

Genomic Regions

High-throughput sequencing of cDNA (RNA-seq) is used extensively to characterize the transcriptome of cells. Many transcriptomic studies aim at comparing either abundance levels or the transcriptome composition between given conditions, and as a first step, the sequencing reads must be used as the basis for abundance quantification of transcriptomic features of interest, such as genes or transcripts. Various quantification approaches have been proposed, ranging from simple counting of reads that overlap given genomic regions to more complex estimation of underlying transcript abundances. In this paper, we show that gene-level abundance estimates and statistical inference offer advantages over transcript-level analyses, in terms of performance and interpretability. We also illustrate that the presence of differential isoform usage can lead to inflated false discovery rates in differential gene expression analyses on simple count matrices but that this can be addressed by incorporating offsets derived from transcript-level abundance estimates. We also show that the problem is relatively minor in several real data sets. Finally, we provide an R package (tximport) to help users integrate transcript-level abundance estimates from common quantification pipelines into count-based statistical inference engines.

Download Full-text

Swimming downstream: statistical analysis of differential transcript usage following Salmon quantification

F1000Research ◽

10.12688/f1000research.15398.3 ◽

2018 ◽

Vol 7 ◽

pp. 952 ◽

Cited By ~ 8

Author(s):

Michael I. Love ◽

Charlotte Soneson ◽

Rob Patro

Keyword(s):

Software Package ◽

Gene Expression Analysis ◽

Real Data ◽

Transcript Level ◽

Bioinformatic Analysis ◽

Rna Seq ◽

Statistical Framework ◽

Gene Level ◽

Show Evidence ◽

Differential Gene

Detection of differential transcript usage (DTU) from RNA-seq data is an important bioinformatic analysis that complements differential gene expression analysis. Here we present a simple workflow using a set of existing R/Bioconductor packages for analysis of DTU. We show how these packages can be used downstream of RNA-seq quantification using the Salmon software package. The entire pipeline is fast, benefiting from inference steps by Salmon to quantify expression at the transcript level. The workflow includes live, runnable code chunks for analysis using DRIMSeq and DEXSeq, as well as for performing two-stage testing of DTU using the stageR package, a statistical framework to screen at the gene level and then confirm which transcripts within the significant genes show evidence of DTU. We evaluate these packages and other related packages on a simulated dataset with parameters estimated from real data.

Download Full-text