BiSEK: a platform for a reliable differential expression analysis

ABSTRACTDifferential Expression Analysis (DEA) of RNA-sequencing data is frequently performed for detecting key genes, affected across different conditions. Although DEA-workflows are well established, preceding reliability-testing of the input material, which is crucial for consistent and strong results, is challenging and less straightforward. Here we present Biological Sequence Expression Kit (BiSEK), a graphical user interface-based platform for DEA, dedicated to a reliable inquiry. BiSEK is based on a novel algorithm to track discrepancies between the data and the statistical model design. Moreover, BiSEK enables differential-expression analysis of groups of genes, to identify affected pathways, without relying on the significance of genes comprising them. Using BiSEK, we were able to improve previously conducted analysis, aimed to detect genes affected by FUBP1 depletion in chronic myeloid leukemia cells of mice bone-marrow. We found affected genes that are related to the regulation of apoptosis, supporting in-vivo experimental findings. We further tested the host response following SARS-CoV-2 infection. We identified a substantial interferon-I reaction and low expression levels of TLR3, an inducer of interferon-III (IFN-III) production, upon infection with SARS-CoV-2 compared to other respiratory viruses. This finding may explain the low IFN-III response upon SARS-CoV-2 infection. BiSEK is open-sourced, available as a web-interface.

Download Full-text

Accounting for technical noise in differential expression analysis of single-cell RNA sequencing data

Nucleic Acids Research ◽

10.1093/nar/gkx754 ◽

2017 ◽

Vol 45 (19) ◽

pp. 10978-10988 ◽

Cited By ~ 26

Author(s):

Cheng Jia ◽

Yu Hu ◽

Derek Kelly ◽

Junhyong Kim ◽

Mingyao Li ◽

...

Keyword(s):

Single Cell ◽

Differential Expression ◽

Rna Sequencing ◽

Expression Analysis ◽

Differential Expression Analysis ◽

Sequencing Data ◽

Technical Noise ◽

Single Cell Rna Sequencing

Download Full-text

The long and the short of it: unlocking nanopore long-read RNA sequencing data with short-read differential expression analysis tools

NAR Genomics and Bioinformatics ◽

10.1093/nargab/lqab028 ◽

2021 ◽

Vol 3 (2) ◽

Author(s):

Xueyi Dong ◽

Luyi Tian ◽

Quentin Gouil ◽

Hasaru Kariyawasam ◽

Shian Su ◽

...

Keyword(s):

Differential Expression ◽

Expression Analysis ◽

Differential Expression Analysis ◽

Transcriptomic Analysis ◽

Statistical Testing ◽

Rna Seq ◽

Sequencing Data ◽

Short Read ◽

Sequencing Platform ◽

Long Read

Abstract Application of Oxford Nanopore Technologies’ long-read sequencing platform to transcriptomic analysis is increasing in popularity. However, such analysis can be challenging due to the high sequence error and small library sizes, which decreases quantification accuracy and reduces power for statistical testing. Here, we report the analysis of two nanopore RNA-seq datasets with the goal of obtaining gene- and isoform-level differential expression information. A dataset of synthetic, spliced, spike-in RNAs (‘sequins’) as well as a mouse neural stem cell dataset from samples with a null mutation of the epigenetic regulator Smchd1 was analysed using a mix of long-read specific tools for preprocessing together with established short-read RNA-seq methods for downstream analysis. We used limma-voom to perform differential gene expression analysis, and the novel FLAMES pipeline to perform isoform identification and quantification, followed by DRIMSeq and limma-diffSplice (with stageR) to perform differential transcript usage analysis. We compared results from the sequins dataset to the ground truth, and results of the mouse dataset to a previous short-read study on equivalent samples. Overall, our work shows that transcriptomic analysis of long-read nanopore data using long-read specific preprocessing methods together with short-read differential expression methods and software that are already in wide use can yield meaningful results.

Download Full-text

Powerful differential expression analysis incorporating network topology for next-generation sequencing data

Bioinformatics ◽

10.1093/bioinformatics/btw833 ◽

2017 ◽

Vol 33 (10) ◽

pp. 1505-1513 ◽

Cited By ~ 10

Author(s):

Malathi S.I Dona ◽

Luke A Prendergast ◽

Suresh Mathivanan ◽

Shivakumar Keerthikumar ◽

Agus Salim

Keyword(s):

Next Generation Sequencing ◽

Differential Expression ◽

Expression Analysis ◽

Network Topology ◽

Differential Expression Analysis ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

Generation Sequencing

Download Full-text

Impulse model-based differential expression analysis of time course sequencing data

Nucleic Acids Research ◽

10.1093/nar/gky675 ◽

2018 ◽

Cited By ~ 6

Author(s):

David S Fischer ◽

Fabian J Theis ◽

Nir Yosef

Keyword(s):

Differential Expression ◽

Expression Analysis ◽

Time Course ◽

Differential Expression Analysis ◽

Sequencing Data ◽

Model Based ◽

Course Sequencing ◽

Impulse Model

Download Full-text

Comprehensive processing of high-throughput small RNA sequencing data including quality checking, normalization, and differential expression analysis using the UEA sRNA Workbench

RNA ◽

10.1261/rna.059360.116 ◽

2017 ◽

Vol 23 (6) ◽

pp. 823-835 ◽

Cited By ~ 21

Author(s):

Matthew Beckers ◽

Irina Mohorianu ◽

Matthew Stocks ◽

Christopher Applegate ◽

Tamas Dalmay ◽

...

Keyword(s):

Differential Expression ◽

Rna Sequencing ◽

Expression Analysis ◽

High Throughput ◽

Small Rna ◽

Differential Expression Analysis ◽

Small Rna Sequencing ◽

Sequencing Data ◽

Comprehensive Processing

Download Full-text

A Comparison of Methods: Normalizing High-Throughput RNA Sequencing Data

10.1101/026062 ◽

2015 ◽

Cited By ~ 2

Author(s):

Rahul Reddy

Keyword(s):

Differential Expression ◽

Expression Analysis ◽

High Throughput ◽

High Throughput Sequencing ◽

Differential Expression Analysis ◽

Simulated Data ◽

Sequencing Data ◽

Technical Variability ◽

Expression Studies ◽

Normalization Methods

As RNA-Seq and other high-throughput sequencing grow in use and remain critical for gene expression studies, technical variability in counts data impedes studies of differential expression studies, data across samples and experiments, or reproducing results. Studies like Dillies et al. (2013) compare several between-lane normalization methods involving scaling factors, while Hansen et al. (2012) and Risso et al. (2014) propose methods that correct for sample-specific bias or use sets of control genes to isolate and remove technical variability. This paper evaluates four normalization methods in terms of reducing intra-group, technical variability and facilitating differential expression analysis or other research where the biological, inter-group variability is of interest. To this end, the four methods were evaluated in differential expression analysis between data from Pickrell et al. (2010) and Montgomery et al. (2010) and between simulated data modeled on these two datasets. Though the between-lane scaling factor methods perform worse on real data sets, they are much stronger for simulated data. We cannot reject the recommendation of Dillies et al. to use TMM and DESeq normalization, but further study of power to detect effects of different size under each normalization method is merited.

Download Full-text

DEApp: an interactive web interface for differential expression analysis of next generation sequence data

Source Code for Biology and Medicine ◽

10.1186/s13029-017-0063-4 ◽

2017 ◽

Vol 12 (1) ◽

Cited By ~ 29

Author(s):

Yan Li ◽

Jorge Andrade

Keyword(s):

Differential Expression ◽

Expression Analysis ◽

Sequence Data ◽

Differential Expression Analysis ◽

Next Generation ◽

Web Interface

Download Full-text

An effective differential expression analysis of deep-sequencing data based on the Poisson log-normal model

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720015500018 ◽

2015 ◽

Vol 13 (02) ◽

pp. 1550001 ◽

Cited By ~ 1

Author(s):

Jun Wu ◽

Xiaodong Zhao ◽

Zongli Lin ◽

Zhifeng Shao

Keyword(s):

Parameter Estimation ◽

Differential Expression ◽

Expression Analysis ◽

Deep Sequencing ◽

Differential Expression Analysis ◽

Biomedical Science ◽

Sequencing Data ◽

Discrimination Ability ◽

Deep Sequencing Data ◽

Log Normal

Tremendous amount of deep-sequencing data has unprecedentedly improved our understanding in biomedical science by digital sequence reads. To mine useful information from such data, a proper distribution for modeling all range of the count data and accurate parameter estimation are required. In this paper, we propose a method, called "DEPln," for differential expression analysis based on the Poisson log-normal (PLN) distribution with an accurate parameter estimation strategy, which aims to overcome the inconvenience in the mathematical analysis of the traditional PLN distribution. The performance of our proposed method is validated by both synthetic and real data. Experimental results indicate that our method outperforms the traditional methods in terms of the discrimination ability and results in a good tradeoff between the recall rate and the precision. Thus, our work provides a new approach for gene expression analysis and has strong potential in deep-sequencing based research.

Download Full-text

Differential gene expression analysis tools exhibit substandard performance for long non-coding RNA–sequencing data

10.1101/220129 ◽

2017 ◽

Cited By ~ 2

Author(s):

Alemu Takele Assefa ◽

Katrijn De Paepe ◽

Celine Everaert ◽

Pieter Mestdagh ◽

Olivier Thas ◽

...

Keyword(s):

Gene Expression ◽

Differential Expression ◽

Expression Analysis ◽

Web Application ◽

Empirical Bayes ◽

Performance Metrics ◽

Differential Expression Analysis ◽

Rna Seq ◽

Sequencing Data ◽

Normalization Methods

ABSTRACTBackgroundProtein-coding RNAs (mRNA) have been the primary target of most transcriptome studies in the past, but in recent years, attention has expanded to include long non-coding RNAs (lncRNA). lncRNAs are typically expressed at low levels, and are inherently highly variable. This is a fundamental challenge for differential expression (DE) analysis. In this study, the performance of 14 popular tools for testing DE in RNA-seq data along with their normalization methods is comprehensively evaluated, with a particular focus on lncRNAs and low abundant mRNAs.ResultsThirteen performance metrics were used to evaluate DE tools and normalization methods using simulations and analyses of six diverse RNA-seq datasets. Non-parametric procedures are used to simulate gene expression data in such a way that realistic levels of expression and variability are preserved in the simulated data. Throughout the assessment, we kept track of the results for mRNA and lncRNA separately. All statistical models exhibited inferior performance for lncRNAs compared to mRNAs across all simulated scenarios and analysis of benchmark RNA-seq datasets. No single tool uniformly outperformed the others.ConclusionOverall, the linear modeling with empirical Bayes moderation (limma) and the nonparametric approach (SAMSeq) showed best performance: good control of the false discovery rate (FDR) and reasonable sensitivity. However, for achieving a sensitivity of at least 50%, more than 80 samples are required when studying expression levels in a realistic clinical settings such as in cancer research. About half of the methods showed severe excess of false discoveries, making these methods unreliable for differential expression analysis and jeopardizing reproducible science. The detailed results of our study can be consulted through a user-friendly web application, http://statapps.ugent.be/tools/AppDGE/

Download Full-text

Differential expression analysis methods for ribonucleic acid-sequencing data

OA Bioinformatics ◽

10.13172/2054-1899-1-1-678 ◽

2013 ◽

Vol 1 (1) ◽

Cited By ~ 2

Author(s):

AM Eteleeb ◽

EC Rouchka

Keyword(s):

Differential Expression ◽

Expression Analysis ◽

Ribonucleic Acid ◽

Differential Expression Analysis ◽

Sequencing Data ◽

Analysis Methods

Download Full-text