A tail-based test to detect differential expression in RNA-sequencing data

2020 ◽  
pp. 096228022095190
Author(s):  
Jiong Chen ◽  
Xinlei Mi ◽  
Jing Ning ◽  
Xuming He ◽  
Jianhua Hu

RNA sequencing data have been abundantly generated in biomedical research for biomarker discovery and other studies. Such data at the exon level are usually heavily tailed and correlated. Conventional statistical tests based on the mean or median difference for differential expression likely suffer from low power when the between-group difference occurs mostly in the upper or lower tail of the distribution of gene expression. We propose a tail-based test to make comparisons between groups in terms of a specific distribution area rather than a single location. The proposed test, which is derived from quantile regression, adjusts for covariates and accounts for within-sample dependence among the exons through a specified correlation structure. Through Monte Carlo simulation studies, we show that the proposed test is generally more powerful and robust in detecting differential expression than commonly used tests based on the mean or a single quantile. An application to TCGA lung adenocarcinoma data demonstrates the promise of the proposed method in terms of biomarker discovery.

2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Yang Wang ◽  
Chengjian Han ◽  
Rongsheng zhou ◽  
Jinjin Zhu ◽  
Famin Zhang ◽  
...  

Abstract Background The predominant genotype of Toxoplasma in China is the Chinese 1 (ToxoDB#9) lineage. TgCtwh3 and TgCtwh6 are two representative strains of Chinese 1, exhibiting high and low virulence to mice, respectively. Little is known regarding the virulence mechanism of this non-classical genotype. Our previous RNA sequencing data revealed differential mRNA levels of TgMIC1 in TgCtwh3 and TgCtwh6. We aim to further confirm the differential expression of TgMIC1 and its significance in this atypical genotype. Methods Quantitative real-time PCR was used to verify the RNA sequencing data; then, polyclonal antibodies against TgMIC1 were prepared and identified. Moreover, the invasion and proliferation of the parasite in HFF cells were observed after treatment with TgMIC1 polyclonal antibody or not. Results The data showed that the protein level of TgMIC1 was significantly higher in high-virulence strain TgCtwh3 than in low-virulence strain TgCtwh6 and that the invasion and proliferation of TgCtwh3 were inhibited by TgMIC1 polyclonal antibody. Conclusion Differential expression of TgMIC1 in TgCtwh3 and TgCtwh6 may explain, at least partly, the virulence mechanism of this atypical genotype.


2018 ◽  
Author(s):  
Fatemeh Gholizadeh ◽  
Zahra Salehi ◽  
Ali Mohammad banaei-Moghaddam ◽  
Abbas Rahimi Foroushani ◽  
Kaveh kavousi

AbstractWith the advent of the Next Generation Sequencing technologies, RNA-seq has become known as an optimal approach for studying gene expression profiling. Particularly, time course RNA-seq differential expression analysis has been used in many studies to identify candidate genes. However, applying a statistical method to efficiently identify differentially expressed genes (DEGs) in time course studies is challenging due to inherent characteristics of such data including correlation and dependencies over time. Here we aim to relatively compare EBSeq-HMM, a Hidden Markov-based model, with multiDE, a Log-Linear-based model, in a real time course RNA sequencing data. In order to conduct the comparison, common DEGs detected by edgeR, DESeq2 and Voom (referred to as Benchmark DEGs) were utilized as a measure. Each of the two models were compared using different normalization methods. The findings revealed that multiDE identified more Benchmark DEGs and showed a higher agreement with them than EBSeq-HMM. Furthermore, multiDE and EBSeq-HMM displayed their best performance using TMM and Upper-Quartile normalization methods, respectively.


F1000Research ◽  
2016 ◽  
Vol 5 ◽  
pp. 1408 ◽  
Author(s):  
Charity W. Law ◽  
Monther Alhamdoosh ◽  
Shian Su ◽  
Gordon K. Smyth ◽  
Matthew E. Ritchie

The ability to easily and efficiently analyse RNA-sequencing data is a key strength of the Bioconductor project. Starting with counts summarised at the gene-level, a typical analysis involves pre-processing, exploratory data analysis, differential expression testing and pathway analysis with the results obtained informing future experiments and validation studies. In this workflow article, we analyse RNA-sequencing data from the mouse mammary gland, demonstrating use of the popular edgeR package to import, organise, filter and normalise the data, followed by the limma package with its voom method, linear modelling and empirical Bayes moderation to assess differential expression and perform gene set testing. This pipeline is further enhanced by the Glimma package which enables interactive exploration of the results so that individual samples and genes can be examined by the user. The complete analysis offered by these three packages highlights the ease with which researchers can turn the raw counts from an RNA-sequencing experiment into biological insights using Bioconductor.


F1000Research ◽  
2018 ◽  
Vol 5 ◽  
pp. 1408 ◽  
Author(s):  
Charity W. Law ◽  
Monther Alhamdoosh ◽  
Shian Su ◽  
Xueyi Dong ◽  
Luyi Tian ◽  
...  

The ability to easily and efficiently analyse RNA-sequencing data is a key strength of the Bioconductor project. Starting with counts summarised at the gene-level, a typical analysis involves pre-processing, exploratory data analysis, differential expression testing and pathway analysis with the results obtained informing future experiments and validation studies. In this workflow article, we analyse RNA-sequencing data from the mouse mammary gland, demonstrating use of the popular edgeR package to import, organise, filter and normalise the data, followed by the limma package with its voom method, linear modelling and empirical Bayes moderation to assess differential expression and perform gene set testing. This pipeline is further enhanced by the Glimma package which enables interactive exploration of the results so that individual samples and genes can be examined by the user. The complete analysis offered by these three packages highlights the ease with which researchers can turn the raw counts from an RNA-sequencing experiment into biological insights using Bioconductor.


2014 ◽  
Vol 42 (11) ◽  
pp. e91-e91 ◽  
Author(s):  
Xiaobei Zhou ◽  
Helen Lindsay ◽  
Mark D. Robinson

F1000Research ◽  
2016 ◽  
Vol 5 ◽  
pp. 1408 ◽  
Author(s):  
Charity W. Law ◽  
Monther Alhamdoosh ◽  
Shian Su ◽  
Gordon K. Smyth ◽  
Matthew E. Ritchie

The ability to easily and efficiently analyse RNA-sequencing data is a key strength of the Bioconductor project. Starting with counts summarised at the gene-level, a typical analysis involves pre-processing, exploratory data analysis, differential expression testing and pathway analysis with the results obtained informing future experiments and validation studies. In this workflow article, we analyse RNA-sequencing data from the mouse mammary gland, demonstrating use of the popular edgeR package to import, organise, filter and normalise the data, followed by the limma package with its voom method, linear modelling and empirical Bayes moderation to assess differential expression and perform gene set testing. This pipeline is further enhanced by the Glimma package which enables interactive exploration of the results so that individual samples and genes can be examined by the user. The complete analysis offered by these three packages highlights the ease with which researchers can turn the raw counts from an RNA-sequencing experiment into biological insights using Bioconductor.


2021 ◽  
Author(s):  
Yang Wang ◽  
Chengjian Han ◽  
Rongsheng Zhou ◽  
Jinjin Zhu ◽  
Famin Zhang ◽  
...  

Abstract The predominant genotype of T. gondii in China is Chinese 1 (ToxoDB#9) lineage. TgCtwh3 and TgCtwh6 are two representative strains of Chinese 1, exhibiting high virulence and low virulence to mice, respectively. Little is known about the virulence mechanism of this non classical genotype. Our previous RNA sequencing data revealed that differential mRNA level of TgMIC1 in TgCtwh3 and TgCtwh6. To further confirm the differential expression of TgMIC1 and its significance in this atypical genotype, quantitative real-time PCR was used to verify the RNA sequencing data firstly, and then polyclonal antibodies against TgMIC1 were prepared and identified. Moreover, the invasion and proliferation of the parasite in HFF cells were observed after treatment with TgMIC1 polyclonal antibody or not. The data showed the protein level of TgMIC1 was significantly higher in high virulence strain TgCtwh3 than that in low virulence strain TgCtwh6, and the invasion and proliferation of TgCtwh3 were inhibited by TgMIC1 polyclonal antibody. Differential expression of TgMIC1 in TgCtwh3 and TgCtwh6 may explain, at least partly, the virulence mechanism of this atypical genotype


2021 ◽  
Author(s):  
Gerard A. Bouland ◽  
Ahmed Mahfouz ◽  
Marcel J.T. Reinders

AbstractSingle-cell RNA sequencing data is characterized by a large number of zero counts, yet there is growing evidence that these zeros reflect biological rather than technical artifacts. We propose differential dropout analysis (DDA), as an alternative to differential expression analysis (DEA), to identify the effects of biological variation in single-cell RNA sequencing data. Using 16 publicly available datasets, we show that dropout patterns are biological in nature and can assess the relative abundance of transcripts more robustly than counts.


Sign in / Sign up

Export Citation Format

Share Document