scholarly journals Union Exon Based Approach for RNA-Seq Gene Quantification: To Be or Not to Be?

PLoS ONE ◽  
2015 ◽  
Vol 10 (11) ◽  
pp. e0141910 ◽  
Author(s):  
Shanrong Zhao ◽  
Li Xi ◽  
Baohong Zhang
Keyword(s):  
Rna Seq ◽  
2020 ◽  
Author(s):  
Eliah G. Overbey ◽  
Amanda M. Saravia-Butler ◽  
Zhe Zhang ◽  
Komal S. Rathi ◽  
Homer Fogle ◽  
...  

SummaryWith the development of transcriptomic technologies, we are able to quantify precise changes in gene expression profiles from astronauts and other organisms exposed to spaceflight. Members of NASA GeneLab and GeneLab-associated analysis working groups (AWGs) have developed a consensus pipeline for analyzing short-read RNA-sequencing data from spaceflight-associated experiments. The pipeline includes quality control, read trimming, mapping, and gene quantification steps, culminating in the detection of differentially expressed genes. This data analysis pipeline and the results of its execution using data submitted to GeneLab are now all publicly available through the GeneLab database. We present here the full details and rationale for the construction of this pipeline in order to promote transparency, reproducibility and reusability of pipeline data, to provide a template for data processing of future spaceflight-relevant datasets, and to encourage cross-analysis of data from other databases with the data available in GeneLab.


GigaScience ◽  
2019 ◽  
Vol 8 (12) ◽  
Author(s):  
Hong Zheng ◽  
Kevin Brennan ◽  
Mikel Hernaez ◽  
Olivier Gevaert

Abstract Background Long non-coding RNAs (lncRNAs) are emerging as important regulators of various biological processes. While many studies have exploited public resources such as RNA sequencing (RNA-Seq) data in The Cancer Genome Atlas to study lncRNAs in cancer, it is crucial to choose the optimal method for accurate expression quantification. Results In this study, we compared the performance of pseudoalignment methods Kallisto and Salmon, alignment-based transcript quantification method RSEM, and alignment-based gene quantification methods HTSeq and featureCounts, in combination with read aligners STAR, Subread, and HISAT2, in lncRNA quantification, by applying them to both un-stranded and stranded RNA-Seq datasets. Full transcriptome annotation, including protein-coding and non-coding RNAs, greatly improves the specificity of lncRNA expression quantification. Pseudoalignment methods and RSEM outperform HTSeq and featureCounts for lncRNA quantification at both sample- and gene-level comparison, regardless of RNA-Seq protocol type, choice of aligners, and transcriptome annotation. Pseudoalignment methods and RSEM detect more lncRNAs and correlate highly with simulated ground truth. On the contrary, HTSeq and featureCounts often underestimate lncRNA expression. Antisense lncRNAs are poorly quantified by alignment-based gene quantification methods, which can be improved using stranded protocols and pseudoalignment methods. Conclusions Considering the consistency with ground truth and computational resources, pseudoalignment methods Kallisto or Salmon in combination with full transcriptome annotation is our recommended strategy for RNA-Seq analysis for lncRNAs.


2018 ◽  
Vol 8 (1) ◽  
Author(s):  
Shanrong Zhao ◽  
Ying Zhang ◽  
Ramya Gamini ◽  
Baohong Zhang ◽  
David von Schack

2018 ◽  
Author(s):  
Douglas C. Wu ◽  
Jun Yao ◽  
Kevin S. Ho ◽  
Alan M. Lambowitz ◽  
Claus O. Wilke

AbstractBackgroundAlignment-free RNA quantification tools have significantly increased the speed of RNA-seq analysis. However, it is unclear whether these state-of-the-art RNA-seq analysis pipelines can quantify small RNAs as accurately as they do with long RNAs in the context of total RNA quantification.ResultWe comprehensively tested and compared four RNA-seq pipelines on the accuracies of gene quantification and fold-change estimation on a novel total RNA benchmarking dataset, in which small non-coding RNAs are highly represented along with other long RNAs. The four RNA-seq pipelines were of two commonly-used alignment-free pipelines and two variants of alignment-based pipelines. We found that all pipelines showed high accuracies for quantifying the expressions of long and highly-abundant genes. However, alignment-free pipelines showed systematically poorer performances in quantifying lowly-abundant and small RNAs.ConclusionWe have shown that alignment-free and traditional alignment-based quantification methods performed similarly for common gene targets, such as protein-coding genes. However, we identified a potential pitfall in analyzing and quantifying lowly-expressed genes and small RNAs with alignment-free pipelines, especially when these small RNAs contain mutations.


Author(s):  
Avi Srivastava ◽  
Mohsen Zakeri ◽  
Hirak Sarkar ◽  
Charlotte Soneson ◽  
Carl Kingsford ◽  
...  

AbstractTranscript and gene quantification is the first step in many RNA-seq analyses. While many factors and properties of experimental RNA-seq data likely contribute to differences in accuracy between various approaches to quantification, it has been demonstrated (1) that quantification accuracy generally benefits from considering, during alignment, potential genomic origins for sequenced fragments that reside outside of the annotated transcriptome.Recently, Varabyou et al. (2) demonstrated that the presence of transcriptional noise leads to systematic errors in the ability of tools — particularly annotation-based ones — to accurately estimate transcript expression. Here, we confirm the findings of Varabyou et al. (2) using the simulation framework they have provided. Using the same data, we also examine the methodology of Srivastava et al.(1) as implemented in recent versions of salmon (3), and show that it substantially enhances the accuracy of annotation-based transcript quantification in these data.


Sign in / Sign up

Export Citation Format

Share Document