scholarly journals WemIQ: an accurate and robust isoform quantification method for RNA-seq data

2014 ◽  
Vol 31 (6) ◽  
pp. 878-885 ◽  
Author(s):  
Jing Zhang ◽  
C.-C. Jay Kuo ◽  
Liang Chen
2018 ◽  
Vol 19 (1) ◽  
Author(s):  
Jennifer Westoby ◽  
Marcela Sjöberg Herrera ◽  
Anne C. Ferguson-Smith ◽  
Martin Hemberg

2019 ◽  
Vol 36 (8) ◽  
pp. 2466-2473 ◽  
Author(s):  
Jiao Sun ◽  
Jae-Woong Chang ◽  
Teng Zhang ◽  
Jeongsik Yong ◽  
Rui Kuang ◽  
...  

Abstract Motivation Accurate estimation of transcript isoform abundance is critical for downstream transcriptome analyses and can lead to precise molecular mechanisms for understanding complex human diseases, like cancer. Simplex mRNA Sequencing (RNA-Seq) based isoform quantification approaches are facing the challenges of inherent sampling bias and unidentifiable read origins. A large-scale experiment shows that the consistency between RNA-Seq and other mRNA quantification platforms is relatively low at the isoform level compared to the gene level. In this project, we developed a platform-integrated model for transcript quantification (IntMTQ) to improve the performance of RNA-Seq on isoform expression estimation. IntMTQ, which benefits from the mRNA expressions reported by the other platforms, provides more precise RNA-Seq-based isoform quantification and leads to more accurate molecular signatures for disease phenotype prediction. Results In the experiments to assess the quality of isoform expression estimated by IntMTQ, we designed three tasks for clustering and classification of 46 cancer cell lines with four different mRNA quantification platforms, including newly developed NanoString’s nCounter technology. The results demonstrate that the isoform expressions learned by IntMTQ consistently provide more and better molecular features for downstream analyses compared with five baseline algorithms which consider RNA-Seq data only. An independent RT-qPCR experiment on seven genes in twelve cancer cell lines showed that the IntMTQ improved overall transcript quantification. The platform-integrated algorithms could be applied to large-scale cancer studies, such as The Cancer Genome Atlas (TCGA), with both RNA-Seq and array-based platforms available. Availability and implementation Source code is available at: https://github.com/CompbioLabUcf/IntMTQ. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Yu Hu ◽  
Li Fang ◽  
Xuelian Chen ◽  
Jiang F. Zhong ◽  
Mingyao Li ◽  
...  

AbstractLong-read RNA sequencing (RNA-seq) technologies can sequence full-length transcripts, facilitating the exploration of isoform-specific gene expression over short-read RNA-seq. We present LIQA to quantify isoform expression and detect differential alternative splicing (DAS) events using long-read direct mRNA sequencing or cDNA sequencing data. LIQA incorporates base pair quality score and isoform-specific read length information in a survival model to assign different weights across reads, and uses an expectation-maximization algorithm for parameter estimation. We apply LIQA to long-read RNA-seq data from the Universal Human Reference, acute myeloid leukemia, and esophageal squamous epithelial cells and demonstrate its high accuracy in profiling alternative splicing events.


Author(s):  
Chi Zhang ◽  
Baohong Zhang ◽  
Michael S Vincent ◽  
Shanrong Zhao

2019 ◽  
Author(s):  
Wei Zhang ◽  
Raphael Petegrosso ◽  
Jae Woong Chang ◽  
Jiao Sun ◽  
Jeongsik Yong ◽  
...  

Abstract Background: Most eukaryotic genes produce different transcripts of multiple isoforms by inclusion or exclusion of particular exons. The isoforms of a gene often play diverse functional roles and thus it is necessary to accurately measure isoform expressions as well as gene expressions. While previous studies have demonstrated the strong agreement between mRNA-sequencing (RNA-seq) and array-based gene and/or isoform quantification platforms (Microarray gene expression and Exon-array), the more recently developed NanoString platform has not been systematically evaluated and compared, especially in large-scale studies across different cancer domains. Results: In this paper, we present a large-scale comparative study among RNA-seq, NanoString, array-based, and RT-qPCR platforms using 46 cancer cell lines across different cancer types. The goal is to understand and evaluate the calibers of the platforms for measuring gene and isoform expressions in cancer studies. We first performed NanoString experiments on 59 cancer cell lines with 403 custom-designed probes for measuring the expressions of 478 isoforms in 155 genes and additional RT-qPCR experiments for a subset of the measured isoforms in 13 cell lines. We then combined the data with the matched RNA-seq, Exon-array and Microarray data of 46 of the 59 cell lines for the comparative analysis. Conclusion: In the comparisons of the platforms for evaluating expressions at both isoform and gene levels, we found that (1) the degree of agreement across platforms on quantifying isoform expressions is lower than gene expressions; (2) NanoString and Exon-array are not consistent on isoform quantification even though both techniques are based on hybridization reactions; (3) RT-qPCR experiments are more consistent with RNA-seq and Exon-array quantification results on isoform-level compared to NanoString; (4) different RNA-seq isoform quantification algorithms showed inconsistent results, and two isoform quantification methods Net-RSTQ and eXpress are more consistent across the platforms in the comparison; and (5) RNA-seq has the best overall consistency with the other platforms on gene expression quantification.


2019 ◽  
Author(s):  
Wei Zhang ◽  
Raphael Petegrosso ◽  
Jae Woong Chang ◽  
Jeongsik Yong ◽  
Jeremy Chien ◽  
...  

Abstract Background: Most eukaryotic genes produce different transcripts of multiple isoforms by inclusion or exclusion of particular exons. The isoforms of a gene often play diverse functional roles and thus, it is necessary to accurately measure isoform expressions as well as the genes'. While previous studies have demonstrated the strong agreement between mRNA-sequencing (RNA-seq) and array-based gene and/or isoform quantification platforms (Microarray gene expression and Exon-array), the more recently developed NanoString platform has not been systematically evaluated and compared, especially in large-scale studies across different cancer domains. Results: In this paper, we present a large-scale comparative study among RNA-seq, NanoString, array-based and RT-qPCR platforms using 46 cancer cell lines across different cancer types to understand and evaluate the calibers of the platforms for measuring gene and isoform expressions in cancer studies. We first performed NanoString experiments on 59 cancer cell lines with 403 custom-designed probes for measuring the expressions of 405 isoforms in 155 genes and additional RT-qPCR experiments for a subset of the measured isoforms in 13 cell lines, and then combined the data with the matched RNA-seq, Exon-array and Microarray data of 46 of the 59 cell lines for the comparative analysis. Conclusion: In the comparisons of the platforms for evaluating expressions at both isoform and gene levels, we found that (1) the degree of agreement across platforms on quantifying isoform expressions is lower than gene expressions; (2) NanoString and Exon-array are not consistent on isoform quantification even though both techniques are based on hybridization reactions; (3) RT-qPCR experiments are more consistent with RNA-seq quantification results on isoform-level compared to NanoString and Exon-array; (4) different RNA-seq isoform quantification algorithms showed inconsistent results, and two isoform quantification methods Net-RSTQ and eXpress are more consistent across the platforms in the comparison; (5) RNA-seq has the best overall consistent with the other platforms on gene expression quantification.


2022 ◽  
Author(s):  
Amit M Fenn ◽  
Olga Tsoy ◽  
Tim Faro ◽  
Fanny Roessler ◽  
Alexander Dietrich ◽  
...  

Alternative splicing is a major contributor to transcriptome and proteome diversity in health and disease. A plethora of tools have been developed for studying alternative splicing in RNA-seq data. Previous benchmarks focused on isoform quantification and mapping. They neglected event detection tools, which arguably provide the most detailed insights into the alternative splicing process. DICAST offers a modular and extensible framework for the analysis of alternative splicing integrating 11 splice-aware mapping and eight event detection tools. We benchmark all tools extensively on simulated as well as whole blood RNA-seq data. STAR and HISAT2 demonstrated the best balance between performance and run time. The performance of event detection tools varies widely with no tool outperforming all others. DICAST allows researchers to employ a consensus approach to consider the most successful tools jointly for robust event detection. Furthermore, we propose the first reporting standard to unify existing formats and to guide future tool development.


2017 ◽  
Author(s):  
Yuanhua Huang ◽  
Guido Sanguinetti

AbstractSingle cell RNA-seq (scRNA-seq) has revolutionised our understanding of transcriptome variability, with profound implications both fundamental and translational. While scRNA-seq provides a comprehensive measurement of stochasticity in transcription, the limitations of the technology have prevented its application to dissect variability in RNA processing events such as splicing. Here we present BRIE (Bayesian Regression for Isoform Estimation), a Bayesian hierarchical model which resolves these problems by learning an informative prior distribution from sequence features. We show that BRIE yields reproducible estimates of exon inclusion ratios in single cells and provides an effective tool for differential isoform quantification between scRNA-seq data sets. BRIE therefore expands the scope of scRNA-seq experiments to probe the stochasticity of RNA-processing.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Johan Gustafsson ◽  
Jonathan Robinson ◽  
Jens Nielsen ◽  
Lior Pachter

AbstractThe incorporation of unique molecular identifiers (UMIs) in single-cell RNA-seq assays makes possible the identification of duplicated molecules, thereby facilitating the counting of distinct molecules from sequenced reads. However, we show that the naïve removal of duplicates can lead to a bias due to a “pooled amplification paradox,” and we propose an improved quantification method based on unseen species modeling. Our correction called BUTTERFLY uses a zero truncated negative binomial estimator implemented in the kallisto bustools workflow. We demonstrate its efficacy across cell types and genes and show that in some cases it can invert the relative abundance of genes.


Sign in / Sign up

Export Citation Format

Share Document