scholarly journals Comparative evaluation of RNA-Seq library preparation methods for strand-specificity and low input

2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Dimitra Sarantopoulou ◽  
Soon Yew Tang ◽  
Emanuela Ricciotti ◽  
Nicholas F. Lahens ◽  
Damien Lekkas ◽  
...  

Abstract Library preparation is a key step in sequencing. For RNA sequencing there are advantages to both strand specificity and working with minute starting material, yet until recently there was no kit available enabling both. The Illumina TruSeq stranded mRNA Sample Preparation kit (TruSeq) requires abundant starting material while the Takara Bio SMART-Seq v4 Ultra Low Input RNA kit (V4) sacrifices strand specificity. The SMARTer Stranded Total RNA-Seq Kit v2 - Pico Input Mammalian (Pico) by Takara Bio claims to overcome these limitations. Comparative evaluation of these kits is important for selecting the appropriate protocol. We compared the three kits in a realistic differential expression analysis. We prepared and sequenced samples from two experimental conditions of biological interest with each of the three kits. We report differences between the kits at the level of differential gene expression; for example, the Pico kit results in 55% fewer differentially expressed genes than TruSeq. Nevertheless, the agreement of the observed enriched pathways suggests that comparable functional results can be obtained. In summary we conclude that the Pico kit sufficiently reproduces the results of the other kits at the level of pathway analysis while providing a combination of options that is not available in the other kits.

2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Mikhail Pomaznoy ◽  
Ashu Sethi ◽  
Jason Greenbaum ◽  
Bjoern Peters

Abstract RNA-seq methods are widely utilized for transcriptomic profiling of biological samples. However, there are known caveats of this technology which can skew the gene expression estimates. Specifically, if the library preparation protocol does not retain RNA strand information then some genes can be erroneously quantitated. Although strand-specific protocols have been established, a significant portion of RNA-seq data is generated in non-strand-specific manner. We used a comprehensive stranded RNA-seq dataset of 15 blood cell types to identify genes for which expression would be erroneously estimated if strand information was not available. We found that about 10% of all genes and 2.5% of protein coding genes have a two-fold or higher difference in estimated expression when strand information of the reads was ignored. We used parameters of read alignments of these genes to construct a machine learning model that can identify which genes in an unstranded dataset might have incorrect expression estimates and which ones do not. We also show that differential expression analysis of genes with biased expression estimates in unstranded read data can be recovered by limiting the reads considered to those which span exonic boundaries. The resulting approach is implemented as a package available at https://github.com/mikpom/uslcount.


2019 ◽  
Author(s):  
Chen Yang ◽  
Chenkai Li ◽  
Ka Ming Nip ◽  
René L Warren ◽  
Inanc Birol

AbstractAs a widespread RNA processing machinery, alternative polyadenylation plays a crucial role in gene regulation. To help decipher its underlying mechanism and understand its impact, it is desirable to comprehensively profile 3’-untranslated region cleavage and associated polyadenylation sites. State-of-the-art polyadenylation site detection tools are known to be influenced by library preparation artefacts or manually selected features. Moreover, recently published machine learning methods have only been tested on pre-constructed datasets, thus lacking validation on experimental data. Here we present Terminitor, the first deep neural network-based profiling pipeline to make predictions from RNA-seq data. We show how Terminitor outperforms competing tools in sensitivity and precision on experimental transcriptome sequencing data, and demonstrate its use with data from short- and long-read sequencing technologies. For species without a good reference transcriptome annotation, Terminitor is still able to pass on the information learnt from a related species and make reasonable predictions. We used Terminitor to showcase how single nucleotide variations can create or destroy polyadenylated cleavage sites in human RNA-seq samples.Author Summary3’ cleavage and polyadenylation of pre-mRNA is part of RNA maturation process. One gene can be cleaved at different positions at its 3’ end, namely alternatively polyadenylation, thus identifying the correct polyadenylated cleavage site (poly(A) CS) is essential to unveil its role in gene regulation under different physiological and pathological conditions. The current poly(A) CS prediction tools are either heavily influenced by RNA-Seq library preparation artefacts or have only been designed and tested on ad hoc datasets, lacking association with real world applications. In this study, we present a deep learning model, Terminitor, that predicts the probability of a nucleotide sequence containing a poly(A) CS, and validated its performance on human and mouse data. Along with the model, we propose a poly(A) CS profiling pipeline for RNA-seq data. We benchmarked our pipeline against competing tools and achieved higher sensitivity and precision in experimental data. The usage of Terminitor is not limited to genome and transcriptome annotation and we expect it to facilitate the identification of novel isoforms, improve the accuracy of transcript quantification and differential expression analysis, and contribute to the repertoire of reference transcriptome annotation.


2002 ◽  
Vol 22 (11) ◽  
pp. 1377-1398 ◽  
Author(s):  
Yiyun Huang ◽  
Dah-Ren Hwang ◽  
Raj Narendran ◽  
Yasuhiko Sudo ◽  
Rano Chatterjee ◽  
...  

The recent introduction of a number of new radiotracers suitable for imaging the serotonin transporters (SERT) has radically changed the field of SERT imaging. Whereas, until recently, only one selective SERT radiotracer was available ([11C]McN 5652) for SERT imaging with positron emission tomography (PET), several new C-11-labeled radiotracers of the N,N-dimethyl-2-(arylthio)benzylamine class have been described as appropriate imaging agents for the SERT. The aim of this study was to conduct a comparative evaluation of four of the most promising agents in this class ([11C]ADAM, [11C]DASB, [11C]DAPA, and [11C]AFM) with the reference tracer [11C]McN 5652 under standardized experimental conditions. This evaluation included in vitro measurements of affinity and lipophilicity, and in vivo PET imaging experiments in baboons. In vitro, DASB displayed significantly lower affinity for SERT than the other four tracers. In the blood, [11C]DASB and [11C]AFM display faster clearance and higher free fractions. Brain uptake was analyzed with kinetic modeling using a one-tissue compartment model and the metabolite-corrected arterial input function. The kinetic uptake of [11C]DASB was significantly faster compared with the other compounds, and the scan duration required to derive time-independent estimates of regional distribution volumes was shorter. [11C]DAPA exhibited the slowest brain kinetic. Regional-specific-to-nonspecific equilibrium partition coefficient (V3“) was the highest for [11C]AFM, followed by [11C]DASB and [11C]DAPA, which in turn provided higher V3” values than [11C]ADAM and [11C]McN 5652. From these experiments, two ligands emerged as superior radiotracers that provide a significant improvement over [11C]McN 5652 for PET imaging of SERT: [11C]DASB, because it enables the measurement of SERT availability in a shorter scanning time, and [11C]AFM, because its higher signal-to-noise ratios provide a more reliable measurement of SERT availability in brain regions with relatively low density of SERT, such as in the limbic system.


2020 ◽  
Author(s):  
Matteo Calgaro ◽  
Chiara Romualdi ◽  
Levi Waldron ◽  
Davide Risso ◽  
Nicola Vitulo

AbstractBackgroundThe correct identification of differentially abundant microbial taxa between experimental conditions is a methodological and computational challenge. Recent work has produced methods to deal with the high sparsity and compositionality characteristic of microbiome data, but independent benchmarks comparing these to alternatives developed for RNA-seq data analysis are lacking.ResultsHere, we compare methods developed for single cell, bulk RNA-seq, and microbiome data, in terms of suitability of distributional assumptions, ability to control false discoveries, concordance, and power. We benchmark these methods using 100 manually curated datasets from 16S and whole metagenome shotgun sequencing.ConclusionsThe multivariate and compositional methods developed specifically for microbiome analysis did not outperform univariate methods developed for differential expression analysis of RNA-seq data. We recommend a careful exploratory data analysis prior to application of any inferential model and we present a framework to help scientists make an informed choice of analysis methods in a dataset-specific manner.


2019 ◽  
Vol 20 (S16) ◽  
Author(s):  
Kefei Liu ◽  
Li Shen ◽  
Hui Jiang

Abstract Background A fundamental problem in RNA-seq data analysis is to identify genes or exons that are differentially expressed with varying experimental conditions based on the read counts. The relativeness of RNA-seq measurements makes the between-sample normalization of read counts an essential step in differential expression (DE) analysis. In most existing methods, the normalization step is performed prior to the DE analysis. Recently, Jiang and Zhan proposed a statistical method which introduces sample-specific normalization parameters into a joint model, which allows for simultaneous normalization and differential expression analysis from log-transformed RNA-seq data. Furthermore, an ℓ0 penalty is used to yield a sparse solution which selects a subset of DE genes. The experimental conditions are restricted to be categorical in their work. Results In this paper, we generalize Jiang and Zhan’s method to handle experimental conditions that are measured in continuous variables. As a result, genes with expression levels associated with a single or multiple covariates can be detected. As the problem being high-dimensional, non-differentiable and non-convex, we develop an efficient algorithm for model fitting. Conclusions Experiments on synthetic data demonstrate that the proposed method outperforms existing methods in terms of detection accuracy when a large fraction of genes are differentially expressed in an asymmetric manner, and the performance gain becomes more substantial for larger sample sizes. We also apply our method to a real prostate cancer RNA-seq dataset to identify genes associated with pre-operative prostate-specific antigen (PSA) levels in patients.


2019 ◽  
Vol 139 (5) ◽  
pp. 543-553
Author(s):  
Hiromune Namie ◽  
Nobuaki Kubo ◽  
Osamu Suzuki ◽  
Chie Kojima ◽  
Yasuhiko Aiko

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Matthew Chung ◽  
Vincent M. Bruno ◽  
David A. Rasko ◽  
Christina A. Cuomo ◽  
José F. Muñoz ◽  
...  

AbstractAdvances in transcriptome sequencing allow for simultaneous interrogation of differentially expressed genes from multiple species originating from a single RNA sample, termed dual or multi-species transcriptomics. Compared to single-species differential expression analysis, the design of multi-species differential expression experiments must account for the relative abundances of each organism of interest within the sample, often requiring enrichment methods and yielding differences in total read counts across samples. The analysis of multi-species transcriptomics datasets requires modifications to the alignment, quantification, and downstream analysis steps compared to the single-species analysis pipelines. We describe best practices for multi-species transcriptomics and differential gene expression.


Animals ◽  
2021 ◽  
Vol 11 (6) ◽  
pp. 1745
Author(s):  
Ben-Ben Miao ◽  
Su-Fang Niu ◽  
Ren-Xie Wu ◽  
Zhen-Bang Liang ◽  
Bao-Gui Tang ◽  
...  

Pearl gentian grouper (Epinephelus fuscoguttatus ♀ × Epinephelus lanceolatus ♂) is a fish of high commercial value in the aquaculture industry in Asia. However, this hybrid fish is not cold-tolerant, and its molecular regulation mechanism underlying cold stress remains largely elusive. This study thus investigated the liver transcriptomic responses of pearl gentian grouper by comparing the gene expression of cold stress groups (20, 15, 12, and 12 °C for 6 h) with that of control group (25 °C) using PacBio SMRT-Seq and Illumina RNA-Seq technologies. In SMRT-Seq analysis, a total of 11,033 full-length transcripts were generated and used as reference sequences for further RNA-Seq analysis. In RNA-Seq analysis, 3271 differentially expressed genes (DEGs), two low-temperature specific modules (tan and blue modules), and two significantly expressed gene sets (profiles 0 and 19) were screened by differential expression analysis, weighted gene co-expression networks analysis (WGCNA), and short time-series expression miner (STEM), respectively. The intersection of the above analyses further revealed some key genes, such as PCK, ALDOB, FBP, G6pC, CPT1A, PPARα, SOCS3, PPP1CC, CYP2J, HMGCR, CDKN1B, and GADD45Bc. These genes were significantly enriched in carbohydrate metabolism, lipid metabolism, signal transduction, and endocrine system pathways. All these pathways were linked to biological functions relevant to cold adaptation, such as energy metabolism, stress-induced cell membrane changes, and transduction of stress signals. Taken together, our study explores an overall and complex regulation network of the functional genes in the liver of pearl gentian grouper, which could benefit the species in preventing damage caused by cold stress.


2021 ◽  
Vol 11 (8) ◽  
pp. 3562
Author(s):  
Yong Jin Lee ◽  
Sang Yong Park ◽  
Dae Yeon Kim ◽  
Jae Yoon Kim

Preharvest sprouting (PHS) is a key global issue in production and end-use quality of cereals, particularly in regions where the rainfall season overlaps the harvest. To investigate transcriptomic changes in genes affected by PHS-induction and ABA-treatment, RNA-seq analysis was performed in two wheat cultivars that differ in PHS tolerance. A total of 123 unigenes related to hormone metabolism and signaling for abscisic acid (ABA), gibberellic acid (GA), indole-3-acetic acid (IAA), and cytokinin were identified and 1862 of differentially expressed genes were identified and divided into 8 groups by transcriptomic analysis. DEG analysis showed the majority of genes were categorized in sugar related processes, which interact with ABA signaling in PHS tolerant cultivar under PHS-induction. Thus, genes related to ABA are key regulators of dormancy and germination. Our results give insight into global changes in expression of plant hormone related genes in response to PHS.


Sign in / Sign up

Export Citation Format

Share Document