scholarly journals CAMDA 2014: Making sense of RNA-Seq data: From low-level processing to functional analysis

2014 ◽  
Vol 2 (2) ◽  
pp. 31-40 ◽  
Author(s):  
Oleg V Moskvin ◽  
Sean McIlwain ◽  
Irene M Ong
2014 ◽  
Author(s):  
Oleg Moskvin ◽  
Sean McIlwain ◽  
Irene Ong

Numerous methods of RNA-Seq data analysis have been developed, and there are more under active development. In this paper, our focus is on evaluating the impact of each processing stage; from pre-processing of sequencing reads to alignment/counting to count normalization to differential expression testing to downstream functional analysis, on the inferred functional pattern of biological response. We assess the impact of 6,912 combinations of technical and biological factors on the resulting signature of transcriptomic functional response. Given the absence of the ground truth, we use two complementary evaluation criteria: a) consistency of the functional patterns identified in two similar comparisons, namely effects of a naturally-toxic medium and a medium with artificially reconstituted toxicity, and b) consistency of results in RNA-Seq and microarray versions of the same study. Our results show that despite high variability at the low-level processing stage (read pre-processing, alignment and counting) and the differential expression calling stage, their impact on the inferred pattern of biological response was surprisingly low; they were instead overshadowed by the choice of the functional enrichment method. The latter have an impact comparable in magnitude to the impact of biological factors per se.


2019 ◽  
Author(s):  
Tim O. Nieuwenhuis ◽  
Stephanie Yang ◽  
Rohan X. Verma ◽  
Vamsee Pillalamarri ◽  
Dan E. Arking ◽  
...  

AbstractOne of the challenges of next generation sequencing (NGS) is read contamination. We used the Genotype-Tissue Expression (GTEx) project, a large, diverse, and robustly generated dataset, to understand the factors that contribute to contamination. We obtained GTEx datasets and technical metadata and validating RNA-Seq from other studies. Of 48 analyzed tissues in GTEx, 26 had variant co-expression clusters of four known highly expressed and pancreas-enriched genes (PRSS1, PNLIP, CLPS, and/or CELA3A). Fourteen additional highly expressed genes from other tissues also indicated contamination. Sample contamination by non-native genes was associated with a sample being sequenced on the same day as a tissue that natively expressed those genes. This was highly significant for pancreas and esophagus genes (linear model, p=9.5e-237 and p=5e-260 respectively). Nine SNPs in four genes shown to contaminate non-native tissues demonstrated allelic differences between DNA-based genotypes and contaminated sample RNA-based genotypes, validating the contamination. Low-level contamination affected 4,497 (39.6%) samples (defined as 10 PRSS1 TPM). It also led ≥ to eQTL assignments in inappropriate tissues among these 18 genes. We note this type of contamination occurs widely, impacting bulk and single cell data set analysis. In conclusion, highly expressed, tissue-enriched genes basally contaminate GTEx and other datasets impacting analyses. Awareness of this process is necessary to avoid assigning inaccurate importance to low-level gene expression in inappropriate tissues and cells.


F1000Research ◽  
2016 ◽  
Vol 5 ◽  
pp. 2122 ◽  
Author(s):  
Aaron T.L. Lun ◽  
Davis J. McCarthy ◽  
John C. Marioni

Single-cell RNA sequencing (scRNA-seq) is widely used to profile the transcriptome of individual cells. This provides biological resolution that cannot be matched by bulk RNA sequencing, at the cost of increased technical noise and data complexity. The differences between scRNA-seq and bulk RNA-seq data mean that the analysis of the former cannot be performed by recycling bioinformatics pipelines for the latter. Rather, dedicated single-cell methods are required at various steps to exploit the cellular resolution while accounting for technical noise. This article describes a computational workflow for low-level analyses of scRNA-seq data, based primarily on software packages from the open-source Bioconductor project. It covers basic steps including quality control, data exploration and normalization, as well as more complex procedures such as cell cycle phase assignment, identification of highly variable and correlated genes, clustering into subpopulations and marker gene detection. Analyses were demonstrated on gene-level count data from several publicly available data sets involving haematopoietic stem cells, brain-derived cells, T-helper cells and mouse embryonic stem cells. This will provide a range of usage scenarios from which readers can construct their own analysis pipelines.


2021 ◽  
Author(s):  
Christian Siadjeu ◽  
Eike Mayland-Quellhorst ◽  
Sascha Laubinger ◽  
Dirk C. Albach

AbstractStorage ability of D. dumetorum is restricted by a severe phenomenon of post-harvest hardening which starts 72h after harvest and renders tubers inedible. Previous work has only focused on the biochemistry changes affecting the PHH on D. dumetorum. To the best of our knowledge nobody has identified candidate genes responsible for hardness on D. dumetorum. Here, transcriptome analysis of D. dumetorum tubers was performed, 4 months after emergence (4MAE), after harvest (AH), 3 days AH (3DAH) and 14 days AH (14DAH) on four accessions using RNA-Seq. In total between AH and 3DAH, 165, 199,128 and 61 differentially expressed genes (DEGs) were detected in Bayangam 2, Fonkouankem 1, Bangou 1 and Ibo sweet 3 respectively. Functional analysis of DEGs revealed that genes encoding for cellulose synthase A, xylan O-acetyltransferase chlorophyll a/b binding protein 1,2,3,4 and transcription factor MYBP were found predominantly and significantly up-regulated 3DAH, implying that genes were potentially involved in the post-harvest hardening. A hypothetical mechanism of this phenomenon and its regulation has been proposed. These findings provide the first comprehensive insights into genes expression in yam tubers after harvest and valuable information for molecular breeding against the post-harvest hardening. A hypothetical mechanism of this phenomenon and its regulation has been proposed. These findings provide the first comprehensive insights into genes expression in yam tubers after harvest and valuable information for molecular breeding against the post-harvest hardening.


2018 ◽  
Vol 19 (11) ◽  
pp. 3637 ◽  
Author(s):  
Xiaoshuang Li ◽  
Bei Gao ◽  
Daoyuan Zhang ◽  
Yuqing Liang ◽  
Xiaojie Liu ◽  
...  

Bryum argenteum is a desert moss which shows tolerance to the desert environment and is emerging as a good plant material for identification of stress-related genes. AP2/ERF transcription factor family plays important roles in plant responses to biotic and abiotic stresses. AP2/ERF genes have been identified and extensively studied in many plants, while they are rarely studied in moss. In the present study, we identified 83 AP2/ERF genes based on the comprehensive dehydrationrehydration transcriptomic atlas of B. argenteum. BaAP2/ERF genes can be classified into five families, including 11 AP2s, 43 DREBs, 26 ERFs, 1 RAV, and 2 Soloists. RNA-seq data showed that 83 BaAP2/ERFs exhibited elevated transcript abundances during dehydration–rehydration process. We used RT-qPCR to validate the expression profiles of 12 representative BaAP2/ERFs and confirmed the expression trends using RNA-seq data. Eight out of 12 BaAP2/ERFs demonstrated transactivation activities. Seven BaAP2/ERFs enhanced salt and osmotic stress tolerances of yeast. This is the first study to provide detailed information on the identification, classification, and functional analysis of the AP2/ERFs in B. argenteum. This study will lay the foundation for the further functional analysis of these genes in plants, as well as provide greater insights into the molecular mechanisms of abiotic stress tolerance of B. argenteum.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Thuan Phu Nguyen-Vo ◽  
Seyoung Ko ◽  
Huichang Ryu ◽  
Jung Rae Kim ◽  
Donghyuk Kim ◽  
...  

Abstract Previously, we have reported that 3-hydroxypropionate (3-HP) tolerance in Escherichia coli W is improved by deletion of yieP, a less-studied transcription factor. Here, through systems analyses along with physiological and functional studies, we suggest that the yieP deletion improves 3-HP tolerance by upregulation of yohJK, encoding putative 3-HP transporter(s). The tolerance improvement by yieP deletion was highly specific to 3-HP, among various C2–C4 organic acids. Mapping of YieP binding sites (ChIP-exo) coupled with transcriptomic profiling (RNA-seq) advocated seven potential genes/operons for further functional analysis. Among them, the yohJK operon, encoding for novel transmembrane proteins, was the most responsible for the improved 3-HP tolerance; deletion of yohJK reduced 3-HP tolerance regardless of yieP deletion, and their subsequent complementation fully restored the tolerance in both the wild-type and yieP deletion mutant. When determined by 3-HP-responsive biosensor, a drastic reduction of intracellular 3-HP was observed upon yieP deletion or yohJK overexpression, suggesting that yohJK encodes for novel 3-HP exporter(s).


2015 ◽  
Vol 2015 ◽  
pp. 1-8 ◽  
Author(s):  
Yongjun Shu ◽  
Jun Zhang ◽  
You Ao ◽  
Lili Song ◽  
Changhong Guo

The transcriptome ofThinopyrum elongatumunder water deficit stress was analyzed using RNA-Seq technology. The results showed that genes involved in processes of amplification of stress signaling, reductions in oxidative damage, creation of protectants, and roots development were expressed differently, which played an important role in the response to water deficit. TheTh. elongatumtranscriptome research highlights the activation of a large set of water deficit-related genes in this species and provides a valuable resource for future functional analysis of candidate genes in the water deficit stress response.


Sign in / Sign up

Export Citation Format

Share Document