Benchmarking Transcriptome Quantification Methods for Duplicated Genes in Xenopus laevis

2015 ◽  
Vol 145 (3-4) ◽  
pp. 253-264 ◽  
Author(s):  
Taejoon Kwon

Xenopus is an important model organism for the study of genome duplication in vertebrates. With the full genome sequence of diploid Xenopus tropicalis available, and that of allotetraploid X. laevis close to being finished, we will be able to expand our understanding of how duplicated genes have evolved. One of the key features in the study of the functional consequence of gene duplication is how their expression patterns vary across different conditions, and RNA-seq seems to have enough resolution to discriminate the expression of highly similar duplicated genes. However, most of the current RNA-seq analysis methods were not designed to study samples with duplicate genes such as in X. laevis. Here, various computational methods to quantify gene expression in RNA-seq data were evaluated, using 2 independent X. laevis egg RNA-seq datasets and 2 reference databases for duplicated genes. The fact that RNA-seq can measure expression levels of similar duplicated genes was confirmed, but long paired-end reads are more informative than short single-end reads to discriminate duplicated genes. Also, it was found that bowtie, one of the most popular mappers in RNA-seq analysis, reports significantly smaller numbers of unique hits according to a mapping quality score compared to other mappers tested (BWA, GSNAP, STAR). Calculated from unique hits based on a mapping quality score, both expression levels and the expression ratio of duplicated genes can be estimated consistently among biological replicates, demonstrating that this method can successfully discriminate the expression of each copy of a duplicated gene pair. This comprehensive evaluation will be a useful guideline for studying gene expression of organisms with genome duplication using RNA-seq in the future.

2020 ◽  
Vol 48 (15) ◽  
pp. 8320-8331
Author(s):  
Xiangjun Ji ◽  
Peng Li ◽  
James C Fuscoe ◽  
Geng Chen ◽  
Wenzhong Xiao ◽  
...  

Abstract The rat is an important model organism in biomedical research for studying human disease mechanisms and treatments, but its annotated transcriptome is far from complete. We constructed a Rat Transcriptome Re-annotation named RTR using RNA-seq data from 320 samples in 11 different organs generated by the SEQC consortium. Totally, there are 52 807 genes and 114 152 transcripts in RTR. Transcribed regions and exons in RTR account for ∼42% and ∼6.5% of the genome, respectively. Of all 73 074 newly annotated transcripts in RTR, 34 213 were annotated as high confident coding transcripts and 24 728 as high confident long noncoding transcripts. Different tissues rather than different stages have a significant influence on the expression patterns of transcripts. We also found that 11 715 genes and 15 852 transcripts were expressed in all 11 tissues and that 849 house-keeping genes expressed different isoforms among tissues. This comprehensive transcriptome is freely available at http://www.unimd.org/rtr/. Our new rat transcriptome provides essential reference for genetics and gene expression studies in rat disease and toxicity models.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Le Zhang ◽  
Jingtian Zhao ◽  
Hao Bi ◽  
Xiangyu Yang ◽  
Zhiyang Zhang ◽  
...  

AbstractThe nonrandom three-dimensional organization of chromatin plays an important role in the regulation of gene expression. However, it remains unclear whether this organization is conserved and whether it is involved in regulating gene expression during speciation after whole-genome duplication (WGD) in plants. In this study, high-resolution interaction maps were generated using high-throughput chromatin conformation capture (Hi-C) techniques for two poplar species, Populus euphratica and Populus alba var. pyramidalis, which diverged ~14 Mya after a common WGD. We examined the similarities and differences in the hierarchical chromatin organization between the two species, including A/B compartment regions and topologically associating domains (TADs), as well as in their DNA methylation and gene expression patterns. We found that chromatin status was strongly associated with epigenetic modifications and gene transcriptional activity, yet the conservation of hierarchical chromatin organization across the two species was low. The divergence of gene expression between WGD-derived paralogs was associated with the strength of chromatin interactions, and colocalized paralogs exhibited strong similarities in epigenetic modifications and expression levels. Thus, the spatial localization of duplicated genes is highly correlated with biased expression during the diploidization process. This study provides new insights into the evolution of chromatin organization and transcriptional regulation during the speciation process of poplars after WGD.


2021 ◽  
Vol 15 (1) ◽  
Author(s):  
Weitong Cui ◽  
Huaru Xue ◽  
Lei Wei ◽  
Jinghua Jin ◽  
Xuewen Tian ◽  
...  

Abstract Background RNA sequencing (RNA-Seq) has been widely applied in oncology for monitoring transcriptome changes. However, the emerging problem that high variation of gene expression levels caused by tumor heterogeneity may affect the reproducibility of differential expression (DE) results has rarely been studied. Here, we investigated the reproducibility of DE results for any given number of biological replicates between 3 and 24 and explored why a great many differentially expressed genes (DEGs) were not reproducible. Results Our findings demonstrate that poor reproducibility of DE results exists not only for small sample sizes, but also for relatively large sample sizes. Quite a few of the DEGs detected are specific to the samples in use, rather than genuinely differentially expressed under different conditions. Poor reproducibility of DE results is mainly caused by high variation of gene expression levels for the same gene in different samples. Even though biological variation may account for much of the high variation of gene expression levels, the effect of outlier count data also needs to be treated seriously, as outlier data severely interfere with DE analysis. Conclusions High heterogeneity exists not only in tumor tissue samples of each cancer type studied, but also in normal samples. High heterogeneity leads to poor reproducibility of DEGs, undermining generalization of differential expression results. Therefore, it is necessary to use large sample sizes (at least 10 if possible) in RNA-Seq experimental designs to reduce the impact of biological variability and DE results should be interpreted cautiously unless soundly validated.


Foods ◽  
2021 ◽  
Vol 10 (2) ◽  
pp. 360
Author(s):  
Guodong Rao ◽  
Jianguo Zhang ◽  
Xiaoxia Liu ◽  
Xue Li ◽  
Chenhe Wang

Olive oil has been favored as high-quality edible oil because it contains balanced fatty acids (FAs) and high levels of minor components. The contents of FAs and minor components are variable in olive fruits of different color at harvest time, which render it difficult to determine the optimal harvest strategy for olive oil producing. Here, we combined metabolome, Pacbio Iso-seq, and Illumina RNA-seq transcriptome to investigate the association between metabolites and gene expression of olive fruits at harvest time. A total of 34 FAs, 12 minor components, and 181 other metabolites (including organic acids, polyols, amino acids, and sugars) were identified in this study. Moreover, we proposed optimal olive harvesting strategy models based on different production purposes. In addition, we used the combined Pacbio Iso-seq and Illumina RNA-seq gene expression data to identify genes related to the biosynthetic pathways of hydroxytyrosol and oleuropein. These data lay the foundation for future investigations of olive fruit metabolism and gene expression patterns, and provide a method to obtain olive harvesting strategies for different production purposes.


2021 ◽  
Vol 3 (1) ◽  
Author(s):  
Hitoshi Iuchi ◽  
Michiaki Hamada

Abstract Time-course experiments using parallel sequencers have the potential to uncover gradual changes in cells over time that cannot be observed in a two-point comparison. An essential step in time-series data analysis is the identification of temporal differentially expressed genes (TEGs) under two conditions (e.g. control versus case). Model-based approaches, which are typical TEG detection methods, often set one parameter (e.g. degree or degree of freedom) for one dataset. This approach risks modeling of linearly increasing genes with higher-order functions, or fitting of cyclic gene expression with linear functions, thereby leading to false positives/negatives. Here, we present a Jonckheere–Terpstra–Kendall (JTK)-based non-parametric algorithm for TEG detection. Benchmarks, using simulation data, show that the JTK-based approach outperforms existing methods, especially in long time-series experiments. Additionally, application of JTK in the analysis of time-series RNA-seq data from seven tissue types, across developmental stages in mouse and rat, suggested that the wave pattern contributes to the TEG identification of JTK, not the difference in expression levels. This result suggests that JTK is a suitable algorithm when focusing on expression patterns over time rather than expression levels, such as comparisons between different species. These results show that JTK is an excellent candidate for TEG detection.


2021 ◽  
Author(s):  
Jianyuan Li ◽  
Hui Shi ◽  
Xiaoyu Liu ◽  
Yanwei Wang ◽  
Haiyan Wang ◽  
...  

Abstract I. Background: Peroxiredoxin 6 (Prdx6) is widely expressed in mammalian tissues. Our previous study demonstrated that Prdx6 was expressed in human epididymis and spermatozoa, and the protective role of Prdx6 in human spermatozoa was also reported. In this study, we demonstrate the potential role and mechanism of Prdx6 in human epididymis epithelial cells (HEECs).II. Methods and Results: Western blotting was used to measure expression levels of key proteins in the JAK / STAT signaling pathway. Digital gene expression analysis (DGE) was used to identify gene expression patterns in control HECs and in HECs after Prdx6-RNA interference (P6-RNAi). The DGE analysis identified 589 up-regulated and 314 down-regulated genes (including Prdx6) in Prdx6-RNAi (P6-RNAi) HEECs. Thirteen significantly different pathways were identified between the two groups, with the majority different expressed genes belonging to the CCL, CXCL, IL, and IFIT families. In particular, the expression levels of IL6, IL6ST, and eighteen IFN related genes were significantly increased in the condition of the down-regulated expression of Prdx6. Compared to control HEECs, the expression levels of JAK1, STAT1, phosphorylated JAK1 and STAT1 were significantly increased, while the expression levels of SOCS3 was significantly decreased in P6-RNAi HEECs. The Malondialdehyde (MDA) level and total antioxidant capacity in P6-RNAi HEECs were significantly increased and decreased compared to that of control, respectively. III. Conclusions: We speculated that knockdown of Prdx6 resulted in higher levels of ROS in HEECs, which in turn, activated the JAK1 / STAT1 signaling pathway induced by IL-6 receptor and IFN.


2020 ◽  
pp. 160-170
Author(s):  
John Vivian ◽  
Jordan M. Eizenga ◽  
Holly C. Beale ◽  
Olena M. Vaske ◽  
Benedict Paten

PURPOSE Many antineoplastics are designed to target upregulated genes, but quantifying upregulation in a single patient sample requires an appropriate set of samples for comparison. In cancer, the most natural comparison set is unaffected samples from the matching tissue, but there are often too few available unaffected samples to overcome high intersample variance. Moreover, some cancer samples have misidentified tissues of origin or even composite-tissue phenotypes. Even if an appropriate comparison set can be identified, most differential expression tools are not designed to accommodate comparisons to a single patient sample. METHODS We propose a Bayesian statistical framework for gene expression outlier detection in single samples. Our method uses all available data to produce a consensus background distribution for each gene of interest without requiring the researcher to manually select a comparison set. The consensus distribution can then be used to quantify over- and underexpression. RESULTS We demonstrate this method on both simulated and real gene expression data. We show that it can robustly quantify overexpression, even when the set of comparison samples lacks ideally matched tissue samples. Furthermore, our results show that the method can identify appropriate comparison sets from samples of mixed lineage and rediscover numerous known gene-cancer expression patterns. CONCLUSION This exploratory method is suitable for identifying expression outliers from comparative RNA sequencing (RNA-seq) analysis for individual samples, and Treehouse, a pediatric precision medicine group that leverages RNA-seq to identify potential therapeutic leads for patients, plans to explore this method for processing its pediatric cohort.


2020 ◽  
Vol 18 (1) ◽  
Author(s):  
Noritaka Saeki ◽  
Yuuki Imai

Abstract Background Macrophages adapt to microenvironments, and change metabolic status and functions to regulate inflammation and/or maintain homeostasis. In joint cavities, synovial macrophages (SM) and synovial fibroblasts (SF) maintain homeostasis. However, under inflammatory conditions such as rheumatoid arthritis (RA), crosstalk between SM and SF remains largely unclear. Methods Immunofluorescent staining was performed to identify localization of SM and SF in synovium of collagen antibody induced arthritis (CAIA) model mice and normal mice. Murine arthritis tissue-derived SM (ADSM), arthritis tissue-derived SF (ADSF) and normal tissue-derived SF (NDSF) were isolated and the purity of isolated cells was examined by RT-qPCR and flow cytometry analysis. RNA-seq was conducted to reveal gene expression profile in ADSM, NDSF and ADSF. Cellular metabolic status and expression levels of metabolic genes and inflammatory genes were analyzed in ADSM treated with ADSM-conditioned medium (ADSM-CM), NDSF-CM and ADSF-CM. Results SM and SF were dispersed in murine hyperplastic synovium. Isolations of ADSM, NDSF and ADSF to analyze the crosstalk were successful with high purity. From gene expression profiles by RNA-seq, we focused on secretory factors in ADSF-CM, which can affect metabolism and inflammatory activity of ADSM. ADSM exposed to ADSF-CM showed significantly upregulated glycolysis and mitochondrial respiration as well as glucose and glutamine uptake relative to ADSM exposed to ADSM-CM and NDSF-CM. Furthermore, mRNA expression levels of metabolic genes, such as Slc2a1, Slc1a5, CD36, Pfkfb1, Pfkfb3 and Irg1, were significantly upregulated in ADSM treated with ADSF-CM. Inflammation marker genes, including Nos2, Tnf, Il-1b and CD86, and the anti-inflammatory marker gene, Il-10, were also substantially upregulated by ADSF-CM. On the other hand, NDSF-CM did not affect metabolism and gene expression in ADSM. Conclusions These findings suggest that crosstalk between SM and SF under inflammatory conditions can induce metabolic reprogramming and extend SM viability that together can contribute to chronic inflammation in RA.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Yohey Ogawa ◽  
Joseph C. Corbo

AbstractVertebrate photoreceptors are categorized into two broad classes, rods and cones, responsible for dim- and bright-light vision, respectively. While many molecular features that distinguish rods and cones are known, gene expression differences among cone subtypes remain poorly understood. Teleost fishes are renowned for the diversity of their photoreceptor systems. Here, we used single-cell RNA-seq to profile adult photoreceptors in zebrafish, a teleost. We found that in addition to the four canonical zebrafish cone types, there exist subpopulations of green and red cones (previously shown to be located in the ventral retina) that express red-shifted opsin paralogs (opn1mw4 or opn1lw1) as well as a unique combination of cone phototransduction genes. Furthermore, the expression of many paralogous phototransduction genes is partitioned among cone subtypes, analogous to the partitioning of the phototransduction paralogs between rods and cones seen across vertebrates. The partitioned cone-gene pairs arose via the teleost-specific whole-genome duplication or later clade-specific gene duplications. We also discovered that cone subtypes express distinct transcriptional regulators, including many factors not previously implicated in photoreceptor development or differentiation. Overall, our work suggests that partitioning of paralogous gene expression via the action of differentially expressed transcriptional regulators enables diversification of cone subtypes in teleosts.


Sign in / Sign up

Export Citation Format

Share Document