scholarly journals Bayesian inference of differentially expressed transcripts and their abundance from multi-condition RNA-seq data

2019 ◽  
Author(s):  
Xi Chen

AbstractDeep sequencing of bulk RNA enables the differential expression analysis at transcript level. We develop a Bayesian approach to directly identify differentially expressed transcripts from RNA-seq data, which features a novel joint model of the sample variability and the differential state of individual transcripts. For each transcript, to minimize the inaccuracy of differential state caused by transcription abundance estimation, we estimate its expression abundance together with the differential state iteratively and enable the differential analysis of weakly expressed transcripts. Simulation analysis demonstrates that the proposed approach has a superior performance over conventional methods (estimating transcription expression first and then identifying differential state), particularly for lowly expressed transcripts. We further apply the proposed approach to a breast cancer RNA-seq data of patients treated by tamoxifen and identified a set of differentially expressed transcripts, providing insights into key signaling pathways associated with breast cancer recurrence.

2017 ◽  
Author(s):  
Lynn Yi ◽  
Harold Pimentel ◽  
Nicolas L. Bray ◽  
Lior Pachter

AbstractGene-level differential expression analysis based on RNA-Seq is more robust, powerful and biologically actionable than transcript-level differential analysis. However aggregation of transcript counts prior to analysis results can mask transcript-level dynamics. We demonstrate that aggregating the results of transcript-level analysis allow for gene-level analysis with transcript-level resolution. We also show that p-value aggregation methods, typically used for meta-analyses, greatly increase the sensitivity of gene-level differential analyses. Furthermore, such aggregation can be applied directly to transcript compatibility counts obtained during pseudoalignment, thereby allowing for rapid and accurate model-free differential testing. The methods are general, allowing for testing not only of genes but also of any groups of transcripts, and we showcase an example where we apply them to perturbation analysis of gene ontologies.


2021 ◽  
Author(s):  
Chengang Guo ◽  
Zhimin wei ◽  
Wei Lyu ◽  
Yanlou Geng

Abstract Quinoa saponins have complex, diverse and evident physiologic activities. However, the key regulatory genes for quinoa saponin metabolism are not yet well studied. The purpose of this study was to explore genes closely related to quinoa saponin metabolism. In this study, the significantly differentially expressed genes in yellow quinoa were firstly screened based on RNA-seq technology. Then, the key genes for saponin metabolism were selected by gene set enrichment analysis (GSEA) and principal component analysis (PCA) statistical methods. Finally, the specificity of the key genes was verified by hierarchical clustering. The results of differential analysis showed that 1654 differentially expressed genes were achieved after pseudogenes deletion. Therein, there were 142 long non-coding genes and 1512 protein-coding genes. Based on GSEA analysis, 116 key candidate genes were found to be significantly correlated with quinoa saponin metabolism. Through PCA dimension reduction analysis, 57 key genes were finally obtained. Hierarchical cluster analysis further demonstrated that these key genes can clearly separate the four groups of samples. The present results could provide references for the breeding of sweet quinoa and would be helpful for the rational utilization of quinoa saponins.


2020 ◽  
Author(s):  
Siew Woh Choo ◽  
Yu Zhong ◽  
Edward Sendler ◽  
Anton Scott Goustin ◽  
Juan Cai ◽  
...  

Abstract BackgroundEstrogen is a hormone that is frequently essential in breast cancer to drive key transcriptional programs by interacting with the estrogen receptor alpha that upregulates proliferative and oncogenic genes and represses apoptotic and tumor suppressor genes. Protein-coding targets of estrogen regulation in breast cancer are well-defined. However, long non-coding RNA (lncRNA) genes account for the majority of human gene catalogs. The coding status of these genes – their accidental, or regulated, translation by ribosomes, under the influence of estrogen – remains a controversial topic. MethodsHere, we performed comprehensive transcriptome analysis using RNA-Seq, as well as ribosome profiling using Ribo-Seq, on the same samples: biological replicates of human estrogen receptor alpha (ERa) positive MCF7 breast cancer cells before and after estrogen treatment. We correlated these two datasets, globally highlighting protein-coding and lncRNA differentially expressed genes and transcripts that were positively as well as negatively responsive to estrogen, separately at the transcriptional level and the translational (as approximated by ribosome binding) level.ResultsOur data showed that some transcripts were more robustly detected in RNA-Seq than in the ribosome-profiling data, and vice versa, suggesting distinct gene-specific estrogen responses at the transcriptional and the translational level, respectively. Certain differentially expressed transcripts may point to the regulation of alternative splicing by estrogen. Several pseudogenes were co- and anti-regulated with their cancer-functional parental genes. Gene ontology analysis highlighted cancer-relevant pathways enriched after estrogen treatment in cells.ConclusionsOur study represents a significant advance in the estrogen receptor biology, because we demonstrated global effects of estrogen on splicing and translation that are distinct from, and not always correlated with, its effects on transcription, and that differ globally for protein-coding and lncRNA genes. We have also highlighted for the first time the transcriptional and translational response of expressed pseudogenes to estrogen, pointing to new perspectives for biomarker and drug-target development for breast cancer in future.


2020 ◽  
Vol 2020 ◽  
pp. 1-18
Author(s):  
Xinhong Liu ◽  
Feng Chen ◽  
Fang Tan ◽  
Fang Li ◽  
Ruokun Yi ◽  
...  

Background. Breast cancer is a malignant tumor that occurs in the epithelial tissue of the breast gland and has become the most common malignancy in women. The regulation of the expression of related genes by microRNA (miRNA) plays an important role in breast cancer. We constructed a comprehensive breast cancer-miRNA-gene interaction map. Methods. Three miRNA microarray datasets (GSE26659, GSE45666, and GSE58210) were obtained from the GEO database. Then, the R software “LIMMA” package was used to identify differential expression analysis. Potential transcription factors and target genes of screened differentially expressed miRNAs (DE-miRNAs) were predicted. The BRCA GE-mRNA datasets (GSE109169 and GSE139038) were downloaded from the GEO database for identifying differentially expressed genes (DE-genes). Next, GO annotation and KEGG pathway enrichment analysis were conducted. A PPI network was then established, and hub genes were identified via Cytoscape software. The expression and prognostic roles of hub genes were further evaluated. Results. We found 6 upregulated differentially expressed- (DE-) miRNAs and 18 downregulated DE-miRNAs by analyzing 3 Gene Expression Omnibus databases, and we predicted the upstream transcription factors and downstream target genes for these DE-miRNAs. Then, we used the GEO database to perform differential analysis on breast cancer mRNA and obtained differentially expressed mRNA. We found 10 hub genes of upregulated DE-miRNAs and 10 hub genes of downregulated DE-miRNAs through interaction analysis. Conclusions. In this study, we have performed an integrated bioinformatics analysis to construct a more comprehensive BRCA-miRNA-gene network and provide new targets and research directions for the treatment and prognosis of BRCA.


2021 ◽  
Author(s):  
Biao Chen ◽  
Wenjie Fang ◽  
Yankai Li ◽  
Ting Xiong ◽  
Mingfang Zhou ◽  
...  

Ducks are an important source of meat and egg products for human beings. In China, duck breeding has gradually changed from the traditional floor-water combination system to multilayer cage breeding. Therefore, the present study collected the hypothalamus and pituitary of 113-day-old ducks after being caged for 3 days, in order to investigate the effect of cage-rearing on the birds. In addition, the same tissues (hypothalamus and pituitary) were collected from ducks raised in the floor-water combination system, for comparison. Thereafter, the transcriptomes were sequenced and the expression level of genes were compared. The results of sequencing analysis showed that a total of 506 and 342 genes were differentially expressed in the hy-po-thalamus and pituitary, respectively. Additionally, the differentially expressed genes were mainly enriched in signaling pathways involved in processing environmental information, including ECM-receptor interaction, neuroactive ligand-receptor interaction and focal adhesion. The findings also showed that there was a change in the alternative splicing of genes when ducks were transferred into the cage rearing system. However, there was no difference in the expression of some genes although there was a change in the expression of the isoforms of these genes. The findings herein can therefore help in understanding the mechanisms underlying the effect of caging on waterfowl. The results also highlight the gene regulatory networks involved in animal responses to acute stress.


PeerJ ◽  
2019 ◽  
Vol 7 ◽  
pp. e7976 ◽  
Author(s):  
Yaozong Wang ◽  
Baorong Song ◽  
Leilei Zhu ◽  
Xia Zhang

Background Dysregulated long non-coding RNAs (lncRNAs) may serve as potential biomarkers of cancers including breast cancer (BRCA). This study aimed to identify lncRNAs with strong prognostic value for BRCA. Methods LncRNA expression profiles of 929 tissue samples were downloaded from TANRIC database. We performed differential expression analysis between paired BRCA and adjacent normal tissues. Survival analysis was used to identify lncRNAs with prognostic value. Univariate and multivariate Cox regression analyses were performed to confirm the independent prognostic value of potential lncRNAs. Dysregulated signaling pathways associated with lncRNA expression were evaluated using gene set enrichment analysis. Results We found that a total of 398 lncRNAs were significantly differentially expressed between BRCA and adjacent normal tissues (adjusted P value <= 0.0001 and |logFC| >= 1). Additionally, 381 potential lncRNAs were correlated Overall Survival (OS) (P value < 0.05). A total of 48 lncRNAs remained when differentially expressed lncRNAs overlapped with lncRNAs that had prognostic value. Among the 48 lncRNAs, one lncRNA (LINC01614) had stronger prognostic value and was highly expressed in BRCA tissues. LINC01614 expression was validated as an independent prognostic factor using univariate and multivariate analyses. Higher LINC01614 expression was observed in several molecular subgroups including estrogen receptors+, progesterone receptors+ and human epidermal growth factor receptor 2 (HER2)+ subgroup, respectively. Also, BRCA carrying one of four gene mutations had higher expression of LINC01614 including AOAH, CIT, HER2 and ODZ1. Higher expression of LINC01614 was positively correlated with several gene sets including TGF-β1 response, CDH1 signals and cell adhesion pathways. Conclusions A novel lncRNA LINC01614 was identified as a potential biomarker for prognosis prediction of BRCA. This study emphasized the importance of LINC01614 and further research should be focused on it.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Melvyn Yap ◽  
Rebecca L. Johnston ◽  
Helena Foley ◽  
Samual MacDonald ◽  
Olga Kondrashova ◽  
...  

AbstractFor complex machine learning (ML) algorithms to gain widespread acceptance in decision making, we must be able to identify the features driving the predictions. Explainability models allow transparency of ML algorithms, however their reliability within high-dimensional data is unclear. To test the reliability of the explainability model SHapley Additive exPlanations (SHAP), we developed a convolutional neural network to predict tissue classification from Genotype-Tissue Expression (GTEx) RNA-seq data representing 16,651 samples from 47 tissues. Our classifier achieved an average F1 score of 96.1% on held-out GTEx samples. Using SHAP values, we identified the 2423 most discriminatory genes, of which 98.6% were also identified by differential expression analysis across all tissues. The SHAP genes reflected expected biological processes involved in tissue differentiation and function. Moreover, SHAP genes clustered tissue types with superior performance when compared to all genes, genes detected by differential expression analysis, or random genes. We demonstrate the utility and reliability of SHAP to explain a deep learning model and highlight the strengths of applying ML to transcriptome data.


2018 ◽  
Author(s):  
Giulio Spinozzi ◽  
Valentina Tini ◽  
Laura Mincarelli ◽  
Brunangelo Falini ◽  
Maria Paola Martelli

There are many methods available for each phase of the RNA-Seq analysis and each of them uses different algorithms. It is therefore useful to identify a pipeline that combines the best tools in terms of time and results. For this purpose, we compared five different pipelines, obtained by combining the most used tools in RNA-Seq analysis. Using RNA-Seq data on samples of different Acute Myeloid Leukemia (AML) cell lines, we compared five pipelines from the alignment to the differential expression analysis (DEA). For each one we evaluated the peak of RAM and time and then compared the differentially expressed genes identified by each pipeline. It emerged that the pipeline with shorter times, lower consumption of RAM and more reliable results, is that which involves the use ofHISAT2for alignment, featureCountsfor quantification and edgeRfor differential analysis. Finally, we developed an automated pipeline that recurs by default to the cited pipeline, but it also allows to choose between different tools. In addition, the pipeline makes a final meta-analysis that includes a Gene Ontology and Pathway analysis. The results can be viewed in an interactive Shiny Appand exported in a report (pdf, word or html formats).


2021 ◽  
Author(s):  
Zhou Chen ◽  
Hao Xu ◽  
Zhongtian Bai ◽  
Shi Dong ◽  
Jian Zhang ◽  
...  

Abstract Background Dysregulated expression of miRNAs in gastric cancer (GC) is associated with tumor progression. MiRNA markers are important for the prognosis and therapeutic targeting of GC patients. Methods To detect differentially expressed miRNAs in GC from the TCGA database and predict their target genes. We downloaded RNA sequencing (RNA-seq), miRNA-seq and clinical data of GC from TCGA. Differential expression analysis of RNA-seq and miRNA-seq data was performed by R 3.6.1. MiRNAs associated with prognosis were evaluated with the Cox model, and differentially expressed miRNAs were assessed by Kaplan–Meier curve analysis. Risk factors were identified in the Cox model. Target genes of differentially expressed miRNAs were searched in three databases. GO enrichment and KEGG pathway analyses were used to evaluate the biological functions of these target genes.Results Five miRNAs (hsa-miR-135b-3p, hsa-miR-143-5p, hsa-miR-196b-3p, hsa-miR-942-3p, hsa-miR-9-3p) were related to survival. Eight target genes (AKAP12, AR, DZIP1, PCDHA11, PCDHA12, PI15, SH3BGRL and TMEM108) were closely correlated with patient overall survival (OS). Conclusion Differentially expressed miRNAs and their target genes have an important influence on the diagnosis and prognosis of GC and may be used as tumor biomarkers in further studies and as potential therapeutic targets.


Sign in / Sign up

Export Citation Format

Share Document