Revealing Biological Pathways Implicated in Lung Cancer from TCGA Gene Expression Data Using Gene Set Enrichment Analysis

Cancer Informatics ◽

10.4137/cin.s13882 ◽

2014 ◽

Vol 13s1 ◽

pp. CIN.S13882 ◽

Cited By ~ 4

Author(s):

Binghuang Cai ◽

Xia Jiang

Keyword(s):

Gene Expression ◽

Lung Cancer ◽

Gene Expression Data ◽

Lung Squamous Cell Carcinoma ◽

Enrichment Analysis ◽

Gene Set Enrichment Analysis ◽

Expression Data ◽

Gene Set Enrichment ◽

Gene Set ◽

Pathway Gene

Analyzing biological system abnormalities in cancer patients based on measures of biological entities, such as gene expression levels, is an important and challenging problem. This paper applies existing methods, Gene Set Enrichment Analysis and Signaling Pathway Impact Analysis, to pathway abnormality analysis in lung cancer using microarray gene expression data. Gene expression data from studies of Lung Squamous Cell Carcinoma (LUSC) in The Cancer Genome Atlas project, and pathway gene set data from the Kyoto Encyclopedia of Genes and Genomes were used to analyze the relationship between pathways and phenotypes. Results, in the form of pathway rankings, indicate that some pathways may behave abnormally in LUSC. For example, both the cell cycle and viral carcinogenesis pathways ranked very high in LUSC. Furthermore, some pathways that are known to be associated with cancer, such as the p53 and the PI3K-Akt signal transduction pathways, were found to rank high in LUSC. Other pathways, such as bladder cancer and thyroid cancer pathways, were also ranked high in LUSC.

Download Full-text

Fast gene set enrichment analysis

10.1101/060012 ◽

2016 ◽

Cited By ~ 218

Author(s):

Gennady Korotkevich ◽

Vladimir Sukhov ◽

Alexey Sergushichev

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Polynomial Algorithm ◽

Enrichment Analysis ◽

Gene Set Enrichment Analysis ◽

Biological Processes ◽

Expression Data ◽

Gene Set Enrichment ◽

P Values ◽

Gene Set

AbstractPreranked gene set enrichment analysis (GSEA) is a widely used method for interpretation of gene expression data in terms of biological processes. Here we present FGSEA method that is able to estimate arbitrarily low GSEA P-values with a higher accuracy and much faster compared to other implementations. We also present a polynomial algorithm to calculate GSEA P-values exactly, which we use to practically confirm the accuracy of the method.

Download Full-text

Application of bi-clustering of gene expression data and gene set enrichment analysis methods to identify potentially disease causing nanomaterials

Data in Brief ◽

10.1016/j.dib.2017.10.060 ◽

2017 ◽

Vol 15 ◽

pp. 933-940 ◽

Cited By ~ 2

Author(s):

Andrew Williams ◽

Sabina Halappanavar

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Enrichment Analysis ◽

Gene Set Enrichment Analysis ◽

Expression Data ◽

Gene Set Enrichment ◽

Gene Set ◽

Analysis Methods

Download Full-text

A Comparative Study of Mouse Hepatic and Intestinal Gene Expression Profiles under PPARαKnockout by Gene Set Enrichment Analysis

PPAR Research ◽

10.1155/2011/629728 ◽

2011 ◽

Vol 2011 ◽

pp. 1-10 ◽

Cited By ~ 6

Author(s):

Kan He ◽

Qishan Wang ◽

Yumei Yang ◽

Minghui Wang ◽

Yuchun Pan

Keyword(s):

Gene Expression ◽

Lung Cancer ◽

Small Cell Lung Cancer ◽

Cell Lung Cancer ◽

Enrichment Analysis ◽

Small Cell ◽

Gene Set Enrichment Analysis ◽

Small Cell Lung ◽

Gene Set Enrichment ◽

Tissue Specific

Gene expression profiling of PPARαhas been used in several studies, but fewer studies went further to identify the tissue-specific pathways or genes involved in PPARαactivation in genome-wide. Here, we employed and applied gene set enrichment analysis to two microarray datasets both PPARαrelated respectively in mouse liver and intestine. We suggested that the regulatory mechanism of PPARαactivation by WY14643 in mouse small intestine is more complicated than in liver due to more involved pathways. Several pathways were cancer-related such as pancreatic cancer and small cell lung cancer, which indicated that PPARαmay have an important role in prevention of cancer development. 12 PPARαdependent pathways and 4 PPARαindependent pathways were identified highly common in both liver and intestine of mice. Most of them were metabolism related, such as fatty acid metabolism, tryptophan metabolism, pyruvate metabolism with regard to PPARαregulation but gluconeogenesis and propanoate metabolism independent of PPARαregulation. Keratan sulfate biosynthesis, the pathway of regulation of actin cytoskeleton, the pathways associated with prostate cancer and small cell lung cancer were not identified as hepatic PPARαindependent but as WY14643 dependent ones in intestinal study. We also provided some novel hepatic tissue-specific marker genes.

Download Full-text

Genetic network and gene set enrichment analysis to identify biomarkers related to cigarette smoking and lung cancer

Cancer Treatment Reviews ◽

10.1016/j.ctrv.2012.06.001 ◽

2013 ◽

Vol 39 (1) ◽

pp. 77-88 ◽

Cited By ~ 21

Author(s):

Xiaocong Fang ◽

Michael Netzer ◽

Christian Baumgartner ◽

Chunxue Bai ◽

Xiangdong Wang

Keyword(s):

Lung Cancer ◽

Cigarette Smoking ◽

Enrichment Analysis ◽

Genetic Network ◽

Gene Set Enrichment Analysis ◽

Gene Set Enrichment ◽

Gene Set

Download Full-text

Differential gene expression and gene-set enrichment analysis in Caco-2 monolayers during a 30-day timeline with Dexamethasone exposure

Tissue Barriers ◽

10.1080/21688370.2019.1651597 ◽

2019 ◽

Vol 7 (3) ◽

pp. e1651597 ◽

Cited By ~ 3

Author(s):

J.M. Robinson ◽

S. Turkington ◽

S.A. Abey ◽

N. Kenea ◽

W.A. Henderson

Keyword(s):

Gene Expression ◽

Differential Gene Expression ◽

Enrichment Analysis ◽

Gene Set Enrichment Analysis ◽

Gene Set Enrichment ◽

Gene Set ◽

Differential Gene

Download Full-text

Gene Expression Signature in Human Neuroblastoma with TERT Overexpression Can Be Identified by Gene Set Enrichment Analysis and Epigenetically Targeted in an Orthotopic Mouse Xenograft Model

Journal of the American College of Surgeons ◽

10.1016/j.jamcollsurg.2020.07.565 ◽

2020 ◽

Vol 231 (4) ◽

pp. S199

Author(s):

Min Huang ◽

Lauren Wood ◽

Jasmine C. Zeki ◽

Modupeola Diyaolu ◽

Miao Gong ◽

...

Keyword(s):

Gene Expression ◽

Xenograft Model ◽

Gene Expression Signature ◽

Enrichment Analysis ◽

Gene Set Enrichment Analysis ◽

Gene Set Enrichment ◽

Human Neuroblastoma ◽

Mouse Xenograft ◽

Gene Set ◽

Mouse Xenograft Model

Download Full-text

Tracking Difference in Gene Expression in a Time-Course Experiment Using Gene Set Enrichment Analysis

PLoS ONE ◽

10.1371/journal.pone.0107629 ◽

2014 ◽

Vol 9 (9) ◽

pp. e107629 ◽

Cited By ~ 2

Author(s):

Pui Shan Wong ◽

Michihiro Tanaka ◽

Yoshihiko Sunaga ◽

Masayoshi Tanaka ◽

Takeaki Taniguchi ◽

...

Keyword(s):

Gene Expression ◽

Time Course ◽

Enrichment Analysis ◽

Gene Set Enrichment Analysis ◽

Gene Set Enrichment ◽

Gene Set ◽

Time Course Experiment

Download Full-text

Towards a gold standard for benchmarking gene set enrichment analysis

10.1101/674267 ◽

2019 ◽

Cited By ~ 1

Author(s):

Ludwig Geistlinger ◽

Gergely Csaba ◽

Mara Santarelli ◽

Marcel Ramos ◽

Lucas Schiffer ◽

...

Keyword(s):

Ad Hoc ◽

Enrichment Analysis ◽

Gene Set Enrichment Analysis ◽

Data Sets ◽

Expression Data ◽

Rna Seq ◽

Gene Set Enrichment ◽

Gene Set ◽

Gene Sets ◽

Enrichment Methods

AbstractBackgroundAlthough gene set enrichment analysis has become an integral part of high-throughput gene expression data analysis, the assessment of enrichment methods remains rudimentary and ad hoc. In the absence of suitable gold standards, evaluations are commonly restricted to selected data sets and biological reasoning on the relevance of resulting enriched gene sets. However, this is typically incomplete and biased towards the goals of individual investigations.ResultsWe present a general framework for standardized and structured benchmarking of enrichment methods based on defined criteria for applicability, gene set prioritization, and detection of relevant processes. This framework incorporates a curated compendium of 75 expression data sets investigating 42 different human diseases. The compendium features microarray and RNA-seq measurements, and each dataset is associated with a precompiled GO/KEGG relevance ranking for the corresponding disease under investigation. We perform a comprehensive assessment of 10 major enrichment methods on the benchmark compendium, identifying significant differences in (i) runtime and applicability to RNA-seq data, (ii) fraction of enriched gene sets depending on the type of null hypothesis tested, and (iii) recovery of the a priori defined relevance rankings. Based on these findings, we make practical recommendations on (i) how methods originally developed for microarray data can efficiently be applied to RNA-seq data, (ii) how to interpret results depending on the type of gene set test conducted, and (iii) which methods are best suited to effectively prioritize gene sets with high relevance for the phenotype investigated.ConclusionWe carried out a systematic assessment of existing enrichment methods, and identified best performing methods, but also general shortcomings in how gene set analysis is currently conducted. We provide a directly executable benchmark system for straightforward assessment of additional enrichment methods.Availabilityhttp://bioconductor.org/packages/GSEABenchmarkeR

Download Full-text

Construction and Validation of a Reliable Six-Gene Prognostic Signature Based on the TP53 Alteration for Hepatocellular Carcinoma

Frontiers in Oncology ◽

10.3389/fonc.2021.618976 ◽

2021 ◽

Vol 11 ◽

Author(s):

Junyu Huo ◽

Liqun Wu ◽

Yunjin Zang

Keyword(s):

Gene Expression ◽

Hepatocellular Carcinoma ◽

Regression Analysis ◽

Cox Regression ◽

Enrichment Analysis ◽

Gene Set Enrichment Analysis ◽

Prognostic Signature ◽

Gene Set Enrichment ◽

Cox Regression Analysis ◽

Gene Set

BackgroundThe high mutation rate of TP53 in hepatocellular carcinoma (HCC) makes it an attractive potential therapeutic target. However, the mechanism by which TP53 mutation affects the prognosis of HCC is not fully understood.Material and ApproachThis study downloaded a gene expression profile and clinical-related information from The Cancer Genome Atlas (TCGA) database and the international genome consortium (ICGC) database. We used Gene Set Enrichment Analysis (GSEA) to determine the difference in gene expression patterns between HCC samples with wild-type TP53 (n=258) and mutant TP53 (n=116) in the TCGA cohort. We screened prognosis-related genes by univariate Cox regression analysis and Kaplan–Meier (KM) survival analysis. We constructed a six-gene prognostic signature in the TCGA training group (n=184) by Lasso and multivariate Cox regression analysis. To assess the predictive capability and applicability of the signature in HCC, we conducted internal validation, external validation, integrated analysis and subgroup analysis.ResultsA prognostic signature consisting of six genes (EIF2S1, SEC61A1, CDC42EP2, SRM, GRM8, and TBCD) showed good performance in predicting the prognosis of HCC. The area under the curve (AUC) values of the ROC curve of 1-, 2-, and 3-year survival of the model were all greater than 0.7 in each independent cohort (internal testing cohort, n = 181; TCGA cohort, n = 365; ICGC cohort, n = 229; whole cohort, n = 594; subgroup, n = 9). Importantly, by gene set variation analysis (GSVA) and the single sample gene set enrichment analysis (ssGSEA) method, we found three possible causes that may lead to poor prognosis of HCC: high proliferative activity, low metabolic activity and immunosuppression.ConclusionOur study provides a reliable method for the prognostic risk assessment of HCC and has great potential for clinical transformation.

Download Full-text

XGSEA: CROSS-species Gene Set Enrichment Analysis via domain adaptation

10.1101/2020.07.21.213645 ◽

2020 ◽

Author(s):

Menglan Cai ◽

Canh Hao Nguyen ◽

Hiroshi Mamitsuka ◽

Limin Li

Keyword(s):

Gene Expression ◽

Domain Adaptation ◽

Gene Knockout ◽

Enrichment Analysis ◽

Real Data ◽

Gene Set Enrichment Analysis ◽

Data Sets ◽

Gene Set Enrichment ◽

Gene Set ◽

Gene Sets

AbstractGene set enrichment analysis (GSEA) has been widely used to identify gene sets with statistically significant difference between cases and controls against a large gene set. GSEA needs both phenotype labels and expression of genes. However, gene expression are assessed more often for model organisms than minor species. More importantly, gene expression could not be measured under specific conditions for human, due to high healthy risk of direct experiments, such as non-approved treatment or gene knockout, and then often substituted by mouse. Thus predicting enrichment significance (on a phenotype) of a given gene set of a species (target, say human), by using gene expression measured under the same phenotype of the other species (source, say mouse) is a vital and challenging problem, which we call CROSS-species Gene Set Enrichment Problem (XGSEP). For XGSEP, we propose XGSEA (Cross-species Gene Set Enrichment Analysis), with three steps of: 1) running GSEA for a source species to obtain enrichment scores and p-values of source gene sets; 2) representing the relation between source and target gene sets by domain adaptation; and 3) using regression to predict p-values of target gene sets, based on the representation in 2). We extensively validated XGSEA by using four real data sets under various settings, proving that XGSEA significantly outperformed three baseline methods. A case study of identifying important human pathways for T cell dysfunction and reprogramming from mouse ATAC-Seq data further confirmed the reliability of XGSEA. Source code is available through https://github.com/LiminLi-xjtu/XGSEAAuthor summaryGene set enrichment analysis (GSEA) is a powerful tool in the gene sets differential analysis given a ranked gene list. GSEA requires complete data, gene expression with phenotype labels. However, gene expression could not be measured under specific conditions for human, due to high risk of direct experiments, such as non-approved treatment or gene knockout, and then often substituted by mouse. Thus no availability of gene expression leads to more challenging problem, CROSS-species Gene Set Enrichment Problem (XGSEP), in which enrichment significance (on a phenotype) of a given gene set of a species (target, say human) is predicted by using gene expression measured under the same phenotype of the other species (source, say mouse). In this work, we propose XGSEA (Cross-species Gene Set Enrichment Analysis) for XGSEP, with three steps of: 1) GSEA; 2) domain adaptation; and 3) regression. The results of four real data sets and a case study indicate that XGSEA significantly outperformed three baseline methods and confirmed the reliability of XGSEA.

Download Full-text