Enhancing gene set enrichment using networks

Differential gene expression (DGE) studies often suffer from poor interpretability of their primary results, i.e., thousands of differentially expressed genes. This has led to the introduction of gene set analysis (GSA) methods that aim at identifying interpretable global effects by grouping genes into sets of common context, such as, molecular pathways, biological function or tissue localization. In practice, GSA often results in hundreds of differentially regulated gene sets. Similar to the genes they contain, gene sets are often regulated in a correlative fashion because they share many of their genes or they describe related processes. Using these kind of neighborhood information to construct networks of gene sets allows to identify highly connected sub-networks as well as poorly connected islands or singletons. We show here how topological information and other network features can be used to filter and prioritize gene sets in routine DGE studies. Community detection in combination with automatic labeling and the network representation of gene set clusters further constitute an appealing and intuitive visualization of GSA results. The RICHNET workflow described here does not require human intervention and can thus be conveniently incorporated in automated analysis pipelines.

Download Full-text

Enhancing gene set enrichment using networks

F1000Research ◽

10.12688/f1000research.17824.1 ◽

2019 ◽

Vol 8 ◽

pp. 129

Author(s):

Michael Prummer

Keyword(s):

Biological Function ◽

Automated Analysis ◽

Gene Set Analysis ◽

Molecular Pathways ◽

Human Intervention ◽

Gene Set Enrichment ◽

Topological Information ◽

Gene Set ◽

Gene Sets ◽

Differential Gene

Download Full-text

Differential Gene Set Enrichment Analysis: a statistical approach to quantify the relative enrichment of two gene sets

Bioinformatics ◽

10.1093/bioinformatics/btaa658 ◽

2020 ◽

Author(s):

James H Joly ◽

William E Lowry ◽

Nicholas A Graham

Keyword(s):

Synthetic Data ◽

Enrichment Analysis ◽

Gene Set Enrichment Analysis ◽

Supplementary Information ◽

Gene Set Enrichment ◽

Gene Set ◽

Transcriptomic Data ◽

Relative Enrichment ◽

Gene Sets ◽

Differential Gene

Abstract Motivation Gene Set Enrichment Analysis (GSEA) is an algorithm widely used to identify statistically enriched gene sets in transcriptomic data. However, GSEA cannot examine the enrichment of two gene sets or pathways relative to one another. Here we present Differential Gene Set Enrichment Analysis (DGSEA), an adaptation of GSEA that quantifies the relative enrichment of two gene sets. Results After validating the method using synthetic data, we demonstrate that DGSEA accurately captures the hypoxia-induced coordinated upregulation of glycolysis and downregulation of oxidative phosphorylation. We also show that DGSEA is more predictive than GSEA of the metabolic state of cancer cell lines, including lactate secretion and intracellular concentrations of lactate and AMP. Finally, we demonstrate the application of DGSEA to generate hypotheses about differential metabolic pathway activity in cellular senescence. Together, these data demonstrate that DGSEA is a novel tool to examine the relative enrichment of gene sets in transcriptomic data. Availability and implementation DGSEA software and tutorials are available at https://jamesjoly.github.io/DGSEA/. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

MAVTgsa: An R Package for Gene Set (Enrichment) Analysis

BioMed Research International ◽

10.1155/2014/346074 ◽

2014 ◽

Vol 2014 ◽

pp. 1-11 ◽

Cited By ~ 2

Author(s):

Chih-Yi Chien ◽

Ching-Wei Chang ◽

Chen-An Tsai ◽

James J. Chen

Keyword(s):

Enrichment Analysis ◽

R Package ◽

Ordinary Least Squares ◽

Gene Set Enrichment Analysis ◽

Gene Set Analysis ◽

Gene Set Enrichment ◽

Experimental Conditions ◽

Gene Set ◽

Gene Sets ◽

Significant Difference

Gene set analysis methods aim to determine whether an a priori defined set of genes shows statistically significant difference in expression on either categorical or continuous outcomes. Although many methods for gene set analysis have been proposed, a systematic analysis tool for identification of different types of gene set significance modules has not been developed previously. This work presents an R package, called MAVTgsa, which includes three different methods for integrated gene set enrichment analysis. (1) The one-sided OLS (ordinary least squares) test detects coordinated changes of genes in gene set in one direction, either up- or downregulation. (2) The two-sided MANOVA (multivariate analysis variance) detects changes both up- and downregulation for studying two or more experimental conditions. (3) A random forests-based procedure is to identify gene sets that can accurately predict samples from different experimental conditions or are associated with the continuous phenotypes. MAVTgsa computes thePvalues and FDR (false discovery rate)q-value for all gene sets in the study. Furthermore, MAVTgsa provides several visualization outputs to support and interpret the enrichment results. This package is available online.

Download Full-text

Differential Gene Set Enrichment Analysis: A statistical approach to quantify the relative enrichment of two gene sets

10.1101/860460 ◽

2019 ◽

Author(s):

James H. Joly ◽

William E. Lowry ◽

Nicholas A. Graham

Keyword(s):

Cell Lines ◽

Cancer Cell ◽

Enrichment Analysis ◽

Cancer Cell Lines ◽

Gene Set Enrichment Analysis ◽

Gene Set Enrichment ◽

Gene Set ◽

Relative Enrichment ◽

Gene Sets ◽

Differential Gene

AbstractGene Set Enrichment Analysis (GSEA) is an algorithm widely used to identify statistically enriched gene sets in transcriptomic data. However, to our knowledge, there exists no method for examining the enrichment of two gene sets relative to one another. Here, we present Differential Gene Set Enrichment Analysis (DGSEA), an adaptation of GSEA that assesses the relative enrichment of two gene sets. Using the metabolic pathways glycolysis and oxidative phosphorylation as an example, we demonstrate that DGSEA accurately captures the hypoxia-induced shift towards glycolysis. We also show that DGSEA is more predictive than GSEA of the metabolic state of cancer cell lines, including lactate secretion and intracellular concentrations of lactate and AMP. Furthermore, we demonstrate that DGSEA identifies novel metabolic dependencies not found by GSEA in cancer cell lines. Together, these data demonstrate that DGSEA is a novel tool to examine the relative enrichment of two gene sets.

Download Full-text

Higher Acid-Base Imbalance Associated with Respiratory Failure Could Decrease the Survival of Patients with Scrub Typhus during Intensive Care Unit Stay: A Gene Set Enrichment Analysis

Journal of Clinical Medicine ◽

10.3390/jcm8101580 ◽

2019 ◽

Vol 8 (10) ◽

pp. 1580 ◽

Cited By ~ 1

Author(s):

Kyoung Min Moon ◽

Kyueng-Whan Min ◽

Mi-Hye Kim ◽

Dong-Hoon Kim ◽

Byoung Kwan Son ◽

...

Keyword(s):

Intensive Care Unit ◽

Intensive Care ◽

Respiratory Failure ◽

Scrub Typhus ◽

Enrichment Analysis ◽

Gene Set Enrichment Analysis ◽

Acid Base ◽

Gene Set Enrichment ◽

Gene Set ◽

Gene Sets

Ninety percent of patients with scrub typhus (SC) with vasculitis-like syndrome recover after mild symptoms; however, 10% can suffer serious complications, such as acute respiratory failure (ARF) and admission to the intensive care unit (ICU). Predictors for the progression of SC have not yet been established, and conventional scoring systems for ICU patients are insufficient to predict severity. We aimed to identify simple and robust indicators to predict aggressive behaviors of SC. We evaluated 91 patients with SC and 81 non-SC patients who were admitted to the ICU, and 32 cases from the public functional genomics data repository for gene expression analysis. We analyzed the relationships between several predictors and clinicopathological characteristics in patients with SC. We performed gene set enrichment analysis (GSEA) to identify SC-specific gene sets. The acid-base imbalance (ABI), measured 24 h before serious complications, was higher in patients with SC than in non-SC patients. A high ABI was associated with an increased incidence of ARF, leading to mechanical ventilation and worse survival. GSEA revealed that SC correlated to gene sets reflecting inflammation/apoptotic response and airway inflammation. ABI can be used to indicate ARF in patients with SC and assist with early detection.

Download Full-text

Transcriptional survey of alveolar macrophages in a murine model of chronic granulomatous inflammation reveals common themes with human sarcoidosis

AJP Lung Cellular and Molecular Physiology ◽

10.1152/ajplung.00289.2017 ◽

2018 ◽

Vol 314 (4) ◽

pp. L617-L625 ◽

Cited By ~ 8

Author(s):

Arjun Mohan ◽

Anagha Malur ◽

Matthew McPeek ◽

Barbara P. Barna ◽

Lynn M. Schnapp ◽

...

Keyword(s):

Differentially Expressed Genes ◽

Murine Model ◽

Alveolar Macrophages ◽

Gene Set Enrichment Analysis ◽

Differentially Expressed ◽

Multiwall Carbon Nanotube ◽

Gene Set Enrichment ◽

Integrated Network ◽

Gene Set ◽

Gene Sets

To advance our understanding of the pathobiology of sarcoidosis, we developed a multiwall carbon nanotube (MWCNT)-based murine model that shows marked histological and inflammatory signal similarities to this disease. In this study, we compared the alveolar macrophage transcriptional signatures of our animal model with human sarcoidosis to identify overlapping molecular programs. Whole genome microarrays were used to assess gene expression of alveolar macrophages in six MWCNT-exposed and six control animals. The results were compared with the transcriptional profiles of alveolar immune cells in 15 sarcoidosis patients and 12 healthy humans. Rigorous statistical methods were used to identify differentially expressed genes. To better elucidate activated pathways, integrated network and gene set enrichment analysis (GSEA) was performed. We identified over 1,000 differentially expressed between control and MWCNT mice. Gene ontology functional analysis showed overrepresentation of processes primarily involved in immunity and inflammation in MCWNT mice. Applying GSEA to both mouse and human samples revealed upregulation of 92 gene sets in MWCNT mice and 142 gene sets in sarcoidosis patients. Commonly activated pathways in both MWCNT mice and sarcoidosis included adaptive immunity, T-cell signaling, IL-12/IL-17 signaling, and oxidative phosphorylation. Differences in gene set enrichment between MWCNT mice and sarcoidosis patients were also observed. We applied network analysis to differentially expressed genes common between the MWCNT model and sarcoidosis to identify key drivers of disease. In conclusion, an integrated network and transcriptomics approach revealed substantial functional similarities between a murine model and human sarcoidosis particularly with respect to activation of immune-specific pathways.

Download Full-text

Drug perturbation gene set enrichment analysis (dpGSEA): a new transcriptomic drug screening approach

BMC Bioinformatics ◽

10.1186/s12859-020-03929-0 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Mike Fang ◽

Brian Richardson ◽

Cheryl M. Cameron ◽

Jean-Eudes Dazard ◽

Mark J. Cameron

Keyword(s):

Drug Targets ◽

T Regulatory Cells ◽

Enrichment Analysis ◽

Gene Set Enrichment Analysis ◽

Regulatory Cells ◽

Gene Set Enrichment ◽

Gene Set ◽

Gene Sets ◽

Gastroenteropancreatic Neuroendocrine Tumor ◽

Public Datasets

Abstract Background In this study, we demonstrate that our modified Gene Set Enrichment Analysis (GSEA) method, drug perturbation GSEA (dpGSEA), can detect phenotypically relevant drug targets through a unique transcriptomic enrichment that emphasizes biological directionality of drug-derived gene sets. Results We detail our dpGSEA method and show its effectiveness in detecting specific perturbation of drugs in independent public datasets by confirming fluvastatin, paclitaxel, and rosiglitazone perturbation in gastroenteropancreatic neuroendocrine tumor cells. In drug discovery experiments, we found that dpGSEA was able to detect phenotypically relevant drug targets in previously published differentially expressed genes of CD4+T regulatory cells from immune responders and non-responders to antiviral therapy in HIV-infected individuals, such as those involved with virion replication, cell cycle dysfunction, and mitochondrial dysfunction. dpGSEA is publicly available at https://github.com/sxf296/drug_targeting. Conclusions dpGSEA is an approach that uniquely enriches on drug-defined gene sets while considering directionality of gene modulation. We recommend dpGSEA as an exploratory tool to screen for possible drug targeting molecules.

Download Full-text

Measuring consistency among gene set analysis methods: A systematic study

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720019400109 ◽

2019 ◽

Vol 17 (05) ◽

pp. 1940010 ◽

Cited By ~ 1

Author(s):

Farhad Maleki ◽

Katie L. Ovens ◽

Daniel J. Hogan ◽

Elham Rezaei ◽

Alan M. Rosenberg ◽

...

Keyword(s):

Gene Set Analysis ◽

Rna Seq ◽

Systematic Analysis ◽

Gene Set ◽

Large Gene ◽

Analysis Methods ◽

Gene Sets ◽

Significant Gene ◽

Biological Insight ◽

Relevant Gene

Gene set analysis is a quantitative approach for generating biological insight from gene expression datasets. The abundance of gene set analysis methods speaks to their popularity, but raises the question of the extent to which results are affected by the choice of method. Our systematic analysis of 13 popular methods using 6 different datasets, from both DNA microarray and RNA-Seq origin, shows that this choice matters a great deal. We observed that the overall number of gene sets reported by each method differed by up to 2 orders of magnitude, and there was a bias toward reporting large gene sets with some methods. Furthermore, there was substantial disagreement between the 20 most statistically significant gene sets reported by the methods. This was also observed when expanding to the 100 most statistically significant reported gene sets. For different datasets of the same phenotype/condition, the top 20 and top 100 most significant results also showed little to no agreement even when using the same method. GAGE, PAGE, and ORA were the only methods able to achieve relatively high reproducibility when comparing the 20 and 100 most statistically significant gene sets. Biological validation on a juvenile idiopathic arthritis (JIA) dataset showed wide variation in terms of the relevance of the top 20 and top 100 most significant gene sets to known biology of the disease, where GAGE predicted the most relevant gene sets, followed by GSEA, ORA, and PAGE.

Download Full-text

Weighted Kolmogorov Smirnov testing: an alternative for Gene Set Enrichment Analysis

Statistical Applications in Genetics and Molecular Biology ◽

10.1515/sagmb-2014-0077 ◽

2015 ◽

Vol 14 (3) ◽

Cited By ~ 13

Author(s):

Konstantina Charmpi ◽

Bernard Ycart

Keyword(s):

Weight Function ◽

Null Hypothesis ◽

Computing Time ◽

Enrichment Analysis ◽

Gene Set Enrichment Analysis ◽

Test Statistic ◽

Gene Set Enrichment ◽

Gene Set ◽

Gene Sets ◽

Kolmogorov Smirnov

AbstractGene Set Enrichment Analysis (GSEA) is a basic tool for genomic data treatment. Its test statistic is based on a cumulated weight function, and its distribution under the null hypothesis is evaluated by Monte-Carlo simulation. Here, it is proposed to subtract to the cumulated weight function its asymptotic expectation, then scale it. Under the null hypothesis, the convergence in distribution of the new test statistic is proved, using the theory of empirical processes. The limiting distribution needs to be computed only once, and can then be used for many different gene sets. This results in large savings in computing time. The test defined in this way has been called Weighted Kolmogorov Smirnov (WKS) test. Using expression data from the GEO repository, tested against the MSig Database C2, a comparison between the classical GSEA test and the new procedure has been conducted. Our conclusion is that, beyond its mathematical and algorithmic advantages, the WKS test could be more informative in many cases, than the classical GSEA test.

Download Full-text

GSEA-InContext Explorer: An interactive visualization tool for putting gene set enrichment analysis results into biological context

10.1101/659847 ◽

2019 ◽

Author(s):

Rani K. Powers ◽

Anthony Sun ◽

James C. Costello

Keyword(s):

Statistical Significance ◽

Null Distribution ◽

Enrichment Analysis ◽

Gene Set Enrichment Analysis ◽

Gene Set Enrichment ◽

Gene Set ◽

Link Type ◽

Interactive Interface ◽

Gene Sets ◽

Shiny App

AbstractSummaryGSEA-InContext Explorer is a Shiny app that allows users to perform two methods of gene set enrichment analysis (GSEA). The first, GSEAPreranked, applies the GSEA algorithm in which statistical significance is estimated from a null distribution of enrichment scores generated for randomly permuted gene sets. The second, GSEA-InContext, incorporates a user-defined set of background experiments to define the null distribution and calculate statistical significance. GSEA-InContext Explorer allows the user to build custom background sets from a compendium of over 5,700 curated experiments, run both GSEAPreranked and GSEA-InContext on their own uploaded experiment, and explore the results using an interactive interface. This tool will allow researchers to visualize gene sets that are commonly enriched across experiments and identify gene sets that are uniquely significant in their experiment, thus complementing current methods for interpreting gene set enrichment results.Availability and implementationThe code for GSEA-InContext Explorer is available at: https://github.com/CostelloLab/GSEA-InContext_Explorer and the interactive tool is at: http://gsea-incontext_explorer.ngrok.io

Download Full-text