scholarly journals GSKB: A gene set database for pathway analysis in mouse

2016 ◽  
Author(s):  
Liming Lai ◽  
Jason Hennessey ◽  
Valerie Bares ◽  
Eun Woo Son ◽  
Yuguang Ban ◽  
...  

ABSTRACTInterpretation of high-throughput genomics data based on biological pathways constitutes a constant challenge, partly because of the lack of supporting pathway database. In this study, we created a functional genomics knowledgebase in mouse, which includes 33,261 pathways and gene sets compiled from 40 sources such as Gene Ontology, KEGG, GeneSetDB, PANTHER, microRNA and transcription factor target genes, etc. In addition, we also manually collected and curated 8,747 lists of differentially expressed genes from 2,526 published gene expression studies to enable the detection of similarity to previously reported gene expression signatures. These two types of data constitute a Gene Set Knowledgebase (GSKB), which can be readily used by various pathway analysis software such as gene set enrichment analysis (GSEA). Using our knowledgebase, we were able to detect the correct microRNA (miR-29) pathway that was suppressed using antisense oligonucleotides and confirmed its role in inhibiting fibrogenesis, which might involve upregulation of transcription factor SMAD3. The knowledgebase can be queried as a source of published gene lists for further meta-analysis. Through meta-analysis of 56 published gene lists related to retina cells, we revealed two fundamentally different types of gene expression changes. One is related to stress and inflammatory response blamed for causing blindness in many diseases; the other associated with visual perception by normal retina cells. GSKB is available online at http://ge-lab.org/gs/, and also as a Bioconductor package (gskb, https://bioconductor.org/packages/gskb/). This database enables in-depth interpretation of mouse genomics data both in terms of known pathways and the context of thousands of published expression signatures.

2017 ◽  
Author(s):  
Mingze He ◽  
Peng Liu ◽  
Carolyn J. Lawrence-Dill

AbstractGenome-wide molecular gene expression studies generally compare expression values for each gene across multiple conditions followed by cluster and gene set enrichment analysis to determine whether differentially expressed genes are enriched in specific biochemical pathways, cellular components, biological processes, and/or molecular functions, etc. This approach to analyzing differences in gene expression enables discovery of gene function, but is not useful to determine whether pre-defined groups of genes share or diverge in their expression patterns in response to treatments nor to assess the correctness of pre-defined gene set groupings. Here we present a simple method that changes the dimension of comparison by treating genes as variable traits to directly assess significance of differences in expression levels among pre-defined gene groups. Because expression distributions are typically skewed (thus unfit for direct assessment using Gaussian statistical methods) our method involves transforming expression data to approximate a normal distribution followed by dividing the genes into groups, then applying Gaussian parametric methods to assess significance of observed differences. This method enables the assessment of differences in gene expression distributions within and across samples, enabling hypothesis-based comparison among groups of genes. We demonstrate this method by assessing the significance of specific gene groups’ differential response to heat stress conditions in maize.AbbreviationsGO– gene ontology HSP – heat shock proteinKEGG– Kyoto Encyclopedia of Genes and GenomesHSF TF– heat shock factor transcription factorHSBP– heat shock binding proteinRNA– ribonucleic acidTE– transposable elementTF– transcription factorTPM– transcripts per kilobase millions


2011 ◽  
Vol 10 (4) ◽  
pp. 3856-3887 ◽  
Author(s):  
Q.Y. Ning ◽  
J.Z. Wu ◽  
N. Zang ◽  
J. Liang ◽  
Y.L. Hu ◽  
...  

2014 ◽  
Vol 13s1 ◽  
pp. CIN.S13882 ◽  
Author(s):  
Binghuang Cai ◽  
Xia Jiang

Analyzing biological system abnormalities in cancer patients based on measures of biological entities, such as gene expression levels, is an important and challenging problem. This paper applies existing methods, Gene Set Enrichment Analysis and Signaling Pathway Impact Analysis, to pathway abnormality analysis in lung cancer using microarray gene expression data. Gene expression data from studies of Lung Squamous Cell Carcinoma (LUSC) in The Cancer Genome Atlas project, and pathway gene set data from the Kyoto Encyclopedia of Genes and Genomes were used to analyze the relationship between pathways and phenotypes. Results, in the form of pathway rankings, indicate that some pathways may behave abnormally in LUSC. For example, both the cell cycle and viral carcinogenesis pathways ranked very high in LUSC. Furthermore, some pathways that are known to be associated with cancer, such as the p53 and the PI3K-Akt signal transduction pathways, were found to rank high in LUSC. Other pathways, such as bladder cancer and thyroid cancer pathways, were also ranked high in LUSC.


2019 ◽  
Author(s):  
Rani K. Powers ◽  
Anthony Sun ◽  
James C. Costello

AbstractSummaryGSEA-InContext Explorer is a Shiny app that allows users to perform two methods of gene set enrichment analysis (GSEA). The first, GSEAPreranked, applies the GSEA algorithm in which statistical significance is estimated from a null distribution of enrichment scores generated for randomly permuted gene sets. The second, GSEA-InContext, incorporates a user-defined set of background experiments to define the null distribution and calculate statistical significance. GSEA-InContext Explorer allows the user to build custom background sets from a compendium of over 5,700 curated experiments, run both GSEAPreranked and GSEA-InContext on their own uploaded experiment, and explore the results using an interactive interface. This tool will allow researchers to visualize gene sets that are commonly enriched across experiments and identify gene sets that are uniquely significant in their experiment, thus complementing current methods for interpreting gene set enrichment results.Availability and implementationThe code for GSEA-InContext Explorer is available at: https://github.com/CostelloLab/GSEA-InContext_Explorer and the interactive tool is at: http://gsea-incontext_explorer.ngrok.io


F1000Research ◽  
2017 ◽  
Vol 6 ◽  
pp. 2010 ◽  
Author(s):  
Monther Alhamdoosh ◽  
Charity W. Law ◽  
Luyi Tian ◽  
Julie M. Sheridan ◽  
Milica Ng ◽  
...  

Gene set enrichment analysis is a popular approach for prioritising the biological processes perturbed in genomic datasets. The Bioconductor project hosts over 80 software packages capable of gene set analysis. Most of these packages search for enriched signatures amongst differentially regulated genes to reveal higher level biological themes that may be missed when focusing only on evidence from individual genes. With so many different methods on offer, choosing the best algorithm and visualization approach can be challenging. The EGSEA package solves this problem by combining results from up to 12 prominent gene set testing algorithms to obtain a consensus ranking of biologically relevant results.This workflow demonstrates how EGSEA can extend limma-based differential expression analyses for RNA-seq and microarray data using experiments that profile 3 distinct cell populations important for studying the origins of breast cancer. Following data normalization and set-up of an appropriate linear model for differential expression analysis, EGSEA builds gene signature specific indexes that link a wide range of mouse or human gene set collections obtained from MSigDB, GeneSetDB and KEGG to the gene expression data being investigated. EGSEA is then configured and the ensemble enrichment analysis run, returning an object that can be queried using several S4 methods for ranking gene sets and visualizing results via heatmaps, KEGG pathway views, GO graphs, scatter plots and bar plots. Finally, an HTML report that combines these displays can fast-track the sharing of results with collaborators, and thus expedite downstream biological validation. EGSEA is simple to use and can be easily integrated with existing gene expression analysis pipelines for both human and mouse data.


2021 ◽  
Vol 2021 ◽  
pp. 1-19
Author(s):  
Xinheng Liu ◽  
Yongxian Rong ◽  
Donglin Huang ◽  
Zhijie Liang ◽  
Xiaolin Yi ◽  
...  

Severe burns are acute wounds caused by local heat exposure, resulting in life-threatening systemic effects and poor survival. However, the specific molecular mechanisms remain unclear. First, we downloaded gene expression data related to severe burns from the GEO database (GSE19743, GSE37069, and GSE77791). Then, a gene expression analysis was performed to identify differentially expressed genes (DEGs) and construct protein-protein interaction (PPI) network. The molecular mechanism was identified by enrichment analysis and Gene Set Enrichment Analysis. In addition, STEM software was used to screen for genes persistently expressed during response to severe burns, and receiver operating characteristic (ROC) curve was used to identify key DEGs. A total of 2631 upregulated and 3451 downregulated DEGs were identified. PPI network analysis clustered these DEGs into 13 modules. Importantly, module genes mostly related with immune responses and metabolism. In addition, we identified genes persistently altered during the response to severe burns corresponding to survival and death status. Among the genes with high area under the ROC curve in the PPI network gene, CCL5 and LCK were identified as key DEGs, which may affect the prognosis of burn patients. Gene set variation analysis showed that the immune response was inhibited and several types of immune cells were decreased, while the metabolic response was enhanced. The results showed that persistent gene expression changes occur in response to severe burns, which may underlie chronic alterations in physiological pathways. Identifying the key altered genes may reveal potential therapeutic targets for mitigating the effects of severe burns.


2020 ◽  
Author(s):  
Xiaomei Lei ◽  
Zhijun Feng ◽  
Xiaojun Wang ◽  
Xiaodong He

Abstract Background. Exploring alterations in the host transcriptome following SARS-CoV-2 infection is not only highly warranted to help us understand molecular mechanisms of the disease, but also provide new prospective for screening effective antiviral drugs, finding new therapeutic targets, and evaluating the risk of systemic inflammatory response syndrome (SIRS) early.Methods. We downloaded three gene expression matrix files from the Gene Expression Omnibus (GEO) database, and extracted the gene expression data of the SARS-CoV-2 infection and non-infection in human samples and different cell line samples, and then performed gene set enrichment analysis (GSEA), respectively. Thereafter, we integrated the results of GSEA and obtained co-enriched gene sets and co-core genes in three various microarray data. Finally, we also constructed a protein-protein interaction (PPI) network and molecular modules for co-core genes and performed Kyoto encyclopedia of genes and genomes (KEGG) pathway analysis for the genes from modules to clarify their possible biological processes and underlying signaling pathway. Results. A total of 11 co-enriched gene sets were identified from the three various microarray data. Among them, 10 gene sets were activated, and involved in immune response and inflammatory reaction. 1 gene set was suppressed, and participated in cell cycle. The analysis of molecular modules showed that 2 modules might play a vital role in the pathogenic process of SARS-CoV-2 infection. The KEGG enrichment analysis showed that genes from module one enriched in signaling pathways related to inflammation, but genes from module two enriched in signaling of cell cycle and DNA replication. Particularly, necroptosis signaling, a newly identified type of programmed cell death that differed from apoptosis, was also determined in our findings. Additionally, for patients with SARS-CoV-2 infection, genes from module one showed a relatively high-level expression while genes from module two showed low-level. Conclusions. We identified two molecular modules were used to assess severity and predict the prognosis of the patients with SARS-CoV-2 infection. In addition, these results provide a unique opportunity to explore more molecular pathways as new potential targets on therapy in COVID 19.


2019 ◽  
Vol 47 (W1) ◽  
pp. W206-W211 ◽  
Author(s):  
Shaojuan Li ◽  
Changxin Wan ◽  
Rongbin Zheng ◽  
Jingyu Fan ◽  
Xin Dong ◽  
...  

AbstractCharacterizing the ontologies of genes directly regulated by a transcription factor (TF), can help to elucidate the TF’s biological role. Previously, we developed a widely used method, BETA, to integrate TF ChIP-seq peaks with differential gene expression (DGE) data to infer direct target genes. Here, we provide Cistrome-GO, a website implementation of this method with enhanced features to conduct ontology analyses of gene regulation by TFs in human and mouse. Cistrome-GO has two working modes: solo mode for ChIP-seq peak analysis; and ensemble mode, which integrates ChIP-seq peaks with DGE data. Cistrome-GO is freely available at http://go.cistrome.org/.


Sign in / Sign up

Export Citation Format

Share Document