A Gene Set Enrichment Analysis of multiomic celiac disease data
Celiac disease is a chronic condition, which can be described as inflammatory and autoimmune. The well-known treatment is a lifelong gluten-free diet, but it can be not totally effective for a high percentage of the patients. The aim of this work is to approach the celiac disease complexity from a bioinformatics point of view. The idea is to analyse the state of the art from GEO online repository and revisit the works, by integrating gene expression data and Gene Ontology (GO) terms. Gene Set Enrichment Analysis (GSEA) is a set of statistical methods to classify genes in groups, which are related to common biological function, chromosomal location or regulation. The work is developed in R environment. The packages are downloaded by the online repository Bioconductor. The studies are not standardized. In these circumstances, the candidate genes subset is chosen with a trade-off among all the scores, thus the creation of a GO graph eludes the Fishers exact test, keeping its biological importance to define process clusters. A little framework on the biological processes involved in each study on celiac disease is suggested: GSE11501, peptidyl-tyrosine phosphorylation, phosphatidylinositol 3-kinase signaling, and response to endoplasmic reticulum stress; GSE87629, mitosis regulation, microtubule cytoskeleton organisation, and protein destabilization; GSE72625, signaling pathway and cellular response about interferon-gamma; GSE61849a, immune response and immune system development; GSE61849b, protein phosphorylation, apoptotic process, and regulation of cell adhesion; GSE76168, cytokine mediate signaling pathways.