geneExpressionFromGEO: An R Package to Facilitate Data Reading from Gene Expression Omnibus (GEO)

Author(s):  
Davide Chicco
2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Fanyan Meng ◽  
Ningna Du ◽  
Daoming Xu ◽  
Li Kuai ◽  
Lanying Liu ◽  
...  

Ankylosing spondylitis (AS) is an autoimmune disease that mainly affects the spinal joints, sacroiliac joints, and adjacent soft tissues. We conducted bioinformatics analysis to explore the molecular mechanism related to AS pathogenesis and uncover novel potential molecular targets for the treatment of AS. The profiles of GSE25101, containing gene expression data extracted from the blood of 16 AS patients and 16 matched controls, were acquired from the Gene Expression Omnibus (GEO) database. The background correction and standardization were carried out utilizing the transcript per million (TPM) method. After analysis of AS patients and the normal groups, we identified 199 differentially expressed genes (DEGs) with upregulation and 121 DEGs with downregulation by the limma R package. The results of the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway and Gene Ontology (GO) biological process enrichment analysis revealed that the DEGs with upregulation were mainly associated with spliceosome, ribosome, RNA-catabolic process, electron transport chain, etc. And the DEGs with downregulation primarily participated in T cell-associated pathways and processes. After analysis of the protein-protein interaction (PPI) network, our data revealed that the hub genes, comprising MRPL13, MRPL22, LSM3, COX7A2, COX7C, EP300, PTPRC, and CD4, could be the treatment targets in AS. Our data furnish new hints to uncover the features of AS and explore more promising treatment targets towards AS.


2021 ◽  
Author(s):  
Mathias N Stokholm ◽  
Maria B Rabaglino ◽  
Haja N Kadarmideen

Transcriptomic data is often expensive and difficult to generate in large cohorts in comparison to genomic data and therefore is often important to integrate multiple transcriptomic datasets from both microarray and next generation sequencing (NGS) based transcriptomic data across similar experiments or clinical trials to improve analytical power and discovery of novel transcripts and genes. However, transcriptomic data integration presents a few challenges including re-annotation and batch effect removal. We developed the Gene Expression Data Integration (GEDI) R package to enable transcriptomic data integration by combining already existing R packages. With just four functions, the GEDI R package makes constructing a transcriptomic data integration pipeline straightforward. Together, the functions overcome the complications in transcriptomic data integration by automatically re-annotating the data and removing the batch effect. The removal of the batch effect is verified with Principal Component Analysis and the data integration is verified using a logistic regression model with forward stepwise feature selection. To demonstrate the functionalities of the GEDI package, we integrated five bovine endometrial transcriptomic datasets from the NCBI Gene Expression Omnibus. The datasets included Affymetrix, Agilent and RNA-sequencing data. Furthermore, we compared the GEDI package to already existing tools and found that GEDI is the only tool that provides a full transcriptomic data integration pipeline including verification of both batch effect removal and data integration.


2020 ◽  
Vol 36 (15) ◽  
pp. 4301-4308
Author(s):  
Stephan Seifert ◽  
Sven Gundlach ◽  
Olaf Junge ◽  
Silke Szymczak

Abstract Motivation High-throughput technologies allow comprehensive characterization of individuals on many molecular levels. However, training computational models to predict disease status based on omics data is challenging. A promising solution is the integration of external knowledge about structural and functional relationships into the modeling process. We compared four published random forest-based approaches using two simulation studies and nine experimental datasets. Results The self-sufficient prediction error approach should be applied when large numbers of relevant pathways are expected. The competing methods hunting and learner of functional enrichment should be used when low numbers of relevant pathways are expected or the most strongly associated pathways are of interest. The hybrid approach synthetic features is not recommended because of its high false discovery rate. Availability and implementation An R package providing functions for data analysis and simulation is available at GitHub (https://github.com/szymczak-lab/PathwayGuidedRF). An accompanying R data package (https://github.com/szymczak-lab/DataPathwayGuidedRF) stores the processed and quality controlled experimental datasets downloaded from Gene Expression Omnibus (GEO). Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 2020 ◽  
pp. 1-14
Author(s):  
Qiaowei Fan ◽  
Lin Guo ◽  
Jingming Guan ◽  
Jing Chen ◽  
Yujing Fan ◽  
...  

Purpose. Gegen Qinlian decoction (GQD) has been used to treat gastrointestinal diseases, such as diarrhea and ulcerative colitis (UC). A recent study demonstrated that GQD enhanced the effect of PD-1 blockade in colorectal cancer (CRC). This study used network pharmacology analysis to investigate the mechanisms of GQD as a potential therapeutic approach against CRC. Materials and Methods. Bioactive chemical ingredients (BCIs) of GQD were collected from the Traditional Chinese Medicine Systems Pharmacology (TCMSP) database. CRC-specific genes were obtained using the gene expression profile GSE110224 from the Gene Expression Omnibus (GEO) database. Target genes related to BCIs of GQD were then screened out. The GQD-CRC ingredient-target pharmacology network was constructed and visualized using Cytoscape software. A protein-protein interaction (PPI) network was subsequently constructed and analyzed with BisoGenet and CytoNCA plug-in in Cytoscape. Gene Ontology (GO) functional and the Kyoto Encyclopaedia of Genes and Genomes (KEGG) pathway enrichment analysis for target genes were then performed using the R package of clusterProfiler. Results. One hundred and eighteen BCIs were determined to be effective on CRC, including quercetin, wogonin, and baicalein. Twenty corresponding target genes were screened out including PTGS2, CCNB1, and SPP1. Among these genes, CCNB1 and SPP1 were identified as crucial to the PPI network. A total of 212 GO terms and 6 KEGG pathways were enriched for target genes. Functional analysis indicated that these targets were closely related to pathophysiological processes and pathways such as biosynthetic and metabolic processes of prostaglandins and prostanoids, cytokine and chemokine activities, and the IL-17, TNF, Toll-like receptor, and nuclear factor-kappa B (NF-κB) signaling pathways. Conclusion. The study elucidated the “multiingredient, multitarget, and multipathway” mechanisms of GQD against CRC from a systemic perspective, indicating GQD to be a candidate therapy for CRC treatment.


2021 ◽  
Vol 67 (3) ◽  
pp. 195-200
Author(s):  
Elham Kazemi ◽  
Javaad Zargooshi ◽  
Marzieh Kaboudi ◽  
Fereshteh Izadi ◽  
Hamid-Reza Mohammadi Motlagh ◽  
...  

Diabetes can cause some diseases or abnormalities. One of the disorders caused by diabetes may be erectile dysfunction (ED). ED is sexual dysfunction characterized by the inability to establish or maintain an erect penis during sexual activity and is a complication of men with chronic type 2 diabetes. These processes, disorders and diseases are highly influenced by the genetics of individuals. In this study, the relationship between genes and diabetes and ED has been explored by a system biology approach. For this purpose, the samples from ten control and diabetic-ED rats were collected. After a search in Gene Expression Omnibus (GEO), series with accession number GSE2457 comprising of 5 normal and 5 diabetic-ED rats were selected. Raw CEL files of these samples were normalized with robust multi-array average (RMA) expression measure method by using the linear models for microarray data (LIMMA) R package. The extracted probe IDs were transformed into 10451 unique and validated official gene symbols. Then, differentially expressed genes (DEGs) were identified between control and normal mucosa by employing the LIMMA R package. DEGs were classified by utilizing KEGG to underlying pathways by Enrichr. The expression values of DEGs were used to construct a gene regulatory network (GRN), by the GENEI3 R package. To analyze the topology of constructed GRNs, betweenness centrality was calculated. Genes with higher betweenness centrality scores were then identified, through the CytoNCA. We then took the commonality of DEGs genes and high-top ranking genes from CytoNCA via a predicted interaction network using GeneMANIA as the most likely important genes in erectile dysfunction. Among the 374 DEGs studied, 146 DEGs showed up-regulation and 228 DEGs displayed down-regulation expression in diabetic-ED rats. According to the Volcano plot, the dpp4, LOC102553868, Ndufa412, Oxct1, Atp2b3 and Zfp91 gene down-regulated and Lpl, Retsat, B4galt1 and Pdk4 genes up-regulated in ED and diabetic rats. Furthermore, genes like dpp4 acted as hubs in the inferred GRN.


2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Guanyi Wang ◽  
Yibin Jia ◽  
Yuqin Ye ◽  
Enming Kang ◽  
Huijun Chen ◽  
...  

Abstract Background Posterior fossa ependymoma (EPN-PF) can be classified into Group A posterior fossa ependymoma (EPN-PFA) and Group B posterior fossa ependymoma (EPN-PFB) according to DNA CpG island methylation profile status and gene expression. EPN-PFA usually occurs in children younger than 5 years and has a poor prognosis. Methods Using epigenome and transcriptome microarray data, a multi-component weighted gene co-expression network analysis (WGCNA) was used to systematically identify the hub genes of EPN-PF. We downloaded two microarray datasets (GSE66354 and GSE114523) from the Gene Expression Omnibus (GEO) database. The Limma R package was used to identify differentially expressed genes (DEGs), and ChAMP R was used to analyze the differential methylation genes (DMGs) between EPN-PFA and EPN-PFB. GO and KEGG enrichment analyses were performed using the Metascape database. Results GO analysis showed that enriched genes were significantly enriched in the extracellular matrix organization, adaptive immune response, membrane raft, focal adhesion, NF-kappa B pathway, and axon guidance, as suggested by KEGG analysis. Through WGCNA, we found that MEblue had a significant correlation with EPN-PF (R = 0.69, P = 1 × 10–08) and selected the 180 hub genes in the blue module. By comparing the DEGs, DMGs, and hub genes in the co-expression network, we identified five hypermethylated, lower expressed genes in EPN-PFA (ATP4B, CCDC151, DMKN, SCN4B, and TUBA4B), and three of them were confirmed by IHC. Conclusion ssGSEA and GSVA analysis indicated that these five hub genes could lead to poor prognosis by inducing hypoxia, PI3K-Akt-mTOR, and TNFα-NFKB pathways. Further study of these dysmethylated hub genes in EPN-PF and the pathways they participate in may provides new ideas for EPN-PF treatment.


2021 ◽  
Author(s):  
Guanyi Wang ◽  
Yibin Jia ◽  
Yuqing Ye ◽  
Enming Kang ◽  
Huijun Chen ◽  
...  

Abstract BackgroundPosterior fossa ependymoma (EPN-PF) can be classified into Group A posterior fossa ependymoma(EPN-PFA) and Group B posterior fossa ependymoma (EPN-PFB) according to DNA CpG island methylation profile status and gene expression. EPN-PFA usually occurs in children younger than 5 years and has a poor prognosis. MethodsUsing epigenome and transcriptome microarray data, a multi-component weighted gene co-expression network analysis (WGCNA) was used to systematically identify the hub genes of EPN-PF. We downloaded two microarray datasets (GSE66354 and GSE114523) from the Gene Expression Omnibus (GEO) database. The Limma R package was used to identify differentially expressed genes (DEGs), and ChAMP R was used to analyze the differential methylation genes (DMGs) between EPN-PFA and EPN-PFB. GO and KEGG enrichment analyses were performed using the Metascape database. ResultsGO analysis showed that enriched genes were significantly enriched in the extracellular matrix organization, adaptive immune response, membrane raft, focal adhesion, NF-kappa B pathway, and axon guidance, as suggested by KEGG analysis. Through WGCNA, we found that MEblue had a significant correlation with EPN-PF (R=0.69, P=1 x 10-08) and selected the 180 hub genes in the blue module. By comparing the DEGs, DMGs, and hub genes in the co-expression network, we identified five hypermethylated, lower expressed genes in EPN-PFA (ATP4B, CCDC151, DMKN, SCN4B, and TUBA4B), and three of them were confirmed by IHC. ConclusionssGSEA and GSVA analysis indicated that these five hub genes could lead to poor prognosis by inducing hypoxia, PI3K-Akt-mTOR, and TNFα-NFKB pathways. Further study of these dysmethylated hub genes in EPN-PF and the pathways they participate in may provides new ideas for EPN-PF treatment.


2020 ◽  
Author(s):  
Jie Yang ◽  
Fei Wang ◽  
Baoan Chen

Abstract Background: Multiple myeloma (MM) is an incurable hematological tumor, which is closely related to hypoxic bone marrow microenvironment. However, the underlying mechanisms are still far from fully understood. We took integrated bioinformatics analysis with expression profile GSE110113 downloaded from National Center for Biotechnology Information-Gene Expression Omnibus (NCBI-GEO) database, and screened out major histocompatibility complex, class II, DP alpha 1 (HLA-DPA1) as a hub gene related to hypoxia in MM.Methods: Differentially expressed genes (DEGs) were filtrated with R package “limma”. Gene Ontology (GO) enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway were performed using “clusterProfiler” package in R. Then, protein-protein interaction (PPI) network was established. Hub genes were screened out according to Maximal Clique Centrality (MCC). PrognoScan evaluated all the significant hub genes for survival analysis. ScanGEO was used for visualization of gene expression in different clinical studies. P and Cox p value < 0.05 was considered to be statistical significance.Results: HLA-DPA1 was finally picked out as a hub gene in MM related to hypoxia. MM patients with down-regulated expression of HLA-DPA1 has statistically significantly shorter disease specific survival (DSS) (COX p = 0.005411). Based on the clinical data of GSE47552 dataset, HLA-DPA1 expression showed significantly lower in MM patients than that in healthy donors (HDs) (p = 0.017).Conclusion: We identified HLA-DPA1 as a hub gene in MM related to hypoxia. HLA-DPA1 down-regulated expression was associated with MM patients’ poor outcome. Further functional and mechanistic studies are need to investigate HLA-DPA1 as potential therapeutic target.


BMC Cancer ◽  
2020 ◽  
Vol 20 (1) ◽  
Author(s):  
Jie Yang ◽  
Fei Wang ◽  
Baoan Chen

Abstract Background Multiple myeloma (MM) is an incurable hematological tumor, which is closely related to hypoxic bone marrow microenvironment. However, the underlying mechanisms are still far from fully understood. We took integrated bioinformatics analysis with expression profile GSE110113 downloaded from National Center for Biotechnology Information-Gene Expression Omnibus (NCBI-GEO) database, and screened out major histocompatibility complex, class II, DP alpha 1 (HLA-DPA1) as a hub gene related to hypoxia in MM. Methods Differentially expressed genes (DEGs) were filtrated with R package “limma”. Gene Ontology (GO) enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway were performed using “clusterProfiler” package in R. Then, protein-protein interaction (PPI) network was established. Hub genes were screened out according to Maximal Clique Centrality (MCC). PrognoScan evaluated all the significant hub genes for survival analysis. ScanGEO was used for visualization of gene expression in different clinical studies. P and Cox p value < 0.05 was considered to be statistical significance. Results HLA-DPA1 was finally picked out as a hub gene in MM related to hypoxia. MM patients with down-regulated expression of HLA-DPA1 has statistically significantly shorter disease specific survival (DSS) (COX p = 0.005411). Based on the clinical data of GSE47552 dataset, HLA-DPA1 expression showed significantly lower in MM patients than that in healthy donors (HDs) (p = 0.017). Conclusion We identified HLA-DPA1 as a hub gene in MM related to hypoxia. HLA-DPA1 down-regulated expression was associated with MM patients’ poor outcome. Further functional and mechanistic studies are need to investigate HLA-DPA1 as potential therapeutic target.


2020 ◽  
Author(s):  
Jie Yang ◽  
Fei Wang ◽  
Baoan Chen

Abstract Background: Multiple myeloma (MM) is an incurable hematological tumor, which is closely related to hypoxic bone marrow microenvironment. We took integrated bioinformatics analysis with expression profile GSE110113 downloaded from National Center for Biotechnology Information-Gene Expression Omnibus (NCBI-GEO) database, and screened out major histocompatibility complex, class II, DP alpha 1 (HLA-DPA1) as a hub gene related to hypoxia in MM.Methods: Differentially expressed genes (DEGs) were filtrated with R package “limma”. Gene Ontology (GO) enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway were performed using “clusterProfiler” package in R. Then, protein-protein interaction (PPI) network was established. Hub genes were screened out according to Maximal Clique Centrality (MCC). PrognoScan evaluated all the significant hub genes for survival analysis. ScanGEO was used for visualization of gene expression in different clinical studies.Results: HLA-DPA1 was finally picked out as a hub gene in MM related to hypoxia. MM patients with down-regulated expression of HLA-DPA1 has statistically significantly shorter disease specific survival (DSS) (COX p =0.005411). Based on the clinical data of GSE47552 dataset, HLA-DPA1 expression showed significantly lower in MM patients than that in healthy donors (HDs) (p=0.017).Conclusion: We identified HLA-DPA1 as a hub gene in MM related to hypoxia. HLA-DPA1 down-regulated expression was associated with MM patients’ poor outcome. Further function and mechanism studies are need to investigate HLA-DPA1 as potential therapeutic target.


Sign in / Sign up

Export Citation Format

Share Document