scholarly journals Identification of differentially expressed gene sets using the Generalized Berk–Jones statistic

2019 ◽  
Vol 35 (22) ◽  
pp. 4568-4576 ◽  
Author(s):  
Sheila M Gaynor ◽  
Ryan Sun ◽  
Xihong Lin ◽  
John Quackenbush

Abstract Motivation Cancer genomics studies frequently aim to identify genes that are differentially expressed between clinically distinct patient subgroups, generally by testing single genes one at a time. However, the results of any individual transcriptomic study are often not fully reproducible. A particular challenge impeding statistical analysis is the difficulty of distinguishing between differential expression comprising part of the genomic disease etiology and that induced by downstream effects. More robust analytical approaches that are well-powered to detect potentially causative genes, are less prone to discovering spurious associations, and can deliver reproducible findings across different studies are needed. Results We propose a set-based procedure for testing of differential expression and show that this set-based approach can produce more robust results by aggregating information across multiple, correlated genomic markers. Specifically, we adapt the Generalized Berk–Jones statistic to test for the transcription factors that may contribute to the progression of estrogen receptor positive breast cancer. We demonstrate the ability of our method to produce reproducible findings by applying the same analysis to 21 publicly available datasets, producing a similar list of significant transcription factors across most studies. Our Generalized Berk–Jones approach produces results that show improved consistency over three set-based testing algorithms: Generalized Higher Criticism, Gene Set Analysis and Gene Set Enrichment Analysis. Availability and implementation Data are in the MetaGxBreast R package. Code is available at github.com/ryanrsun/gaynor_sun_GBJ_breast_cancer. Supplementary information Supplementary data are available at Bioinformatics online.

BMC Urology ◽  
2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Hongjian Wu ◽  
Wubing Jiang ◽  
Guanghua Ji ◽  
Rong Xu ◽  
Gaobo Zhou ◽  
...  

Abstract Background Bladder cancer (BC) is the second most frequent malignancy of the urinary system. The aim of this study was to identify key microRNAs (miRNAs) and hub genes associated with BC as well as analyse their targeted relationships. Methods According to the microRNA dataset GSE112264 and gene microarray dataset GSE52519, differentially expressed microRNAs (DEMs) and differentially expressed genes (DEGs) were obtained using the R limma software package. The FunRich software database was used to predict the miRNA-targeted genes. The overlapping common genes (OCGs) between miRNA-targeted genes and DEGs were screened to construct the PPI network. Then, gene ontology (GO) analysis was performed through the “cluster Profiler” and “org.Hs.eg.db” R packages. The differential expression analysis and hierarchical clustering of these hub genes were analysed through the GEPIA and UCSC Cancer Genomics Browser databases, respectively. KEGG pathway enrichment analyses of hub genes were performed through gene set enrichment analysis (GSEA). Results A total of 12 DEMs and 10 hub genes were identified. Differential expression analysis of the hub genes using the GEPIA database was consistent with the results for the UCSC Cancer Genomics Browser database. The results indicated that these hub genes were oncogenes, but VCL, TPM2, and TPM1 were tumour suppressor genes. The GSEA also showed that hub genes were most enriched in those pathways that were closely associated with tumour proliferation and apoptosis. Conclusions In this study, we built a miRNA-mRNA regulatory targeted network, which explores an understanding of the pathogenesis of cancer development and provides key evidence for novel targeted treatments for BC.


2008 ◽  
Vol 6 ◽  
pp. CIN.S867 ◽  
Author(s):  
Irina Dinu ◽  
Qi Liu ◽  
John D. Potter ◽  
Adeniyi J. Adewale ◽  
Gian S. Jhangri ◽  
...  

Gene-set analysis of microarray data evaluates biological pathways, or gene sets, for their differential expression by a phenotype of interest. In contrast to the analysis of individual genes, gene-set analysis utilizes existing biological knowledge of genes and their pathways in assessing differential expression. This paper evaluates the biological performance of five gene-set analysis methods testing “self-contained null hypotheses” via subject sampling, along with the most popular gene-set analysis method, Gene Set Enrichment Analysis (GSEA). We use three real microarray analyses in which differentially expressed gene sets are predictable biologically from the phenotype. Two types of gene sets are considered for this empirical evaluation: one type contains “truly positive” sets that should be identified as differentially expressed; and the other type contains “truly negative” sets that should not be identified as differentially expressed. Our evaluation suggests advantages of SAM-GS, Global, and ANCOVA Global methods over GSEA and the other two methods.


Author(s):  
Sven Jacob ◽  
Vindi Jurinovic ◽  
Christopher Lampert ◽  
Elise Pretzsch ◽  
Jörg Kumbrink ◽  
...  

Abstract Background Colorectal cancer (CRC) is the third most common malignancy worldwide, but the key driver to distant metastases is still unknown. This study aimed to elucidate the link between immunosurveillance and organotropism of metastases in CRC by evaluating different gene signatures and pathways. Material and methods CRC patients undergoing surgery at the Department of General, Visceral and Transplantation Surgery at the Ludwig-Maximilian University Hospital Munich (Munich, Germany) were screened and categorized into M0 (no distant metastases), HEP (liver metastases) and PER (peritoneal carcinomatosis) after a 5-year follow-up. Six patients of each group were randomly selected to conduct a NanoString analysis, which includes 770 genes. Subsequently, all genes were further analyzed by gene set enrichment analysis (GSEA) based on seven main cancer-associated databases. Results Comparing HEP vs. M0, the gene set associated with the Toll-like receptor (TLR) cascade defined by the Reactome database was significantly overrepresented in HEP. HSP90B1, MAPKAPK3, PPP2CB, PPP2R1A were identified as the core enrichment genes. The immunologic signature pathway GSE6875_TCONV_VS_FOXP3_KO_TREG_DN with FOXP3 as downstream target was significantly overexpressed in M0. RB1, TMEM 100, CFP, ZKSCAN5, DDX50 were the core enrichment genes. Comparing PER vs. M0 no significantly differentially expressed gene signatures were identified. Conclusion Chronic inflammation might enhance local tumor growth. This is the first study identifying immune related gene sets differentially expressed between patients with either liver or peritoneal metastases. The present findings suggest that the formation of liver metastases might be associated with TLR-associated pathways. In M0, a high expression of FOXP3 + tumor infiltrating lymphocytes (TILs) seemed to prevent at least in part metastases. Thus, these correlative findings lay the cornerstone to further studies elucidating the underlying mechanisms of organotropism of metastases.


2020 ◽  
Vol 15 ◽  
Author(s):  
Chen-An Tsai ◽  
James J. Chen

Background: Gene set enrichment analyses (GSEA) provide a useful and powerful approach to identify differentially expressed gene sets with prior biological knowledge. Several GSEA algorithms have been proposed to perform enrichment analyses on groups of genes. However, many of these algorithms have focused on identification of differentially expressed gene sets in a given phenotype. Objective: In this paper, we propose a gene set analytic framework, Gene Set Correlation Analysis (GSCoA), that simultaneously measures within and between gene sets variation to identify sets of genes enriched for differential expression and highly co-related pathways. Methods: We apply co-inertia analysis to the comparisons of cross-gene sets in gene expression data to measure the costructure of expression profiles in pairs of gene sets. Co-inertia analysis (CIA) is one multivariate method to identify trends or co-relationships in multiple datasets, which contain the same samples. The objective of CIA is to seek ordinations (dimension reduction diagrams) of two gene sets such that the square covariance between the projections of the gene sets on successive axes is maximized. Simulation studies illustrate that CIA offers superior performance in identifying corelationships between gene sets in all simulation settings when compared to correlation-based gene set methods. Result and Conclusion: We also combine between-gene set CIA and GSEA to discover the relationships between gene sets significantly associated with phenotypes. In addition, we provide a graphical technique for visualizing and simultaneously exploring the associations of between and within gene sets and their interaction and network. We then demonstrate integration of within and between gene sets variation using CIA and GSEA, applied to the p53 gene expression data using the c2 curated gene sets. Ultimately, the GSCoA approach provides an attractive tool for identification and visualization of novel associations between pairs of gene sets by integrating co-relationships between gene sets into gene set analysis.


2021 ◽  
Vol 22 (12) ◽  
pp. 6644
Author(s):  
Xupeng Zang ◽  
Ting Gu ◽  
Wenjing Wang ◽  
Chen Zhou ◽  
Yue Ding ◽  
...  

Due to the high rate of spontaneous abortion (SAB) in porcine pregnancy, there is a major interest and concern on commercial pig farming worldwide. Whereas the perturbed immune response at the maternal–fetal interface is an important mechanism associated with the spontaneous embryo loss in the early stages of implantation in porcine, data on the specific regulatory mechanism of the SAB at the end stage of the implantation remains scant. Therefore, we used high-throughput sequencing and bioinformatics tools to analyze the healthy and arresting endometrium on day 28 of pregnancy. We identified 639 differentially expressed lncRNAs (DELs) and 2357 differentially expressed genes (DEGs) at the end stage of implantation, and qRT-PCR was used to verify the sequencing data. Gene set variation analysis (GSVA), gene set enrichment analysis (GSEA), and immunohistochemistry analysis demonstrated weaker immune response activities in the arresting endometrium compared to the healthy one. Using the lasso regression analysis, we screened the DELs and constructed an immunological competitive endogenous RNA (ceRNA) network related to SAB, including 4 lncRNAs, 11 miRNAs, and 13 genes. In addition, Blast analysis showed the applicability of the constructed ceRNA network in different species, and subsequently determined HOXA-AS2 in pigs. Our study, for the first time, demonstrated that the SAB events at the end stages of implantation is associated with the regulation of immunobiological processes, and a specific molecular regulatory network was obtained. These novel findings may provide new insight into the possibility of increasing the litter size of sows, making pig breeding better and thus improving the efficiency of animal husbandry production.


2018 ◽  
Vol 314 (4) ◽  
pp. L617-L625 ◽  
Author(s):  
Arjun Mohan ◽  
Anagha Malur ◽  
Matthew McPeek ◽  
Barbara P. Barna ◽  
Lynn M. Schnapp ◽  
...  

To advance our understanding of the pathobiology of sarcoidosis, we developed a multiwall carbon nanotube (MWCNT)-based murine model that shows marked histological and inflammatory signal similarities to this disease. In this study, we compared the alveolar macrophage transcriptional signatures of our animal model with human sarcoidosis to identify overlapping molecular programs. Whole genome microarrays were used to assess gene expression of alveolar macrophages in six MWCNT-exposed and six control animals. The results were compared with the transcriptional profiles of alveolar immune cells in 15 sarcoidosis patients and 12 healthy humans. Rigorous statistical methods were used to identify differentially expressed genes. To better elucidate activated pathways, integrated network and gene set enrichment analysis (GSEA) was performed. We identified over 1,000 differentially expressed between control and MWCNT mice. Gene ontology functional analysis showed overrepresentation of processes primarily involved in immunity and inflammation in MCWNT mice. Applying GSEA to both mouse and human samples revealed upregulation of 92 gene sets in MWCNT mice and 142 gene sets in sarcoidosis patients. Commonly activated pathways in both MWCNT mice and sarcoidosis included adaptive immunity, T-cell signaling, IL-12/IL-17 signaling, and oxidative phosphorylation. Differences in gene set enrichment between MWCNT mice and sarcoidosis patients were also observed. We applied network analysis to differentially expressed genes common between the MWCNT model and sarcoidosis to identify key drivers of disease. In conclusion, an integrated network and transcriptomics approach revealed substantial functional similarities between a murine model and human sarcoidosis particularly with respect to activation of immune-specific pathways.


2021 ◽  
Author(s):  
Shahan Mamoor

Breast cancer affects women at relatively high frequency (1). We mined published microarray datasets (2, 3) to determine in an unbiased fashion and at the systems level genes most differentially expressed in the primary tumors of patients with breast cancer. We report here significant differential expression of the gene encoding LIM domain binding 2, LDB2, when comparing primary tumors of the breast to the tissue of origin, the normal breast. LDB2 mRNA was present at significantly lower quantities in tumors of the breast as compared to normal breast tissue. Analysis of human survival data revealed that expression of LDB2 in primary tumors of the breast was correlated with recurrence-free survival in patients with luminal A subtype cancers, demonstrating a relationship between primary tumor expression of a differentially expressed gene and patient survival outcomes influenced by molecular subtype. LDB2 may be of relevance to initiation, maintenance or progression of cancers of the female breast.


2021 ◽  
Author(s):  
Shahan Mamoor

Breast cancer affects women at relatively high frequency (1). We mined published microarray datasets (2, 3) to determine in an unbiased fashion and at the systems level genes most differentially expressed in the primary tumors of patients with breast cancer. We report here significant differential expression of the gene encoding Rho GTPase-activating protein 20, ARHGAP20, when comparing primary tumors of the breast to the tissue of origin, the normal breast. ARHGAP20 mRNA was present at significantly lower quantities in tumors of the breast as compared to normal breast tissue. Analysis of human survival data revealed that expression of ARHGAP20 in primary tumors of the breast was correlated with overall survival in patients with HER2+ subtype cancer, demonstrating a relationship between primary tumor expression of a differentially expressed gene and patient survival outcomes influenced by PAM50 molecular subtype. ARHGAP20 may be of relevance to initiation, maintenance or progression of cancers of the female breast.


2021 ◽  
Author(s):  
Shahan Mamoor

Breast cancer affects women at relatively high frequency (1). We mined published microarray datasets (2, 3) to determine in an unbiased fashion and at the systems level genes most differentially expressed in the primary tumors of patients with breast cancer. We report here significant differential expression of the gene encoding mab-21 like 1, MAB21L1, when comparing primary tumors of the breast to the tissue of origin, the normal breast. MAB21L1 was also differentially expressed in the tumor cells of patients with triple negative breast cancer. MAB21L1 mRNA was present at significantly lower quantities in tumors of the breast as compared to normal breast tissue. Analysis of human survival data revealed that expression of MAB21L1 in primary tumors of the breast was correlated with overall survival in patients with luminal A subtype cancer, demonstrating a relationship between primary tumor expression of a differentially expressed gene and patient survival outcomes influenced by molecular subtype. MAB21L1 may be of relevance to initiation, maintenance or progression of cancers of the female breast.


2021 ◽  
Author(s):  
Shahan Mamoor

Breast cancer affects women at relatively high frequency (1). We mined published microarray datasets (2, 3) to determine in an unbiased fashion and at the systems level genes most differentially expressed in the primary tumors of patients with breast cancer. We report here significant differential expression of the gene encoding betaine--homocysteine S-methyltransferase 2, BHMT2, when comparing primary tumors of the breast to the tissue of origin, the normal breast. BHMT2 mRNA was present at significantly lower quantities in tumors of the breast as compared to normal breast tissue. Analysis of human survival data revealed that expression of BHMT2 in primary tumors of the breast was correlated with distant metastasis-free survival in patients with luminal B subtype cancer, demonstrating a relationship between primary tumor expression of a differentially expressed gene and patient survival outcomes influenced by PAM50 molecular subtype. BHMT2 may be of relevance to initiation, maintenance or progression of cancers of the female breast.


Sign in / Sign up

Export Citation Format

Share Document