SNP-based pathway enrichment analysis for genome-wide association studies

Genome-wide association studies (GWAS) are powerful tools for identifying pathogenic genes of complex diseases and revealing genetic structure of diseases. However, due to gene-to-gene interactions, only a part of the hereditary factors can be revealed. The meta-analysis based on GWAS can integrate gene expression data at multiple levels and reveal the complex relationship between genes. Therefore, we used meta-analysis to integrate GWAS data of sarcoma to establish complex networks and discuss their significant genes. Firstly, we established gene interaction networks based on the data of different subtypes of sarcoma to analyze the node centralities of genes. Secondly, we calculated the significant score of each gene according to the Staged Significant Gene Network Algorithm (SSGNA). Then, we obtained the critical gene set HYC of sarcoma by ranking the scores, and then combined Gene Ontology enrichment analysis and protein network analysis to further screen it. Finally, the critical core gene set Hcore containing 47 genes was obtained and validated by GEPIA analysis. Our method has certain generalization performance to the study of complex diseases with prior knowledge and it is a useful supplement to genome-wide association studies.

Download Full-text

GSEA-SNP: applying gene set enrichment analysis to SNP data from genome-wide association studies

Bioinformatics ◽

10.1093/bioinformatics/btn516 ◽

2008 ◽

Vol 24 (23) ◽

pp. 2784-2785 ◽

Cited By ~ 119

Author(s):

Marit Holden ◽

Shiwei Deng ◽

Leszek Wojnowski ◽

Bettina Kulle

Keyword(s):

Association Studies ◽

Enrichment Analysis ◽

Gene Set Enrichment Analysis ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Gene Set Enrichment ◽

Gene Set ◽

Snp Data ◽

Genome Wide

Download Full-text

INRICH: interval-based enrichment analysis for genome-wide association studies

Bioinformatics ◽

10.1093/bioinformatics/bts191 ◽

2012 ◽

Vol 28 (13) ◽

pp. 1797-1799 ◽

Cited By ~ 170

Author(s):

Phil H. Lee ◽

Colm O'Dushlaine ◽

Brett Thomas ◽

Shaun M. Purcell

Keyword(s):

Association Studies ◽

Enrichment Analysis ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide

Download Full-text

Systematic Pathway Enrichment Analysis of a Genome-Wide Association Study on Breast Cancer Survival Reveals an Influence of Genes Involved in Cell Adhesion and Calcium Signaling on the Patients’ Clinical Outcome

PLoS ONE ◽

10.1371/journal.pone.0098229 ◽

2014 ◽

Vol 9 (6) ◽

pp. e98229 ◽

Cited By ~ 14

Author(s):

Andrea Woltmann ◽

Bowang Chen ◽

Jesús Lascorz ◽

Robert Johansson ◽

Jorunn E. Eyfjörd ◽

...

Keyword(s):

Breast Cancer ◽

Genome Wide Association Study ◽

Cancer Survival ◽

Enrichment Analysis ◽

Breast Cancer Survival ◽

Genome Wide Association ◽

Pathway Enrichment Analysis ◽

Pathway Enrichment ◽

Genome Wide ◽

A Genome

Download Full-text

Importance of SNP Dependency Correction and Association Integration for Gene Set Analysis in Genome-Wide Association Studies

Frontiers in Genetics ◽

10.3389/fgene.2021.767358 ◽

2021 ◽

Vol 12 ◽

Author(s):

Michal Marczyk ◽

Agnieszka Macioszek ◽

Joanna Tobiasz ◽

Joanna Polanska ◽

Joanna Zyla

Keyword(s):

Association Studies ◽

Enrichment Analysis ◽

Gene Set Enrichment Analysis ◽

Genome Wide Association ◽

Gene Set Analysis ◽

Genome Wide Association Studies ◽

Gene Set Enrichment ◽

Gene Set ◽

Genome Wide ◽

The Impact

A typical genome-wide association study (GWAS) analyzes millions of single-nucleotide polymorphisms (SNPs), several of which are in a region of the same gene. To conduct gene set analysis (GSA), information from SNPs needs to be unified at the gene level. A widely used practice is to use only the most relevant SNP per gene; however, there are other methods of integration that could be applied here. Also, the problem of nonrandom association of alleles at two or more loci is often neglected. Here, we tested the impact of incorporation of different integrations and linkage disequilibrium (LD) correction on the performance of several GSA methods. Matched normal and breast cancer samples from The Cancer Genome Atlas database were used to evaluate the performance of six GSA algorithms: Coincident Extreme Ranks in Numerical Observations (CERNO), Gene Set Enrichment Analysis (GSEA), GSEA-SNP, improved GSEA for GWAS (i-GSEA4GWAS), Meta-Analysis Gene-set Enrichment of variaNT Associations (MAGENTA), and Over-Representation Analysis (ORA). Association of SNPs to phenotype was calculated using modified McNemar’s test. Results for SNPs mapped to the same gene were integrated using Fisher and Stouffer methods and compared with the minimum p-value method. Four common measures were used to quantify the performance of all combinations of methods. Results of GSA analysis on GWAS were compared to the one performed on gene expression data. Comparing all evaluation metrics across different GSA algorithms, integrations, and LD correction, we highlighted CERNO, and MAGENTA with Stouffer as the most efficient. Applying LD correction increased prioritization and specificity of enrichment outcomes for all tested algorithms. When Fisher or Stouffer were used with LD, sensitivity and reproducibility were also better. Using any integration method was beneficial in comparison with a minimum p-value method in specific combinations. The correlation between GSA results from genomic and transcriptomic level was the highest when Stouffer integration was combined with LD correction. We thoroughly evaluated different approaches to GSA in GWAS in terms of performance to guide others to select the most effective combinations. We showed that LD correction and Stouffer integration could increase the performance of enrichment analysis and encourage the usage of these techniques.

Download Full-text

Identification of candidate genes affecting chronic subclinical mastitis in Norwegian Red cattle: combining genome‐wide association study, topologically associated domains and pathway enrichment analysis

Animal Genetics ◽

10.1111/age.12886 ◽

2019 ◽

Vol 51 (1) ◽

pp. 22-31

Author(s):

E. Kirsanova ◽

B. Heringstad ◽

A. Lewandowska‐Sabat ◽

I. Olsaker

Keyword(s):

Association Study ◽

Candidate Genes ◽

Genome Wide Association Study ◽

Enrichment Analysis ◽

Subclinical Mastitis ◽

Genome Wide Association ◽

Pathway Enrichment Analysis ◽

Pathway Enrichment ◽

Genome Wide ◽

Topologically Associated Domains

Download Full-text

Strategies and issues in the detection of pathway enrichment in genome-wide association studies

Human Genetics ◽

10.1007/s00439-009-0676-z ◽

2009 ◽

Vol 126 (2) ◽

pp. 289-301 ◽

Cited By ~ 94

Author(s):

Mun-Gwan Hong ◽

Yudi Pawitan ◽

Patrik K. E. Magnusson ◽

Jonathan A. Prince

Keyword(s):

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Pathway Enrichment ◽

Genome Wide

Download Full-text

Genome-Wide Association Study of Maize Aboveground Dry Matter Accumulation at Seedling Stage

Frontiers in Genetics ◽

10.3389/fgene.2020.571236 ◽

2021 ◽

Vol 11 ◽

Author(s):

Xianju Lu ◽

Jinglu Wang ◽

Yongjian Wang ◽

Weiliang Wen ◽

Ying Zhang ◽

...

Keyword(s):

Candidate Genes ◽

Dry Matter ◽

Association Studies ◽

Genome Wide Association ◽

Phenotypic Traits ◽

Pathway Enrichment Analysis ◽

Dry Matter Accumulation ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Maize Varieties

Dry matter accumulation and partitioning during the early phases of development could significantly affect crop growth and productivity. In this study, the aboveground dry matter (DM), the DM of different organs, and partition coefficients of a maize association mapping panel of 412 inbred lines were evaluated at the third and sixth leaf stages (V3 and V6). Further, the properties of these phenotypic traits were analyzed. Genome-wide association studies (GWAS) were conducted on the total aboveground biomass and the DM of different organs. Analysis of GWAS results identified a total of 1,103 unique candidate genes annotated by 678 significant SNPs (P value < 1.28e–6). A total of 224 genes annotated by SNPs at the top five of each GWAS method and detected by multiple GWAS methods were regarded as having high reliability. Pathway enrichment analysis was also performed to explore the biological significance and functions of these candidate genes. Several biological pathways related to the regulation of seed growth, gibberellin-mediated signaling pathway, and long-day photoperiodism were enriched. The results of our study could provide new perspectives on breeding high-yielding maize varieties.

Download Full-text