scholarly journals Gene-Set Enrichment with Mathematical Biology

2019 ◽  
Author(s):  
Amy L Cochran ◽  
Kenneth Nieser ◽  
Daniel B Forger ◽  
Sebastian Zöllner ◽  
Melvin G McInnis

AbstractGene-set analyses measure the association between a disease of interest and a set of genes related to a biological pathway. These analyses often incorporate gene network properties to account for the differential contributions of each gene. Extending this concept further, mathematical models of biology can be leveraged to define gene interactions based on biophysical principles by predicting the effects of genetic perturbations on a particular downstream function. We present a method that combines gene weights from model predictions and gene ranks from genome-wide association studies into a weighted gene-set test. Using publicly-available summary data from the Psychiatric Genetics Consortium (n=41,653; ~9) million SNPs), we examine an a priori hypothesis that intracellular calcium ion concentrations contribute to bipolar disorder. In this case study, we are able to strengthen inferences from a P-value of 0.081 to 1.7×10−4 by moving from a general calcium signaling pathway to a specific model-predicted function.

2009 ◽  
Vol 3 (Suppl 7) ◽  
pp. S95 ◽  
Author(s):  
Melanie Sohns ◽  
Albert Rosenberger ◽  
Heike Bickeböller

2020 ◽  
Vol 10 (7) ◽  
pp. 1776-1784
Author(s):  
Shudong Wang ◽  
Jixiao Wang ◽  
Xinzeng Wang ◽  
Yuanyuan Zhang ◽  
Tao Yi

Genome-wide association studies (GWAS) are powerful tools for identifying pathogenic genes of complex diseases and revealing genetic structure of diseases. However, due to gene-to-gene interactions, only a part of the hereditary factors can be revealed. The meta-analysis based on GWAS can integrate gene expression data at multiple levels and reveal the complex relationship between genes. Therefore, we used meta-analysis to integrate GWAS data of sarcoma to establish complex networks and discuss their significant genes. Firstly, we established gene interaction networks based on the data of different subtypes of sarcoma to analyze the node centralities of genes. Secondly, we calculated the significant score of each gene according to the Staged Significant Gene Network Algorithm (SSGNA). Then, we obtained the critical gene set HYC of sarcoma by ranking the scores, and then combined Gene Ontology enrichment analysis and protein network analysis to further screen it. Finally, the critical core gene set Hcore containing 47 genes was obtained and validated by GEPIA analysis. Our method has certain generalization performance to the study of complex diseases with prior knowledge and it is a useful supplement to genome-wide association studies.


2021 ◽  
Vol 12 ◽  
Author(s):  
Michal Marczyk ◽  
Agnieszka Macioszek ◽  
Joanna Tobiasz ◽  
Joanna Polanska ◽  
Joanna Zyla

A typical genome-wide association study (GWAS) analyzes millions of single-nucleotide polymorphisms (SNPs), several of which are in a region of the same gene. To conduct gene set analysis (GSA), information from SNPs needs to be unified at the gene level. A widely used practice is to use only the most relevant SNP per gene; however, there are other methods of integration that could be applied here. Also, the problem of nonrandom association of alleles at two or more loci is often neglected. Here, we tested the impact of incorporation of different integrations and linkage disequilibrium (LD) correction on the performance of several GSA methods. Matched normal and breast cancer samples from The Cancer Genome Atlas database were used to evaluate the performance of six GSA algorithms: Coincident Extreme Ranks in Numerical Observations (CERNO), Gene Set Enrichment Analysis (GSEA), GSEA-SNP, improved GSEA for GWAS (i-GSEA4GWAS), Meta-Analysis Gene-set Enrichment of variaNT Associations (MAGENTA), and Over-Representation Analysis (ORA). Association of SNPs to phenotype was calculated using modified McNemar’s test. Results for SNPs mapped to the same gene were integrated using Fisher and Stouffer methods and compared with the minimum p-value method. Four common measures were used to quantify the performance of all combinations of methods. Results of GSA analysis on GWAS were compared to the one performed on gene expression data. Comparing all evaluation metrics across different GSA algorithms, integrations, and LD correction, we highlighted CERNO, and MAGENTA with Stouffer as the most efficient. Applying LD correction increased prioritization and specificity of enrichment outcomes for all tested algorithms. When Fisher or Stouffer were used with LD, sensitivity and reproducibility were also better. Using any integration method was beneficial in comparison with a minimum p-value method in specific combinations. The correlation between GSA results from genomic and transcriptomic level was the highest when Stouffer integration was combined with LD correction. We thoroughly evaluated different approaches to GSA in GWAS in terms of performance to guide others to select the most effective combinations. We showed that LD correction and Stouffer integration could increase the performance of enrichment analysis and encourage the usage of these techniques.


2012 ◽  
Vol 16 (2) ◽  
pp. 271-278 ◽  
Author(s):  
Joanna M. Biernacka ◽  
Jennifer Geske ◽  
Gregory D. Jenkins ◽  
Colin Colby ◽  
David N. Rider ◽  
...  

Abstract It is believed that multiple genetic variants with small individual effects contribute to the risk of alcohol dependence. Such polygenic effects are difficult to detect in genome-wide association studies that test for association of the phenotype with each single nucleotide polymorphism (SNP) individually. To overcome this challenge, gene-set analysis (GSA) methods that jointly test for the effects of pre-defined groups of genes have been proposed. Rather than testing for association between the phenotype and individual SNPs, these analyses evaluate the global evidence of association with a set of related genes enabling the identification of cellular or molecular pathways or biological processes that play a role in development of the disease. It is hoped that by aggregating the evidence of association for all available SNPs in a group of related genes, these approaches will have enhanced power to detect genetic associations with complex traits. We performed GSA using data from a genome-wide study of 1165 alcohol-dependent cases and 1379 controls from the Study of Addiction: Genetics and Environment (SAGE), for all 200 pathways listed in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Results demonstrated a potential role of the ‘synthesis and degradation of ketone bodies’ pathway. Our results also support the potential involvement of the ‘neuroactive ligand–receptor interaction’ pathway, which has previously been implicated in addictive disorders. These findings demonstrate the utility of GSA in the study of complex disease, and suggest specific directions for further research into the genetic architecture of alcohol dependence.


Genes ◽  
2021 ◽  
Vol 12 (7) ◽  
pp. 1030
Author(s):  
Omobola O. Oluwafemi ◽  
Fadi I. Musfee ◽  
Laura E. Mitchell ◽  
Elizabeth Goldmuntz ◽  
Hongbo M. Xie ◽  
...  

Conotruncal defects with normally related great vessels (CTD-NRGVs) occur in both patients with and without 22q11.2 deletion syndrome (22q11.2DS), but it is unclear to what extent the genetically complex etiologies of these heart defects may overlap across these two groups, potentially involving variation within and/or outside of the 22q11.2 region. To explore this potential overlap, we conducted genome-wide SNP-level, gene-level, and gene set analyses using common variants, separately in each of five cohorts, including two with 22q11.2DS (N = 1472 total cases) and three without 22q11.2DS (N = 935 total cases). Results from the SNP-level analyses were combined in meta-analyses, and summary statistics from these analyses were also used in gene and gene set analyses. Across all these analyses, no association was significant after correction for multiple comparisons. However, several SNPs, genes, and gene sets with suggestive evidence of association were identified. For common inherited variants, we did not identify strong evidence for shared genomic mechanisms for CTD-NRGVs across individuals with and without 22q11.2 deletions. Nevertheless, several of our top gene-level and gene set results have been linked to cardiogenesis and may represent candidates for future work.


Sign in / Sign up

Export Citation Format

Share Document