Testing Gene-Gene Interactions Based on a Neighborhood Perspective in Genome-wide Association Studies

Unexplained genetic variation that causes complex diseases is often induced by gene-gene interactions (GGIs). Gene-based methods are one of the current statistical methodologies for discovering GGIs in case-control genome-wide association studies that are not only powerful statistically, but also interpretable biologically. However, most approaches include assumptions about the form of GGIs, which results in poor statistical performance. As a result, we propose gene-based testing based on the maximal neighborhood coefficient (MNC) called gene-based gene-gene interaction through a maximal neighborhood coefficient (GBMNC). MNC is a metric for capturing a wide range of relationships between two random vectors with arbitrary, but not necessarily equal, dimensions. We established a statistic that leverages the difference in MNC in case and in control samples as an indication of the existence of GGIs, based on the assumption that the joint distribution of two genes in cases and controls should not be substantially different if there is no interaction between them. We then used a permutation-based statistical test to evaluate this statistic and calculate a statistical p-value to represent the significance of the interaction. Experimental results using both simulation and real data showed that our approach outperformed earlier methods for detecting GGIs.

Download Full-text

Gene-Based Testing of Interactions Using XGBoost in Genome-Wide Association Studies

Frontiers in Cell and Developmental Biology ◽

10.3389/fcell.2021.801113 ◽

2021 ◽

Vol 9 ◽

Author(s):

Yingjie Guo ◽

Chenxi Wu ◽

Zhian Yuan ◽

Yansu Wang ◽

Zhen Liang ◽

...

Keyword(s):

Association Studies ◽

Real Data ◽

Gene Interaction ◽

Genome Wide Association ◽

Superior Performance ◽

Gene Interactions ◽

Genome Wide Association Studies ◽

Nucleotide Polymorphisms ◽

Genome Wide ◽

The Difference

Among the myriad of statistical methods that identify gene–gene interactions in the realm of qualitative genome-wide association studies, gene-based interactions are not only powerful statistically, but also they are interpretable biologically. However, they have limited statistical detection by making assumptions on the association between traits and single nucleotide polymorphisms. Thus, a gene-based method (GGInt-XGBoost) originated from XGBoost is proposed in this article. Assuming that log odds ratio of disease traits satisfies the additive relationship if the pair of genes had no interactions, the difference in error between the XGBoost model with and without additive constraint could indicate gene–gene interaction; we then used a permutation-based statistical test to assess this difference and to provide a statistical p-value to represent the significance of the interaction. Experimental results on both simulation and real data showed that our approach had superior performance than previous experiments to detect gene–gene interactions.

Download Full-text

Novel Methods for Epistasis Detection in Genome-Wide Association Studies

10.1101/442749 ◽

2018 ◽

Cited By ~ 2

Author(s):

Lotfi Slim ◽

Clément Chatelain ◽

Chloé-Agathe Azencott ◽

Jean-Philippe Vert

Keyword(s):

Randomized Clinical Trials ◽

Association Studies ◽

Real Data ◽

Gene Interaction ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

New Approach ◽

Pairwise Interactions ◽

Genome Wide ◽

Or Gene

More and more genome-wide association studies are being designed to uncover the full genetic basis of common diseases. Nonetheless, the resulting loci are often insufficient to fully recover the observed heritability. Epistasis, or gene-gene interaction, is one of many hypotheses put forward to explain this missing heritability. In the present work, we propose epiGWAS, a new approach for epistasis detection that identifies interactions between a target SNP and the rest of the genome. This contrasts with the classical strategy of epistasis detection through exhaustive pairwise SNP testing. We draw inspiration from causal inference in randomized clinical trials, which allows us to take into account linkage disequilibrium. EpiGWAS encompasses several methods, which we compare to state-of-the-art techniques for epistasis detection on simulated and real data. The promising results demonstrate empirically the benefits of EpiGWAS to identify pairwise interactions.

Download Full-text

Novel methods for epistasis detection in genome-wide association studies

PLoS ONE ◽

10.1371/journal.pone.0242927 ◽

2020 ◽

Vol 15 (11) ◽

pp. e0242927

Author(s):

Lotfi Slim ◽

Clément Chatelain ◽

Chloé-Agathe Azencott ◽

Jean-Philippe Vert

Keyword(s):

Randomized Clinical Trials ◽

Association Studies ◽

Real Data ◽

Gene Interaction ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

New Approach ◽

Pairwise Interactions ◽

Genome Wide ◽

Or Gene

Download Full-text

The search for gene-gene interactions in genome-wide association studies: challenges in abundance of methods, practical considerations, and biological interpretation

Annals of Translational Medicine ◽

10.21037/atm.2018.04.05 ◽

2018 ◽

Vol 6 (8) ◽

pp. 157-157 ◽

Cited By ~ 19

Author(s):

Marylyn D. Ritchie ◽

Kristel Van Steen

Keyword(s):

Association Studies ◽

Genome Wide Association ◽

Gene Interactions ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Biological Interpretation

Download Full-text

Use of the Multivariate Discriminant Analysis for Genome-Wide Association Studies in Cattle

Animals ◽

10.3390/ani10081300 ◽

2020 ◽

Vol 10 (8) ◽

pp. 1300 ◽

Cited By ~ 1

Author(s):

Elisabetta Manca ◽

Alberto Cesarani ◽

Giustino Gaspa ◽

Silvia Sorbolini ◽

Nicolò P.P. Macciotta ◽

...

Keyword(s):

Discriminant Analysis ◽

Association Studies ◽

Real Data ◽

Genome Wide Association ◽

Stepwise Discriminant Analysis ◽

Genome Wide Association Studies ◽

Multivariate Method ◽

Genome Wide ◽

Single Marker ◽

Multivariate Gwas

Genome-wide association studies (GWAS) are traditionally carried out by using the single marker regression model that, if a small number of individuals is involved, often lead to very few associations. The Bayesian methods, such as BayesR, have obtained encouraging results when they are applied to the GWAS. However, these approaches, require that an a priori posterior inclusion probability threshold be fixed, thus arbitrarily affecting the obtained associations. To partially overcome these problems, a multivariate statistical algorithm was proposed. The basic idea was that animals with different phenotypic values of a specific trait share different allelic combinations for genes involved in its determinism. Three multivariate techniques were used to highlight the differences between the individuals assembled in high and low phenotype groups: the canonical discriminant analysis, the discriminant analysis and the stepwise discriminant analysis. The multivariate method was tested both on simulated and on real data. The results from the simulation study highlighted that the multivariate GWAS detected a greater number of true associated single nucleotide polymorphisms (SNPs) and Quantitative trait loci (QTLs) than the single marker model and the Bayesian approach. For example, with 3000 animals, the traditional GWAS highlighted only 29 significantly associated markers and 13 QTLs, whereas the multivariate method found 127 associated SNPs and 65 QTLs. The gap between the two approaches slowly decreased as the number of animals increased. The Bayesian method gave worse results than the other two. On average, with the real data, the multivariate GWAS found 108 associated markers for each trait under study and among them, around 63% SNPs were also found in the single marker approach. Among the top 118 associated markers, 76 SNPs harbored putative candidate genes.

Download Full-text

Rapid testing of gene-gene interactions in genome-wide association studies of binary and quantitative phenotypes

Genetic Epidemiology ◽

10.1002/gepi.20629 ◽

2011 ◽

Vol 35 (8) ◽

pp. 800-808 ◽

Cited By ~ 7

Author(s):

Kanishka Bhattacharya ◽

Mark I. McCarthy ◽

Andrew P. Morris

Keyword(s):

Association Studies ◽

Genome Wide Association ◽

Gene Interactions ◽

Genome Wide Association Studies ◽

Rapid Testing ◽

Genome Wide

Download Full-text

A fast algorithm for detecting gene–gene interactions in genome-wide association studies

The Annals of Applied Statistics ◽

10.1214/14-aoas771 ◽

2014 ◽

Vol 8 (4) ◽

pp. 2292-2318 ◽

Cited By ~ 16

Author(s):

Jiahan Li ◽

Wei Zhong ◽

Runze Li ◽

Rongling Wu

Keyword(s):

Fast Algorithm ◽

Association Studies ◽

Genome Wide Association ◽

Gene Interactions ◽

Genome Wide Association Studies ◽

Genome Wide

Download Full-text

An Exploration of Gene-Gene Interactions and Their Effects on Hypertension

International Journal of Genomics ◽

10.1155/2017/7208318 ◽

2017 ◽

Vol 2017 ◽

pp. 1-9 ◽

Cited By ~ 7

Author(s):

Ying Meng ◽

Susan Groth ◽

Jill R. Quinn ◽

John Bisognano ◽

Tong Tong Wu

Keyword(s):

Interaction Analysis ◽

Association Studies ◽

Independent Set ◽

Gene Interaction ◽

Single Locus ◽

Gene Interactions ◽

Genome Wide Association Studies ◽

Original Cohort ◽

Genome Wide ◽

Heart Study

Hypertension tends to perpetuate in families and the heritability of hypertension is estimated to be around 20–60%. So far, the main proportion of this heritability has not been found by single-locus genome-wide association studies. Therefore, the current study explored gene-gene interactions that have the potential to partially fill in the missing heritability. A two-stage discovery-confirmatory analysis was carried out in the Framingham Heart Study cohorts. The first stage was an exhaustive pairwise search performed in 2320 early-onset hypertensive cases with matched normotensive controls from the offspring cohort. Then, identified gene-gene interactions were assessed in an independent set of 694 subjects from the original cohort. Four unique gene-gene interactions were found to be related to hypertension. Three detected genes were recognized by previous studies, and the other 5 loci/genes (MAN1A1, LMO3, NPAP1/SNRPN, DNAL4, and RNA5SP455/KRT8P5) were novel findings, which had no strong main effect on hypertension and could not be easily identified by single-locus genome-wide studies. Also, by including the identified gene-gene interactions, more variance was explained in hypertension. Overall, our study provides evidence that the genome-wide gene-gene interaction analysis has the possibility to identify new susceptibility genes, which can provide more insights into the genetic background of blood pressure regulation.

Download Full-text

Identification of Critical Core Genes of Sarcoma Based on Centrality Analysis of Networks Nodes

Journal of Medical Imaging and Health Informatics ◽

10.1166/jmihi.2020.3080 ◽

2020 ◽

Vol 10 (7) ◽

pp. 1776-1784

Author(s):

Shudong Wang ◽

Jixiao Wang ◽

Xinzeng Wang ◽

Yuanyuan Zhang ◽

Tao Yi

Keyword(s):

Association Studies ◽

Meta Analysis ◽

Complex Diseases ◽

Enrichment Analysis ◽

Gene Interaction ◽

Core Gene ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Gene Set ◽

Genome Wide

Genome-wide association studies (GWAS) are powerful tools for identifying pathogenic genes of complex diseases and revealing genetic structure of diseases. However, due to gene-to-gene interactions, only a part of the hereditary factors can be revealed. The meta-analysis based on GWAS can integrate gene expression data at multiple levels and reveal the complex relationship between genes. Therefore, we used meta-analysis to integrate GWAS data of sarcoma to establish complex networks and discuss their significant genes. Firstly, we established gene interaction networks based on the data of different subtypes of sarcoma to analyze the node centralities of genes. Secondly, we calculated the significant score of each gene according to the Staged Significant Gene Network Algorithm (SSGNA). Then, we obtained the critical gene set HYC of sarcoma by ranking the scores, and then combined Gene Ontology enrichment analysis and protein network analysis to further screen it. Finally, the critical core gene set Hcore containing 47 genes was obtained and validated by GEPIA analysis. Our method has certain generalization performance to the study of complex diseases with prior knowledge and it is a useful supplement to genome-wide association studies.

Download Full-text