scholarly journals SEAGLE: A Scalable Exact Algorithm for Large-Scale Set-Based Gene-Environment Interaction Tests in Biobank Data

2021 ◽  
Vol 12 ◽  
Author(s):  
Jocelyn T. Chi ◽  
Ilse C. F. Ipsen ◽  
Tzu-Hung Hsiao ◽  
Ching-Heng Lin ◽  
Li-San Wang ◽  
...  

The explosion of biobank data offers unprecedented opportunities for gene-environment interaction (GxE) studies of complex diseases because of the large sample sizes and the rich collection in genetic and non-genetic information. However, the extremely large sample size also introduces new computational challenges in G×E assessment, especially for set-based G×E variance component (VC) tests, which are a widely used strategy to boost overall G×E signals and to evaluate the joint G×E effect of multiple variants from a biologically meaningful unit (e.g., gene). In this work, we focus on continuous traits and present SEAGLE, a Scalable Exact AlGorithm for Large-scale set-based G×E tests, to permit G×E VC tests for biobank-scale data. SEAGLE employs modern matrix computations to calculate the test statistic and p-value of the GxE VC test in a computationally efficient fashion, without imposing additional assumptions or relying on approximations. SEAGLE can easily accommodate sample sizes in the order of 105, is implementable on standard laptops, and does not require specialized computing equipment. We demonstrate the performance of SEAGLE using extensive simulations. We illustrate its utility by conducting genome-wide gene-based G×E analysis on the Taiwan Biobank data to explore the interaction of gene and physical activity status on body mass index.

2020 ◽  
Author(s):  
Benjamin W. Domingue ◽  
Klint Kanopka ◽  
Travis T. Mallard ◽  
Sam Trejo ◽  
Elliot M. Tucker-Drob

AbstractGenotype-by-environment interaction (GxE) occurs when the size of a genetic effect varies systematically across levels of the environment and when the size of an environmental effect varies systematically across levels of the genotype. However, total variance in the phenotype may shift as a function of the moderator irrespective of its etiology such that the proportional effect of the predictor is constant. We expand the traditional GxE regression model to directly account for heteroscedasticity associated with both the genotype and the measured environment. We then derive a test statistic, ξ, for inferring whether GxE can be attributed to an effect of the moderator on the dispersion of the phenotype. We apply this method to identify genotype-by-birth year interactions for Body Mass Index (BMI) that are distinguishable from general secular increases in the variance of BMI or associations of the genetic predictors (both PGS and individual loci) with BMI variance. We provide software for analyzing such models.


2020 ◽  
Vol 9 (10) ◽  
pp. 3109
Author(s):  
Carine Salliot ◽  
Yann Nguyen ◽  
Marie-Christine Boutron-Ruault ◽  
Raphaèle Seror

Background: Rheumatoid arthritis (RA) is a complex disease in which environmental agents are thought to interact with genetic factors that lead to triggering of autoimmunity. Methods: We reviewed environmental, hormonal, and dietary factors that have been suggested to be associated with the risk of RA. Results: Smoking is the most robust factor associated with the risk of RA, with a clear gene–environment interaction. Among other inhalants, silica may increase the risk of RA in men. There is less evidence for pesticides, pollution, and other occupational inhalants. Regarding female hormonal exposures, there is some epidemiological evidence, although not consistent in the literature, to suggest a link between hormonal factors and the risk of RA. Regarding dietary factors, available evidence is conflicting. A high consumption of coffee seems to be associated with an increased risk of RA, whereas a moderate consumption of alcohol is inversely associated with the risk of RA, and there is less evidence regarding other food groups. Dietary pattern analyses (Mediterranean diet, the inflammatory potential of the diet, or diet quality) suggested a potential benefit of dietary modifications for individuals at high risk of RA. Conclusion: To date, smoking and silica exposure have been reproducibly demonstrated to trigger the emergence of RA. However, many other environmental factors have been studied, mostly with a case-control design. Results were conflicting and studies rarely considered potential gene–environment interactions. There is a need for large scale prospective studies and studies in predisposed individuals to better understand and prevent the disease and its course.


Author(s):  
Kenneth E. Westerman ◽  
Duy T. Pham ◽  
Liang Hong ◽  
Ye Chen ◽  
Magdalena Sevilla-González ◽  
...  

ABSTRACTMotivationGene-environment interaction (GEI) studies are a general framework that can be used to identify genetic variants that modify the effects of environmental, physiological, lifestyle, or treatment effects on complex traits. Moreover, accounting for GEIs can enhance our understanding of the genetic architecture of complex diseases. However, commonly-used statistical software programs for GEI studies are either not applicable to testing certain types of GEI hypotheses or have not been optimized for use in large samples.ResultsHere, we develop a new software program, GEM (Gene-Environment interaction analysis in Millions of samples), which supports the inclusion of multiple GEI terms, adjustment for GEI covariates, and robust inference, while allowing multi-threading to reduce computation time. GEM can conduct GEI tests as well as joint tests of genetic effects for both continuous and binary phenotypes. Through simulations, we demonstrate that GEM scales to millions of samples while addressing limitations of existing software programs. We additionally conduct a gene-sex interaction analysis on waist-hip ratio in 352,768 unrelated individuals from the UK Biobank, identifying 39 novel loci in the joint test that have not previously been reported in combined or sex-specific analyses. Our results demonstrate that GEM can facilitate the next generation of large-scale GEI studies and help advance our understanding of genomic contributions to complex traits.AvailabilityGEM is freely available as an open source project at https://github.com/large-scale-gxe-methods/[email protected], [email protected] informationSupplementary data are available at Bioinformatics online.


2020 ◽  
Author(s):  
Xinyu Wang ◽  
Elise Lim ◽  
Ching-Ti Liu ◽  
Yun Ju Sung ◽  
Dabeeru C. Rao ◽  
...  

ABSTRACTComplex human diseases are affected by genetic and environmental risk factors and their interactions. Gene-environment interaction (GEI) tests for aggregate genetic variant sets have been developed in recent years. However, existing statistical methods become rate limiting for large biobank-scale sequencing studies with correlated samples. We propose efficient Mixed-model Association tests for GEne-Environment interactions (MAGEE), for testing GEI between an aggregate variant set and environmental exposures on quantitative and binary traits in large-scale sequencing studies with related individuals. Joint tests for the aggregate genetic main effects and GEI effects are also developed. A null generalized linear mixed model adjusting for covariates but without any genetic effects is fit only once in a whole genome GEI analysis, thereby vastly reducing the overall computational burden. Score tests for variant sets are performed as a combination of genetic burden and variance component tests by accounting for the genetic main effects using matrix projections. The computational complexity is dramatically reduced in a whole genome GEI analysis, which makes MAGEE scalable to hundreds of thousands of individuals. We applied MAGEE to the exome sequencing data of 41,144 related individuals from the UK Biobank, and the analysis of 18,970 protein coding genes finished within 10.4 CPU hours.


1997 ◽  
Vol 78 (01) ◽  
pp. 457-461 ◽  
Author(s):  
S E Humphries ◽  
A Panahloo ◽  
H E Montgomery ◽  
F Green ◽  
J Yudkin

Sign in / Sign up

Export Citation Format

Share Document