scholarly journals Efficient gene-environment interaction tests for large biobank-scale sequencing studies

2020 ◽  
Author(s):  
Xinyu Wang ◽  
Elise Lim ◽  
Ching-Ti Liu ◽  
Yun Ju Sung ◽  
Dabeeru C. Rao ◽  
...  

ABSTRACTComplex human diseases are affected by genetic and environmental risk factors and their interactions. Gene-environment interaction (GEI) tests for aggregate genetic variant sets have been developed in recent years. However, existing statistical methods become rate limiting for large biobank-scale sequencing studies with correlated samples. We propose efficient Mixed-model Association tests for GEne-Environment interactions (MAGEE), for testing GEI between an aggregate variant set and environmental exposures on quantitative and binary traits in large-scale sequencing studies with related individuals. Joint tests for the aggregate genetic main effects and GEI effects are also developed. A null generalized linear mixed model adjusting for covariates but without any genetic effects is fit only once in a whole genome GEI analysis, thereby vastly reducing the overall computational burden. Score tests for variant sets are performed as a combination of genetic burden and variance component tests by accounting for the genetic main effects using matrix projections. The computational complexity is dramatically reduced in a whole genome GEI analysis, which makes MAGEE scalable to hundreds of thousands of individuals. We applied MAGEE to the exome sequencing data of 41,144 related individuals from the UK Biobank, and the analysis of 18,970 protein coding genes finished within 10.4 CPU hours.

Author(s):  
Shuo Jiao

This chapter presents set-based approaches that focus on identifying G X E interactions rather than set-based approaches that are based primarily on detecting G main effects (e.g., via marginal effects). The author reviews both his own research and the development of his Set Based Gene EnviRonment InterAction test (SBERIA), as well as another set-based G X E approach referred to as GESAT. GESAT extends the variance component test of the SNP-set Kernel Association Test (SKAT) to evaluate G x E effects while incorporating the main SNP effects as covariates. While both of these approaches (SBERIA and GESAT) have outperformed other benchmark methods (e.g., likelihood ratio test) and have been demonstrated to retain the appropriate Type 1 error rate, in this chapter the author conducts simulation studies to compare findings for SBERIA and GESAT approaches, and identifies associated strengths and limitations of the respective methods.


Author(s):  
Andrey Ziyatdinov ◽  
Jihye Kim ◽  
Dmitry Prokopenko ◽  
Florian Privé ◽  
Fabien Laporte ◽  
...  

Abstract The effective sample size (ESS) is a metric used to summarize in a single term the amount of correlation in a sample. It is of particular interest when predicting the statistical power of genome-wide association studies (GWAS) based on linear mixed models. Here, we introduce an analytical form of the ESS for mixed-model GWAS of quantitative traits and relate it to empirical estimators recently proposed. Using our framework, we derived approximations of the ESS for analyses of related and unrelated samples and for both marginal genetic and gene-environment interaction tests. We conducted simulations to validate our approximations and to provide a quantitative perspective on the statistical power of various scenarios, including power loss due to family relatedness and power gains due to conditioning on the polygenic signal. Our analyses also demonstrate that the power of gene-environment interaction GWAS in related individuals strongly depends on the family structure and exposure distribution. Finally, we performed a series of mixed-model GWAS on data from the UK Biobank and confirmed the simulation results. We notably found that the expected power drop due to family relatedness in the UK Biobank is negligible.


2021 ◽  
Vol 12 ◽  
Author(s):  
Jocelyn T. Chi ◽  
Ilse C. F. Ipsen ◽  
Tzu-Hung Hsiao ◽  
Ching-Heng Lin ◽  
Li-San Wang ◽  
...  

The explosion of biobank data offers unprecedented opportunities for gene-environment interaction (GxE) studies of complex diseases because of the large sample sizes and the rich collection in genetic and non-genetic information. However, the extremely large sample size also introduces new computational challenges in G×E assessment, especially for set-based G×E variance component (VC) tests, which are a widely used strategy to boost overall G×E signals and to evaluate the joint G×E effect of multiple variants from a biologically meaningful unit (e.g., gene). In this work, we focus on continuous traits and present SEAGLE, a Scalable Exact AlGorithm for Large-scale set-based G×E tests, to permit G×E VC tests for biobank-scale data. SEAGLE employs modern matrix computations to calculate the test statistic and p-value of the GxE VC test in a computationally efficient fashion, without imposing additional assumptions or relying on approximations. SEAGLE can easily accommodate sample sizes in the order of 105, is implementable on standard laptops, and does not require specialized computing equipment. We demonstrate the performance of SEAGLE using extensive simulations. We illustrate its utility by conducting genome-wide gene-based G×E analysis on the Taiwan Biobank data to explore the interaction of gene and physical activity status on body mass index.


2020 ◽  
Vol 9 (10) ◽  
pp. 3109
Author(s):  
Carine Salliot ◽  
Yann Nguyen ◽  
Marie-Christine Boutron-Ruault ◽  
Raphaèle Seror

Background: Rheumatoid arthritis (RA) is a complex disease in which environmental agents are thought to interact with genetic factors that lead to triggering of autoimmunity. Methods: We reviewed environmental, hormonal, and dietary factors that have been suggested to be associated with the risk of RA. Results: Smoking is the most robust factor associated with the risk of RA, with a clear gene–environment interaction. Among other inhalants, silica may increase the risk of RA in men. There is less evidence for pesticides, pollution, and other occupational inhalants. Regarding female hormonal exposures, there is some epidemiological evidence, although not consistent in the literature, to suggest a link between hormonal factors and the risk of RA. Regarding dietary factors, available evidence is conflicting. A high consumption of coffee seems to be associated with an increased risk of RA, whereas a moderate consumption of alcohol is inversely associated with the risk of RA, and there is less evidence regarding other food groups. Dietary pattern analyses (Mediterranean diet, the inflammatory potential of the diet, or diet quality) suggested a potential benefit of dietary modifications for individuals at high risk of RA. Conclusion: To date, smoking and silica exposure have been reproducibly demonstrated to trigger the emergence of RA. However, many other environmental factors have been studied, mostly with a case-control design. Results were conflicting and studies rarely considered potential gene–environment interactions. There is a need for large scale prospective studies and studies in predisposed individuals to better understand and prevent the disease and its course.


2019 ◽  
Vol 104 (2) ◽  
pp. 260-274 ◽  
Author(s):  
Han Chen ◽  
Jennifer E. Huffman ◽  
Jennifer A. Brody ◽  
Chaolong Wang ◽  
Seunggeun Lee ◽  
...  

2014 ◽  
Vol 27 (3) ◽  
pp. 725-746 ◽  
Author(s):  
Jay Belsky ◽  
Daniel A. Newman ◽  
Keith F. Widaman ◽  
Phil Rodkin ◽  
Michael Pluess ◽  
...  

AbstractHere we tested whether there was genetic moderation of effects of early maternal sensitivity on social–emotional and cognitive–linguistic development from early childhood onward and whether any detected Gene × Environment interaction effects proved consistent with differential-susceptibility or diathesis–stress models of Person × Environment interaction (N= 695). Two new approaches for evaluating models were employed with 12 candidate genes. Whereas maternal sensitivity proved to be a consistent predictor of child functioning across the primary-school years, candidate genes did not show many main effects, nor did they tend to interact with maternal sensitivity/insensitivity. These findings suggest that the developmental benefits of early sensitive mothering and the costs of insensitive mothering look more similar than different across genetically different children in the current sample. Although acknowledgement of this result is important, it is equally important that the generally null Gene × Environment results reported here not be overgeneralized to other samples, other predictors, other outcomes, and other candidate genes.


2020 ◽  
Vol 44 (8) ◽  
pp. 908-923
Author(s):  
Xinyu Wang ◽  
Elise Lim ◽  
Ching‐Ti Liu ◽  
Yun Ju Sung ◽  
Dabeeru C. Rao ◽  
...  

2018 ◽  
Author(s):  
Andy Dahl ◽  
Na Cai ◽  
Jonathan Flint ◽  
Noah Zaitlen

AbstractGene-environment interaction (GxE) is a well-known source of non-additive inheritance. GxE can be important in applications ranging from basic functional genomics to precision medical treatment. Further, GxE effects elude inherently-linear LMMs and may explain missing heritability. We propose a simple, unifying mixed model for polygenic interactions (GxEMM) to capture the aggregate effect of small GxE effects spread across the genome. GxEMM extends existing LMMs for GxE in two important ways. First, it extends to arbitrary environmental variables, not just categorical groups. Second, GxEMM can estimate and test for environment-specific heritability. In simulations where the assumptions of existing methods do not hold, we show that GxEMM improves estimates of ordinary and GxE heritability and increases power to test for polygenic GxE. We then use GxEMM to prove that the heritability of major depression (MD) is reduced by stress, which we previously conjectured but could not prove with prior methods, and that a tail of polygenic GxE effects remains unexplained by MD GWAS.


Sign in / Sign up

Export Citation Format

Share Document