scholarly journals Testing Gene-Environment Interactions in the Presence of Confounders and Mismeasured Environmental Exposures

Author(s):  
Chao Cheng ◽  
Donna Spiegelman ◽  
Zuoheng Wang ◽  
Molin Wang

Abstract Interest in investigating gene-environment (GxE) interactions has rapidly increased over the last decade. Although GxE interactions have been extremely investigated in large studies, few such effects have been identified and replicated, highlighting the need to develop statistical GxE tests with greater statistical power. The reverse test has been proposed for testing the interaction effect between a continuous exposure and genetic variants in relation to a binary disease outcome, which leverages the idea of linear discriminant analysis, significantly increasing statistical power comparing to the standard logistic regression approach. However, this reverse approach did not take into consideration adjustment for confounders. Since GxE interaction studies are inherently non-experimental, adjusting for potential confounding effects is critical for valid evaluation of GxE interactions. In this paper, we extend the reverse test to allow for confounders. The proposed reverse test also allows for exposure measurement errors as typically occurs. Extensive simulation experiments demonstrated that the proposed method not only provides greater statistical power under most simulation scenarios but also provides substantive computational efficiency, which achieves a computation time that is more than sevenfold less than that of the standard logistic regression test. In an illustrative example, we applied the proposed approach to the Veterans Aging Cohort Study (VACS) to search for genetic susceptibility loci modifying the smoking–HIV status association.

2018 ◽  
Author(s):  
Alfred Pozarickij ◽  
Cathy Williams ◽  
Pirro Hysi ◽  
Jeremy A. Guggenheim ◽  

AbstractRefractive error is a complex ocular trait controlled by genetic and environmental factors. Genome-wide association studies (GWAS) have identified approximately 150 genetic variants associated with refractive error. Among the known environmental factors, education, near-work and time spent outdoors have been demonstrated to have the strongest associations. Currently, the extent of gene-environment or gene-gene interactions in myopia is unknown. Here we show that the majority of genetic variants associated with refractive error show evidence of effect size heterogeneity, which is a hallmark feature of genetic interactions. Using conditional quantile regression, we observed that 88% of genetic variants associated with refractive error have at least nominally-significant non-uniform, non-linear profiles across the refractive error distribution. SNP effects tend to be strongest at the phenotype extremes and have weaker effects in emmetropes. A parsimonious explanation for these findings is that gene-environment or gene-gene interactions in refractive error are pervasive.Author summaryThe prevalence of myopia (nearsightedness) in the United States and East Asia has almost doubled in the past 30 years. Such a rapid rise in prevalence cannot be explained by genetics, which implies that environmental (lifestyle) risk factors play a major role. Nevertheless, diverse approaches have suggested that genetics is also important, and indeed approximately 150 distinct genetic risk loci for myopia have been discovered to date. One attractive explanation for the evidence implicating both genes and environment in myopia is gene-environment (GxE) interaction (a difference in genetic effect in individuals exposed to a high vs. low level of an environmental risk factor). Past studies aiming to discover GxE interactions in myopia have met with limited success, perhaps because information on lifestyle exposures during childhood has rarely been available. Here we used an agnostic approach that does not require information about specific lifestyle exposures in order to detect ‘signatures’ of GxE interaction. We found compelling evidence for widespread genetic interactions in myopia, with 88% of 150 known myopia genetic susceptibility loci showing an interaction signature. These findings suggest that GxE interactions in myopia are pervasive.


Author(s):  
Andrew R. Marderstein ◽  
Emily Davenport ◽  
Scott Kulm ◽  
Cristopher V. Van Hout ◽  
Olivier Elemento ◽  
...  

AbstractWhile thousands of loci have been associated with human phenotypes, the role of gene-environment (GxE) interactions in determining individual risk of human diseases remains unclear. This is partly due to the severe erosion of statistical power resulting from the massive number of statistical tests required to detect such interactions. Here, we focus on improving the power of GxE tests by developing a statistical framework for assessing quantitative trait loci (QTLs) associated with the trait means and/or trait variances. When applying this framework to body mass index (BMI), we find that GxE discovery and replication rates are significantly higher when prioritizing genetic variants associated with the variance of the phenotype (vQTLs) compared to assessing all genetic variants. Moreover, we find that vQTLs are enriched for associations with other non-BMI phenotypes having strong environmental influences, such as diabetes or ulcerative colitis. We show that GxE effects first identified in quantitative traits such as BMI can be used for GxE discovery in disease phenotypes such as diabetes. A clear conclusion is that strong GxE interactions mediate the genetic contribution to body weight and diabetes risk.


Author(s):  
Andrey Ziyatdinov ◽  
Jihye Kim ◽  
Dmitry Prokopenko ◽  
Florian Privé ◽  
Fabien Laporte ◽  
...  

Abstract The effective sample size (ESS) is a metric used to summarize in a single term the amount of correlation in a sample. It is of particular interest when predicting the statistical power of genome-wide association studies (GWAS) based on linear mixed models. Here, we introduce an analytical form of the ESS for mixed-model GWAS of quantitative traits and relate it to empirical estimators recently proposed. Using our framework, we derived approximations of the ESS for analyses of related and unrelated samples and for both marginal genetic and gene-environment interaction tests. We conducted simulations to validate our approximations and to provide a quantitative perspective on the statistical power of various scenarios, including power loss due to family relatedness and power gains due to conditioning on the polygenic signal. Our analyses also demonstrate that the power of gene-environment interaction GWAS in related individuals strongly depends on the family structure and exposure distribution. Finally, we performed a series of mixed-model GWAS on data from the UK Biobank and confirmed the simulation results. We notably found that the expected power drop due to family relatedness in the UK Biobank is negligible.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
I. E. Ceyisakar ◽  
N. van Leeuwen ◽  
Diederik W. J. Dippel ◽  
Ewout W. Steyerberg ◽  
H. F. Lingsma

Abstract Background There is a growing interest in assessment of the quality of hospital care, based on outcome measures. Many quality of care comparisons rely on binary outcomes, for example mortality rates. Due to low numbers, the observed differences in outcome are partly subject to chance. We aimed to quantify the gain in efficiency by ordinal instead of binary outcome analyses for hospital comparisons. We analyzed patients with traumatic brain injury (TBI) and stroke as examples. Methods We sampled patients from two trials. We simulated ordinal and dichotomous outcomes based on the modified Rankin Scale (stroke) and Glasgow Outcome Scale (TBI) in scenarios with and without true differences between hospitals in outcome. The potential efficiency gain of ordinal outcomes, analyzed with ordinal logistic regression, compared to dichotomous outcomes, analyzed with binary logistic regression was expressed as the possible reduction in sample size while keeping the same statistical power to detect outliers. Results In the IMPACT study (9578 patients in 265 hospitals, mean number of patients per hospital = 36), the analysis of the ordinal scale rather than the dichotomized scale (‘unfavorable outcome’), allowed for up to 32% less patients in the analysis without a loss of power. In the PRACTISE trial (1657 patients in 12 hospitals, mean number of patients per hospital = 138), ordinal analysis allowed for 13% less patients. Compared to mortality, ordinal outcome analyses allowed for up to 37 to 63% less patients. Conclusions Ordinal analyses provide the statistical power of substantially larger studies which have been analyzed with dichotomization of endpoints. We advise to exploit ordinal outcome measures for hospital comparisons, in order to increase efficiency in quality of care measurements. Trial registration We do not report the results of a health care intervention.


BMC Cancer ◽  
2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Yang Wang ◽  
Xiaojuan Men ◽  
Yongxue Gu ◽  
Huidong Wang ◽  
Zhicai Xu

Abstract Background Up to now, limited researches focused on the association between transcription factor 7-like 2 gene (TF7L2) gene single nucleotide polymorphisms (SNPs) and breast cancer (BC) risk. The aim of this study was to evaluate the associations between TF7L2 and BC risk in Chinese Han population. Methods Logistic regression model was used to test the correlation between polymorphisms and BC risk. Strength of association was evaluated by odds ratio (OR) and 95% confidence interval (CI). Generalized multifactor dimensionality reduction (GMDR) was applied to analyze the SNP-SNP and gene-environment interaction. Results Logistic regression analysis indicated that the BC risk was obviously higher in carriers of rs1225404 polymorphism C allele than that in TT genotype carriers (TC or CC versus TT), adjusted OR (95%CI) =1.40 (1.09–1.72). Additionally, we also discovered that people with rs7903146- T allele had an obviously higher risk of BC than people with CC allele (CT or TT versus CC), adjusted OR (95%CI) =1.44 (1.09–1.82). GMDR model was used to research the effect of interaction among 4 SNPs and environmental factors on BC risk. We discovered an important two-locus model (p = 0.0100) including rs1225404 and abdominal obesity, suggesting a potential gene–environment correlation between rs1225404 and abdominal obesity. In general, the cross-validation consistency of two-locus model was 10 of 10, and the testing accuracy was 0.632. Compared with subjects with normal waist circumference (WC) value and rs1225404 TT genotype, abdominal obese subjects with rs1225404 TC or CC genotype had the highest BC risk. After covariate adjustment, OR (95%CI) was 2.23 (1.62–2.89). Haplotype analysis indicated that haplotype containing rs1225404-T and rs7903146-C alleles were associated with higher BC risk. Conclusions C allele of rs1225404 and T allele of rs7903146, interaction between rs1225404 and abdominal obesity, rs1225404-T and rs7903146-C haplotype were all related to increased BC risk.


2004 ◽  
Vol 03 (02) ◽  
pp. 265-279 ◽  
Author(s):  
STAN LIPOVETSKY ◽  
MICHAEL CONKLIN

Comparative contribution of predictors in multivariate statistical models is widely used for decision making on the importance of the variables for the aims of analysis and prediction. However, the analysis can be made difficult because of the predictors' multicollinearity that distorts estimates for coefficients in the linear aggregate. To solve the problem of the robust evaluation of the predictors' contribution, we apply the Shapley Value regression analysis that provides consistent results in the presence of multicollinearity both for regression and discriminant functions. We also show how the linear discriminant function can be constructed as a multiple regression, and how the logistic regression can be approximated by linear regression that helps to obtain the variables contribution in the linear aggregate.


2011 ◽  
Vol 2011 ◽  
pp. 1-9 ◽  
Author(s):  
Serena Bucossi ◽  
Stefania Mariani ◽  
Mariacarla Ventriglia ◽  
Renato Polimanti ◽  
Massimo Gennarelli ◽  
...  

Nonceruloplasmin-bound copper (“free”) is reported to be elevated in Alzheimer's disease (AD). In Wilson's disease (WD) Cu-ATPase 7B protein tightly controls free copper body levels. To explore whether the ATP7B gene harbours susceptibility loci for AD, we screened 180 AD chromosomes for sequence changes in exons 2, 5, 8, 10, 14, and 16, where most of the Mediterranean WD-causing mutations lie. No WD mutation, but sequence changes corresponding to c.1216 T>G Single-Nucleotide Polymorphism (SNP) and c.2495 A>G SNP were found. Thereafter, we genotyped 190 AD patients and 164 controls for these SNPs frequencies estimation. Logistic regression analyses revealed either a trend for the c.1216 SNP (P=.074) or a higher frequency for c.2495 SNP of the GG genotype in patients, increasing the probability of AD by 74% (P=.028). Presence of the GG genotype in ATP7B c.2495 could account for copper dysfunction in AD which has been shown to raise the probability of the disease.


2020 ◽  
Author(s):  
Brody Holohan ◽  
Raphael Laderman

AbstractGene-environment interactions are at the heart of why many complex traits are not fully heritable, and why prediction of disease incidence and individual response to environmental changes based on genetics has been underwhelming in utility. Understanding these interactions is the primary limiting factor for the application of personalized medicine, but current methods are not well suited for dealing with complex traits that pose both a dimensionality and sparse data problem to unsupervised analysis methods. Genteract has developed a proprietary analytical technique that allows for detection and interpretation of GxEs regarding specific pairs of a single phenotype with a single environmental factor; these methods allow us to develop a platform that can be used to predict how individuals will respond to changes in their environment based on their genetics. To validate the methods we performed two types of testing: cross-validation against a dataset of clinical study results, and application of the methods in a simulated dataset. These tests enable a greater understanding of the methods’ utility, statistical power and predictive capabilities.


Author(s):  
Alexandre Todorov

The aim of the RELIEF algorithm is to filter out features (e.g., genes, environmental factors) that are relevant to a trait of interest, starting from a set of that may include thousands of irrelevant features. Though widely used in many fields, its application to the study of gene-environment interaction studies has been limited thus far. We provide here an overview of this machine learning algorithm and some of its variants. Using simulated data, we then compare of the performance of RELIEF to that of logistic regression for screening for gene-environment interactions in SNP data. Even though performance degrades in larger sets of markers, RELIEF remains a competitive alternative to logistic regression, and shows clear promise as a tool for the study of gene-environment interactions. Areas for further improvements of the algorithm are then suggested.


Sign in / Sign up

Export Citation Format

Share Document