scholarly journals Robust Mendelian randomization in the presence of residual population stratification, batch effects and horizontal pleiotropy

2020 ◽  
Author(s):  
Carlos Cinelli ◽  
Nathan LaPierre ◽  
Brian L. Hill ◽  
Sriram Sankararaman ◽  
Eleazar Eskin

ABSTRACTMendelian Randomization (MR) exploits genetic variants as instrumental variables to estimate the causal effect of an “exposure” trait on an “outcome” trait from observational data. However, the validity of such studies is threatened by population stratification, batch effects, and horizontal pleiotropy. Although a variety of methods have been proposed to partially mitigate those problems, residual biases may still remain, leading to highly statistically significant false positives in large genetic databases. Here, we describe a suite of sensitivity analysis tools for MR that enables investigators to properly quantify the robustness of their findings against these (and other) unobserved validity threats. Specifically, we propose the routine reporting of sensitivity statistics that can be used to readily quantify the robustness of a MR result: (i) the partial R2 of the genetic instrument with the exposure and the outcome traits; and, (ii) the robustness value of both genetic associations. These statistics quantify the minimal strength of violations of the MR assumptions that would be necessary to explain away the MR causal effect estimate. We also provide intuitive displays to visualize the sensitivity of the MR estimate to any degree of violation, and formal methods to bound the worst-case bias caused by violations in terms of multiples of the observed strength of principal components, batch effects, as well as putative pleiotropic pathways. We demonstrate how these tools can aid researchers in distinguishing robust from fragile findings, by showing that the MR estimate of the causal effect of body mass index (BMI) on diastolic blood pressure is relatively robust, whereas the MR estimate of the causal effect of BMI on Townsend deprivation index is relatively fragile.

Author(s):  
Fernando Pires Hartwig ◽  
Kate Tilling ◽  
George Davey Smith ◽  
Deborah A Lawlor ◽  
Maria Carolina Borges

Abstract Background Two-sample Mendelian randomization (MR) allows the use of freely accessible summary association results from genome-wide association studies (GWAS) to estimate causal effects of modifiable exposures on outcomes. Some GWAS adjust for heritable covariables in an attempt to estimate direct effects of genetic variants on the trait of interest. One, both or neither of the exposure GWAS and outcome GWAS may have been adjusted for covariables. Methods We performed a simulation study comprising different scenarios that could motivate covariable adjustment in a GWAS and analysed real data to assess the influence of using covariable-adjusted summary association results in two-sample MR. Results In the absence of residual confounding between exposure and covariable, between exposure and outcome, and between covariable and outcome, using covariable-adjusted summary associations for two-sample MR eliminated bias due to horizontal pleiotropy. However, covariable adjustment led to bias in the presence of residual confounding (especially between the covariable and the outcome), even in the absence of horizontal pleiotropy (when the genetic variants would be valid instruments without covariable adjustment). In an analysis using real data from the Genetic Investigation of ANthropometric Traits (GIANT) consortium and UK Biobank, the causal effect estimate of waist circumference on blood pressure changed direction upon adjustment of waist circumference for body mass index. Conclusions Our findings indicate that using covariable-adjusted summary associations in MR should generally be avoided. When that is not possible, careful consideration of the causal relationships underlying the data (including potentially unmeasured confounders) is required to direct sensitivity analyses and interpret results with appropriate caution.


2019 ◽  
Author(s):  
Christopher N Foley ◽  
Paul D W Kirk ◽  
Stephen Burgess

AbstractMotivationMendelian randomization is an epidemiological technique that uses genetic variants as instrumental variables to estimate the causal effect of a risk factor on an outcome. We consider a scenario in which causal estimates based on each variant in turn differ more strongly than expected by chance alone, but the variants can be divided into distinct clusters, such that all variants in the cluster have similar causal estimates. This scenario is likely to occur when there are several distinct causal mechanisms by which a risk factor influences an outcome with different magnitudes of causal effect. We have developed an algorithm MR-Clust that finds such clusters of variants, and so can identify variants that reflect distinct causal mechanisms. Two features of our clustering algorithm are that it accounts for uncertainty in the causal estimates, and it includes ‘null’ and ‘junk’ clusters, to provide protection against the detection of spurious clusters.ResultsOur algorithm correctly detected the number of clusters in a simulation analysis, outperforming the popular Mclust method. In an applied example considering the effect of blood pressure on coronary artery disease risk, the method detected four clusters of genetic variants. A hypothesis-free search suggested that variants in the cluster with a negative effect of blood pressure on coronary artery disease risk were more strongly related to trunk fat percentage and other adiposity measures than variants not in this cluster.Availability and ImplementationMR-Clust can be downloaded from https://github.com/cnfoley/[email protected] or [email protected] InformationSupplementary Material is included in the submission.


2016 ◽  
Author(s):  
Hans van Kippersluis ◽  
Cornelius A Rietveld

AbstractBackgroundThe potential of Mendelian Randomization studies is rapidly expanding due to (i) the growing power of GWAS meta-analyses to detect genetic variants associated with several exposures, and (ii) the increasing availability of these genetic variants in large-scale surveys. However, without a proper biological understanding of the pleiotropic working of genetic variants, a fundamental assumption of Mendelian Randomization (the exclusion restriction) can always be contested.MethodsWe build upon and synthesize recent advances in the econometric literature on instrumental variables (IV) estimation that test and relax the exclusion restriction. Our Pleiotropy-robust Mendelian Randomization (PRMR) method first estimates the degree of pleiotropy, and in turn corrects for it. If a sample exists for which the genetic variants do not affect the exposure, and pleiotropic effects are homogenous, PRMR obtains unbiased estimates of causal effects in case of pleiotropy.ResultsSimulations show that existing MR methods produce biased estimators for realistic forms of pleiotropy. Under the aforementioned assumptions, PRMR produces unbiased estimators. We illustrate the practical use of PRMR by estimating the causal effect of (i) cigarettes smoked per day on Body Mass Index (BMI); (ii) prostate cancer on self-reported health, and (iii) educational attainment on BMI in the UK Biobank data.ConclusionsPRMR allows for instrumental variables that violate the exclusion restriction due to pleiotropy, and corrects for pleiotropy in the estimation of the causal effect. If the degree of pleiotropy is unknown, PRMR can still be used as a sensitivity analysis.Key messagesIf genetic variants have pleiotropic effects, causal estimates of Mendelian Randomization studies will be biased.Pleiotropy-robust Mendelian Randomization (PRMR) produces unbiased causal estimates in case (i) a subsample can be identified for which the genetic variants do not affect the exposure, and (ii) pleiotropic effects are homogenous.If such a subsample does not exist, PRMR can still routinely be reported as a sensitivity analysis in any MR analysis.If pleiotropic effects are not homogenous, PRMR can be used as an informal test to gauge the exclusion restriction.


Author(s):  
Christopher N Foley ◽  
Amy M Mason ◽  
Paul D W Kirk ◽  
Stephen Burgess

Abstract Motivation Mendelian randomization is an epidemiological technique that uses genetic variants as instrumental variables to estimate the causal effect of a risk factor on an outcome. We consider a scenario in which causal estimates based on each variant in turn differ more strongly than expected by chance alone, but the variants can be divided into distinct clusters, such that all variants in the cluster have similar causal estimates. This scenario is likely to occur when there are several distinct causal mechanisms by which a risk factor influences an outcome with different magnitudes of causal effect. We have developed an algorithm MR-Clust that finds such clusters of variants, and so can identify variants that reflect distinct causal mechanisms. Two features of our clustering algorithm are that it accounts for differential uncertainty in the causal estimates, and it includes ‘null’ and ‘junk’ clusters, to provide protection against the detection of spurious clusters. Results Our algorithm correctly detected the number of clusters in a simulation analysis, outperforming methods that either do not account for uncertainty or do not include null and junk clusters. In an applied example considering the effect of blood pressure on coronary artery disease risk, the method detected four clusters of genetic variants. A post hoc hypothesis-generating search suggested that variants in the cluster with a negative effect of blood pressure on coronary artery disease risk were more strongly related to trunk fat percentage and other adiposity measures than variants not in this cluster. Availability and implementation MR-Clust can be downloaded from https://github.com/cnfoley/mrclust. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Author(s):  
Claudia Coscia ◽  
Dipender Gill ◽  
Raquel Benítez ◽  
Teresa Pérez ◽  
Núria Malats ◽  
...  

AbstractBackgroundMendelian randomization (MR) uses genetic variants as instrumental variables to investigate the causal effect of a risk factor on an outcome. A collider is a variable influenced by two or more other variables. Naive calculation of MR estimates in strata of the population defined by a variable affected by the risk factor can result in collider bias.MethodsWe propose an approach that allows MR estimation in strata of the population while avoiding collider bias. This approach constructs a new variable, the residual collider, as the residual from regression of the collider on the genetic instrument, and then calculates causal estimates in strata defined by quantiles of the residual collider. Estimates stratified on the residual collider will typically have an equivalent interpretation to estimates stratified on the collider, but they are not subject to collider bias. We apply the approach in several simulation scenarios considering different characteristics of the collider variable and strengths of the instrument. We then apply the proposed approach to investigate the causal effect of smoking on bladder cancer in strata of the population defined by bodyweight.ResultsThe new approach generated unbiased estimates in all the simulation settings. In the applied example, we observed a trend in the stratum-specific MR estimates at different bodyweight levels that suggested stronger effects of smoking on bladder cancer among individuals with lower bodyweight.ConclusionsThe proposed approach can be used to perform MR studying heterogeneity among subgroups of the population while avoiding collider bias.


2017 ◽  
Author(s):  
James R Staley ◽  
Stephen Burgess

AbstractMendelian randomization, the use of genetic variants as instrumental variables (IV), can test for and estimate the causal effect of an exposure on an outcome. Most IV methods assume that the function relating the exposure to the expected value of the outcome (the exposure-outcome relationship) is linear. However, in practice this assumption may not hold. Indeed, often the primary question of interest is to assess the shape of this relationship. We present two novel IV methods for investigating the shape of the exposure-outcome relationship: a fractional polynomial method and a piecewise linear method. We divide the population into strata using the exposure distribution, and estimate a causal effect, referred to as a localized average causal effect (LACE), in each stratum of population. The fractional polynomial method performs meta-regression on these LACE estimates. The piecewise linear method estimates a continuous piecewise linear function, the gradient of which is the LACE estimate in each stratum. Both methods were demonstrated in a simulation study to estimate the true exposure-outcome relationship well, particularly when the relationship was a fractional polynomial (for the fractional polynomial method) or was piecewise linear (for the piecewise linear method). The methods were used to investigate the shape of relationship of body mass index with systolic blood pressure and diastolic blood pressure.Availability and implementation: https://github.com/jrs95/nlmr


2017 ◽  
Author(s):  
Lai Jiang ◽  
Karim Oualkacha ◽  
Vanessa Didelez ◽  
Antonio Ciampi ◽  
Pedro Rosa ◽  
...  

AbstractIn Mendelian randomization (MR), genetic variants are used to construct instrumental variables, which enable inference about the causal relationship between a phenotype of interest and a response or disease outcome. However, standard MR inference requires several assumptions, including the assumption that the genetic variants only influence the response through the phenotype of interest. Pleiotropy occurs when a genetic variant has an effect on more than one phenotype; therefore, a pleiotropic genetic variant may be an invalid instrumental variable. Hence, a naive method for constructing instrumental variables may lead to biased estimation of the causality between the phenotype and the response. Here, we present a set of intuitive methods (Constrained Instrumental Variable methods [CIV]) to construct valid instrumental variables and perform adjusted causal effect estimation when pleiotropy exists, focusing particularly on the situation where pleiotropic phenotypes have been measured. Our approach includes an automatic and valid selection of genetic variants when building the instrumental variables. We also provide details of the features of many existing methods, together with a comparison of their performance in a large series of simulations. CIV methods performed consistently better than many comparators across four different pleiotropic violations of the MR assumptions. We analyzed data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) Mueller et al. (2005) to disentangle causal relationships of several biomarkers with AD progression. The results showed that CIV methods can provide causal effect estimates, as well as selection of valid instruments while accounting for pleiotropy.


2019 ◽  
Vol 29 (4) ◽  
pp. 1081-1111 ◽  
Author(s):  
Ioan Gabriel Bucur ◽  
Tom Claassen ◽  
Tom Heskes

The use of genetic variants as instrumental variables – an approach known as Mendelian randomization – is a popular epidemiological method for estimating the causal effect of an exposure (phenotype, biomarker, risk factor) on a disease or health-related outcome from observational data. Instrumental variables must satisfy strong, often untestable assumptions, which means that finding good genetic instruments among a large list of potential candidates is challenging. This difficulty is compounded by the fact that many genetic variants influence more than one phenotype through different causal pathways, a phenomenon called horizontal pleiotropy. This leads to errors not only in estimating the magnitude of the causal effect but also in inferring the direction of the putative causal link. In this paper, we propose a Bayesian approach called BayesMR that is a generalization of the Mendelian randomization technique in which we allow for pleiotropic effects and, crucially, for the possibility of reverse causation. The output of the method is a posterior distribution over the target causal effect, which provides an immediate and easily interpretable measure of the uncertainty in the estimation. More importantly, we use Bayesian model averaging to determine how much more likely the inferred direction is relative to the reverse direction.


2021 ◽  
Vol 12 ◽  
Author(s):  
Yuquan Wang ◽  
Tingting Li ◽  
Liwan Fu ◽  
Siqian Yang ◽  
Yue-Qing Hu

Mendelian randomization makes use of genetic variants as instrumental variables to eliminate the influence induced by unknown confounders on causal estimation in epidemiology studies. However, with the soaring genetic variants identified in genome-wide association studies, the pleiotropy, and linkage disequilibrium in genetic variants are unavoidable and may produce severe bias in causal inference. In this study, by modeling the pleiotropic effect as a normally distributed random effect, we propose a novel mixed-effects regression model-based method PLDMR, pleiotropy and linkage disequilibrium adaptive Mendelian randomization, which takes linkage disequilibrium into account and also corrects for the pleiotropic effect in causal effect estimation and statistical inference. We conduct voluminous simulation studies to evaluate the performance of the proposed and existing methods. Simulation results illustrate the validity and advantage of the novel method, especially in the case of linkage disequilibrium and directional pleiotropic effects, compared with other methods. In addition, by applying this novel method to the data on Atherosclerosis Risk in Communications Study, we conclude that body mass index has a significant causal effect on and thus might be a potential risk factor of systolic blood pressure. The novel method is implemented in R and the corresponding R code is provided for free download.


Author(s):  
Io Ieong Chan ◽  
Man Ki Kwok ◽  
C Mary Schooling

Abstract Introduction Observational studies suggest earlier puberty is associated with higher adulthood blood pressure (BP), but these findings have not been replicated using Mendelian randomization (MR). We examined this question sex-specifically using larger genome-wide association studies (GWAS) with more extensive measures of pubertal timing. Methods We obtained genetic instruments proxying pubertal maturation (age at menarche (AAM) or voice breaking (AVB)) from the largest published GWAS. We applied them to summary sex-specific genetic associations with systolic and diastolic BP z-scores, and self-reported hypertension in women (n=194174) and men (n=167020) from the UK Biobank, using inverse-variance weighting meta-analysis. We conducted sensitivity analyses using other MR methods, including multivariable MR adjusted for childhood obesity proxied by body mass index (BMI). We used late pubertal growth as a validation outcome. Results AAM (beta per one-year later = -0.030 [95% confidence interval (CI) -0.055, -0.005] and AVB (beta -0.058 [95% CI -0.100, -0.015]) were inversely associated with systolic BP independent of childhood BMI, as were diastolic BP (-0.035 [95% CI -0.060, -0.009] for AAM and -0.046 [95% CI -0.089, -0.004] for AVB) and self-reported hypertension (odds ratios 0.89 [95% CI 0.84, 0.95] for AAM and 0.87 [95% CI 0.79, 0.96] for AVB). AAM and AVB were positively associated with late pubertal growth, as expected. The results were robust to sensitivity analysis using other MR methods. Conclusion Timing of pubertal maturation was associated with adulthood BP independent of childhood BMI, highlighting the role of pubertal maturation timing in midlife BP.


Sign in / Sign up

Export Citation Format

Share Document