scholarly journals A Comparison Of Robust Mendelian Randomization Methods Using Summary Data

2019 ◽  
Author(s):  
Eric A.W. Slob ◽  
Stephen Burgess

AbstractThe number of Mendelian randomization analyses including large numbers of genetic variants is rapidly increasing. This is due to the proliferation of genome-wide association studies, and the desire to obtain more precise estimates of causal effects. Since it is unlikely that all genetic variants will be valid instrumental variables, several robust methods have been proposed. We compare nine robust methods for Mendelian randomization based on summary data that can be implemented using standard statistical software. Methods were compared in three ways: by reviewing their theoretical properties, in an extensive simulation study, and in an empirical example to investigate the effect of body mass index on coronary artery disease risk. In the simulation study, the overall best methods, judged by mean squared error, were the contamination mixture method and the mode based estimation method. These methods generally had well-controlled Type 1 error rates with up to 50% invalid instruments across a range of scenarios. Outlier-robust methods such as MR-Lasso, MR-Robust, and MR-PRESSO, had the narrowest confidence intervals in the empirical example. They performed well when most variants were valid instruments with a few outliers, but less well with several invalid instruments. With isolated exceptions, all methods performed badly when over 50% of the variants were invalid instruments. Our recommendation for investigators is to perform a variety of robust methods that operate in different ways and rely on different assumptions for valid inferences to assess the reliability of Mendelian randomization analyses.


2020 ◽  
Vol 49 (4) ◽  
pp. 1246-1256
Author(s):  
Inge Verkouter ◽  
Renée de Mutsert ◽  
Roelof A J Smit ◽  
Stella Trompet ◽  
Frits R Rosendaal ◽  
...  

Abstract Background Body mass index (BMI)-associated loci are used to explore the effects of obesity using Mendelian randomization (MR), but the contribution of individual tissues to risks remains unknown. We aimed to identify tissue-grouped pathways of BMI-associated loci and relate these to cardiometabolic disease using MR analyses. Methods Using Genotype-Tissue Expression (GTEx) data, we performed overrepresentation tests to identify tissue-grouped gene sets based on mRNA-expression profiles from 634 previously published BMI-associated loci. We conducted two-sample MR with inverse-variance-weighted methods, to examine associations between tissue-grouped BMI-associated genetic instruments and type 2 diabetes mellitus (T2DM) and coronary artery disease (CAD), with use of summary-level data from published genome-wide association studies (T2DM: 74 124 cases, 824 006 controls; CAD: 60 801 cases, 123 504 controls). Additionally, we performed MR analyses on T2DM and CAD using randomly sampled sets of 100 or 200 BMI-associated genetic variants. Results We identified 17 partly overlapping tissue-grouped gene sets, of which 12 were brain areas, where BMI-associated genes were differentially expressed. In tissue-grouped MR analyses, all gene sets were similarly associated with increased risks of T2DM and CAD. MR analyses with randomly sampled genetic variants on T2DM and CAD resulted in a distribution of effect estimates similar to tissue-grouped gene sets. Conclusions Overrepresentation tests revealed differential expression of BMI-associated genes in 17 different tissues. However, with our biology-based approach using tissue-grouped MR analyses, we did not identify different risks of T2DM or CAD for the BMI-associated gene sets, which was reflected by similar effect estimates obtained by randomly sampled gene sets.



Author(s):  
Fernando Pires Hartwig ◽  
Kate Tilling ◽  
George Davey Smith ◽  
Deborah A Lawlor ◽  
Maria Carolina Borges

Abstract Background Two-sample Mendelian randomization (MR) allows the use of freely accessible summary association results from genome-wide association studies (GWAS) to estimate causal effects of modifiable exposures on outcomes. Some GWAS adjust for heritable covariables in an attempt to estimate direct effects of genetic variants on the trait of interest. One, both or neither of the exposure GWAS and outcome GWAS may have been adjusted for covariables. Methods We performed a simulation study comprising different scenarios that could motivate covariable adjustment in a GWAS and analysed real data to assess the influence of using covariable-adjusted summary association results in two-sample MR. Results In the absence of residual confounding between exposure and covariable, between exposure and outcome, and between covariable and outcome, using covariable-adjusted summary associations for two-sample MR eliminated bias due to horizontal pleiotropy. However, covariable adjustment led to bias in the presence of residual confounding (especially between the covariable and the outcome), even in the absence of horizontal pleiotropy (when the genetic variants would be valid instruments without covariable adjustment). In an analysis using real data from the Genetic Investigation of ANthropometric Traits (GIANT) consortium and UK Biobank, the causal effect estimate of waist circumference on blood pressure changed direction upon adjustment of waist circumference for body mass index. Conclusions Our findings indicate that using covariable-adjusted summary associations in MR should generally be avoided. When that is not possible, careful consideration of the causal relationships underlying the data (including potentially unmeasured confounders) is required to direct sensitivity analyses and interpret results with appropriate caution.



2021 ◽  
Vol 12 ◽  
Author(s):  
Yuquan Wang ◽  
Tingting Li ◽  
Liwan Fu ◽  
Siqian Yang ◽  
Yue-Qing Hu

Mendelian randomization makes use of genetic variants as instrumental variables to eliminate the influence induced by unknown confounders on causal estimation in epidemiology studies. However, with the soaring genetic variants identified in genome-wide association studies, the pleiotropy, and linkage disequilibrium in genetic variants are unavoidable and may produce severe bias in causal inference. In this study, by modeling the pleiotropic effect as a normally distributed random effect, we propose a novel mixed-effects regression model-based method PLDMR, pleiotropy and linkage disequilibrium adaptive Mendelian randomization, which takes linkage disequilibrium into account and also corrects for the pleiotropic effect in causal effect estimation and statistical inference. We conduct voluminous simulation studies to evaluate the performance of the proposed and existing methods. Simulation results illustrate the validity and advantage of the novel method, especially in the case of linkage disequilibrium and directional pleiotropic effects, compared with other methods. In addition, by applying this novel method to the data on Atherosclerosis Risk in Communications Study, we conclude that body mass index has a significant causal effect on and thus might be a potential risk factor of systolic blood pressure. The novel method is implemented in R and the corresponding R code is provided for free download.



2017 ◽  
Author(s):  
Jorien L. Treur ◽  
Mark Gibson ◽  
Amy E Taylor ◽  
Peter J Rogers ◽  
Marcus R Munafò

AbstractStudy Objectives:Higher caffeine consumption has been linked to poorer sleep and insomnia complaints. We investigated whether these observational associations are the result of genetic risk factors influencing both caffeine consumption and poorer sleep, and/or whether they reflect (possibly bidirectional) causal effects.Methods:Summary-level data were available from genome-wide association studies (GWAS) on caffeine consumption (n=91,462), sleep duration, and chronotype (i.e., being a ‘morning’ versus an ‘evening’ person) (both n=128,266), and insomnia complaints (n=113,006). Linkage disequilibrium (LD) score regression was used to calculate genetic correlations, reflecting the extent to which genetic variants influencing caffeine consumption and sleep behaviours overlap. Causal effects were tested with bidirectional, two-sample Mendelian randomization (MR), an instrumental variable approach that utilizes genetic variants robustly associated with an exposure variable as an instrument to test causal effects. Estimates from individual genetic variants were combined using inverse-variance weighted meta-analysis, weighted median regression and MR Egger regression methods.Results:There was no clear evidence for genetic correlation between caffeine consumption and sleep duration (rg=0.000,p=0.998), chronotype (rg=0.086,p=0.192) or insomnia (rg=-0.034,p=0.700). Two-sample Mendelian randomization analyses did not support causal effects from caffeine consumption to sleep behaviours, or the other way around.Conclusions:We found no evidence in support of genetic correlation or causal effects between caffeine consumption and sleep. While caffeine may have acute effects on sleep when taken shortly before habitual bedtime, our findings suggest that a more sustained pattern of high caffeine consumption is likely associated with poorer sleep through shared environmental factors.



2019 ◽  
Author(s):  
Tom G Richardson ◽  
Gibran Hemani ◽  
Tom R Gaunt ◽  
Caroline L Relton ◽  
George Davey Smith

AbstractBackgroundDeveloping insight into tissue-specific transcriptional mechanisms can help improve our understanding of how genetic variants exert their effects on complex traits and disease. By applying the principles of Mendelian randomization, we have undertaken a systematic analysis to evaluate transcriptome-wide associations between gene expression across 48 different tissue types and 395 complex traits.ResultsOverall, we identified 100,025 gene-trait associations based on conventional genome-wide corrections (P < 5 × 10−08) that also provided evidence of genetic colocalization. These results indicated that genetic variants which influence gene expression levels in multiple tissues are more likely to influence multiple complex traits. We identified many examples of tissue-specific effects, such as genetically-predicted TPO, NR3C2 and SPATA13 expression only associating with thyroid disease in thyroid tissue. Additionally, FBN2 expression was associated with both cardiovascular and lung function traits, but only when analysed in heart and lung tissue respectively.We also demonstrate that conducting phenome-wide evaluations of our results can help flag adverse on-target side effects for therapeutic intervention, as well as propose drug repositioning opportunities. Moreover, we find that exploring the tissue-dependency of associations identified by genome-wide association studies (GWAS) can help elucidate the causal genes and tissues responsible for effects, as well as uncover putative novel associations.ConclusionsThe atlas of tissue-dependent associations we have constructed should prove extremely valuable to future studies investigating the genetic determinants of complex disease. The follow-up analyses we have performed in this study are merely a guide for future research. Conducting similar evaluations can be undertaken systematically at http://mrcieu.mrsoftware.org/Tissue_MR_atlas/.



2020 ◽  
Vol 2 (2) ◽  
Author(s):  
Qing Cheng ◽  
Yi Yang ◽  
Xingjie Shi ◽  
Kar-Fu Yeung ◽  
Can Yang ◽  
...  

Abstract The proliferation of genome-wide association studies (GWAS) has prompted the use of two-sample Mendelian randomization (MR) with genetic variants as instrumental variables (IVs) for drawing reliable causal relationships between health risk factors and disease outcomes. However, the unique features of GWAS demand that MR methods account for both linkage disequilibrium (LD) and ubiquitously existing horizontal pleiotropy among complex traits, which is the phenomenon wherein a variant affects the outcome through mechanisms other than exclusively through the exposure. Therefore, statistical methods that fail to consider LD and horizontal pleiotropy can lead to biased estimates and false-positive causal relationships. To overcome these limitations, we proposed a probabilistic model for MR analysis in identifying the causal effects between risk factors and disease outcomes using GWAS summary statistics in the presence of LD and to properly account for horizontal pleiotropy among genetic variants (MR-LDP) and develop a computationally efficient algorithm to make the causal inference. We then conducted comprehensive simulation studies to demonstrate the advantages of MR-LDP over the existing methods. Moreover, we used two real exposure–outcome pairs to validate the results from MR-LDP compared with alternative methods, showing that our method is more efficient in using all-instrumental variants in LD. By further applying MR-LDP to lipid traits and body mass index (BMI) as risk factors for complex diseases, we identified multiple pairs of significant causal relationships, including a protective effect of high-density lipoprotein cholesterol on peripheral vascular disease and a positive causal effect of BMI on hemorrhoids.



2018 ◽  
Vol 48 (3) ◽  
pp. 684-690 ◽  
Author(s):  
Wes Spiller ◽  
Neil M Davies ◽  
Tom M Palmer

Abstract Motivation In recent years, Mendelian randomization analysis using summary data from genome-wide association studies has become a popular approach for investigating causal relationships in epidemiology. The mrrobust Stata package implements several of the recently developed methods. Implementation mrrobust is freely available as a Stata package. General features The package includes inverse variance weighted estimation, as well as a range of median, modal and MR-Egger estimation methods. Using mrrobust, plots can be constructed visualizing each estimate either individually or simultaneously. The package also provides statistics such as IGX2, which are useful in assessing attenuation bias in causal estimates. Availability The software is freely available from GitHub [https://raw.github.com/remlapmot/mrrobust/master/].



BMC Medicine ◽  
2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Maxime M. Bos ◽  
Neil J. Goulding ◽  
Matthew A. Lee ◽  
Amy Hofman ◽  
Mariska Bot ◽  
...  

Abstract Background Sleep traits are associated with cardiometabolic disease risk, with evidence from Mendelian randomization (MR) suggesting that insomnia symptoms and shorter sleep duration increase coronary artery disease risk. We combined adjusted multivariable regression (AMV) and MR analyses of phenotypes of unfavourable sleep on 113 metabolomic traits to investigate possible biochemical mechanisms linking sleep to cardiovascular disease. Methods We used AMV (N = 17,368) combined with two-sample MR (N = 38,618) to examine effects of self-reported insomnia symptoms, total habitual sleep duration, and chronotype on 113 metabolomic traits. The AMV analyses were conducted on data from 10 cohorts of mostly Europeans, adjusted for age, sex, and body mass index. For the MR analyses, we used summary results from published European-ancestry genome-wide association studies of self-reported sleep traits and of nuclear magnetic resonance (NMR) serum metabolites. We used the inverse-variance weighted (IVW) method and complemented this with sensitivity analyses to assess MR assumptions. Results We found consistent evidence from AMV and MR analyses for associations of usual vs. sometimes/rare/never insomnia symptoms with lower citrate (− 0.08 standard deviation (SD)[95% confidence interval (CI) − 0.12, − 0.03] in AMV and − 0.03SD [− 0.07, − 0.003] in MR), higher glycoprotein acetyls (0.08SD [95% CI 0.03, 0.12] in AMV and 0.06SD [0.03, 0.10) in MR]), lower total very large HDL particles (− 0.04SD [− 0.08, 0.00] in AMV and − 0.05SD [− 0.09, − 0.02] in MR), and lower phospholipids in very large HDL particles (− 0.04SD [− 0.08, 0.002] in AMV and − 0.05SD [− 0.08, − 0.02] in MR). Longer total sleep duration associated with higher creatinine concentrations using both methods (0.02SD per 1 h [0.01, 0.03] in AMV and 0.15SD [0.02, 0.29] in MR) and with isoleucine in MR analyses (0.22SD [0.08, 0.35]). No consistent evidence was observed for effects of chronotype on metabolomic measures. Conclusions Whilst our results suggested that unfavourable sleep traits may not cause widespread metabolic disruption, some notable effects were observed. The evidence for possible effects of insomnia symptoms on glycoprotein acetyls and citrate and longer total sleep duration on creatinine and isoleucine might explain some of the effects, found in MR analyses of these sleep traits on coronary heart disease, which warrant further investigation.



2021 ◽  
Author(s):  
Jin Jin ◽  
Guanghao Qi ◽  
Zhi Yu ◽  
Nilanjan Chatterjee

AbstractMendelian Randomization (MR) analysis is increasingly popular for testing the causal effect of exposures on disease outcomes using data from genome-wide association studies. In some settings, the underlying exposure, such as systematic inflammation, may not be directly observable, but measurements can be available on multiple biomarkers, or other types of traits, that are co-regulated by the exposure. We propose method MRLE, which tests the significance for, and the direction of, the effect of a latent exposure by leveraging information from multiple related traits. The method is developed by constructing a set of estimating functions based on the second-order moments of summary association statistics, under a structural equation model where genetic variants are assumed to have indirect effects through the latent exposure and potentially direct effects on the traits. Simulation studies showed that MRLE has well-controlled type I error rates and increased power compared to single-trait MR tests under various types of pleiotropy. Applications of MRLE using genetic association statistics across five inflammatory biomarkers (CRP, IL-6, IL-8, TNF-α and MCP-1) provided evidence for potential causal effects of inflammation on increased risk of coronary artery disease, colorectal cancer and rheumatoid arthritis, while standard MR analysis for individual biomarkers often failed to detect consistent evidence for such effects.



2021 ◽  
Author(s):  
Karthik A. Jagadeesh ◽  
Kushal K Dey ◽  
Daniel T. Montoro ◽  
Steven Gazal ◽  
Jesse M Engreitz ◽  
...  

Cellular dysfunction is a hallmark of disease. Genome-wide association studies (GWAS) have provided a powerful means to identify loci and genes contributing to disease risk, but in many cases the related cell types/states through which genes confer disease risk remain unknown. Deciphering such relationships is important both for our understanding of disease, and for developing therapeutic interventions. Here, we introduce a framework for integrating single-cell RNA-seq (scRNA-seq), epigenomic maps and GWAS summary statistics to infer the underlying cell types and processes by which genetic variants influence disease. We analyzed 1.6 million scRNA-seq profiles from 209 individuals spanning 11 tissue types and 6 disease conditions, and constructed gene programs capturing cell types, disease progression in cell types, and cellular processes both within and across cell types. We evaluated these gene programs for disease enrichment by transforming them to SNP annotations with tissue-specific epigenomic maps and computing enrichment scores across 60 diseases and complex traits (average N=297K). The inferred disease enrichments recapitulated known biology and highlighted novel relationships for different conditions, including GABAergic neurons in major depressive disorder (MDD), disease progression programs in M cells in ulcerative colitis, and a disease-specific complement cascade process in multiple sclerosis. Our framework provides a powerful approach for identifying the cell types and cellular processes by which genetic variants influence disease.



Sign in / Sign up

Export Citation Format

Share Document