scholarly journals Tuning Random Forests for Causal Inference Under Cluster-Level Unmeasured Confounding

2020 ◽  
Author(s):  
Youmi Suk ◽  
Hyunseung Kang

Recently, there has been growing interest in using machine learning (ML) methods for causal inference due to their automatic and flexible abilities to model the propensity score and the outcome model. However, almost all the ML methods for causal inference have been studied under the assumption of no unmeasured confounding and there is little work on handling omitted/unmeasured variable bias. This paper focuses on an ML method based on random forests known as Causal Forests and presents five simple modifications for tuning Causal Forests so that they are robust to cluster-level unmeasured confounding. Our simulation study finds that adjusting the algorithm with the propensity score from fixed effects logistic regression and using demeaned variables make the estimates more robust to cluster-level unmeasured confounding. In particular, using demeaned variables is useful when we are not sure of the functional form of the propensity scores. We conclude by demonstrating our proposals in a real data study concerning the effect of taking an eighth-grade algebra course on math achievement scores from the Early Childhood Longitudinal Study.

2019 ◽  
Vol 27 (4) ◽  
pp. 435-454 ◽  
Author(s):  
Gary King ◽  
Richard Nielsen

We show that propensity score matching (PSM), an enormously popular method of preprocessing data for causal inference, often accomplishes the opposite of its intended goal—thus increasing imbalance, inefficiency, model dependence, and bias. The weakness of PSM comes from its attempts to approximate a completely randomized experiment, rather than, as with other matching methods, a more efficient fully blocked randomized experiment. PSM is thus uniquely blind to the often large portion of imbalance that can be eliminated by approximating full blocking with other matching methods. Moreover, in data balanced enough to approximate complete randomization, either to begin with or after pruning some observations, PSM approximates random matching which, we show, increases imbalance even relative to the original data. Although these results suggest researchers replace PSM with one of the other available matching methods, propensity scores have other productive uses.


2019 ◽  
Vol 29 (4) ◽  
pp. 1067-1080
Author(s):  
Shan-Yu Liu ◽  
Chunyan Liu ◽  
Eddie Nehus ◽  
Maurizio Macaluso ◽  
Bo Lu ◽  
...  

As individuals may respond differently to treatment, estimating subgroup effects is important to understand the characteristics of individuals who may benefit. Factors that define subgroups may be correlated, complicating evaluation of subgroup effects, especially in observational studies requiring control of confounding variables. We address this problem when propensity score methods are used for confounding control. A common practice is to evaluate candidate subgroup identifiers one at a time without adjusting for other candidate identifiers. We show that this practice can be misleading if the treatment effect modification attributed to a candidate identifier is in truth due to the effect of other correlated true effect modifiers. Whereas jointly analyzing multiple identifiers provides estimates of the desired subgroup effects adjusted for the effects of the other identifiers, it requires the propensity scores to adequately reflect the underlying treatment selection processes and balance the covariates within each subgroup of interest. Satisfying the requirement in practice is hard since the number of strata may increase quickly, while the per stratum sample size may decrease dramatically. A practically helpful approach is utilizing the whole cohort for the propensity score estimation with modeling of interaction terms to reflect the potentially different treatment selection processes across strata. We empirically examine the performance of the whole cohort approach by itself and with subjecting the interaction terms to variable selection. Our results using both simulations and real data analysis suggest that the whole cohort approach should explore inclusion of high-order interactions in the propensity score model to ensure adequate covariate balance across strata, and that variable selection is of limited utility.


2017 ◽  
Vol 28 (1) ◽  
pp. 84-101 ◽  
Author(s):  
Yuying Xie ◽  
Yeying Zhu ◽  
Cecilia A Cotton ◽  
Pan Wu

Many approaches, including traditional parametric modeling and machine learning techniques, have been proposed to estimate propensity scores. This paper describes a new model averaging approach to propensity score estimation in which parametric and nonparametric estimates are combined to achieve covariate balance. Simulation studies are conducted across different scenarios varying in the degree of interactions and nonlinearities in the treatment model. The results show that, based on inverse probability weighting, the proposed propensity score estimator produces less bias and smaller standard errors than existing approaches. They also show that a model averaging approach with the objective of minimizing the average Kolmogorov–Smirnov statistic leads to the best performing IPW estimator. The proposed approach is also applied to a real data set in evaluating the causal effect of formula or mixed feeding versus exclusive breastfeeding on a child’s body mass index Z-score at age 4. The data analysis shows that formula or mixed feeding is more likely to lead to obesity at age 4, compared to exclusive breastfeeding.


2018 ◽  
Vol 6 (2) ◽  
Author(s):  
Shu Yang

AbstractPropensity score weighting is a tool for causal inference to adjust for measured confounders in observational studies. In practice, data often present complex structures, such as clustering, which make propensity score modeling and estimation challenging. In addition, for clustered data, there may be unmeasured cluster-level covariates that are related to both the treatment assignment and outcome. When such unmeasured cluster-specific confounders exist and are omitted in the propensity score model, the subsequent propensity score adjustment may be biased. In this article, we propose a calibration technique for propensity score estimation under the latent ignorable treatment assignment mechanism, i. e., the treatment-outcome relationship is unconfounded given the observed covariates and the latent cluster-specific confounders. We impose novel balance constraints which imply exact balance of the observed confounders and the unobserved cluster-level confounders between the treatment groups. We show that the proposed calibrated propensity score weighting estimator is doubly robust in that it is consistent for the average treatment effect if either the propensity score model is correctly specified or the outcome follows a linear mixed effects model. Moreover, the proposed weighting method can be combined with sampling weights for an integrated solution to handle confounding and sampling designs for causal inference with clustered survey data. In simulation studies, we show that the proposed estimator is superior to other competitors. We estimate the effect of School Body Mass Index Screening on prevalence of overweight and obesity for elementary schools in Pennsylvania.


2016 ◽  
Vol 24 (3) ◽  
pp. 307-323 ◽  
Author(s):  
Joel A. Middleton ◽  
Marc A. Scott ◽  
Ronli Diakow ◽  
Jennifer L. Hill

In the analysis of causal effects in non-experimental studies, conditioning on observable covariates is one way to try to reduce unobserved confounder bias. However, a developing literature has shown that conditioning on certain covariates may increase bias, and the mechanisms underlying this phenomenon have not been fully explored. We add to the literature on bias-increasing covariates by first introducing a way to decompose omitted variable bias into three constituent parts: bias due to an unobserved confounder, bias due toexcludingobserved covariates, and bias due to amplification. This leads to two important findings. Although instruments have been the primary focus of the bias amplification literature to date, we identify the fact that the popular approach of adding group fixed effects can lead to bias amplification as well. This is an important finding because many practitioners think that fixed effects are a convenient way to account for any and all group-level confounding and are at worst harmless. The second finding introduces the concept of biasunmaskingand shows how it can be even more insidious than bias amplification in some cases. After introducing these new results analytically, we use constructed observational placebo studies to illustrate bias amplification and bias unmasking with real data. Finally, we propose a way to add bias decomposition information to graphical displays for sensitivity analysis to help practitioners think through the potential for bias amplification and bias unmasking in actual applications.


2019 ◽  
Author(s):  
Youmi Suk ◽  
Hyunseung Kang ◽  
Jee-Seon Kim

There is a growing interest in using machine learning (ML) methods for causal inference due to their (nearly) automatic and flexible ability to model key quantities such as the propensity score or the outcome model. Unfortunately, most ML methods for causal inference have been studied under single-level settings where all individuals are independent of each other and there is little work in using these methods with clustered or nested data, a common setting in education studies. This paper investigates using one particular ML method based on random forests known as Causal Forests to estimate treatment effects in multilevel observational data. We conduct simulation studies under different types of multilevel data, including two-level, three-level, and cross-classified data. Our simulation study shows that when the ML method is supplemented with estimated propensity scores from multilevel models that account for clustered/hierarchical structure, the modified ML method outperforms pre-existing methods in a wide variety of settings. We conclude by estimating the effect of private math lessons in the Trends in International Mathematics and Science Study data, a large-scale educational assessment where students are nested within schools.


2008 ◽  
Vol 24 (3) ◽  
pp. 165-173 ◽  
Author(s):  
Niko Kohls ◽  
Harald Walach

Validation studies of standard scales in the particular sample that one is studying are essential for accurate conclusions. We investigated the differences in answering patterns of the Brief-Symptom-Inventory (BSI), Transpersonal Trust Scale (TPV), Sense of Coherence Questionnaire (SOC), and a Social Support Scale (F-SoZu) for a matched sample of spiritually practicing (SP) and nonpracticing (NSP) individuals at two measurement points (t1, t2). Applying a sample matching procedure based on propensity scores, we selected two sociodemographically balanced subsamples of N = 120 out of a total sample of N = 431. Employing repeated measures ANOVAs, we found an intersample difference in means only for TPV and an intrasample difference for F-SoZu. Additionally, a group × time interaction effect was found for TPV. While Cronbach’s α was acceptable and comparable for both samples, a significantly lower test-rest-reliability for the BSI was found in the SP sample (rSP = .62; rNSP = .78). Thus, when researching the effects of spiritual practice, one should not only look at differences in means but also consider time stability. We recommend propensity score matching as an alternative for randomization in variables that defy experimental manipulation such as spirituality.


2020 ◽  
Vol 4 (Supplement_1) ◽  
pp. 408-408
Author(s):  
Si Young Song ◽  
Hey Jung Jun ◽  
Sun Ah Lee

Abstract The purpose of this study is to explore the effect of employment on depression and life satisfaction among old-aged. Using 12th (2017) wave and 13th (2018) wave of Korean Welfare Panel Study (KoWePS), three stages of analyses were conducted. First, through propensity score matching (PSM) method, sample with similar propensity scores was matched between the group that did not work in 12th wave but worked in 13th wave (experimental group, N=180), and the group that did not work in 12th and 13th wave (comparative group, N=180). Second, the matched sample was used to conduct multiple regression analysis with the group dummy variable (experimental group, comparative group) as an independent variable, and depression and life satisfaction as the dependent variables. Third, combined model of propensity score matching (PSM) and double difference (DD) method was conducted to more appropriately derive the net effect of employment. The results of multiple regression after propensity matching showed that employment had a positive effect on reducing depression (B= -1.70, p< .01) and increasing life satisfaction (B= .12, p< .01) in old-aged. Furthermore, in combined model of PSM and DD, life satisfaction was improved when employed compared to non-employed (B= .15, p< .05). The results of this study are meaningful in that the meaning of employment in old-aged is more clearly derived by solving selection bias and endogenous problems. Also, this study may provide reference for establishing welfare policies related to employment among old-aged.


Author(s):  
Alice R. Carter ◽  
Eleanor Sanderson ◽  
Gemma Hammerton ◽  
Rebecca C. Richmond ◽  
George Davey Smith ◽  
...  

AbstractMediation analysis seeks to explain the pathway(s) through which an exposure affects an outcome. Traditional, non-instrumental variable methods for mediation analysis experience a number of methodological difficulties, including bias due to confounding between an exposure, mediator and outcome and measurement error. Mendelian randomisation (MR) can be used to improve causal inference for mediation analysis. We describe two approaches that can be used for estimating mediation analysis with MR: multivariable MR (MVMR) and two-step MR. We outline the approaches and provide code to demonstrate how they can be used in mediation analysis. We review issues that can affect analyses, including confounding, measurement error, weak instrument bias, interactions between exposures and mediators and analysis of multiple mediators. Description of the methods is supplemented by simulated and real data examples. Although MR relies on large sample sizes and strong assumptions, such as having strong instruments and no horizontally pleiotropic pathways, our simulations demonstrate that these methods are unaffected by confounders of the exposure or mediator and the outcome and non-differential measurement error of the exposure or mediator. Both MVMR and two-step MR can be implemented in both individual-level MR and summary data MR. MR mediation methods require different assumptions to be made, compared with non-instrumental variable mediation methods. Where these assumptions are more plausible, MR can be used to improve causal inference in mediation analysis.


Sign in / Sign up

Export Citation Format

Share Document