Tuning Random Forests for Causal Inference Under Cluster-Level Unmeasured Confounding

Mapping Intimacies ◽

10.31234/osf.io/36w72 ◽

2020 ◽

Author(s):

Youmi Suk ◽

Hyunseung Kang

Keyword(s):

Propensity Score ◽

Causal Inference ◽

Random Forests ◽

Fixed Effects ◽

Propensity Scores ◽

Real Data ◽

Unmeasured Confounding ◽

Variable Bias ◽

Almost All ◽

Cluster Level

Recently, there has been growing interest in using machine learning (ML) methods for causal inference due to their automatic and flexible abilities to model the propensity score and the outcome model. However, almost all the ML methods for causal inference have been studied under the assumption of no unmeasured confounding and there is little work on handling omitted/unmeasured variable bias. This paper focuses on an ML method based on random forests known as Causal Forests and presents five simple modifications for tuning Causal Forests so that they are robust to cluster-level unmeasured confounding. Our simulation study finds that adjusting the algorithm with the propensity score from fixed effects logistic regression and using demeaned variables make the estimates more robust to cluster-level unmeasured confounding. In particular, using demeaned variables is useful when we are not sure of the functional form of the propensity scores. We conclude by demonstrating our proposals in a real data study concerning the effect of taking an eighth-grade algebra course on math achievement scores from the Early Childhood Longitudinal Study.

Download Full-text

Why Propensity Scores Should Not Be Used for Matching

Political Analysis ◽

10.1017/pan.2019.11 ◽

2019 ◽

Vol 27 (4) ◽

pp. 435-454 ◽

Cited By ~ 149

Author(s):

Gary King ◽

Richard Nielsen

Keyword(s):

Propensity Score ◽

Causal Inference ◽

Propensity Score Matching ◽

Propensity Scores ◽

Original Data ◽

The Other ◽

Randomized Experiment ◽

Random Matching ◽

Popular Method ◽

Matching Methods

We show that propensity score matching (PSM), an enormously popular method of preprocessing data for causal inference, often accomplishes the opposite of its intended goal—thus increasing imbalance, inefficiency, model dependence, and bias. The weakness of PSM comes from its attempts to approximate a completely randomized experiment, rather than, as with other matching methods, a more efficient fully blocked randomized experiment. PSM is thus uniquely blind to the often large portion of imbalance that can be eliminated by approximating full blocking with other matching methods. Moreover, in data balanced enough to approximate complete randomization, either to begin with or after pruning some observations, PSM approximates random matching which, we show, increases imbalance even relative to the original data. Although these results suggest researchers replace PSM with one of the other available matching methods, propensity scores have other productive uses.

Download Full-text

Propensity score analysis for correlated subgroup effects

Statistical Methods in Medical Research ◽

10.1177/0962280219850595 ◽

2019 ◽

Vol 29 (4) ◽

pp. 1067-1080

Author(s):

Shan-Yu Liu ◽

Chunyan Liu ◽

Eddie Nehus ◽

Maurizio Macaluso ◽

Bo Lu ◽

...

Keyword(s):

Propensity Score ◽

Variable Selection ◽

Propensity Scores ◽

Propensity Score Analysis ◽

Real Data ◽

Treatment Selection ◽

Propensity Score Model ◽

Interaction Terms ◽

Selection Processes ◽

Subgroup Effects

As individuals may respond differently to treatment, estimating subgroup effects is important to understand the characteristics of individuals who may benefit. Factors that define subgroups may be correlated, complicating evaluation of subgroup effects, especially in observational studies requiring control of confounding variables. We address this problem when propensity score methods are used for confounding control. A common practice is to evaluate candidate subgroup identifiers one at a time without adjusting for other candidate identifiers. We show that this practice can be misleading if the treatment effect modification attributed to a candidate identifier is in truth due to the effect of other correlated true effect modifiers. Whereas jointly analyzing multiple identifiers provides estimates of the desired subgroup effects adjusted for the effects of the other identifiers, it requires the propensity scores to adequately reflect the underlying treatment selection processes and balance the covariates within each subgroup of interest. Satisfying the requirement in practice is hard since the number of strata may increase quickly, while the per stratum sample size may decrease dramatically. A practically helpful approach is utilizing the whole cohort for the propensity score estimation with modeling of interaction terms to reflect the potentially different treatment selection processes across strata. We empirically examine the performance of the whole cohort approach by itself and with subjecting the interaction terms to variable selection. Our results using both simulations and real data analysis suggest that the whole cohort approach should explore inclusion of high-order interactions in the propensity score model to ensure adequate covariate balance across strata, and that variable selection is of limited utility.

Download Full-text

A model averaging approach for estimating propensity scores by optimizing balance

Statistical Methods in Medical Research ◽

10.1177/0962280217715487 ◽

2017 ◽

Vol 28 (1) ◽

pp. 84-101 ◽

Cited By ~ 2

Author(s):

Yuying Xie ◽

Yeying Zhu ◽

Cecilia A Cotton ◽

Pan Wu

Keyword(s):

Propensity Score ◽

Exclusive Breastfeeding ◽

Propensity Scores ◽

Causal Effect ◽

Model Averaging ◽

Real Data ◽

Parametric Modeling ◽

Machine Learning Techniques ◽

Mixed Feeding ◽

Data Set

Many approaches, including traditional parametric modeling and machine learning techniques, have been proposed to estimate propensity scores. This paper describes a new model averaging approach to propensity score estimation in which parametric and nonparametric estimates are combined to achieve covariate balance. Simulation studies are conducted across different scenarios varying in the degree of interactions and nonlinearities in the treatment model. The results show that, based on inverse probability weighting, the proposed propensity score estimator produces less bias and smaller standard errors than existing approaches. They also show that a model averaging approach with the objective of minimizing the average Kolmogorov–Smirnov statistic leads to the best performing IPW estimator. The proposed approach is also applied to a real data set in evaluating the causal effect of formula or mixed feeding versus exclusive breastfeeding on a child’s body mass index Z-score at age 4. The data analysis shows that formula or mixed feeding is more likely to lead to obesity at age 4, compared to exclusive breastfeeding.

Download Full-text

Propensity Score Weighting for Causal Inference with Clustered Data

Journal of Causal Inference ◽

10.1515/jci-2017-0027 ◽

2018 ◽

Vol 6 (2) ◽

Author(s):

Shu Yang

Keyword(s):

Propensity Score ◽

Causal Inference ◽

Clustered Data ◽

Overweight And Obesity ◽

Average Treatment Effect ◽

Treatment Assignment ◽

Propensity Score Model ◽

Propensity Score Weighting ◽

Score Model ◽

Cluster Level

AbstractPropensity score weighting is a tool for causal inference to adjust for measured confounders in observational studies. In practice, data often present complex structures, such as clustering, which make propensity score modeling and estimation challenging. In addition, for clustered data, there may be unmeasured cluster-level covariates that are related to both the treatment assignment and outcome. When such unmeasured cluster-specific confounders exist and are omitted in the propensity score model, the subsequent propensity score adjustment may be biased. In this article, we propose a calibration technique for propensity score estimation under the latent ignorable treatment assignment mechanism, i. e., the treatment-outcome relationship is unconfounded given the observed covariates and the latent cluster-specific confounders. We impose novel balance constraints which imply exact balance of the observed confounders and the unobserved cluster-level confounders between the treatment groups. We show that the proposed calibrated propensity score weighting estimator is doubly robust in that it is consistent for the average treatment effect if either the propensity score model is correctly specified or the outcome follows a linear mixed effects model. Moreover, the proposed weighting method can be combined with sampling weights for an integrated solution to handle confounding and sampling designs for causal inference with clustered survey data. In simulation studies, we show that the proposed estimator is superior to other competitors. We estimate the effect of School Body Mass Index Screening on prevalence of overweight and obesity for elementary schools in Pennsylvania.

Download Full-text

Bias Amplification and Bias Unmasking

Political Analysis ◽

10.1093/pan/mpw015 ◽

2016 ◽

Vol 24 (3) ◽

pp. 307-323 ◽

Cited By ~ 16

Author(s):

Joel A. Middleton ◽

Marc A. Scott ◽

Ronli Diakow ◽

Jennifer L. Hill

Keyword(s):

Fixed Effects ◽

Experimental Studies ◽

Real Data ◽

Causal Effects ◽

Group Level ◽

Popular Approach ◽

Graphical Displays ◽

Omitted Variable Bias ◽

Variable Bias ◽

Primary Focus

In the analysis of causal effects in non-experimental studies, conditioning on observable covariates is one way to try to reduce unobserved confounder bias. However, a developing literature has shown that conditioning on certain covariates may increase bias, and the mechanisms underlying this phenomenon have not been fully explored. We add to the literature on bias-increasing covariates by first introducing a way to decompose omitted variable bias into three constituent parts: bias due to an unobserved confounder, bias due toexcludingobserved covariates, and bias due to amplification. This leads to two important findings. Although instruments have been the primary focus of the bias amplification literature to date, we identify the fact that the popular approach of adding group fixed effects can lead to bias amplification as well. This is an important finding because many practitioners think that fixed effects are a convenient way to account for any and all group-level confounding and are at worst harmless. The second finding introduces the concept of biasunmaskingand shows how it can be even more insidious than bias amplification in some cases. After introducing these new results analytically, we use constructed observational placebo studies to illustrate bias amplification and bias unmasking with real data. Finally, we propose a way to add bias decomposition information to graphical displays for sensitivity analysis to help practitioners think through the potential for bias amplification and bias unmasking in actual applications.

Download Full-text

4. Bayesian Propensity Score Estimators: Incorporating Uncertainties in Propensity Scores into Causal Inference

Sociological Methodology ◽

10.1111/j.1467-9531.2010.01226.x ◽

2010 ◽

Vol 40 (1) ◽

pp. 151-189 ◽

Cited By ~ 33

Author(s):

Weihua An

Keyword(s):

Propensity Score ◽

Causal Inference ◽

Propensity Scores

Download Full-text

Random Forests Approach for Causal Inference with Clustered Observational Data

10.31234/osf.io/xgq2k ◽

2019 ◽

Author(s):

Youmi Suk ◽

Hyunseung Kang ◽

Jee-Seon Kim

Keyword(s):

Causal Inference ◽

Observational Data ◽

Random Forests ◽

Large Scale ◽

Propensity Scores ◽

Multilevel Models ◽

Study Data ◽

Simulation Studies ◽

Different Types ◽

Mathematics And Science

There is a growing interest in using machine learning (ML) methods for causal inference due to their (nearly) automatic and flexible ability to model key quantities such as the propensity score or the outcome model. Unfortunately, most ML methods for causal inference have been studied under single-level settings where all individuals are independent of each other and there is little work in using these methods with clustered or nested data, a common setting in education studies. This paper investigates using one particular ML method based on random forests known as Causal Forests to estimate treatment effects in multilevel observational data. We conduct simulation studies under different types of multilevel data, including two-level, three-level, and cross-classified data. Our simulation study shows that when the ML method is supplemented with estimated propensity scores from multilevel models that account for clustered/hierarchical structure, the modified ML method outperforms pre-existing methods in a wide variety of settings. We conclude by estimating the effect of private math lessons in the Trends in International Mathematics and Science Study data, a large-scale educational assessment where students are nested within schools.

Download Full-text

Validating Four Standard Scales in Spiritually Practicing and Nonpracticing Samples Using Propensity Score Matching

European Journal of Psychological Assessment ◽

10.1027/1015-5759.24.3.165 ◽

2008 ◽

Vol 24 (3) ◽

pp. 165-173 ◽

Cited By ~ 8

Author(s):

Niko Kohls ◽

Harald Walach

Keyword(s):

Propensity Score ◽

Propensity Score Matching ◽

Repeated Measures ◽

Propensity Scores ◽

Total Sample ◽

Experimental Manipulation ◽

Brief Symptom Inventory ◽

Time Interaction ◽

Social Support Scale ◽

Lower Test

Validation studies of standard scales in the particular sample that one is studying are essential for accurate conclusions. We investigated the differences in answering patterns of the Brief-Symptom-Inventory (BSI), Transpersonal Trust Scale (TPV), Sense of Coherence Questionnaire (SOC), and a Social Support Scale (F-SoZu) for a matched sample of spiritually practicing (SP) and nonpracticing (NSP) individuals at two measurement points (t1, t2). Applying a sample matching procedure based on propensity scores, we selected two sociodemographically balanced subsamples of N = 120 out of a total sample of N = 431. Employing repeated measures ANOVAs, we found an intersample difference in means only for TPV and an intrasample difference for F-SoZu. Additionally, a group × time interaction effect was found for TPV. While Cronbach’s α was acceptable and comparable for both samples, a significantly lower test-rest-reliability for the BSI was found in the SP sample (rSP = .62; rNSP = .78). Thus, when researching the effects of spiritual practice, one should not only look at differences in means but also consider time stability. We recommend propensity score matching as an alternative for randomization in variables that defy experimental manipulation such as spirituality.

Download Full-text

The Effects of Employment on Depression and Life Satisfaction Among Old-Aged Using the DD Method Combined With PSM

Innovation in Aging ◽

10.1093/geroni/igaa057.1315 ◽

2020 ◽

Vol 4 (Supplement_1) ◽

pp. 408-408

Author(s):

Si Young Song ◽

Hey Jung Jun ◽

Sun Ah Lee

Keyword(s):

Life Satisfaction ◽

Propensity Score ◽

Propensity Score Matching ◽

Multiple Regression ◽

Propensity Scores ◽

Panel Study ◽

Double Difference ◽

Combined Model ◽

Comparative Group ◽

Experimental Group

Abstract The purpose of this study is to explore the effect of employment on depression and life satisfaction among old-aged. Using 12th (2017) wave and 13th (2018) wave of Korean Welfare Panel Study (KoWePS), three stages of analyses were conducted. First, through propensity score matching (PSM) method, sample with similar propensity scores was matched between the group that did not work in 12th wave but worked in 13th wave (experimental group, N=180), and the group that did not work in 12th and 13th wave (comparative group, N=180). Second, the matched sample was used to conduct multiple regression analysis with the group dummy variable (experimental group, comparative group) as an independent variable, and depression and life satisfaction as the dependent variables. Third, combined model of propensity score matching (PSM) and double difference (DD) method was conducted to more appropriately derive the net effect of employment. The results of multiple regression after propensity matching showed that employment had a positive effect on reducing depression (B= -1.70, p< .01) and increasing life satisfaction (B= .12, p< .01) in old-aged. Furthermore, in combined model of PSM and DD, life satisfaction was improved when employed compared to non-employed (B= .15, p< .05). The results of this study are meaningful in that the meaning of employment in old-aged is more clearly derived by solving selection bias and endogenous problems. Also, this study may provide reference for establishing welfare policies related to employment among old-aged.

Download Full-text

Mendelian randomisation for mediation analysis: current methods and challenges for implementation

European Journal of Epidemiology ◽

10.1007/s10654-021-00757-1 ◽

2021 ◽

Author(s):

Alice R. Carter ◽

Eleanor Sanderson ◽

Gemma Hammerton ◽

Rebecca C. Richmond ◽

George Davey Smith ◽

...

Keyword(s):

Measurement Error ◽

Causal Inference ◽

Mediation Analysis ◽

Instrumental Variable ◽

Real Data ◽

Mendelian Randomisation ◽

Differential Measurement ◽

Individual Level ◽

Differential Measurement Error ◽

Summary Data

AbstractMediation analysis seeks to explain the pathway(s) through which an exposure affects an outcome. Traditional, non-instrumental variable methods for mediation analysis experience a number of methodological difficulties, including bias due to confounding between an exposure, mediator and outcome and measurement error. Mendelian randomisation (MR) can be used to improve causal inference for mediation analysis. We describe two approaches that can be used for estimating mediation analysis with MR: multivariable MR (MVMR) and two-step MR. We outline the approaches and provide code to demonstrate how they can be used in mediation analysis. We review issues that can affect analyses, including confounding, measurement error, weak instrument bias, interactions between exposures and mediators and analysis of multiple mediators. Description of the methods is supplemented by simulated and real data examples. Although MR relies on large sample sizes and strong assumptions, such as having strong instruments and no horizontally pleiotropic pathways, our simulations demonstrate that these methods are unaffected by confounders of the exposure or mediator and the outcome and non-differential measurement error of the exposure or mediator. Both MVMR and two-step MR can be implemented in both individual-level MR and summary data MR. MR mediation methods require different assumptions to be made, compared with non-instrumental variable mediation methods. Where these assumptions are more plausible, MR can be used to improve causal inference in mediation analysis.

Download Full-text