scholarly journals IDENTIFICATION OF REGRESSION MODELS WITH A MISCLASSIFIED AND ENDOGENOUS BINARY REGRESSOR

2021 ◽  
pp. 1-23
Author(s):  
Hiroyuki Kasahara ◽  
Katsumi Shimotsu

We study identification in nonparametric regression models with a misclassified and endogenous binary regressor when an instrument is correlated with misclassification error. We show that the regression function is nonparametrically identified if one binary instrument variable and one binary covariate satisfy the following conditions. The instrumental variable corrects endogeneity; the instrumental variable must be correlated with the unobserved true underlying binary variable, must be uncorrelated with the error term in the outcome equation, but is allowed to be correlated with the misclassification error. The covariate corrects misclassification; this variable can be one of the regressors in the outcome equation, must be correlated with the unobserved true underlying binary variable, and must be uncorrelated with the misclassification error. We also propose a mixture-based framework for modeling unobserved heterogeneous treatment effects with a misclassified and endogenous binary regressor and show that treatment effects can be identified if the true treatment effect is related to an observed regressor and another observable variable.

2019 ◽  
pp. 004912411988244
Author(s):  
Deirdre Bloome ◽  
Daniel Schrage

Causal analyses typically focus on average treatment effects. Yet for substantive research on topics like inequality, interest extends to treatments’ distributional consequences. When individuals differ in their responses to treatment, three types of inequality may result. Treatment may shape inequalities between subgroups defined by pretreatment covariates, it may induce more inequality in one subgroup than another, or it may polarize people across multiple dimensions of well-being. We introduce a model, called a covariance regression, that captures all three types of inequality via the means, variances, and correlations between multiple outcomes. The model can test for heterogeneous treatment effects, quantify the heterogeneity, and explain its structure using covariates. Finding that a treatment creates inequalities could drive theoretical refinement and inform policy decisions (targeting groups where payoffs will be most predictable). We illustrate the utility of covariance regressions by analyzing the effects of sharing information about income inequality on redistributive preferences.


2021 ◽  
Author(s):  
Guihua Wang ◽  
Jun Li ◽  
Wallace J. Hopp

This study addresses the ubiquitous challenge of using big observational data to identify heterogeneous treatment effects. This problem arises in precision medicine, targeted marketing, personalized education, and many other environments. Identifying heterogeneous treatment effects presents several analytical challenges including high dimensionality and endogeneity issues. We develop a new instrumental variable tree (IVT) approach that incorporates the instrumental variable method into a causal tree (CT) to correct for potential endogeneity biases that may exist in observational data. Our IVT approach partitions subjects into subgroups with similar treatment effects within subgroups and different treatment effects across subgroups. The estimated treatment effects are asymptotically consistent under a set of mild assumptions. Using simulated data, we show our approach has a better coverage rate and smaller mean-squared error than the conventional CT approach. We also demonstrate that an instrumental variable forest (IVF) constructed using IVTs has better accuracy and stratification than a generalized random forest. Finally, by applying the IVF approach to an empirical assessment of laparoscopic colectomy, we demonstrate the importance of accounting for endogeneity to make accurate comparisons of the heterogeneous effects of the treatment (teaching hospitals) and control (nonteaching hospitals) on different types of patients. This paper was accepted by J. George Shanthikumar, big data analytics.


2017 ◽  
Vol 25 (4) ◽  
pp. 413-434 ◽  
Author(s):  
Justin Grimmer ◽  
Solomon Messing ◽  
Sean J. Westwood

Randomized experiments are increasingly used to study political phenomena because they can credibly estimate the average effect of a treatment on a population of interest. But political scientists are often interested in how effects vary across subpopulations—heterogeneous treatment effects—and how differences in the content of the treatment affects responses—the response to heterogeneous treatments. Several new methods have been introduced to estimate heterogeneous effects, but it is difficult to know if a method will perform well for a particular data set. Rather than using only one method, we show how an ensemble of methods—weighted averages of estimates from individual models increasingly used in machine learning—accurately measure heterogeneous effects. Building on a large literature on ensemble methods, we show how the weighting of methods can contribute to accurate estimation of heterogeneous treatment effects and demonstrate how pooling models lead to superior performance to individual methods across diverse problems. We apply the ensemble method to two experiments, illuminating how the ensemble method for heterogeneous treatment effects facilitates exploratory analysis of treatment effects.


2005 ◽  
Vol 5 (1) ◽  
Author(s):  
Charles H Mullin

AbstractEmpirical researchers commonly invoke instrumental variable (IV) assumptions to identify treatment effects. This paper considers what can be learned under two specific violations of those assumptions: contaminated and corrupted data. Either of these violations prevents point identification, but sharp bounds of the treatment effect remain feasible. In an applied example, random miscarriages are an IV for women’s age at first birth. However, the inability to separate random miscarriages from behaviorally induced miscarriages (those caused by smoking and drinking) results in a contaminated sample. Furthermore, censored child outcomes produce a corrupted sample. Despite these limitations, the bounds demonstrate that delaying the age at first birth for the current population of non-black teenage mothers reduces their first-born child’s well-being.


2019 ◽  
Vol 116 (10) ◽  
pp. 4156-4165 ◽  
Author(s):  
Sören R. Künzel ◽  
Jasjeet S. Sekhon ◽  
Peter J. Bickel ◽  
Bin Yu

There is growing interest in estimating and analyzing heterogeneous treatment effects in experimental and observational studies. We describe a number of metaalgorithms that can take advantage of any supervised learning or regression method in machine learning and statistics to estimate the conditional average treatment effect (CATE) function. Metaalgorithms build on base algorithms—such as random forests (RFs), Bayesian additive regression trees (BARTs), or neural networks—to estimate the CATE, a function that the base algorithms are not designed to estimate directly. We introduce a metaalgorithm, the X-learner, that is provably efficient when the number of units in one treatment group is much larger than in the other and can exploit structural properties of the CATE function. For example, if the CATE function is linear and the response functions in treatment and control are Lipschitz-continuous, the X-learner can still achieve the parametric rate under regularity conditions. We then introduce versions of the X-learner that use RF and BART as base learners. In extensive simulation studies, the X-learner performs favorably, although none of the metalearners is uniformly the best. In two persuasion field experiments from political science, we demonstrate how our X-learner can be used to target treatment regimes and to shed light on underlying mechanisms. A software package is provided that implements our methods.


Sign in / Sign up

Export Citation Format

Share Document