scholarly journals Hybridizing Machine Learning Methods and Finite Mixture Models for Estimating Heterogeneous Treatment Effects in Latent Classes

2019 ◽  
Author(s):  
Youmi Suk ◽  
Jee-Seon Kim ◽  
Hyunseung Kang

There has been increasing interest in exploring heterogeneous treatment effects using machine learning (ML) methods such as Causal Forests, Bayesian Additive Regression Trees (BART), and Targeted Maximum Likelihood Estimation (TMLE). However, there is little work on applying these methods to estimate treatment effects in latent classes defined by well-established finite mixture/latent class models. This paper proposes a hybrid method, a combination of finite mixture modeling and ML methods from causal inference to discover effect heterogeneity in latent classes. Our simulation study reveals that hybrid ML methods produced more precise and accurate estimates of treatment effects in latent classes. We also use hybrid ML methods to estimate the differential effects of private lessons across latent classes from the Trends in International Mathematics and Science Study (TIMSS) data.

2020 ◽  
pp. 107699862095198
Author(s):  
Youmi Suk ◽  
Jee-Seon Kim ◽  
Hyunseung Kang

There has been increasing interest in exploring heterogeneous treatment effects using machine learning (ML) methods such as causal forests, Bayesian additive regression trees, and targeted maximum likelihood estimation. However, there is little work on applying these methods to estimate treatment effects in latent classes defined by well-established finite mixture/latent class models. This article proposes a hybrid method, a combination of finite mixture modeling and ML methods from causal inference to discover effect heterogeneity in latent classes. Our simulation study reveals that hybrid ML methods produced more precise and accurate estimates of treatment effects in latent classes. We also use hybrid ML methods to estimate the differential effects of private lessons across latent classes from Trends in International Mathematics and Science Study data.


2008 ◽  
Vol 17 (1) ◽  
pp. 33-51 ◽  
Author(s):  
Jeroen K Vermunt

An extension of latent class (LC) and finite mixture models is described for the analysis of hierarchical data sets. As is typical in multilevel analysis, the dependence between lower-level units within higher-level units is dealt with by assuming that certain model parameters differ randomly across higher-level observations. One of the special cases is an LC model in which group-level differences in the logit of belonging to a particular LC are captured with continuous random effects. Other variants are obtained by including random effects in the model for the response variables rather than for the LCs. The variant that receives most attention in this article is an LC model with discrete random effects: higher-level units are clustered based on the likelihood of their members belonging to the various LCs. This yields a model with mixture distributions at two levels, namely at the group and the subject level. This model is illustrated with three rather different empirical examples. The appendix describes an adapted version of the expectation—maximization algorithm that can be used for maximum likelihood estimation, as well as providing setups for estimating the multilevel LC model with generally available software.


2019 ◽  
Vol 116 (10) ◽  
pp. 4156-4165 ◽  
Author(s):  
Sören R. Künzel ◽  
Jasjeet S. Sekhon ◽  
Peter J. Bickel ◽  
Bin Yu

There is growing interest in estimating and analyzing heterogeneous treatment effects in experimental and observational studies. We describe a number of metaalgorithms that can take advantage of any supervised learning or regression method in machine learning and statistics to estimate the conditional average treatment effect (CATE) function. Metaalgorithms build on base algorithms—such as random forests (RFs), Bayesian additive regression trees (BARTs), or neural networks—to estimate the CATE, a function that the base algorithms are not designed to estimate directly. We introduce a metaalgorithm, the X-learner, that is provably efficient when the number of units in one treatment group is much larger than in the other and can exploit structural properties of the CATE function. For example, if the CATE function is linear and the response functions in treatment and control are Lipschitz-continuous, the X-learner can still achieve the parametric rate under regularity conditions. We then introduce versions of the X-learner that use RF and BART as base learners. In extensive simulation studies, the X-learner performs favorably, although none of the metalearners is uniformly the best. In two persuasion field experiments from political science, we demonstrate how our X-learner can be used to target treatment regimes and to shed light on underlying mechanisms. A software package is provided that implements our methods.


2020 ◽  
Vol 29 (11) ◽  
pp. 3381-3395
Author(s):  
Wonmo Koo ◽  
Heeyoung Kim

Latent class models have been widely used in longitudinal studies to uncover unobserved heterogeneity in a population and find the characteristics of the latent classes simultaneously using the class allocation probabilities dependent on predictors. However, previous latent class models for longitudinal data suffer from uncertainty in the choice of the number of latent classes. In this study, we propose a Bayesian nonparametric latent class model for longitudinal data, which allows the number of latent classes to be inferred from the data. The proposed model is an infinite mixture model with predictor-dependent class allocation probabilities; an individual longitudinal trajectory is described by the class-specific linear mixed effects model. The model parameters are estimated using Markov chain Monte Carlo methods. The proposed model is validated using a simulated example and a real-data example for characterizing latent classes of estradiol trajectories over the menopausal transition using data from the Study of Women’s Health Across the Nation.


2020 ◽  
Vol 8 (3) ◽  
pp. 30 ◽  
Author(s):  
Alexander Robitzsch

The last series of Raven’s standard progressive matrices (SPM-LS) test was studied with respect to its psychometric properties in a series of recent papers. In this paper, the SPM-LS dataset is analyzed with regularized latent class models (RLCMs). For dichotomous item response data, an alternative estimation approach based on fused regularization for RLCMs is proposed. For polytomous item responses, different alternative fused regularization penalties are presented. The usefulness of the proposed methods is demonstrated in a simulated data illustration and for the SPM-LS dataset. For the SPM-LS dataset, it turned out the regularized latent class model resulted in five partially ordered latent classes. In total, three out of five latent classes are ordered for all items. For the remaining two classes, violations for two and three items were found, respectively, which can be interpreted as a kind of latent differential item functioning.


Sign in / Sign up

Export Citation Format

Share Document