A two-stage estimation of structural equation models with continuous and polytomous variables

Author(s):  
Sik-Yum Lee ◽  
Wai-Yin Poon ◽  
P. M. Bentler
Biostatistics ◽  
2019 ◽  
Vol 21 (4) ◽  
pp. 676-691
Author(s):  
Klaus Kähler Holst ◽  
Esben Budtz-Jørgensen

Summary Applications of structural equation models (SEMs) are often restricted to linear associations between variables. Maximum likelihood (ML) estimation in non-linear models may be complex and require numerical integration. Furthermore, ML inference is sensitive to distributional assumptions. In this article, we introduce a simple two-stage estimation technique for estimation of non-linear associations between latent variables. Here both steps are based on fitting linear SEMs: first a linear model is fitted to data on the latent predictor and terms describing the non-linear effect are predicted by their conditional means. In the second step, the predictions are included in a linear model for the latent outcome variable. We show that this procedure is consistent and identifies its asymptotic distribution. We also illustrate how this framework easily allows the association between latent variables to be modeled using restricted cubic splines, and we develop a modified estimator which is robust to non-normality of the latent predictor. In a simulation study, we compare the proposed method to MLE and alternative two-stage estimation techniques.


2020 ◽  
Vol 52 (6) ◽  
pp. 2306-2323 ◽  
Author(s):  
Lihan Chen ◽  
Victoria Savalei ◽  
Mijke Rhemtulla

AbstractPsychologists use scales comprised of multiple items to measure underlying constructs. Missing data on such scales often occur at the item level, whereas the model of interest to the researcher is at the composite (scale score) level. Existing analytic approaches cannot easily accommodate item-level missing data when models involve composites. A very common practice in psychology is to average all available items to produce scale scores. This approach, referred to as available-case maximum likelihood (ACML), may produce biased parameter estimates. Another approach researchers use to deal with item-level missing data is scale-level full information maximum likelihood (SL-FIML), which treats the whole scale as missing if any item is missing. SL-FIML is inefficient and it may also exhibit bias. Multiple imputation (MI) produces the correct results using a simulation-based approach. We study a new analytic alternative for item-level missingness, called two-stage maximum likelihood (TSML; Savalei & Rhemtulla, Journal of Educational and Behavioral Statistics, 42(4), 405–431. 2017). The original work showed the method outperforming ACML and SL-FIML in structural equation models with parcels. The current simulation study examined the performance of ACML, SL-FIML, MI, and TSML in the context of univariate regression. We demonstrated performance issues encountered by ACML and SL-FIML when estimating regression coefficients, under both MCAR and MAR conditions. Aside from convergence issues with small sample sizes and high missingness, TSML performed similarly to MI in all conditions, showing negligible bias, high efficiency, and good coverage. This fast analytic approach is therefore recommended whenever it achieves convergence. R code and a Shiny app to perform TSML are provided.


2017 ◽  
Vol 42 (4) ◽  
pp. 405-431 ◽  
Author(s):  
Victoria Savalei ◽  
Mijke Rhemtulla

In many modeling contexts, the variables in the model are linear composites of the raw items measured for each participant; for instance, regression and path analysis models rely on scale scores, and structural equation models often use parcels as indicators of latent constructs. Currently, no analytic estimation method exists to appropriately handle missing data at the item level. Item-level multiple imputation (MI), however, can handle such missing data straightforwardly. In this article, we develop an analytic approach for dealing with item-level missing data—that is, one that obtains a unique set of parameter estimates directly from the incomplete data set and does not require imputations. The proposed approach is a variant of the two-stage maximum likelihood (TSML) methodology, and it is the analytic equivalent of item-level MI. We compare the new TSML approach to three existing alternatives for handling item-level missing data: scale-level full information maximum likelihood, available-case maximum likelihood, and item-level MI. We find that the TSML approach is the best analytic approach, and its performance is similar to item-level MI. We recommend its implementation in popular software and its further study.


Sign in / Sign up

Export Citation Format

Share Document