Correcting the Bias of the Root Mean Squared Error of Approximation under Missing Data
Missing data are ubiquitous in both small and large datasets. Missing data may come about as a result of coding or computer error, participant absences, or it may be intentional, as in planned missing designs. We discuss missing data as it relates to goodness-of-fit indices in Structural Equation Modeling (SEM), specifically the effects of missing data on the Root Mean Squared Error of Approximation (RMSEA). We use simulations to show that naive implementations of the RMSEA have a downward bias in the presence of missing data and, thus, overestimate model goodness-of-fit. Unfortunately, many state-of-the-art software packages report the biased form of RMSEA. As a consequence, the community may have been accepting a much larger fraction of models with non-acceptable model fit. We propose a bias-correction for the RMSEA based on information-theoretic considerations that take into account the expected misfit of a person with fully observed data. This results in an RMSEA which is asymptotically independent of the proportion of missing data for misspecified models. Importantly, results of the corrected RMSEA computation are identical to naive RMSEA if there are no missing data.