Multiple Imputation for Missing Data: Fully Conditional Specification Versus Multivariate Normal Imputation

Missing data is a common occurrence in confirmatory factor analysis (CFA). Much work had evaluated the performance of different techniques when all observed variables were either continuous or ordinal. However, few have investigated these techniques when observed variables are a mix of continuous and ordinal variables. This study investigated the performance of four approaches to handling missing data in these models, a joint ordinal-continuous full information maximum likelihood (JOC-FIML) approach and three multiple imputation approaches (fully conditional specification, fully conditional specification with latent variable formulation, and expectation-maximization with bootstrapping) combined with the weighted least squares with mean and variance adjustment (WLSMV) estimator. In a Monte-Carlo simulation, the JOC-FIML approach produced unbiased estimations of factor loadings and standard errors in almost all conditions. Fully conditional specification combined with WLSMV was second best, producing accurate estimates if the sample size was large. We recommend JOC-FIML across most conditions, except when certain ordinal categories have extremely low frequencies as it was less likely to converge. If the sample is large, fully conditional specification combined with weighted-least-squares is recommended when the FIML approach is not feasible (e.g., non-convergence, variables that predict missingness are not of interest to the analysis).

Download Full-text

Multiple Imputation for Multivariate Missing Data: The Fully Conditional Specification Approach

10.1201/9780429156397-7 ◽

2021 ◽

pp. 181-208

Author(s):

Yulei He ◽

Guangyu Zhang ◽

Chiu-Hsieh Hsu

Keyword(s):

Missing Data ◽

Multiple Imputation ◽

Fully Conditional Specification ◽

Conditional Specification

Download Full-text

Dealing with missing information on covariates for excess mortality hazard regression models – Making the imputation model compatible with the substantive model

Statistical Methods in Medical Research ◽

10.1177/09622802211031615 ◽

2021 ◽

Vol 30 (10) ◽

pp. 2256-2268

Author(s):

Luís Antunes ◽

Denisa Mendonça ◽

Maria José Bento ◽

Edmund Njeru Njagi ◽

Aurélien Belot ◽

...

Keyword(s):

Missing Data ◽

Multiple Imputation ◽

Survival Data ◽

Regression Models ◽

Cancer Survival ◽

Population Based ◽

Hazard Regression ◽

The North ◽

Fully Conditional Specification ◽

Conditional Specification

Missing data is a common issue in epidemiological databases. Among the different ways of dealing with missing data, multiple imputation has become more available in common statistical software packages. However, the incompatibility between the imputation and substantive model, which can arise when the associations between variables in the substantive model are not taken into account in the imputation models or when the substantive model is itself nonlinear, can lead to invalid inference. Aiming at analysing population-based cancer survival data, we extended the multiple imputation substantive model compatible-fully conditional specification (SMC-FCS) approach, proposed by Bartlett et al. in 2015 to accommodate excess hazard regression models. The proposed approach was compared with the standard fully conditional specification multiple imputation procedure and with the complete-case analysis using a simulation study. The SMC-FCS approach produced unbiased estimates in both scenarios tested, while the fully conditional specification produced biased estimates and poor empirical coverages probabilities. The SMC-FCS algorithm was then used for handling missing data in the evaluation of socioeconomic inequalities in survival from colorectal cancer patients diagnosed in the North Region of Portugal. The analysis using SMC-FCS showed a clearer trend in higher excess hazards for patients coming from more deprived areas. The proposed algorithm was implemented in R software and is presented as Supplementary Material.

Download Full-text

Application of Multiple Imputation using the Two-Fold Fully Conditional Specification Algorithm in Longitudinal Clinical Data

The Stata Journal Promoting communications on statistics and Stata ◽

10.1177/1536867x1401400213 ◽

2014 ◽

Vol 14 (2) ◽

pp. 418-431 ◽

Cited By ~ 25

Author(s):

Catherine Welch ◽

Jonathan Bartlett ◽

Irene Petersen

Keyword(s):

Multiple Imputation ◽

Clinical Data ◽

Fully Conditional Specification ◽

Conditional Specification

Download Full-text

Multiple Imputation of Covariates by Substantive-model Compatible Fully Conditional Specification

The Stata Journal Promoting communications on statistics and Stata ◽

10.1177/1536867x1501500206 ◽

2015 ◽

Vol 15 (2) ◽

pp. 437-456 ◽

Cited By ~ 17

Author(s):

Jonathan W. Bartlett ◽

Tim P. Morris

Keyword(s):

Multiple Imputation ◽

Fully Conditional Specification ◽

Conditional Specification

Download Full-text

Relative efficiency of joint-model and full-conditional-specification multiple imputation when conditional models are compatible: The general location model

Statistical Methods in Medical Research ◽

10.1177/0962280216665872 ◽

2016 ◽

Vol 27 (6) ◽

pp. 1603-1614 ◽

Cited By ~ 5

Author(s):

Shaun R Seaman ◽

Rachael A Hughes

Keyword(s):

Missing Data ◽

Multiple Imputation ◽

Joint Model ◽

Location Model ◽

Outcome Variable ◽

Efficiency Gain ◽

Asymptotic Equivalence ◽

Conditional Specification ◽

Conditional Models ◽

Asymptotically Efficient

Estimating the parameters of a regression model of interest is complicated by missing data on the variables in that model. Multiple imputation is commonly used to handle these missing data. Joint model multiple imputation and full-conditional specification multiple imputation are known to yield imputed data with the same asymptotic distribution when the conditional models of full-conditional specification are compatible with that joint model. We show that this asymptotic equivalence of imputation distributions does not imply that joint model multiple imputation and full-conditional specification multiple imputation will also yield asymptotically equally efficient inference about the parameters of the model of interest, nor that they will be equally robust to misspecification of the joint model. When the conditional models used by full-conditional specification multiple imputation are linear, logistic and multinomial regressions, these are compatible with a restricted general location joint model. We show that multiple imputation using the restricted general location joint model can be substantially more asymptotically efficient than full-conditional specification multiple imputation, but this typically requires very strong associations between variables. When associations are weaker, the efficiency gain is small. Moreover, full-conditional specification multiple imputation is shown to be potentially much more robust than joint model multiple imputation using the restricted general location model to mispecification of that model when there is substantial missingness in the outcome variable.

Download Full-text

A Fully Conditional Specification Approach to Multilevel Multiple Imputation with Latent Cluster Means

Multivariate Behavioral Research ◽

10.1080/00273171.2018.1556085 ◽

2019 ◽

Vol 54 (1) ◽

pp. 149-150

Author(s):

Brian T. Keller ◽

Han Du

Keyword(s):

Multiple Imputation ◽

Fully Conditional Specification ◽

Conditional Specification ◽

Multilevel Multiple Imputation

Download Full-text

Evaluation of two-fold fully conditional specification multiple imputation for longitudinal electronic health record data

Statistics in Medicine ◽

10.1002/sim.6184 ◽

2014 ◽

Vol 33 (21) ◽

pp. 3725-3737 ◽

Cited By ~ 25

Author(s):

Catherine A. Welch ◽

Irene Petersen ◽

Jonathan W. Bartlett ◽

Ian R. White ◽

Louise Marston ◽

...

Keyword(s):

Electronic Health Record ◽

Multiple Imputation ◽

Health Record ◽

Electronic Health Record Data ◽

Fully Conditional Specification ◽

Conditional Specification ◽

Record Data ◽

Electronic Health

Download Full-text