Item Imputation Without Specifying Scale Structure

Imputation of incomplete questionnaire items should preserve the structure among items and the correlations between scales. This paper explores the use of fully conditional specification (FCS) to impute missing data in questionnaire items. FCS is particularly attractive for items because it does not require (1) a specification of the number of factors or classes, (2) a specification of which item belongs to which scale, and (3) assumptions about conditional independence among items. Imputation models can be specified using standard features of the R package MICE 1.16. A limited simulation shows that MICE outperforms two-way imputation with respect to Cronbach’s α and the correlations between scales. We conclude that FCS is a promising alternative for imputing incomplete questionnaire items.

Download Full-text

Multiple Imputation by Fully Conditional Specification for Dealing with Missing Data in a Large Epidemiologic Study

International Journal of Statistics in Medical Research ◽

10.6000/1929-6029.2015.04.03.7 ◽

2015 ◽

Vol 4 (3) ◽

pp. 287-295 ◽

Cited By ~ 105

Author(s):

Yang Liu ◽

◽

Anindya De

Keyword(s):

Missing Data ◽

Multiple Imputation ◽

Epidemiologic Study ◽

Fully Conditional Specification ◽

Conditional Specification

Download Full-text

Evaluating FIML And Multiple Imputation In Joint Ordinal-Continuous Measurements Models With Missing Data

10.31234/osf.io/j3b2t ◽

2021 ◽

Author(s):

Aaron Lim ◽

Mike W.-L. Cheung

Keyword(s):

Missing Data ◽

Least Squares ◽

Multiple Imputation ◽

Latent Variable ◽

Weighted Least Squares ◽

Low Frequencies ◽

Full Information Maximum Likelihood ◽

Fully Conditional Specification ◽

Conditional Specification ◽

Almost All

Missing data is a common occurrence in confirmatory factor analysis (CFA). Much work had evaluated the performance of different techniques when all observed variables were either continuous or ordinal. However, few have investigated these techniques when observed variables are a mix of continuous and ordinal variables. This study investigated the performance of four approaches to handling missing data in these models, a joint ordinal-continuous full information maximum likelihood (JOC-FIML) approach and three multiple imputation approaches (fully conditional specification, fully conditional specification with latent variable formulation, and expectation-maximization with bootstrapping) combined with the weighted least squares with mean and variance adjustment (WLSMV) estimator. In a Monte-Carlo simulation, the JOC-FIML approach produced unbiased estimations of factor loadings and standard errors in almost all conditions. Fully conditional specification combined with WLSMV was second best, producing accurate estimates if the sample size was large. We recommend JOC-FIML across most conditions, except when certain ordinal categories have extremely low frequencies as it was less likely to converge. If the sample is large, fully conditional specification combined with weighted-least-squares is recommended when the FIML approach is not feasible (e.g., non-convergence, variables that predict missingness are not of interest to the analysis).

Download Full-text

Multiple Imputation for Multivariate Missing Data: The Fully Conditional Specification Approach

10.1201/9780429156397-7 ◽

2021 ◽

pp. 181-208

Author(s):

Yulei He ◽

Guangyu Zhang ◽

Chiu-Hsieh Hsu

Keyword(s):

Missing Data ◽

Multiple Imputation ◽

Fully Conditional Specification ◽

Conditional Specification

Download Full-text

Multiple Imputation for Missing Data: Fully Conditional Specification Versus Multivariate Normal Imputation

American Journal of Epidemiology ◽

10.1093/aje/kwp425 ◽

2010 ◽

Vol 171 (5) ◽

pp. 624-632 ◽

Cited By ~ 346

Author(s):

K. J. Lee ◽

J. B. Carlin

Keyword(s):

Missing Data ◽

Multiple Imputation ◽

Multivariate Normal ◽

Fully Conditional Specification ◽

Conditional Specification ◽

Multivariate Normal Imputation

Download Full-text

Dealing with missing information on covariates for excess mortality hazard regression models – Making the imputation model compatible with the substantive model

Statistical Methods in Medical Research ◽

10.1177/09622802211031615 ◽

2021 ◽

Vol 30 (10) ◽

pp. 2256-2268

Author(s):

Luís Antunes ◽

Denisa Mendonça ◽

Maria José Bento ◽

Edmund Njeru Njagi ◽

Aurélien Belot ◽

...

Keyword(s):

Missing Data ◽

Multiple Imputation ◽

Survival Data ◽

Regression Models ◽

Cancer Survival ◽

Population Based ◽

Hazard Regression ◽

The North ◽

Fully Conditional Specification ◽

Conditional Specification

Missing data is a common issue in epidemiological databases. Among the different ways of dealing with missing data, multiple imputation has become more available in common statistical software packages. However, the incompatibility between the imputation and substantive model, which can arise when the associations between variables in the substantive model are not taken into account in the imputation models or when the substantive model is itself nonlinear, can lead to invalid inference. Aiming at analysing population-based cancer survival data, we extended the multiple imputation substantive model compatible-fully conditional specification (SMC-FCS) approach, proposed by Bartlett et al. in 2015 to accommodate excess hazard regression models. The proposed approach was compared with the standard fully conditional specification multiple imputation procedure and with the complete-case analysis using a simulation study. The SMC-FCS approach produced unbiased estimates in both scenarios tested, while the fully conditional specification produced biased estimates and poor empirical coverages probabilities. The SMC-FCS algorithm was then used for handling missing data in the evaluation of socioeconomic inequalities in survival from colorectal cancer patients diagnosed in the North Region of Portugal. The analysis using SMC-FCS showed a clearer trend in higher excess hazards for patients coming from more deprived areas. The proposed algorithm was implemented in R software and is presented as Supplementary Material.

Download Full-text

A fully conditional specification approach to multilevel imputation of categorical and continuous variables.

Psychological Methods ◽

10.1037/met0000148 ◽

2018 ◽

Vol 23 (2) ◽

pp. 298-317 ◽

Cited By ~ 39

Author(s):

Craig K. Enders ◽

Brian T. Keller ◽

Roy Levy

Keyword(s):

Continuous Variables ◽

Fully Conditional Specification ◽

Conditional Specification

Download Full-text

Missing Data: A Unified Taxonomy Guided by Conditional Independence

International Statistical Review ◽

10.1111/insr.12242 ◽

2017 ◽

Vol 86 (2) ◽

pp. 189-204 ◽

Cited By ~ 1

Author(s):

Marco Doretti ◽

Sara Geneletti ◽

Elena Stanghellini

Keyword(s):

Missing Data ◽

Conditional Independence

Download Full-text

MGMM: An R Package for fitting Gaussian Mixture Models on Incomplete Data

10.1101/2019.12.20.884551 ◽

2019 ◽

Cited By ~ 1

Author(s):

Zachary R. McCaw ◽

Hanna Julienne ◽

Hugues Aschard

Keyword(s):

Missing Data ◽

Mixture Models ◽

Gaussian Mixture Models ◽

Model Fitting ◽

Simulated Data ◽

R Package ◽

Gaussian Mixture ◽

Parameter Estimates ◽

Cluster Assignment ◽

Underlying Distribution

AbstractAlthough missing data are prevalent in applications, existing implementations of Gaussian mixture models (GMMs) require complete data. Standard practice is to perform complete case analysis or imputation prior to model fitting. Both approaches have serious drawbacks, potentially resulting in biased and unstable parameter estimates. Here we present MGMM, an R package for fitting GMMs in the presence of missing data. Using three case studies on real and simulated data sets, we demonstrate that, when the underlying distribution is near-to a GMM, MGMM is more effective at recovering the true cluster assignments than state of the art imputation followed by standard GMM. Moreover, MGMM provides an accurate assessment of cluster assignment uncertainty even when the generative distribution is not a GMM. This assessment may be used to identify unassignable observations. MGMM is available as an R package on CRAN: https://CRAN.R-project.org/package=MGMM.

Download Full-text