Abstract
Background: Among the new multiple imputation methods, Multiple Imputation by Chained Equations (MICE) is a popular approach for implementing multiple imputations because of its flexibility. Our main focus in this study is to compare the performance of parametric imputation models based on predictive mean matching and recursive partitioning methods in multiple imputation by chained equations in the presence of interaction in the data.Methods: We compared the performance of parametric and tree-based imputation methods via simulation using two data generation models. For each combination of data generation model and imputation method, the following steps were performed: data generation, removal of observations, imputation, logistic regression analysis, and calculation of bias, Coverage Probability (CP), and Confidence Interval (CI) width for each coefficient Furthermore, model-based and empirical SE, and estimated proportion of the variance attributable to the missing data (λ) were calculated.Results: We have shown by simulation that to impute a binary response in observations involving an interaction, manually interring the interaction term into the imputation model in the predictive mean matching model improves the performance of the PMM method compared to the recursive partitioning models in multiple imputation by chained equations. The parametric method in which we entered the interaction model into the imputation model (MICE-Interaction) led to smaller bias, slightly higher coverage probability for the interaction effect, but it had slightly wider confidence intervals than tree-based imputation (especially classification and regression trees). Conclusions: The application of MICE-Interaction led to better performance than recursive partitioning methods in MICE, although the user is interested in estimating the interaction and does not know enough about the structure of the observations, recursive partitioning methods can be suggested to impute the missing values.