scholarly journals Choosing Imputation Models

2021 ◽  
pp. 1-9
Author(s):  
Moritz Marbach

Abstract Imputing missing values is an important preprocessing step in data analysis, but the literature offers little guidance on how to choose between imputation models. This letter suggests adopting the imputation model that generates a density of imputed values most similar to those of the observed values for an incomplete variable after balancing all other covariates. We recommend stable balancing weights as a practical approach to balance covariates whose distribution is expected to differ if the values are not missing completely at random. After balancing, discrepancy statistics can be used to compare the density of imputed and observed values. We illustrate the application of the suggested approach using simulated and real-world survey data from the American National Election Study, comparing popular imputation approaches including random forests, hot-deck, predictive mean matching, and multivariate normal imputation. An R package implementing the suggested approach accompanies this letter.

2020 ◽  
Author(s):  
Q. Giai Gianetto ◽  
S. Wieczorek ◽  
Y. Couté ◽  
T. Burger

AbstractMotivationQuantitative mass spectrometry-based proteomics data are characterized by high rates of missing values, which may be of two kinds: missing completely-at-random (MCAR) and missing not-at-random (MNAR). Despite numerous imputation methods available in the literature, none account for this duality, for it would require to diagnose the missingness mechanism behind each missing value.ResultsA multiple imputation strategy is proposed by combining MCAR-devoted and MNAR-devoted imputation algorithms. First, we propose an estimator for the proportion of MCAR values and show it is asymptotically unbiased under assumptions adapted to label-free proteomics data. This allows us to estimate the number of MCAR values in each sample and to take into account the nature of missing values through an original multiple imputation method. We evaluate this approach on simulated data and shows it outperforms traditionally used imputation algorithms.AvailabilityThe proposed methods are implemented in the R package imp4p (available on the CRAN Giai Gianetto (2020)), which is itself accessible through Prostar [email protected]; [email protected]


2021 ◽  
pp. 1-27 ◽  
Author(s):  
Julie VanDusky-Allen ◽  
Stephen M. Utych

AbstractIn this paper, we analyze how variations in partisan representation across different levels of government influence Americans’ satisfaction with the democracy in the United States. We conduct two survey experiments and analyze data from the 2016 American National Election Study postelection survey. We find that Americans are the most satisfied with democracy when their most preferred party controls both the federal and their respective state governments. However, we also find that even if an individual’s least preferred party only controls one level of government, they are still more satisfied with democracy than if their most preferred party controls no levels of government. These findings suggest that competition in elections across both the national and state government, where winning and losing alternates between the two parties, may have positive outcomes for attitudes toward democracy.


Sign in / Sign up

Export Citation Format

Share Document