response propensity
Recently Published Documents


TOTAL DOCUMENTS

30
(FIVE YEARS 8)

H-INDEX

7
(FIVE YEARS 1)

2021 ◽  
Vol 11 (4) ◽  
pp. 1653-1687
Author(s):  
Alexander Robitzsch

Missing item responses are prevalent in educational large-scale assessment studies such as the programme for international student assessment (PISA). The current operational practice scores missing item responses as wrong, but several psychometricians have advocated for a model-based treatment based on latent ignorability assumption. In this approach, item responses and response indicators are jointly modeled conditional on a latent ability and a latent response propensity variable. Alternatively, imputation-based approaches can be used. The latent ignorability assumption is weakened in the Mislevy-Wu model that characterizes a nonignorable missingness mechanism and allows the missingness of an item to depend on the item itself. The scoring of missing item responses as wrong and the latent ignorable model are submodels of the Mislevy-Wu model. In an illustrative simulation study, it is shown that the Mislevy-Wu model provides unbiased model parameters. Moreover, the simulation replicates the finding from various simulation studies from the literature that scoring missing item responses as wrong provides biased estimates if the latent ignorability assumption holds in the data-generating model. However, if missing item responses are generated such that they can only be generated from incorrect item responses, applying an item response model that relies on latent ignorability results in biased estimates. The Mislevy-Wu model guarantees unbiased parameter estimates if the more general Mislevy-Wu model holds in the data-generating model. In addition, this article uses the PISA 2018 mathematics dataset as a case study to investigate the consequences of different missing data treatments on country means and country standard deviations. Obtained country means and country standard deviations can substantially differ for the different scaling models. In contrast to previous statements in the literature, the scoring of missing item responses as incorrect provided a better model fit than a latent ignorable model for most countries. Furthermore, the dependence of the missingness of an item from the item itself after conditioning on the latent response propensity was much more pronounced for constructed-response items than for multiple-choice items. As a consequence, scaling models that presuppose latent ignorability should be refused from two perspectives. First, the Mislevy-Wu model is preferred over the latent ignorable model for reasons of model fit. Second, in the discussion section, we argue that model fit should only play a minor role in choosing psychometric models in large-scale assessment studies because validity aspects are most relevant. Missing data treatments that countries can simply manipulate (and, hence, their students) result in unfair country comparisons.


Author(s):  
Alexander Robitzsch

Missing item responses are prevalent in educational large-scale assessment studies like the programme for international student assessment (PISA). The current operational practice scores missing item responses as wrong, but several psychometricians advocated a model-based treatment based on latent ignorability assumption. In this approach, item responses and response indicators are jointly modeled conditional on a latent ability and a latent response propensity variable. Alternatively, imputation-based approaches can be used. The latent ignorability assumption is weakened in the Mislevy-Wu model that characterizes a nonignorable missingness mechanism and allows the missingness of an item to depend on the item itself. The scoring of missing item responses as wrong and the latent ignorable model are submodels of the Mislevy-Wu model. This article uses the PISA 2018 mathematics dataset to investigate the consequences of different missing data treatments on country means. Obtained country means can substantially differ for the different scaling models. In contrast to previous statements in the literature, the scoring of missing item responses as incorrect provided a better model fit than a latent ignorable model for most countries. Furthermore, the dependence of the missingness of an item from the item itself after conditioning on the latent response propensity was much more pronounced for constructed-response items than for multiple-choice items. As a consequence, scaling models that presuppose latent ignorability should be refused from two perspectives. First, the Mislevy-Wu model is preferred over the latent ignorable model for reasons of model fit. Second, we argue that model fit should only play a minor role in choosing psychometric models in large-scale assessment studies because validity aspects are most relevant. Missing data treatments that countries can simply manipulate (and, hence, their students) result in unfair country comparisons.


Author(s):  
Andy Peytchev ◽  
Daniel Pratt ◽  
Michael Duprey

Abstract Reduction in nonresponse bias has been a key focus in responsive and adaptive survey designs, through multiple phases of data collection, each defined by a different protocol, and targeting interventions to a subset of sample elements. Key in this approach is the identification of nonrespondents who, if interviewed, can reduce nonresponse bias in survey estimates. From a design perspective, we need to identify an appropriate model to select targeted cases, in addition to an effective intervention (change in protocol). From an evaluation perspective, we need to compare estimates to a control condition that is often omitted from study designs, in addition to the need for benchmark estimates for key survey measures to provide estimates of nonresponse bias. We introduced a bias propensity approach for the selection of sample members to reduce nonresponse bias. Unlike a response propensity approach in which the objective is to maximize the prediction of nonresponse, this new approach deliberately excludes strong predictors of nonresponse that are uncorrelated with survey measures and uses covariates that are of substantive interest to the study. We also devised an analytic approach to simulate which sample members would have responded in a control condition. This study also provided a rare opportunity to estimate nonresponse bias, using rich sampling frame information, prior round survey data, and data from extensive nonresponse follow-up. The bias propensity model yielded reasonable fit despite the exclusion of the strongest predictors of nonresponse. The intervention was found to be effective in increasing participation among identified sample members. On average, the responsive and adaptive survey design reduced nonresponse bias by more than one-quarter—almost one percentage point—regardless of the choice of benchmark estimates. Effort under the control condition did not reduce nonresponse bias. While results are strongly encouraging, we argue for replication with varied populations and methods.


2020 ◽  
Author(s):  
Boris Forthmann ◽  
Dorota Maria Jankowska ◽  
Maciej Karwowski

Creativity—as any other object of scientific endeavor—requires a sound measurement that adheres to quality criteria. For decades, creativity science has been criticized as falling short in developing valid and reliable measures of creative potential, activity, and achievement. Recent years have witnessed growth of theoretical and empirical works that focused on improving creativity assessment. Here, we apply one of such recently developed approaches based on item response theory to examine ideas’ and person score reliability in a divergent thinking task. A large sample (N = 621) of children and adolescents solved the Circles task from Torrance Tests of Creative Thinking-Figural and two other figural tests measuring creative thinking (Test of Creative Thinking-Drawing Production) and creative imagination (Test of Creative Imagery Abilities). By employing response propensity models, we observed that separate ideas’ reliability tended to fall below recommended thresholds (even liberal ones, e.g., .60) unless the sample size as well as the number of generated ideas (fluency) were large. Importantly, reliability at the idea-level affected reliability at the person-level much less than could be assumed based on recent findings. We propose a systematic perspective on divergent thinking assessment that considers responses as nested in tasks and tasks as nested in tests. Finally, we recommend that adding more tasks to divergent thinking tests might increase reliability at the task-level.


Author(s):  
Mary H Mulry ◽  
Nancy Bates ◽  
Matthew Virgile

Abstract As the 2020 US Census approaches, the preparations include tests of new methodologies for enumeration that have the potential to reduce cost and improve quality. The 2015 Census Test in Savannah, GA, included tests of Internet and mail response modes and of online delivery of social marketing communications focused on persuading the public to respond by Internet and mail. Merging data from the 2015 Census Test with external third-party lifestyle segments and the Census Bureau’s new Low Response Score (LRS) produces a dataset suitable for studying relationships between census response, LRSs, and lifestyle segments. This paper uses the merged dataset to examine whether lifestyle segments can provide insight to hard-to-survey populations, their response behavior, and interactions with social marketing communications. The article also includes analyses with nationwide data that support the broader application of using segmentation variables in self-response propensity models and a discussion of potential applications of segment lifestyle information in tailored and targeted survey designs for hard-to-survey populations.


2019 ◽  
Vol 90 (3) ◽  
pp. 683-699 ◽  
Author(s):  
Boris Forthmann ◽  
Sue Hyeon Paek ◽  
Denis Dumas ◽  
Baptiste Barbot ◽  
Heinz Holling

2019 ◽  
Vol 8 (3) ◽  
pp. 566-588
Author(s):  
Ruben L Bach ◽  
Stephanie Eckman ◽  
Jessica Daikeler

AbstractMany surveys aim to achieve high response rates to keep bias due to nonresponse low. However, research has shown that the relationship between the nonresponse rate and nonresponse bias is small. In fact, high response rates may lead to measurement error, if respondents with low response propensities provide survey responses of low quality. In this paper, we explore the relationship between response propensity and measurement error, specifically, motivated misreporting, the tendency to give inaccurate answers to speed through an interview. Using data from four surveys conducted in several countries and modes, we analyze whether motivated misreporting is worse among those respondents who were the least likely to respond to the survey. Contrary to the prediction of our theoretical model, we find only limited evidence that reluctant respondents are more likely to misreport.


2019 ◽  
Vol 8 (2) ◽  
pp. 385-411
Author(s):  
Michael T Jackson ◽  
Cameron B McPhee ◽  
Paul J Lavrakas

Abstract Monetary incentives are frequently used to improve survey response rates. While it is common to use a single incentive amount for an entire sample, allowing the incentive to vary inversely with the expected probability of response may help to mitigate nonresponse and/or nonresponse bias. Using data from the 2016 National Household Education Survey (NHES:2016), an address-based sample (ABS) of US households, this article evaluates an experiment in which the noncontingent incentive amount was determined by a household’s predicted response propensity (RP). Households with the lowest RP received $10, those with the highest received $2 or $0, and those in between received the standard NHES incentive of $5. Relative to a uniform $5 protocol, this “tailored” incentive protocol slightly reduced the response rate and had no impact on observable nonresponse bias. These results serve as an important caution to researchers considering the targeting of incentives or other interventions based on predicted RP. While preferable in theory to “one-size-fits-all” approaches, such differential designs may not improve recruitment outcomes without a dramatic increase in the resources devoted to low RP cases. If budget and/or ethical concerns limit the resources that can be devoted to such cases, RP-based targeting could have little practical benefit.


2018 ◽  
Vol 7 (2) ◽  
pp. 250-274 ◽  
Author(s):  
Micha Fischer ◽  
Brady T West ◽  
Michael R Elliott ◽  
Frauke Kreuter

Abstract This article examines the influence of interviewers on the estimation of regression coefficients from survey data. First, we present theoretical considerations with a focus on measurement errors and nonresponse errors due to interviewers. Then, we show via simulation which of several nonresponse and measurement error scenarios has the biggest impact on the estimate of a slope parameter from a simple linear regression model. When response propensity depends on the dependent variable in a linear regression model, bias in the estimated slope parameter is introduced. We find no evidence that interviewer effects on the response propensity have a large impact on the estimated regression parameters. We do find, however, that interviewer effects on the predictor variable of interest explain a large portion of the bias in the estimated regression parameter. Simulation studies suggest that standard measurement error adjustments using the reliability ratio (i.e., the ratio of the measurement-error-free variance to the observed variance with measurement error) can correct most of the bias introduced by these interviewer effects in a variety of complex settings, suggesting that more routine adjustment for such effects should be considered in regression analysis using survey data.


Sign in / Sign up

Export Citation Format

Share Document