scholarly journals Efficient Standard Errors in Item Response Theory Models for Short Tests

2019 ◽  
Vol 80 (3) ◽  
pp. 461-475
Author(s):  
Lianne Ippel ◽  
David Magis

In dichotomous item response theory (IRT) framework, the asymptotic standard error (ASE) is the most common statistic to evaluate the precision of various ability estimators. Easy-to-use ASE formulas are readily available; however, the accuracy of some of these formulas was recently questioned and new ASE formulas were derived from a general asymptotic theory framework. Furthermore, exact standard errors were suggested to better evaluate the precision of ability estimators, especially with short tests for which the asymptotic framework is invalid. Unfortunately, the accuracy of exact standard errors was assessed so far only in a very limiting setting. The purpose of this article is to perform a global comparison of exact versus (classical and new formulations of) asymptotic standard errors, for a wide range of usual IRT ability estimators, IRT models, and with short tests. Results indicate that exact standard errors globally outperform the ASE versions in terms of reduced bias and root mean square error, while the new ASE formulas are also globally less biased than their classical counterparts. Further discussion about the usefulness and practical computation of exact standard errors are outlined.

2018 ◽  
Vol 29 (1) ◽  
pp. 35-44
Author(s):  
Nell Sedransk

This article is about FMCSA data and its analysis. The article responds to the two-part question: How does an Item Response Theory (IRT) model work differently . . . or better than any other model? The response to the first part is a careful, completely non-technical exposition of the fundamentals for IRT models. It differentiates IRT models from other models by providing the rationale underlying IRT modeling and by using graphs to illustrate two key properties for data items. The response to the second part of the question about superiority of an IRT model is, “it depends.” For FMCSA data, serious challenges arise from complexity of the data and from heterogeneity of the carrier industry. Questions are posed that will need to be addressed to determine the success of the actual model developed and of the scoring system.


1998 ◽  
Vol 23 (3) ◽  
pp. 236-243 ◽  
Author(s):  
Eric T. Bradlow ◽  
Neal Thomas

Examinations that permit students to choose a subset of the items are popular despite the potential that students may take examinations of varying difficulty as a result of their choices. We provide a set of conditions for the validity of inference for Item Response Theory (IRT) models applied to data collected from choice-based examinations. Valid likelihood and Bayesian inference using standard estimation methods require (except in extraordinary circumstances) that there is no dependence, after conditioning on the observed item responses, between the examinees choices and their (potential but unobserved) responses to omitted items, as well as their latent abilities. These independence assumptions are typical of those required in much more general settings. Common low-dimensional IRT models estimated by standard methods, though potentially useful tools for educational data, do not resolve the difficult problems posed by choice-based data.


2021 ◽  
Vol 117 ◽  
pp. 106849
Author(s):  
Danilo Carrozzino ◽  
Kaj Sparle Christensen ◽  
Giovanni Mansueto ◽  
Fiammetta Cosci

2014 ◽  
Vol 22 (2) ◽  
pp. 323-341 ◽  
Author(s):  
Dheeraj Raju ◽  
Xiaogang Su ◽  
Patricia A. Patrician

Background and Purpose: The purpose of this article is to introduce different types of item response theory models and to demonstrate their usefulness by evaluating the Practice Environment Scale. Methods: Item response theory models such as constrained and unconstrained graded response model, partial credit model, Rasch model, and one-parameter logistic model are demonstrated. The Akaike information criterion (AIC) and Bayesian information criterion (BIC) indices are used as model selection criterion. Results: The unconstrained graded response and partial credit models indicated the best fit for the data. Almost all items in the instrument performed well. Conclusions: Although most of the items strongly measure the construct, there are a few items that could be eliminated without substantially altering the instrument. The analysis revealed that the instrument may function differently when administered to different unit types.


2021 ◽  
Vol 8 (3) ◽  
pp. 672-695
Author(s):  
Thomas DeVaney

This article presents a discussion and illustration of Mokken scale analysis (MSA), a nonparametric form of item response theory (IRT), in relation to common IRT models such as Rasch and Guttman scaling. The procedure can be used for dichotomous and ordinal polytomous data commonly used with questionnaires. The assumptions of MSA are discussed as well as characteristics that differentiate a Mokken scale from a Guttman scale. MSA is illustrated using the mokken package with R Studio and a data set that included over 3,340 responses to a modified version of the Statistical Anxiety Rating Scale. Issues addressed in the illustration include monotonicity, scalability, and invariant ordering. The R script for the illustration is included.


Sign in / Sign up

Export Citation Format

Share Document