Methods of Identifying Individual Guessers From Item Response Data

2007 ◽  
Vol 67 (5) ◽  
pp. 745-764 ◽  
Author(s):  
Xiangdong Yang
Keyword(s):  
2020 ◽  
Vol 44 (5) ◽  
pp. 362-375
Author(s):  
Tyler Strachan ◽  
Edward Ip ◽  
Yanyan Fu ◽  
Terry Ackerman ◽  
Shyh-Huei Chen ◽  
...  

As a method to derive a “purified” measure along a dimension of interest from response data that are potentially multidimensional in nature, the projective item response theory (PIRT) approach requires first fitting a multidimensional item response theory (MIRT) model to the data before projecting onto a dimension of interest. This study aims to explore how accurate the PIRT results are when the estimated MIRT model is misspecified. Specifically, we focus on using a (potentially misspecified) two-dimensional (2D)-MIRT for projection because of its advantages, including interpretability, identifiability, and computational stability, over higher dimensional models. Two large simulation studies (I and II) were conducted. Both studies examined whether the fitting of a 2D-MIRT is sufficient to recover the PIRT parameters when multiple nuisance dimensions exist in the test items, which were generated, respectively, under compensatory MIRT and bifactor models. Various factors were manipulated, including sample size, test length, latent factor correlation, and number of nuisance dimensions. The results from simulation studies I and II showed that the PIRT was overall robust to a misspecified 2D-MIRT. Smaller third and fourth simulation studies were done to evaluate recovery of the PIRT model parameters when the correctly specified higher dimensional MIRT or bifactor model was fitted with the response data. In addition, a real data set was used to illustrate the robustness of PIRT.


2021 ◽  
Author(s):  
Benjamin Domingue ◽  
Dimiter Dimitrov

A recently developed framework of measurement, referred to as Delta-scoring (or D-scoring) method (DSM; e.g., Dimitrov 2016, 2018, 2020) is gaining attention in the field of educational measurement and widely used in large-scale assessments at the National Center for Assessment in Saudi Arabia. The D-scores obtained under the DSM range from 0 to 1 to indicate how much (what proportion) of the ability measured by a test of binary items is demonstrated by the examinee. This study examines whether the D-scale is an interval scale and how D-scores compare to IRT ability scores (thetas) in terms of intervalness via testing the axioms of additive conjoint measurement (ACM). The approach to testing is the ConjointChecks (Domingue, 2014), which implements a Bayesian method to evaluating whether the axioms are violated in a given empirical item response data set. The results indicate that the D-scores, computed under the DSM, produce fewer violations of the ordering axioms of ACM than do the IRT “theta” scores. The conclusion is that the DSM produces a dependable D-scale in terms of the essential property of intervalness.


Sign in / Sign up

Export Citation Format

Share Document