Performance of the S−χ2 Statistic for the Multidimensional Graded Response Model

2020 ◽  
pp. 001316442095806
Author(s):  
Shiyang Su ◽  
Chun Wang ◽  
David J. Weiss

[Formula: see text] is a popular item fit index that is available in commercial software packages such as flexMIRT. However, no research has systematically examined the performance of [Formula: see text] for detecting item misfit within the context of the multidimensional graded response model (MGRM). The primary goal of this study was to evaluate the performance of [Formula: see text] under two practical misfit scenarios: first, all items are misfitting due to model misspecification, and second, a small subset of items violate the underlying assumptions of the MGRM. Simulation studies showed that caution should be exercised when reporting item fit results of polytomous items using [Formula: see text] within the context of the MGRM, because of its inflated false positive rates (FPRs), especially with a small sample size and a long test. [Formula: see text] performed well when detecting overall model misfit as well as item misfit for a small subset of items when the ordinality assumption was violated. However, under a number of conditions of model misspecification or items violating the homogeneous discrimination assumption, even though true positive rates (TPRs) of [Formula: see text] were high when a small sample size was coupled with a long test, the inflated FPRs were generally directly related to increasing TPRs. There was also a suggestion that performance of [Formula: see text] was affected by the magnitude of misfit within an item. There was no evidence that FPRs for fitting items were exacerbated by the presence of a small percentage of misfitting items among them.

Author(s):  
Rabeeah M. Alsaqri ◽  
Mohsen N. Al Salmi

The study aimed to calibrate Oman data of the PIRLS test using the graded response model and to examine the psychometric properties of it, as well as identify the fit and unfit of its items. PIRLS2011 test booklets were used, which consisted of 146 test items (74 dichotomous and 72 polytomous). Items were divided into 13 booklets; each with two blocks (one literary and one informational). PIRLS test booklets were administered to 13 groups of fourth grade students in Sultanate of Oman with a total sample of 10394 students. Assumptions of IRT (unidimensionality and local independence) were examined and supported. Also, item fit was examined and supported using Samejima’s graded response model. The data was analyzed by Multilog7.03 program to estimate both item and ability parameters. Results indicated that the assumptions of IRT were proved. Also, IRT analysis revealed that 8 items showed unfit which represents only 5% of the test items. So, this result confirms that the test has good psychometric properties under the IRT.


Author(s):  
Amal K. Al-zaabi ◽  
Abdulhameed Hassan ◽  
Rashid S. Al-mehrzi

The study aimed to calibrate Oman data of the PIRLS test using the graded response model and to examine the psychometric properties of it, as well as identify the fit and unfit of its items. PIRLS2011 test booklets were used, which consisted of 146 test items (74 dichotomous and 72 polytomous). Items were divided into 13 booklets; each with two blocks (one literary and one informational). PIRLS test booklets were administered to 13 groups of fourth grade students in Sultanate of Oman with a total sample of 10394 students. Assumptions of IRT (unidimensionality and local independence) were examined and supported. Also, item fit was examined and supported using Samejima’s graded response model. The data was analyzed by Multilog7.03 program to estimate both item and ability parameters. Results indicated that the assumptions of IRT were proved. Also, IRT analysis revealed that 8 items showed unfit which represents only 5% of the test items. So, this result confirms that the test has good psychometric properties under the IRT.


2020 ◽  
Vol 10 ◽  
Author(s):  
Jianhua Xiong ◽  
Shuliang Ding ◽  
Fen Luo ◽  
Zhaosheng Luo

2018 ◽  
Vol 79 (3) ◽  
pp. 545-557 ◽  
Author(s):  
Dimiter M. Dimitrov ◽  
Yong Luo

An approach to scoring tests with binary items, referred to as D-scoring method, was previously developed as a classical analog to basic models in item response theory (IRT) for binary items. As some tests include polytomous items, this study offers an approach to D-scoring of such items and parallels the results with those obtained under the graded response model (GRM) for ordered polytomous items in the framework of IRT. The proposed design of using D-scoring with “virtual” binary items generated from polytomous items provides (a) ability scores that are consistent with their GRM counterparts and (b) item category response functions analogous to those obtained under the GRM. This approach provides a unified framework for D-scoring and psychometric analysis of tests with binary and/or polytomous items that can be efficient in different scenarios of educational and psychological assessment.


Sign in / Sign up

Export Citation Format

Share Document