A comparison of IRT item fit statistics for dichotomous responses

Aims/Background: The Edinburgh Feeding Evaluation in Dementia (EdFED) scale has been shown to have good psychometric properties using a range of methods including Mokken scaling. We aimed to study the Italian version of the EdFED using Mokken scaling. Methods: Data were gathered at 7 time points from 401 nursing home residents affected by dementia in the course of a 6-month intervention study using analysis of variance, Mokken scaling, and person-item fit statistics. Results: The properties of the EdFED-I scale were stable over the course of the study with 4 items showing invariant item ordering at all time points. Some items behaved differently at different levels of difficulty in the scale and also depending on the mean level of feeding difficulty. The test information function showed a dip in the mid-range of difficulty scores.

Download Full-text

An NCME Instructional Module on Item-Fit Statistics for Item Response Theory Models

Educational Measurement Issues and Practice ◽

10.1111/emip.12067 ◽

2015 ◽

Vol 34 (3) ◽

pp. 39-48 ◽

Cited By ~ 18

Author(s):

Allison J. Ames ◽

Randall D. Penfield

Keyword(s):

Item Response Theory ◽

Item Response ◽

Response Theory ◽

Fit Statistics ◽

Item Fit ◽

Instructional Module ◽

Item Response Theory Models

Download Full-text

Rasch Modeling of the Test of Early Mathematics Ability–Third Edition With a Sample of K1 Children in Singapore

Journal of Psychoeducational Assessment ◽

10.1177/0734282916651021 ◽

2016 ◽

Vol 35 (6) ◽

pp. 615-627 ◽

Cited By ~ 2

Author(s):

Shih-Ying Yao ◽

David Muñez ◽

Rebecca Bull ◽

Kerry Lee ◽

Kiat Hui Khng ◽

...

Keyword(s):

Psychometric Properties ◽

Internal Structure ◽

Future Research ◽

Early Mathematics ◽

Fit Statistics ◽

Item Fit ◽

Mathematics Ability ◽

Mathematics Knowledge ◽

Test Use ◽

The Rasch Model

The Test of Early Mathematics Ability–Third Edition (TEMA-3) is a commonly used measure of early mathematics knowledge for children aged 3 years to 8 years 11 months. In spite of its wide use, research on the psychometric properties of TEMA-3 remains limited. This study applied the Rasch model to investigate the psychometric properties of TEMA-3 from three aspects: technical qualities, internal structure, and convergent evidence. Data were collected from 971 K1 children in Singapore. Item fit statistics suggested a reasonable model-data fit. The TEMA-3 items were found to demonstrate generally good technical qualities, interpretable internal structure, and reasonable convergent evidence. Implications for test development, test use, and future research are further discussed.

Download Full-text

Testing Latent Variable Distribution Fit in IRT Using Posterior Residuals

Journal of Educational and Behavioral Statistics ◽

10.3102/1076998620953764 ◽

2020 ◽

pp. 107699862095376

Author(s):

Scott Monroe

Keyword(s):

Latent Variable ◽

Latent Trait ◽

Item Parameter ◽

Parameter Estimates ◽

Fit Statistics ◽

Item Fit ◽

Sample Average ◽

Variable Distribution ◽

Standard Normal ◽

Distribution Fit

This research proposes a new statistic for testing latent variable distribution fit for unidimensional item response theory (IRT) models. If the typical assumption of normality is violated, then item parameter estimates will be biased, and dependent quantities such as IRT score estimates will be adversely affected. The proposed statistic compares the specified latent variable distribution to the sample average of latent variable posterior distributions commonly used in IRT scoring. Formally, the statistic is an instantiation of a generalized residual and is thus asymptotically distributed as standard normal. Also, the statistic naturally complements residual-based item-fit statistics, as both are conditional on the latent trait, and can be presented with graphical plots. In addition, a corresponding unconditional statistic, which controls for multiple comparisons, is proposed. The statistics are evaluated using a simulation study, and empirical analyses are provided.

Download Full-text

Item Fit Statistics

Rasch Models in Health ◽

10.1002/9781118574454.ch5 ◽

2013 ◽

pp. 83-104 ◽

Cited By ~ 7

Author(s):

Karl Bang Christensen ◽

Svend Kreiner

Keyword(s):

Fit Statistics ◽

Item Fit

Download Full-text

THE CORRELATION BETWEEN ITEM PARAMETERS AND ITEM FIT STATISTICS

ETS Research Report Series ◽

10.1002/j.2333-8504.2007.tb02078.x ◽

2007 ◽

Vol 2007 (2) ◽

pp. i-19

Author(s):

Sandip Sinharay ◽

Ying Lu

Keyword(s):

Fit Statistics ◽

Item Fit ◽

Item Parameters

Download Full-text

The Distributional Properties of Rasch Item Fit Statistics

Educational and Psychological Measurement ◽

10.1177/0013164491513003 ◽

1991 ◽

Vol 51 (3) ◽

pp. 541-565 ◽

Cited By ~ 65

Author(s):

Richard M. Smith

Keyword(s):

Fit Statistics ◽

Item Fit ◽

Distributional Properties

Download Full-text

Practical Significance of Item Misfit in Educational Assessments

Applied Psychological Measurement ◽

10.1177/0146621617692978 ◽

2017 ◽

Vol 41 (5) ◽

pp. 388-400 ◽

Cited By ~ 5

Author(s):

Carmen Köhler ◽

Johannes Hartig

Keyword(s):

Item Response ◽

Correlation Coefficients ◽

Model Fit ◽

Practical Significance ◽

Fit Statistics ◽

Item Fit ◽

Estimated Parameters ◽

Educational Test ◽

Educational Assessments ◽

Using Data

Testing item fit is an important step when calibrating and analyzing item response theory (IRT)-based tests, as model fit is a necessary prerequisite for drawing valid inferences from estimated parameters. In the literature, numerous item fit statistics exist, sometimes resulting in contradictory conclusions regarding which items should be excluded from the test. Recently, researchers argue to shift the focus from statistical item fit analyses to evaluating practical consequences of item misfit. This article introduces a method to quantify potential bias of relationship estimates (e.g., correlation coefficients) due to misfitting items. The potential deviation informs about whether item misfit is practically significant for outcomes of substantial analyses. The method is demonstrated using data from an educational test.

Download Full-text

Validity and Reliability of Green Competencies Instrument for Automobile Technology Programme Using Rasch Model

Asean Journal of Engineering Education ◽

10.11113/ajee2021.5n2.63 ◽

2021 ◽

Vol 5 (2) ◽

Author(s):

Igogbe Regina Onyilo ◽

Mahyuddin Arsat ◽

Nor Fadila Amin

Keyword(s):

Rasch Model ◽

Point Of View ◽

Validity And Reliability ◽

Automobile Engineering ◽

Engineering Technology ◽

Fit Statistics ◽

Item Fit ◽

Point Measure ◽

Technology Programme ◽

Automobile Technology

This article aims to determine the validity of developed constructs and check the reliability of the newly developed instrument named as Questionnaire on Green Competencies for Automobile Engineering Technology (QGCAET) for the Automobile Technology Programme in Nigerian Universities. The instrument consists of 170 elements measuring four constructs namely Technical Green Competencies; Managerial Green Competencies; Personal Green Competencies and Social Green Competencies and was administered to 299 respondents made of Lecturers, Technologists and Final- Year Students of Automobile Engineering and Technology programme in Nigeria universities. The Rasch model was used to examine the validity and reliability of the items. From the analysis point of view, the polarity of the elements indicates that the correlation of the point measure (PTMEA CORR) of 170 elements of green competencies is between 0.00 and 0.55. The summary statistics show that the reliability of the items and the separation of the items of the green competencies instrument are 0.98 and 6.46, respectively. Similarly, the item reliability of each construct is between 0.96 and 0.99, and the reliability of the person is between 0.79 and 1.97, respectively. In terms of item fit statistics, a total of 157 items are found to be fit to achieve the objectives of the study. The result also indicates that the range of fit for the four (4) identified green competencies constructs is between 0.61 and 1.49 signifying that all the constructs are in harmony in measuring the items in the constructs, so suitable in achieving the objectives of the research.

Download Full-text