Using Item Response Theory to Evaluate Measurement Invariance in Health-Related Measures

2021 ◽  
Vol 9 ◽  
Author(s):  
Ron D. Hays ◽  
David Hubble ◽  
Frank Jenkins ◽  
Alexa Fraser ◽  
Beryl Carew

The National Children's Study (NCS) statistics and item response theory group was tasked with promoting the quality of study measures and analysis. This paper provides an overview of six measurement and statistical considerations for the NCS: (1) Conceptual and Measurement Model; (2) Reliability; (3) Validity; (4) Measurement Invariance; (5) Interpretability of Scores; and (6) Burden of administration. The guidance was based primarily on recommendations of the International Society of Quality of Life Research.


2020 ◽  
Author(s):  
E. Damiano D'Urso ◽  
Kim De Roover ◽  
Jeroen K. Vermunt ◽  
Jesper Tijmstra

In social sciences, the study of group differences concerning latent constructs is ubiquitous. These constructs are generally measured by means of scales composed of ordinal items. In order to compare these constructs across groups, one crucial requirement is that they are measured equivalently or, in technical jargon, that measurement invariance holds across the groups. This study compared the performance of multiple group categorical confirmatory factor analysis (MG-CCFA) and multiple group item response theory (MG-IRT) in testing measurement invariance with ordinal data. A simulation study was conducted to compare the true positive rate (TPR) and false positive rate (FPR) both at the scale and at the item level for these two approaches under an invariance and a non-invariance scenario. The results of the simulation studies showed that the performance, in terms of the TPR, of MG-CCFA- and MG-IRT-based approaches mostly depends on the scale length. In fact, for long scales, the likelihood ratio test (LRT) approach, for MG-IRT, outperformed the other approaches, while, for short scales, MG-CCFA seemed to be generally preferable. In addition, the performance of MG-CCFA's fit measures, such as RMSEA and CFI, seemed to depend largely on the length of the scale, especially when MI was tested at the item level. General caution is recommended when using these measures, especially when MI is tested for each item individually. A decision flowchart, based on the results of the simulation studies, is provided to help summarizing the results and providing indications on which approach performed best and in which setting.


2017 ◽  
Vol 19 (1) ◽  
pp. 91-102 ◽  
Author(s):  
Jacob Kean ◽  
Erica F. Bisson ◽  
Darrel S. Brodke ◽  
Joshua Biber ◽  
Paul H. Gross

Item response theory has its origins in educational measurement and is now commonly applied in health-related measurement of latent traits, such as function and symptoms. This application is due, in large part, to gains in the precision of measurement attributable to item response theory and corresponding decreases in response burden, study costs, and study duration. The purpose of this paper is twofold: introduce basic concepts of item response theory and demonstrate this analytic approach in a worked example, a Rasch model (1PL) analysis of the Eating Assessment Tool (EAT-10), a commonly used measure for oropharyngeal dysphagia. The results of the analysis were largely concordant with previous studies of the EAT-10 and illustrate for brain impairment clinicians and researchers how IRT analysis can yield greater precision of measurement.


2021 ◽  
Author(s):  
Joshua Marmara ◽  
Daniel Zarate ◽  
Jeremy Vassallo ◽  
Rhiannon Patten ◽  
Vasileios Stavropoulos

Abstract Background: The Warwick Edinburgh Mental Well-Being Scale (WEMWBS) is a measure of subjective well-being and assesses eudemonic and hedonic aspects of well-being. However, differential scoring of the WEMWBS across gender and its precision of measurement has not been examined. The present study assesses the psychometric properties of the WEMWBS using Measurement Invariance (MI) between males and females and Item Response Theory (IRT) analyses. Method: A community sample of 386 adults from the United States of America (USA), United Kingdom, Ireland, Australia, New Zealand, and Canada were assessed online (N = 394, 54.8% men, 43.1% women, Mage = 27.48, SD = 5.57). Results: MI analyses observed invariance across males and females at the configural level and metric level but non-invariance at the scalar level. The graded response model conducted to observe item properties indicated that all items demonstrated, although variable, sufficient discrimination capacity.Conclusions: Gender comparisons based on WEMWBS scores should be cautiously interpreted for specific items that demonstrate different scalar scales and similar scores indicate different severity. The items showed increased reliability for latent levels of ∓ 2 SD from the mean level of SWB. The WEMWBS may also not perform well for clinically low and high levels of SWB. Including assessments for clinical cases may optimise the use of the WEMWBS.


2009 ◽  
Vol 30 (1) ◽  
pp. 5-15 ◽  
Author(s):  
Dennis G. Fryback ◽  
Mari Palta ◽  
Dasha Cherepanov ◽  
Daniel Bolt ◽  
Jee-Seon Kim

Sign in / Sign up

Export Citation Format

Share Document