Solving the measurement invariance anchor item problem in item response theory.

2012 ◽  
Vol 97 (5) ◽  
pp. 1016-1031 ◽  
Author(s):  
Adam W. Meade ◽  
Natalie A. Wright
2021 ◽  
Vol 9 ◽  
Author(s):  
Ron D. Hays ◽  
David Hubble ◽  
Frank Jenkins ◽  
Alexa Fraser ◽  
Beryl Carew

The National Children's Study (NCS) statistics and item response theory group was tasked with promoting the quality of study measures and analysis. This paper provides an overview of six measurement and statistical considerations for the NCS: (1) Conceptual and Measurement Model; (2) Reliability; (3) Validity; (4) Measurement Invariance; (5) Interpretability of Scores; and (6) Burden of administration. The guidance was based primarily on recommendations of the International Society of Quality of Life Research.


2020 ◽  
Author(s):  
E. Damiano D'Urso ◽  
Kim De Roover ◽  
Jeroen K. Vermunt ◽  
Jesper Tijmstra

In social sciences, the study of group differences concerning latent constructs is ubiquitous. These constructs are generally measured by means of scales composed of ordinal items. In order to compare these constructs across groups, one crucial requirement is that they are measured equivalently or, in technical jargon, that measurement invariance holds across the groups. This study compared the performance of multiple group categorical confirmatory factor analysis (MG-CCFA) and multiple group item response theory (MG-IRT) in testing measurement invariance with ordinal data. A simulation study was conducted to compare the true positive rate (TPR) and false positive rate (FPR) both at the scale and at the item level for these two approaches under an invariance and a non-invariance scenario. The results of the simulation studies showed that the performance, in terms of the TPR, of MG-CCFA- and MG-IRT-based approaches mostly depends on the scale length. In fact, for long scales, the likelihood ratio test (LRT) approach, for MG-IRT, outperformed the other approaches, while, for short scales, MG-CCFA seemed to be generally preferable. In addition, the performance of MG-CCFA's fit measures, such as RMSEA and CFI, seemed to depend largely on the length of the scale, especially when MI was tested at the item level. General caution is recommended when using these measures, especially when MI is tested for each item individually. A decision flowchart, based on the results of the simulation studies, is provided to help summarizing the results and providing indications on which approach performed best and in which setting.


2021 ◽  
Author(s):  
Joshua Marmara ◽  
Daniel Zarate ◽  
Jeremy Vassallo ◽  
Rhiannon Patten ◽  
Vasileios Stavropoulos

Abstract Background: The Warwick Edinburgh Mental Well-Being Scale (WEMWBS) is a measure of subjective well-being and assesses eudemonic and hedonic aspects of well-being. However, differential scoring of the WEMWBS across gender and its precision of measurement has not been examined. The present study assesses the psychometric properties of the WEMWBS using Measurement Invariance (MI) between males and females and Item Response Theory (IRT) analyses. Method: A community sample of 386 adults from the United States of America (USA), United Kingdom, Ireland, Australia, New Zealand, and Canada were assessed online (N = 394, 54.8% men, 43.1% women, Mage = 27.48, SD = 5.57). Results: MI analyses observed invariance across males and females at the configural level and metric level but non-invariance at the scalar level. The graded response model conducted to observe item properties indicated that all items demonstrated, although variable, sufficient discrimination capacity.Conclusions: Gender comparisons based on WEMWBS scores should be cautiously interpreted for specific items that demonstrate different scalar scales and similar scores indicate different severity. The items showed increased reliability for latent levels of ∓ 2 SD from the mean level of SWB. The WEMWBS may also not perform well for clinically low and high levels of SWB. Including assessments for clinical cases may optimise the use of the WEMWBS.


2021 ◽  
Author(s):  
Emily Lasko ◽  
David Chester

The Taylor Aggression Paradigm (TAP) is a widely used laboratory aggression task, yet item response theory (IRT) analyses of this task are nonexistent. To estimate these aspects of the TAP, we combined data from nine laboratory studies that employed the 25-trial version of the TAP (combined N = 1,856). One-factor and four-factor solutions for the TAP data exhibited evidence of measurement invariance across gender (men versus women) and experimental provocation (negative versus positive social feedback), as well as negligible instances of differential item functioning. As such, psychometric properties of the TAP were invariant across binary representations of gender and experimental provocation. Further, trials following low and high provocation were the least informative and those following moderate provocation were the most informative. Scoring approaches to the TAP may benefit from giving greater weight to trials following moderate provocation. Overall, we find great utility in applying IRT approaches to behavioral laboratory tasks.


Sign in / Sign up

Export Citation Format

Share Document