Taking an Item-Level Approach to Measuring Change With the Force and Motion Conceptual Evaluation: An Application of Item Response Theory

2013 ◽  
Vol 113 (7) ◽  
pp. 356-365 ◽  
Author(s):  
Robert M. Talbot
2020 ◽  
Vol 35 (7) ◽  
pp. 1094-1108
Author(s):  
Morgan E Nitta ◽  
Brooke E Magnus ◽  
Paul S Marshall ◽  
James B Hoelzle

Abstract There are many challenges associated with assessment and diagnosis of ADHD in adulthood. Utilizing the graded response model (GRM) from item response theory (IRT), a comprehensive item-level analysis of adult ADHD rating scales in a clinical population was conducted with Barkley's Adult ADHD Rating Scale-IV, Self-Report of Current Symptoms (CSS), a self-report diagnostic checklist and a similar self-report measure quantifying retrospective report of childhood symptoms, Barkley's Adult ADHD Rating Scale-IV, Self-Report of Childhood Symptoms (BAARS-C). Differences in item functioning were also considered after identifying and excluding individuals with suspect effort. Items associated with symptoms of inattention (IA) and hyperactivity/impulsivity (H/I) are endorsed differently across the lifespan, and these data suggest that they vary in their relationship to the theoretical constructs of IA and H/I. Screening for sufficient effort did not meaningfully change item level functioning. The application IRT to direct item-to-symptom measures allows for a unique psychometric assessment of how the current DSM-5 symptoms represent latent traits of IA and H/I. Meeting a symptom threshold of five or more symptoms may be misleading. Closer attention given to specific symptoms in the context of the clinical interview and reported difficulties across domains may lead to more informed diagnosis.


2020 ◽  
Author(s):  
E. Damiano D'Urso ◽  
Kim De Roover ◽  
Jeroen K. Vermunt ◽  
Jesper Tijmstra

In social sciences, the study of group differences concerning latent constructs is ubiquitous. These constructs are generally measured by means of scales composed of ordinal items. In order to compare these constructs across groups, one crucial requirement is that they are measured equivalently or, in technical jargon, that measurement invariance holds across the groups. This study compared the performance of multiple group categorical confirmatory factor analysis (MG-CCFA) and multiple group item response theory (MG-IRT) in testing measurement invariance with ordinal data. A simulation study was conducted to compare the true positive rate (TPR) and false positive rate (FPR) both at the scale and at the item level for these two approaches under an invariance and a non-invariance scenario. The results of the simulation studies showed that the performance, in terms of the TPR, of MG-CCFA- and MG-IRT-based approaches mostly depends on the scale length. In fact, for long scales, the likelihood ratio test (LRT) approach, for MG-IRT, outperformed the other approaches, while, for short scales, MG-CCFA seemed to be generally preferable. In addition, the performance of MG-CCFA's fit measures, such as RMSEA and CFI, seemed to depend largely on the length of the scale, especially when MI was tested at the item level. General caution is recommended when using these measures, especially when MI is tested for each item individually. A decision flowchart, based on the results of the simulation studies, is provided to help summarizing the results and providing indications on which approach performed best and in which setting.


2020 ◽  
Vol 18 (2) ◽  
pp. 2-43
Author(s):  
William R. Dardick ◽  
Brandi A. Weiss

New variants of entropy as measures of item-fit in item response theory are investigated. Monte Carlo simulation(s) examine aberrant conditions of item-level misfit to evaluate relative (compare EMRj, X2, G2, S-X2, and PV-Q1) and absolute (Type I error and empirical power) performance. EMRj has utility in discovering misfit.


2018 ◽  
Vol 42 (8) ◽  
pp. 644-659
Author(s):  
Xue Zhang ◽  
Chun Wang ◽  
Jian Tao

Testing item-level fit is important in scale development to guide item revision/deletion. Many item-level fit indices have been proposed in literature, yet none of them were directly applicable to an important family of models, namely, the higher order item response theory (HO-IRT) models. In this study, chi-square-based fit indices (i.e., Yen’s Q1, McKinley and Mill’s G2, Orlando and Thissen’s S-X2, and S-G2) were extended to HO-IRT models. Their performances are evaluated via simulation studies in terms of false positive rates and correct detection rates. The manipulated factors include test structure (i.e., test length and number of dimensions), sample size, level of correlations among dimensions, and the proportion of misfitting items. For misfitting items, the sources of misfit, including the misfitting item response functions, and misspecifying factor structures were also manipulated. The results from simulation studies demonstrate that the S-G2 is promising for higher order items.


Rheumatology ◽  
2019 ◽  
Vol 59 (6) ◽  
pp. 1398-1406 ◽  
Author(s):  
Marie Corneloup ◽  
François Maurier ◽  
Denis Wahl ◽  
Geraldine Muller ◽  
Olivier Aumaitre ◽  
...  

Abstract Objective To explore, at an item-level, the effect of disease activity (DA) on specific health-related quality of life (HRQoL) in SLE patients using an item response theory longitudinal model. Methods This prospective longitudinal multicentre French cohort EQUAL followed SLE patients over 2 years. Specific HRQoL according to LupusQoL and SLEQOL was collected every 3 months. DA according to SELENA-SLEDAI flare index (SFI) and revised SELENA-SLEDAI flare index (SFI-R) was evaluated every 6 months. Regarding DA according to SFI and each SFI-R type of flare, specific HRQoL of remitting patients was compared with non-flaring patients fitting a linear logistic model with relaxed assumptions for each domain of the questionnaires. Results Between December 2011 and July 2015, 336 patients were included (89.9% female). LupusQoL and SLEQOL items related to physical HRQoL (physical health, physical functioning, pain) were most affected by musculoskeletal and cutaneous flares. Cutaneous flares had significant influence on self-image. Neurological or psychiatric flares had a more severe impact on specific HRQoL. Patient HRQoL was impacted up to 18 months after a flare. Conclusion Item response theory analysis is able to pinpoint items that are influenced by a given patient group in terms of a latent trait change. Item-level analysis provides a new way of interpreting HRQoL variation in SLE patients, permitting a better understanding of DA impact on HRQoL. This kind of analysis could be easily implemented for the comparison of groups in a clinical trial. Trial registration ClinicalTrials.gov, http://clinicaltrials.gov, NCT01904812.


Author(s):  
E. Damiano D’Urso ◽  
Kim De Roover ◽  
Jeroen K. Vermunt ◽  
Jesper Tijmstra

AbstractIn social sciences, the study of group differences concerning latent constructs is ubiquitous. These constructs are generally measured by means of scales composed of ordinal items. In order to compare these constructs across groups, one crucial requirement is that they are measured equivalently or, in technical jargon, that measurement invariance (MI) holds across the groups. This study compared the performance of scale- and item-level approaches based on multiple group categorical confirmatory factor analysis (MG-CCFA) and multiple group item response theory (MG-IRT) in testing MI with ordinal data. In general, the results of the simulation studies showed that MG-CCFA-based approaches outperformed MG-IRT-based approaches when testing MI at the scale level, whereas, at the item level, the best performing approach depends on the tested parameter (i.e., loadings or thresholds). That is, when testing loadings equivalence, the likelihood ratio test provided the best trade-off between true-positive rate and false-positive rate, whereas, when testing thresholds equivalence, the χ2 test outperformed the other testing strategies. In addition, the performance of MG-CCFA’s fit measures, such as RMSEA and CFI, seemed to depend largely on the length of the scale, especially when MI was tested at the item level. General caution is recommended when using these measures, especially when MI is tested for each item individually.


Sign in / Sign up

Export Citation Format

Share Document