scholarly journals Efficient and precise Ultra-QuickDASH scale measuring lymphedema impact developed using computerized adaptive testing

Author(s):  
Cai Xu ◽  
Mark V. Schaverien ◽  
Joani M. Christensen ◽  
Chris J. Sidey-Gibbons

Abstract Purpose This study aimed to evaluate and improve the accuracy and efficiency of the QuickDASH for use in assessment of limb function in patients with upper extremity lymphedema using modern psychometric techniques. Method We conducted confirmative factor analysis (CFA) and Mokken analysis to examine the assumption of unidimensionality for IRT model on data from 285 patients who completed the QuickDASH, and then fit the data to Samejima’s graded response model (GRM) and assessed the assumption of local independence of items and calibrated the item responses for CAT simulation. Results Initial CFA and Mokken analyses demonstrated good scalability of items and unidimensionality. However, the local independence of items assumption was violated between items 9 (severity of pain) and 11 (sleeping difficulty due to pain) (Yen’s Q3 = 0.46) and disordered thresholds were evident for item 5 (cutting food). After addressing these breaches of assumptions, the re-analyzed GRM with the remaining 10 items achieved an improved fit. Simulation of CAT administration demonstrated a high correlation between scores on the CAT and the QuickDash (r = 0.98). Items 2 (doing heavy chores) and 8 (limiting work or daily activities) were the most frequently used. The correlation among factor scores derived from the QuickDASH version with 11 items and the Ultra-QuickDASH version with items 2 and 8 was as high as 0.91. Conclusion By administering just these two best performing QuickDash items we can obtain estimates that are very similar to those obtained from the full-length QuickDash without the need for CAT technology.

2019 ◽  
Vol 12 (1) ◽  
pp. 49-56
Author(s):  
Sofía E. Auné ◽  
Facundo J. P. Abal ◽  
Horacio F. Attorresi

The results obtained from the application of the Graded Response Model (GRM) to the items of the Scale of Empathic Behavior whose authors are Auné, Abal and Attorresi (2017) are presented. The sample was obtained by accessibility and consisted of Argentine university students (80% Women). None of the items presented uniform or non-uniform gender differential item functioning. The GRM assumptions of local independence and unidimensionality were successfully corroborated. Threshold parameters tended to be located at low levels of the trait scale, whereas discrimination parameters were high. The analysis of the information function showed acceptable precision in low and middle empathic behavior level.


2017 ◽  
Vol 78 (3) ◽  
pp. 384-408 ◽  
Author(s):  
Yong Luo ◽  
Hong Jiao

Stan is a new Bayesian statistical software program that implements the powerful and efficient Hamiltonian Monte Carlo (HMC) algorithm. To date there is not a source that systematically provides Stan code for various item response theory (IRT) models. This article provides Stan code for three representative IRT models, including the three-parameter logistic IRT model, the graded response model, and the nominal response model. We demonstrate how IRT model comparison can be conducted with Stan and how the provided Stan code for simple IRT models can be easily extended to their multidimensional and multilevel cases.


Psychometrika ◽  
2021 ◽  
Author(s):  
Steven P. Reise ◽  
Han Du ◽  
Emily F. Wong ◽  
Anne S. Hubbard ◽  
Mark G. Haviland

AbstractItem response theory (IRT) model applications extend well beyond cognitive ability testing, and various patient-reported outcomes (PRO) measures are among the more prominent examples. PRO (and like) constructs differ from cognitive ability constructs in many ways, and these differences have model fitting implications. With a few notable exceptions, however, most IRT applications to PRO constructs rely on traditional IRT models, such as the graded response model. We review some notable differences between cognitive and PRO constructs and how these differences can present challenges for traditional IRT model applications. We then apply two models (the traditional graded response model and an alternative log-logistic model) to depression measure data drawn from the Patient-Reported Outcomes Measurement Information System project. We do not claim that one model is “a better fit” or more “valid” than the other; rather, we show that the log-logistic model may be more consistent with the construct of depression as a unipolar phenomenon. Clearly, the graded response and log-logistic models can lead to different conclusions about the psychometrics of an instrument and the scaling of individual differences. We underscore, too, that, in general, explorations of which model may be more appropriate cannot be decided only by fit index comparisons; these decisions may require the integration of psychometrics with theory and research findings on the construct of interest.


2009 ◽  
Vol 40 (11) ◽  
pp. 1212-1220 ◽  
Author(s):  
Zhao-Sheng LUO ◽  
Xue-Lian OUYANG ◽  
Shu-Qing QI ◽  
Hai-Qi DAI ◽  
Shu-Liang DING

Sign in / Sign up

Export Citation Format

Share Document