scholarly journals Power and Sample Size Calculations in Clinical Trials with Patient-Reported Outcomes under Equal and Unequal Group Sizes Based on Graded Response Model: A Simulation Study

2016 ◽  
Vol 19 (5) ◽  
pp. 639-647 ◽  
Author(s):  
Marziyeh Doostfatemeh ◽  
Seyyed Mohammad Taghi Ayatollah ◽  
Peyman Jafari
Psychometrika ◽  
2021 ◽  
Author(s):  
Steven P. Reise ◽  
Han Du ◽  
Emily F. Wong ◽  
Anne S. Hubbard ◽  
Mark G. Haviland

AbstractItem response theory (IRT) model applications extend well beyond cognitive ability testing, and various patient-reported outcomes (PRO) measures are among the more prominent examples. PRO (and like) constructs differ from cognitive ability constructs in many ways, and these differences have model fitting implications. With a few notable exceptions, however, most IRT applications to PRO constructs rely on traditional IRT models, such as the graded response model. We review some notable differences between cognitive and PRO constructs and how these differences can present challenges for traditional IRT model applications. We then apply two models (the traditional graded response model and an alternative log-logistic model) to depression measure data drawn from the Patient-Reported Outcomes Measurement Information System project. We do not claim that one model is “a better fit” or more “valid” than the other; rather, we show that the log-logistic model may be more consistent with the construct of depression as a unipolar phenomenon. Clearly, the graded response and log-logistic models can lead to different conclusions about the psychometrics of an instrument and the scaling of individual differences. We underscore, too, that, in general, explorations of which model may be more appropriate cannot be decided only by fit index comparisons; these decisions may require the integration of psychometrics with theory and research findings on the construct of interest.


2020 ◽  
pp. 001316442095806
Author(s):  
Shiyang Su ◽  
Chun Wang ◽  
David J. Weiss

[Formula: see text] is a popular item fit index that is available in commercial software packages such as flexMIRT. However, no research has systematically examined the performance of [Formula: see text] for detecting item misfit within the context of the multidimensional graded response model (MGRM). The primary goal of this study was to evaluate the performance of [Formula: see text] under two practical misfit scenarios: first, all items are misfitting due to model misspecification, and second, a small subset of items violate the underlying assumptions of the MGRM. Simulation studies showed that caution should be exercised when reporting item fit results of polytomous items using [Formula: see text] within the context of the MGRM, because of its inflated false positive rates (FPRs), especially with a small sample size and a long test. [Formula: see text] performed well when detecting overall model misfit as well as item misfit for a small subset of items when the ordinality assumption was violated. However, under a number of conditions of model misspecification or items violating the homogeneous discrimination assumption, even though true positive rates (TPRs) of [Formula: see text] were high when a small sample size was coupled with a long test, the inflated FPRs were generally directly related to increasing TPRs. There was also a suggestion that performance of [Formula: see text] was affected by the magnitude of misfit within an item. There was no evidence that FPRs for fitting items were exacerbated by the presence of a small percentage of misfitting items among them.


2013 ◽  
Vol 22 (2) ◽  
pp. 252-262 ◽  
Author(s):  
Michelene Chenault ◽  
Martijn Berger ◽  
Bernd Kremer ◽  
Lucien Anteunis

Purpose The purpose of this study was to improve the effectiveness of adult hearing screens and demonstrate that interventions assessment methods are needed that address the individual's experienced hearing. Item response theory, which provides a methodology for assessing patient-reported outcomes, is examined here to demonstrate its usefulness in hearing screens and interventions. Method The graded response model is applied to a scale of 11 items assessing perceived hearing functioning and 10 items assessing experienced social limitations completed by a sample of 212 persons age 55+ years. Fixed and variable slope models are compared. Discrimination and threshold parameters are estimated and information functions evaluated. Results Variable slope models for both scales provided the best fit. The estimated discrimination parameters for all items except for one in each scale were good if not excellent (1.5–3.4). Threshold values varied, demonstrating the complementary and supplementary value of items within a scale. The information provided by each item varies relative to trait values so that each scale of items provides information over a wider range of trait values. Conclusion Item response theory methodology facilitates the comparison of items relative to their discriminative ability and information provided and thus provides a basis for the selection of items for application in a screening setting.


2009 ◽  
Vol 40 (11) ◽  
pp. 1212-1220 ◽  
Author(s):  
Zhao-Sheng LUO ◽  
Xue-Lian OUYANG ◽  
Shu-Qing QI ◽  
Hai-Qi DAI ◽  
Shu-Liang DING

2013 ◽  
Vol 11 (1) ◽  
pp. 83 ◽  
Author(s):  
Ari Gnanasakthy ◽  
Sandra Lewis ◽  
Marci Clark ◽  
Margaret Mordin ◽  
Carla DeMuro

Sign in / Sign up

Export Citation Format

Share Document