graded response model
Recently Published Documents


TOTAL DOCUMENTS

109
(FIVE YEARS 32)

H-INDEX

16
(FIVE YEARS 3)

Author(s):  
Cai Xu ◽  
Mark V. Schaverien ◽  
Joani M. Christensen ◽  
Chris J. Sidey-Gibbons

Abstract Purpose This study aimed to evaluate and improve the accuracy and efficiency of the QuickDASH for use in assessment of limb function in patients with upper extremity lymphedema using modern psychometric techniques. Method We conducted confirmative factor analysis (CFA) and Mokken analysis to examine the assumption of unidimensionality for IRT model on data from 285 patients who completed the QuickDASH, and then fit the data to Samejima’s graded response model (GRM) and assessed the assumption of local independence of items and calibrated the item responses for CAT simulation. Results Initial CFA and Mokken analyses demonstrated good scalability of items and unidimensionality. However, the local independence of items assumption was violated between items 9 (severity of pain) and 11 (sleeping difficulty due to pain) (Yen’s Q3 = 0.46) and disordered thresholds were evident for item 5 (cutting food). After addressing these breaches of assumptions, the re-analyzed GRM with the remaining 10 items achieved an improved fit. Simulation of CAT administration demonstrated a high correlation between scores on the CAT and the QuickDash (r = 0.98). Items 2 (doing heavy chores) and 8 (limiting work or daily activities) were the most frequently used. The correlation among factor scores derived from the QuickDASH version with 11 items and the Ultra-QuickDASH version with items 2 and 8 was as high as 0.91. Conclusion By administering just these two best performing QuickDash items we can obtain estimates that are very similar to those obtained from the full-length QuickDash without the need for CAT technology.


Psychometrika ◽  
2021 ◽  
Author(s):  
Steven P. Reise ◽  
Han Du ◽  
Emily F. Wong ◽  
Anne S. Hubbard ◽  
Mark G. Haviland

AbstractItem response theory (IRT) model applications extend well beyond cognitive ability testing, and various patient-reported outcomes (PRO) measures are among the more prominent examples. PRO (and like) constructs differ from cognitive ability constructs in many ways, and these differences have model fitting implications. With a few notable exceptions, however, most IRT applications to PRO constructs rely on traditional IRT models, such as the graded response model. We review some notable differences between cognitive and PRO constructs and how these differences can present challenges for traditional IRT model applications. We then apply two models (the traditional graded response model and an alternative log-logistic model) to depression measure data drawn from the Patient-Reported Outcomes Measurement Information System project. We do not claim that one model is “a better fit” or more “valid” than the other; rather, we show that the log-logistic model may be more consistent with the construct of depression as a unipolar phenomenon. Clearly, the graded response and log-logistic models can lead to different conclusions about the psychometrics of an instrument and the scaling of individual differences. We underscore, too, that, in general, explorations of which model may be more appropriate cannot be decided only by fit index comparisons; these decisions may require the integration of psychometrics with theory and research findings on the construct of interest.


2021 ◽  
Vol 2 (1) ◽  
pp. 72-84
Author(s):  
Rahmadhani Mulvia ◽  
Taufik Ramlan Ramalis ◽  
Ridwan Efendi

Penelitian ini bertujuan untuk mendeteksi keajegan butir tes dengan fungsi informasi dari teori respon butir. Keajengan informasi dalam pengukuran dikenal dengan reliabilitas. Metode penelitian yang digunakan adalah pengembangan dengan desain ADDIE (Analyze, Design, Development, Implement, and Evaluation). Partisipan pada penelitian ini terdiri dari 5 ahli dan 280 peserta didik SMA yang dipilih dengan menggunakan teknik random sampling. Instrumen yang digunakan adalah lembar validasi dan angket kebiasaan berpikir ilmiah. Lembar validasi digunakan pada tahap development yang terdiri dari 6 aspek mencakup konstruk, konten dan bahasa. Angket kebiasaan berpikir ilmiah digunakan pada tahap implement yang terdiri dari 22 butir tes dan dinyatakan layak untuk diujicobakan oleh ahli. Data yang diperoleh dari lembar validasi dianalisis dengan menggunakan Many Facet Rasch Model berbantuan program Minifac dari Winstep sehingga dinyatakan layak untuk diuji cobakan. Data dari tahap implement dianalisis dengan menggunakan graded response model berbantuan program eirt. Hasil yang diperoleh adalah sebagian besar dari instrumen baik digunakan untuk mengukur kebiasaan berpikir ilmiah dan dapat mengukur tingkat kemampuan dari tingkat yang rendah sampai tingkat yang tinggi. Oleh karena itu, sebagian besar butir tes dapat diimplikasikan dalam pengukuran kemampuan kebiasaan berpikir ilmiah yang dimiliki oleh peserta didik.


2021 ◽  
Vol 9 ◽  
pp. 205031212110122
Author(s):  
Samuel W Terman ◽  
James F Burke

Objectives: Accurately measuring disability is critical toward policy development, economic analyses, and determining individual-level effects of health interventions. Nationally representative population surveys such as the National Health and Nutrition Examination Survey provide key opportunities to measure disability constructs such as activity limitations. However, only very limited work has previously evaluated the item response properties of questions pertaining to limitations in National Health and Nutrition Examination Survey. Methods: This was a cross-sectional study. We included participants ⩾20 years old for the 2013–2018 National Health and Nutrition Examination Survey cycles. Activity limitations, and a smaller number of body function impairments or participation restrictions, were determined from interview questions. We performed item response theory models (a two-parameter logistic and a graded response model) to characterize discriminating information along the latent continuum of activity limitation. Results: We included 17,057 participants. Although each particular limitation was somewhat rare (maximally 13%), 7214 (38%) responded having at least one limitation. We found a high amount of discriminating information at 1–2 standard deviations above average limitation, though essentially zero discrimination below that range. Items had substantial overlap in the range at which they provided information distinguishing individuals. The ordinal graded response model including 20 limitations provided greater information than the dichotomous two-parameter logistic model, though further omitting items from the graded response model led to loss of information. Conclusion: National Health and Nutrition Examination Survey disability-related questions, mostly specifically activity limitations, provided a high degree of information distinguishing individuals with higher than average limitations on the latent continuum, but essentially zero resolution to distinguish individuals with low or average limitations. Future work may focus on developing items which better distinguish individuals at the “lower” end of the limitation spectrum.


2020 ◽  
Vol 9 (11) ◽  
pp. 3754
Author(s):  
Yoshiaki Nomura ◽  
Toshiya Morozumi ◽  
Mitsuo Fukuda ◽  
Nobuhiro Hanada ◽  
Erika Kakuta ◽  
...  

Periodontal examination data have a complex structure. For epidemiological studies, mass screenings, and public health use, a simple index that represents the periodontal condition is necessary. Periodontal indices for partial examination of selected teeth have been developed. However, the selected teeth vary between indices, and a justification for the selection of examination teeth has not been presented. We applied a graded response model based on the item response theory to select optimal examination teeth and sites that represent periodontal conditions. Data were obtained from 254 patients who participated in a multicenter follow-up study. Baseline data were obtained from initial follow-up. Optimal examination sites were selected using item information calculated by graded response modeling. Twelve sites—maxillary 2nd premolar (palatal-medial), 1st premolar (palatal-distal), canine (palatal-medial), lateral incisor (palatal-central), central incisor (palatal-distal) and mandibular 1st premolar (lingual, medial)—were selected. Mean values for clinical attachment level, probing pocket depth, and bleeding on probing by full mouth examinations were used for objective variables. Measuring the clinical parameters of these sites can predict the results of full mouth examination. For calculating the periodontal index by partial oral examination, a justification for the selection of examination sites is essential. This study presents an evidence-based partial examination methodology and its modeling.


2020 ◽  
Vol 18 (1) ◽  
Author(s):  
Fengling Yang ◽  
Fei Zhao ◽  
Yun Zheng ◽  
Gang Li

Abstract Background Early prelingual auditory development (EPLAD) is a fundamental and important process in the speech and language development of infants and toddlers. The Infant–Toddler Meaningful Auditory Integration Scale (ITMAIS) is a widely used measurement tool for EPLAD, however it has not yet undergone a comprehensive psychometric analysis. The aim of this research was to modify and verify the psychometric properties of ITMAIS using a combination of Item Response Theory (IRT) and Classical Test Theory (CTT). Methods Stage 1—1730 children were retrospectively recruited to enable the application of an IRT model, specifically the graded response model, to modify the ITMAIS. Stage 2—another 450 infants and toddlers with normal hearing or permanent hearing loss before auditory intervention were recruited to verify the psychometric properties of the modified ITMAIS (ITMAIS-m) using the CTT method. Results Using the metric of the graded response model, by removing item 2 from the ITMAIS, ITMAIS-m demonstrated discrimination parameters ranging from 3.947 to 5.431, difficulty parameters from − 1.146 to 1.150, item information distributed between 4.798 and 9.259 and a test information score of 48.061. None of the items showed differential item functioning. ITMAIS-m was further verified in Stage 2, showing Cronbach’s α of 0.919 and item-total correlations ranging from 0.693 to 0.851. There was good convergent validity of ITMAIS-m with other auditory outcome measure (r = 0.932) and pure tone average thresholds (r ranging from − 0.670 to − 0.909), as well as a high ability to discriminate between different hearing grades (Cohen d ranging from 0.41 to 5.83). Conclusions The ITMAIS-m is a reliable and valid tool for evaluating EPLAD in infants and toddlers, which can be efficiently and precisely applied in clinical practice. The combined use of IRT and CTT provides a powerful means to modify psychometrically robust scales aimed at childhood auditory outcome measurements.


Sign in / Sign up

Export Citation Format

Share Document