Item Response Theory in Measurement, Assessment, and Evaluation for Higher Education

Author(s):  
Steven J. Osterlind ◽  
Ze Wang
2020 ◽  
Vol 35 (6) ◽  
pp. 781-781
Author(s):  
W Goette ◽  
A Carlew ◽  
J Schaffert ◽  
H Rossetti ◽  
L Lacritz

Abstract Objective Characterize three functional living scales under item response theory and examine these scales for evidence of differential item functioning (DIF) by participant and/or informant ethnicity and education. Method Baseline data from 3155 participants [Mage = 70.59(9.55); Medu = 13.3(4.26); 61.72%female] enrolled in the Texas Alzheimer’s Research and Care Consortium with data from the Clinical Dementia Rating Scale (CDR; functional items), Physical Self-Maintenance Scale (PSMS), and Instrumental Activities of Daily Living Scale (IADL) were used. The sample was predominately white (93.94%) and 35.97% identified as Hispanic. Graded response models fit all three tests best. DIF was examined by iteratively dropping item-by-item constraints and then testing model fit. Results The CDR demonstrated overall good item functioning with clear separation between all of the rating categories for each item, while the PSMS and IADL did not, suggesting the item ratings should be reconsidered. DIF was observed by ethnicity (Hispanic v. non-Hispanic) and education (separated into low, average, high) for every item on all three scales (all ps ≤ .01 after adjustment for multiple observations). Hispanic ethnicity and higher education subjects were more likely to be rated as more impaired. Conclusions Results suggest these three commonly used functional scales have DIF depending on the ethnicity and education of the patient. This finding has implications for understanding functional change in certain populations, particularly the potential for mischaracterization of impairment in minority samples. The finding that individuals with higher education tended to be rated as more functionally impaired warrants further investigation.


2020 ◽  
Vol 78 (4) ◽  
pp. 576-594
Author(s):  
Bing Jia ◽  
Dan He ◽  
Zhemin Zhu

The quality of multiple-choice questions (MCQs) as well as the student's solve behavior in MCQs are educational concerns. MCQs cover wide educational content and can be immediately and accurately scored. However, many studies have found some flawed items in this exam type, thereby possibly resulting in misleading insights into students’ performance and affecting important decisions. This research sought to determine the characteristics of MCQs and factors that may affect the quality of MCQs by using item response theory (IRT) to evaluate data. For this, four samples of different sizes from US and China in secondary and higher education were chosen. Item difficulty and discrimination were determined using item response theory statistical item analysis models. Results were as follows. First, only a few guessing behaviors are included in MCQ exams because all data fit the two-parameter logistic model better than the three-parameter logistic model. Second, the quality of MCQs depended more on the degree of training of examiners and less on middle or higher education levels. Lastly, MCQs must be evaluated to ensure that high-quality items can be used as bases of inference in middle and higher education. Keywords: higher education, item evaluation, item response theory, multiple-choice test, secondary education


2001 ◽  
Vol 46 (6) ◽  
pp. 629-632
Author(s):  
Robert J. Mislevy

Sign in / Sign up

Export Citation Format

Share Document