The Impact of Omitted Responses on the Accuracy of Ability Estimation in Item Response Theory

2001 ◽  
Vol 38 (3) ◽  
pp. 213-234 ◽  
Author(s):  
R. J. Ayala ◽  
Barbara S. Plake ◽  
James C. Impara
2009 ◽  
Vol 15 (5) ◽  
pp. 758-768 ◽  
Author(s):  
OTTO PEDRAZA ◽  
NEILL R. GRAFF-RADFORD ◽  
GLENN E. SMITH ◽  
ROBERT J. IVNIK ◽  
FLOYD B. WILLIS ◽  
...  

AbstractScores on the Boston Naming Test (BNT) are frequently lower for African American when compared with Caucasian adults. Although demographically based norms can mitigate the impact of this discrepancy on the likelihood of erroneous diagnostic impressions, a growing consensus suggests that group norms do not sufficiently address or advance our understanding of the underlying psychometric and sociocultural factors that lead to between-group score discrepancies. Using item response theory and methods to detect differential item functioning (DIF), the current investigation moves beyond comparisons of the summed total score to examine whether the conditional probability of responding correctly to individual BNT items differs between African American and Caucasian adults. Participants included 670 adults age 52 and older who took part in Mayo’s Older Americans and Older African Americans Normative Studies. Under a two-parameter logistic item response theory framework and after correction for the false discovery rate, 12 items where shown to demonstrate DIF. Of these 12 items, 6 (“dominoes,” “escalator,” “muzzle,” “latch,” “tripod,” and “palette”) were also identified in additional analyses using hierarchical logistic regression models and represent the strongest evidence for race/ethnicity-based DIF. These findings afford a finer characterization of the psychometric properties of the BNT and expand our understanding of between-group performance. (JINS, 2009, 15, 758–768.)


2020 ◽  
Vol 64 (3) ◽  
pp. 219-237
Author(s):  
Brandon LeBeau ◽  
Susan G. Assouline ◽  
Duhita Mahatmya ◽  
Ann Lupkowski-Shoplik

This study investigated the application of item response theory (IRT) to expand the range of ability estimates for gifted (hereinafter referred to as high-achieving) students’ performance on an above-level test. Using a sample of fourth- to sixth-grade high-achieving students ( N = 1,893), we conducted a study to compare estimates from two measurement theories, classical test theory (CTT) and IRT. CTT and IRT make different assumptions about the analysis that impact the reliability and validity of the scores obtained from the test. IRT can also differentiate students based on the student’s grade or within a grade by using the unique string of correct and incorrect answers the student makes while taking the test. This differentiation may have implications for identifying or classifying students who are ready for advanced coursework. An exploration of the differentiation for Math, Reading, and Science tests and the impact the different measurement frameworks can have on classification of students are explored. Implications for academic talent identification with the talent search model and development of academic talent are discussed.


2014 ◽  
Vol 27 (7) ◽  
pp. 1099-1111 ◽  
Author(s):  
Anne E. Mueller ◽  
Daniel L. Segal ◽  
Brandon Gavett ◽  
Meghan A. Marty ◽  
Brian Yochim ◽  
...  

ABSTRACTBackground:The Geriatric Anxiety Scale (GAS; Segal et al. (Segal, D. L., June, A., Payne, M., Coolidge, F. L. and Yochim, B. (2010). Journal of Anxiety Disorders, 24, 709–714. doi:10.1016/j.janxdis.2010.05.002) is a self-report measure of anxiety that was designed to address unique issues associated with anxiety assessment in older adults. This study is the first to use item response theory (IRT) to examine the psychometric properties of a measure of anxiety in older adults.Method:A large sample of older adults (n = 581; mean age = 72.32 years, SD = 7.64 years, range = 60 to 96 years; 64% women; 88% European American) completed the GAS. IRT properties were examined. The presence of differential item functioning (DIF) or measurement bias by age and sex was assessed, and a ten-item short form of the GAS (called the GAS-10) was created.Results:All GAS items had discrimination parameters of 1.07 or greater. Items from the somatic subscale tended to have lower discrimination parameters than items on the cognitive or affective subscales. Two items were flagged for DIF, but the impact of the DIF was negligible. Women scored significantly higher than men on the GAS and its subscales. Participants in the young-old group (60 to 79 years old) scored significantly higher on the cognitive subscale than participants in the old-old group (80 years old and older).Conclusions:Results from the IRT analyses indicated that the GAS and GAS-10 have strong psychometric properties among older adults. We conclude by discussing implications and future research directions.


2022 ◽  
Vol 12 ◽  
Author(s):  
Feifei Huang ◽  
Zhe Li ◽  
Ying Liu ◽  
Jingan Su ◽  
Li Yin ◽  
...  

Educational assessments tests are often constructed using testlets because of the flexibility to test various aspects of the cognitive activities and broad content sampling. However, the violation of the local item independence assumption is inevitable when tests are built using testlet items. In this study, simulations are conducted to evaluate the performance of item response theory models and testlet response theory models for both the dichotomous and polytomous items in the context of equating tests composed of testlets. We also examine the impact of testlet effect, length of testlet items, and sample size on estimating item and person parameters. The results show that more accurate performance of testlet response theory models over item response theory models was consistently observed across the studies, which supports the benefits of using the testlet response theory models in equating for tests composed of testlets. Further, results of the study indicate that when sample size is large, item response theory models performed similarly to testlet response theory models across all studies.


Author(s):  
Murtuza Shergadwala ◽  
Karthik N. Kannan ◽  
Jitesh H. Panchal

Research on expertise in design has focused primarily on understanding expert-novice differences. Although it is well established that experts perform better than novices, there is a lack of formal methods to quantify the potential impact of expertise on the quality of design outcomes. The research question addressed in this paper is: How can the impact of expertise on the quality of design solutions be quantified? Quantifying such impacts can be of particular importance in product development, recruitment processes and design competitions. We utilize an approach based on Item Response Theory (IRT) and Concept Inventories (CI) for expertise quantification. We then investigate and validate the impact of expertise on solution quality through a behavioral experiment involving a track design problem. The results highlight the usefulness of the proposed approach and provide a functional relationship between expertise and solution quality. We also observe behavioral differences between participants with varying scores on a test taken in the behavioral experiment. The proposed approach could be used in the future work to quantify learning.


Sign in / Sign up

Export Citation Format

Share Document