Estimating Item Difficulty

Author(s):  
David Andrich ◽  
Ida Marais
Keyword(s):  
2020 ◽  
Vol 36 (4) ◽  
pp. 554-562
Author(s):  
Alica Thissen ◽  
Frank M. Spinath ◽  
Nicolas Becker

Abstract. The cube construction task represents a novel format in the assessment of spatial ability through mental cube rotation tasks. Instead of selecting the correct answer from several response options, respondents construct their own response in a computerized test environment, leading to a higher demand for spatial ability. In the present study with a sample of 146 German high-school students, we tested an approach to manipulate the item difficulties in order to create items with a greater difficulty range. Furthermore, we compared the cube task in a distractor-free and a distractor-based version while the item stems were held identical. The average item difficulty of the distractor-free format was significantly higher than in the distractor-based format ( M = 0.27 vs. M = 0.46) and the distractor-free format showed a broader range of item difficulties (.02 ≤  pi ≤ .95 vs. .37 ≤  pi ≤ .63). The analyses of the test results also showed that the distractor-free format had a significantly higher correlation with a broad intelligence test ( r = .57 vs. r = .17). Reasons for the higher convergent validity of the distractor-free format (prevention of response elimination strategies and the broader range of item difficulties) and further research possibilities are discussed.


2012 ◽  
Author(s):  
Victoria Blanshteyn ◽  
Charles A. Scherbaum
Keyword(s):  

2010 ◽  
Vol 65 (4) ◽  
pp. 257-282
Author(s):  
전유아 ◽  
신택수

2008 ◽  
Vol 30 (1) ◽  
pp. 105
Author(s):  
Christopher Weaver ◽  
Yoko Sato

This empirical study introduces population targeting and cut-off point targeting as a systematic approach to evaluating the performance of items in the English section of university entrance examinations. Using Rasch measurement theory, we found that the item difficulty and the types of items in a series of national university entrance examinations varied considerably over a 4-year period. However, there was progress towards improved test performance in terms of an increased number of items assessing different language skills and content areas as well as an increased number targeting test takers’ knowledge of English. This study also found that productive items rather than receptive items better targeted test takers’ overall knowledge of English. Moreover, productive items were more consistently located around the probable cut point for university admissions. The paper concludes with a detailed account of a number of probable factors that could influence item performance, such as the use of rating scales. 本論文では、ある国立大学における大学入試の英語の問題の変化を実証的に検証したものである。テスト項目の結果を検証するための体系的なアプローチとして、「母集団を対象としたアプローチ」および「足きり点を対象としたアプローチ」という方法を導入した。ラッシュ・モデリングを用いて分析した結果、過去4年間の間に、項目の困難度および項目の型について、様々な技能を測定していること、内容も多様であること、英語の知識を検証している項目が増えたこと、などの点で大きく変化していることがわかった。さらに、産出能力の方が受容能力を測定する項目よりも入学者決定の際の足きり点の周辺に収束する傾向が見られた。項目ごとの成績に影響を及ぼす可能性のある多様な要因について詳細な検討を行った。


Sign in / Sign up

Export Citation Format

Share Document