scholarly journals Quantitatively ranking incorrect responses to multiple-choice questions using item response theory

Author(s):  
Trevor I. Smith ◽  
Kyle J. Louis ◽  
Bartholomew J. Ricci ◽  
Nasrine Bendjilali
Author(s):  
Adam J. Berinsky ◽  
Michele F. Margolis ◽  
Michael W. Sances ◽  
Christopher Warshaw

Abstract Inattentive respondents introduce noise into data sets, weakening correlations between items and increasing the likelihood of null findings. “Screeners” have been proposed as a way to identify inattentive respondents, but questions remain regarding their implementation. First, what is the optimal number of Screeners for identifying inattentive respondents? Second, what types of Screener questions best capture inattention? In this paper, we address both of these questions. Using item-response theory to aggregate individual Screeners we find that four Screeners are sufficient to identify inattentive respondents. Moreover, two grid and two multiple choice questions work well. Our findings have relevance for applied survey research in political science and other disciplines. Most importantly, our recommendations enable the standardization of Screeners on future surveys.


2020 ◽  
Vol 78 (4) ◽  
pp. 576-594
Author(s):  
Bing Jia ◽  
Dan He ◽  
Zhemin Zhu

The quality of multiple-choice questions (MCQs) as well as the student's solve behavior in MCQs are educational concerns. MCQs cover wide educational content and can be immediately and accurately scored. However, many studies have found some flawed items in this exam type, thereby possibly resulting in misleading insights into students’ performance and affecting important decisions. This research sought to determine the characteristics of MCQs and factors that may affect the quality of MCQs by using item response theory (IRT) to evaluate data. For this, four samples of different sizes from US and China in secondary and higher education were chosen. Item difficulty and discrimination were determined using item response theory statistical item analysis models. Results were as follows. First, only a few guessing behaviors are included in MCQ exams because all data fit the two-parameter logistic model better than the three-parameter logistic model. Second, the quality of MCQs depended more on the degree of training of examiners and less on middle or higher education levels. Lastly, MCQs must be evaluated to ensure that high-quality items can be used as bases of inference in middle and higher education. Keywords: higher education, item evaluation, item response theory, multiple-choice test, secondary education


2021 ◽  
pp. 097226292110019
Author(s):  
Isha Bajaj ◽  
Mandeep Kaur

It is important to have knowledge about financial products and services to make rational financial decisions. Financial knowledge is a wider term and hence difficult to measure. The previous studies have used various methods and instruments to measure it. But there is a need of comprehensive and validated instrument to measure the financial knowledge. In this study, an attempt has been made to measure financial knowledge using a scale consisting of multiple-choice questions on basic and specific financial knowledge related with banking products and services. Each correct answer has been scored ‘1’ and each wrong answer has been scored ‘0’. This dichotomous scale has been validated using Item Response Theory. The theory focuses on the appropriateness of the questions (items) included in the scale with respect to difficulty and discriminability. The results reveal that the overall instrument fulfils both the criteria. The test consisting of twenty-two items is reliable as well as valid with the discrimination index having all positive values ranging from 0.23 to 1.96 and the difficulty index ranging from –5.66 to 0.90. The purpose of this article is to encourage the usage of validated scales for the measurement of financial knowledge and discourage the perplexity in the field of financial knowledge and financial literacy.


Author(s):  
Andre F. De Champlain ◽  
Andre-Philippe Boulais ◽  
Andrew Dallas

Purpose: The aim of this research was to compare different methods of calibrating multiple choice question (MCQ) and clinical decision making (CDM) components for the Medical Council of Canada’s Qualifying Examination Part I (MCCQEI) based on item response theory. Methods: Our data consisted of test results from 8,213 first time applicants to MCCQEI in spring and fall 2010 and 2011 test administrations. The data set contained several thousand multiple choice items and several hundred CDM cases. Four dichotomous calibrations were run using BILOG-MG 3.0. All 3 mixed item format (dichotomous MCQ responses and polytomous CDM case scores) calibrations were conducted using PARSCALE 4. Results: The 2-PL model had identical numbers of items with chi-square values at or below a Type I error rate of 0.01 (83/3,499 or 0.02). In all 3 polytomous models, whether the MCQs were either anchored or concurrently run with the CDM cases, results suggest very poor fit. All IRT abilities estimated from dichotomous calibration designs correlated very highly with each other. IRT-based pass-fail rates were extremely similar, not only across calibration designs and methods, but also with regard to the actual reported decision to candidates. The largest difference noted in pass rates was 4.78%, which occurred between the mixed format concurrent 2-PL graded response model (pass rate= 80.43%) and the dichotomous anchored 1-PL calibrations (pass rate= 85.21%). Conclusion: Simpler calibration designs with dichotomized items should be implemented. The dichotomous calibrations provided better fit of the item response matrix than more complex, polytomous calibrations.


Sign in / Sign up

Export Citation Format

Share Document