The effect of response order on candidate viewing behaviour and item difficulty in a multiple-choice listening test

Studies from various disciplines have reported that spatial location of options in relation to processing order impacts the ultimate choice of the option. A large number of studies have found a primacy effect, that is, the tendency to prefer the first option. In this paper we report on evidence that position of the key in four-option multiple-choice (MC) listening test items may affect item difficulty and thereby potentially introduce construct-irrelevant variance. Two sets of analyses were undertaken. With Study 1 we explored 30 test takers’ processing via eye-tracking on listening items from the Aptis Test. An unexpected finding concerned the amount of processing undertaken on different response options on the MC questions, given their order. Based on this, in Study 2 we looked at the direct effect of key position on item difficulty in a sample of 200 live Aptis items and around 6000 test takers per item. The results suggest that the spatial location of the key in MC listening tests affects the amount of processing it receives and the item’s difficulty. Given the widespread use of MC tasks in language assessments, these findings seem crucial, particularly for tests that randomize response order. Candidates who by chance have many keys in last position might be significantly disadvantaged.

Download Full-text

On the Impact of the Response Options’ Position on Item Difficulty in Multiple-Choice-Items

European Journal of Psychological Assessment ◽

10.1027/1015-5759/a000615 ◽

2020 ◽

pp. 1-10

Author(s):

Bettina Hagenmüller

Keyword(s):

Large Scale ◽

Item Difficulty ◽

Multiple Choice ◽

Test Construction ◽

Test Model ◽

Group Setting ◽

Response Options ◽

Large Scale Assessment ◽

Multiple Choice Item ◽

The Impact

Abstract. The multiple-choice item format is widely used in test construction and Large-Scale Assessment. So far, there has been little research on the impact of the position of the solution among the response options and the few existing results are even inconsistent. Since it would be an easy way to create parallel items for group setting by altering the response options, the influence of the response options’ position on item difficulty should be examined. The Linear Logistic Test Model ( Fischer, 1972 ) was used to analyze the data of 829 students aged 8–20 years, who worked on general knowledge items. It was found that the position of the solution among the response options has an influence on item difficulty. Items are easiest when the solution is in first place and more difficult when the solution is placed in a middle position or at the end of the set of response options.

Download Full-text

THE ABILITY OF THE ELEVENTH GRADE STUDENTS OF SMA DELI MURNI BANDAR BARU ON USING COUNTABLE ANDUNCOUNTABLE NOUNS IN THE ACADEMIC YEAR OF 2020/2021

Kairos English Language Teaching Journal ◽

10.54367/kairos.v5i1.1443 ◽

2021 ◽

pp. 35-47

Author(s):

Tesalonika Br Karo ◽

Viator Lumbanraja ◽

Novalina Sembiring

Keyword(s):

Item Difficulty ◽

Multiple Choice ◽

English Teachers ◽

Test Items ◽

High Category ◽

The Mean ◽

English Students ◽

Academic Year

The purpose of this research is to describe the ability of the eleventh-grade students of SMA Deli Murni Bandar Baru on using Countable and Uncountable Nouns. The population of this research was the eleventh-grade students, with 58 students taken as sample. The instrument of collecting data is a test concerning Countable and Uncountable Nouns. The tryout test was done to know the validity, reliability, item difficulty of test items. The result showed that 5 students (15 %) belong to the high category, 24 students (73 %) to the moderate category, and 4 students (12 %) to the low category. The mean score was 61,39 it was only 24 % of the total students who can do the test well with 12 students who get a score above 75, it means that the eleventh-grade students of SMA Deli Murni Bandar Baru are not yet able to use Countable and Uncountable Nouns. Based on the total incorrect answers made by the students in using countable and uncountable was 502. The percentage of studentsâ€™ mistakes made by students in uncountable multiple choice including indefinite and quantifier uncountable was 33 %, in countable multiple choice including singular, regular, irregular countable 34%, in the countable essay including regular and irregular countable was 33%. Based on the findings and conclusions, some suggestions are offered to English teachers, English students, and other researchers. Especially to English teachers, who teach in school, are advised to improve students' ability to use countable and uncountable nouns.

Download Full-text

The Advantages of Five-Option Multiple-Choice Items in Classroom Tests of Student Mastery

Journal of Education, Teaching and Social Studies ◽

10.22158/jetss.v2n4p59 ◽

2020 ◽

Vol 2 (4) ◽

pp. p59

Author(s):

Michael Joseph Wise

Keyword(s):

Item Difficulty ◽

Multiple Choice ◽

Response Options ◽

Course Content ◽

Science Courses ◽

The Past ◽

Multiple Choice Items ◽

Writing Tests ◽

Do So

The effectiveness of multiple-choice (MC) items depends on the quality of the response options—particularly how well the incorrect options (“distractors”) attract students who have incomplete knowledge. It is often contended that test-writers are unable to devise more than two plausible distractors for most MC items, and that the effort needed to do so is not worthwhile in terms of the items’ psychometric qualities. To test these contentions, I analyzed students’ performance on 545 MC items across six science courses that I have taught over the past decade. Each MC item contained four distractors, and the dataset included more than 19,000 individual responses. All four distractors were deemed plausible in one-third of the items, and three distractors were plausible in another third. Each increase in plausible distractor led to an average of a 13% increase in item difficulty. Moreover, an increase in plausible distractors led to a significant increase in the discriminability of the items, with a leveling off by the fourth distractor. These results suggest that—at least for teachers writing tests to assess mastery of course content—it may be worthwhile to eschew recent skepticism and continue to attempt to write MC items with three or four distractors.

Download Full-text

A Psychometric Model for Discrete-Option Multiple-Choice Items

Applied Psychological Measurement ◽

10.1177/0146621619835499 ◽

2019 ◽

Vol 44 (1) ◽

pp. 33-48

Author(s):

Daniel M. Bolt ◽

Nana Kim ◽

James Wollack ◽

Yiqin Pan ◽

Carol Eckerly ◽

...

Keyword(s):

Item Difficulty ◽

Multiple Choice ◽

Data Sets ◽

Response Options ◽

Proposed Model ◽

Sequential Administration ◽

Computer Based ◽

Multiple Choice Items ◽

Practical Implications ◽

Psychometric Model

Discrete-option multiple-choice (DOMC) items differ from traditional multiple-choice (MC) items in the sequential administration of response options (up to display of the correct option). DOMC can be appealing in computer-based test administrations due to its protection of item security and its potential to reduce testwiseness effects. A psychometric model for DOMC items that attends to the random positioning of key location across different administrations of the same item is proposed, a feature that has been shown to affect DOMC item difficulty. Using two empirical data sets having items administered in both DOMC and MC formats, the variability in key location effects across both items and persons is considered. The proposed model exploits the capacity of the DOMC format to isolate both (a) distinct sources of item difficulty (i.e., related to the identification of keyed responses versus the ruling out of distractor options) and (b) distinct person proficiencies related to the same two components. Practical implications in terms of the randomized process applied to schedule item key location in DOMC test administrations are considered.

Download Full-text

Pengembangan Instrumen Asesmen Higher Order Thinking Skills (HOTS) pada Mata Pelajaran Bahasa Indonesia SMA dan SMK

DIGLOSIA Jurnal Kajian Bahasa Sastra dan Pengajarannya ◽

10.30872/diglosia.v3i1.24 ◽

2020 ◽

Vol 3 (1) ◽

pp. 102-113

Author(s):

Sutami

Keyword(s):

Item Difficulty ◽

Multiple Choice ◽

Thinking Skills ◽

Assessment Instrument ◽

Item Discrimination ◽

Test Items ◽

Tenth Grade ◽

Multiple Choice Items ◽

Essay Test

This research aims to produce a valid and reliable Indonesian language assessment instrument in form of HOTS test items and it describes the quality of HOTS test items to measure HOTS skill for the tenth grade of SMA and SMK students. This study was a research and development study adapted from Borg & Gall’s development model, including the following steps: research and information collection, planning, early product development, limited try out, revising the early product, field try out, and revising the final product. The research’s result shows that the HOTS assessment instrument in the form of HOTS test consists of 40 multiple choice items and 5 essay test items. Based on the judgment of the materials, construction, and language was valid and appropriate to be used. The reliability coefficients were 0.88 for the multiple-choice items, and 0.79 for essays. The multiple-choice items have the average difficulty 0.57 (average), the average of item discrimination 0.44 (good), and the distractors function well. The essay items have the average of item difficulty 0.60 (average) and the average of item discrimination 0.45 (good)

Download Full-text

Investigating the effects of multiple-choice listening test items in the oral versus written mode on L2 listeners' performance and perceptions

System ◽

10.1016/j.system.2013.06.001 ◽

2013 ◽

Vol 41 (3) ◽

pp. 575-586 ◽

Cited By ~ 8

Author(s):

Anna C.-S. Chang ◽

John Read

Keyword(s):

Multiple Choice ◽

Listening Test ◽

Test Items

Download Full-text

Revision of Guilford Formula to Correct Item Difficulty for Guessing in Multiple Choice Test Items

Journal of Educational and Psychological Studies [JEPS] ◽

10.24200/jeps.vol8iss2pp249-257 ◽

2014 ◽

Vol 8 (2) ◽

pp. 249

Author(s):

Ahmad S. Audeh

Keyword(s):

Risk Taking ◽

Item Difficulty ◽

Measurement Theory ◽

Multiple Choice ◽

Choice Test ◽

Correction Factors ◽

Partial Knowledge ◽

Test Items ◽

Multiple Choice Item ◽

New Formula

The original Guilford formula for estimation of multiple choice item difficulty was based on a penalty for guessing. This penalty was originally based on completely random or blind guessing, which means that it is purely based on mathematical estimation and on significantly violated assumptions. While authentic and fair estimation is expected to be based on mixed scoring formula which adds another correction factor to integrate measurement theory with decision theory based on partial knowledge and risk- taking behavior. A new formula with two correction factors related to guessing, partial knowledge and risk-taking is presented in this paper. Further studies are suggested for reviewing the validation of the main assumptions of item theory models.

Download Full-text