scholarly journals The Advantages of Five-Option Multiple-Choice Items in Classroom Tests of Student Mastery

2020 ◽  
Vol 2 (4) ◽  
pp. p59
Author(s):  
Michael Joseph Wise

The effectiveness of multiple-choice (MC) items depends on the quality of the response options—particularly how well the incorrect options (“distractors”) attract students who have incomplete knowledge. It is often contended that test-writers are unable to devise more than two plausible distractors for most MC items, and that the effort needed to do so is not worthwhile in terms of the items’ psychometric qualities. To test these contentions, I analyzed students’ performance on 545 MC items across six science courses that I have taught over the past decade. Each MC item contained four distractors, and the dataset included more than 19,000 individual responses. All four distractors were deemed plausible in one-third of the items, and three distractors were plausible in another third. Each increase in plausible distractor led to an average of a 13% increase in item difficulty. Moreover, an increase in plausible distractors led to a significant increase in the discriminability of the items, with a leveling off by the fourth distractor. These results suggest that—at least for teachers writing tests to assess mastery of course content—it may be worthwhile to eschew recent skepticism and continue to attempt to write MC items with three or four distractors.

2019 ◽  
Vol 44 (1) ◽  
pp. 33-48
Author(s):  
Daniel M. Bolt ◽  
Nana Kim ◽  
James Wollack ◽  
Yiqin Pan ◽  
Carol Eckerly ◽  
...  

Discrete-option multiple-choice (DOMC) items differ from traditional multiple-choice (MC) items in the sequential administration of response options (up to display of the correct option). DOMC can be appealing in computer-based test administrations due to its protection of item security and its potential to reduce testwiseness effects. A psychometric model for DOMC items that attends to the random positioning of key location across different administrations of the same item is proposed, a feature that has been shown to affect DOMC item difficulty. Using two empirical data sets having items administered in both DOMC and MC formats, the variability in key location effects across both items and persons is considered. The proposed model exploits the capacity of the DOMC format to isolate both (a) distinct sources of item difficulty (i.e., related to the identification of keyed responses versus the ruling out of distractor options) and (b) distinct person proficiencies related to the same two components. Practical implications in terms of the randomized process applied to schedule item key location in DOMC test administrations are considered.


2022 ◽  
Author(s):  
Achmad Shabir

The aim of this study was to describe the quality of English testing intrument used in Try Out National Exam conducted by 40 Junior High Schools in Makassar-Sulawesi Selatan, using Item Response Theory (IRT) especially based on one (1PL), two (2PL), and three (3PL) parameters models. The data consist of 1.267 student’s answer sheets and the test has 50 multiple choice items. Results showed that the test is preferably good at both item difficulty and item dicrimination as suggest by 1PL and 2PL estimation. But at 3PL estimation, the test unable to discriminate students ability, while 38 % of the items were easy to guess.


2020 ◽  
Vol 3 (1) ◽  
pp. 102-113
Author(s):  
Sutami

This research aims to produce a valid and reliable Indonesian language assessment instrument in form of HOTS test items and it describes the quality of HOTS test items to measure HOTS skill for the tenth grade of SMA and SMK students. This study was a research and development study adapted from Borg & Gall’s development model, including the following steps: research and information collection, planning, early product development, limited try out, revising the early product, field try out, and revising the final product. The research’s result shows that the HOTS assessment instrument in the form of HOTS test consists of 40 multiple choice items and 5 essay test items. Based on the judgment of the materials, construction, and language was valid and appropriate to be used. The reliability coefficients were 0.88 for the multiple-choice items, and 0.79 for essays. The multiple-choice items have the average difficulty 0.57 (average), the average of item discrimination 0.44 (good), and the distractors function well. The essay items have the average of item difficulty 0.60 (average) and the average of item discrimination 0.45 (good)


2018 ◽  
Vol 13 (4) ◽  
pp. 633-648
Author(s):  
Kobi Peled

A striking feature of Palestinian oral history projects is the extensive use that interviewees make of direct speech to communicate their memories—especially those born before the 1948 Arab–Israeli war. They do so irrespective of whether or not they participated in or actually heard the dialogues they wish to convey. This article seeks to characterize and explain this phenomenon. In the interviews conducted by the author—an Arabic-speaking Jew—as well as in other projects, this mode of speech is marked by ease of transition from character to character and between different points in time. It clearly gives pleasure to those engaged in the act of remembering, and it grades readily into a theatrical performance in which tone of speech and the quality of the acting become the main thing. This form of discourse sprang up from the soil of a rural oral culture and still flourishes as a prop for supporting memory, a vessel for collecting and disseminating stories, and a technique for expressing identification with significant figures from the past.


2021 ◽  
pp. 9-10
Author(s):  
Bhoomika R. Chauhan ◽  
Jayesh Vaza ◽  
Girish R. Chauhan ◽  
Pradip R. Chauhan

Multiple choice questions are nowadays used in competitive examination and formative assessment to assess the student's eligibility and certification.Item analysis is the process of collecting,summarizing and using information from students' responses to assess the quality of test items.Goal of the study was to identify the relationship between the item difficulty index and item discriminating index in medical student's assessment. 400 final year medical students from various medical colleges responded 200 items constructed for the study.The responses were assessed and analysed for item difficulty index and item discriminating power. Item difficulty index an item discriminating power were analysed by statical methods to identify correlation.The discriminating power of the items with difficulty index in 40%-50% was the highest. Summary and Conclusion:Items with good difficulty index in range of 30%-70% are good discriminator.


1979 ◽  
Vol 1 (2) ◽  
pp. 24-33 ◽  
Author(s):  
James R. McMillan

Most educators agree that classroom evaluation practices need improvement. One way to improve testing is to use high-quality objective multiple-choice exams. Almost any understanding or ability which can be tested by another test form can also be tested by means of multiple-choice items. Based on a survey of 173 respondents, it appears that marketing teachers are disenchanted with multiple-choice questions and use them sparingly. Further, their limited use is largely in the introductory marketing course even though there are emerging pressures for universities to take a closer look at the quality of classroom evaluation at all levels.


Author(s):  
David DiBattista ◽  
Laura Kurzawa

Because multiple-choice testing is so widespread in higher education, we assessed the quality of items used on classroom tests by carrying out a statistical item analysis. We examined undergraduates’ responses to 1198 multiple-choice items on sixteen classroom tests in various disciplines. The mean item discrimination coefficient was +0.25, with more than 30% of items having unsatisfactory coefficients less than +0.20. Of the 3819 distractors, 45% were flawed either because less than 5% of examinees selected them or because their selection was positively rather than negatively correlated with test scores. In three tests, more than 40% of the items had an unsatisfactory discrimination coefficient, and in six tests, more than half of the distractors were flawed. Discriminatory power suffered dramatically when the selection of one or more distractors was positively correlated with test scores, but it was only minimally affected by the presence of distractors that were selected by less than 5% of examinees. Our findings indicate that there is considerable room for improvement in the quality of many multiple-choice tests. We suggest that instructors consider improving the quality of their multiple-choice tests by conducting an item analysis and by modifying distractors that impair the discriminatory power of items. Étant donné que les examens à choix multiple sont tellement généralisés dans l’enseignement supérieur, nous avons effectué une analyse statistique des items utilisés dans les examens en classe afin d’en évaluer la qualité. Nous avons analysé les réponses des étudiants de premier cycle à 1198 questions à choix multiples dans 16 examens effectués en classe dans diverses disciplines. Le coefficient moyen de discrimination de l’item était +0.25. Plus de 30 % des items avaient des coefficients insatisfaisants inférieurs à + 0.20. Sur les 3819 distracteurs, 45 % étaient imparfaits parce que moins de 5 % des étudiants les ont choisis ou à cause d’une corrélation négative plutôt que positive avec les résultats des examens. Dans trois examens, le coefficient de discrimination de plus de 40 % des items était insatisfaisant et dans six examens, plus de la moitié des distracteurs était imparfaits. Le pouvoir de discrimination était considérablement affecté en cas de corrélation positive entre un distracteur ou plus et les résultatsde l’examen, mais la présence de distracteurs choisis par moins de 5 % des étudiants avait une influence minime sur ce pouvoir. Nos résultats indiquent que les examens à choix multiple peuvent être considérablement améliorés. Nous suggérons que les enseignants procèdent à une analyse des items et modifient les distracteurs qui compromettent le pouvoir de discrimination des items.


1996 ◽  
Vol 83 (3_suppl) ◽  
pp. 1127-1138
Author(s):  
Horacio J. A. Rimoldi ◽  
Elsa Inés Bei de Libonatti

The performance of 30 subjects when solving problems built around logical connectives (Conjunction, Inclusive Disjunction, Exclusive Disjunction, Conditional, and Biconditional) was compared with the results obtained when the same logical connectives were presented using a multiple-choice approach. The processes followed by the subjects in solving 20 problems were evaluated in terms of information theory, making it possible to investigate (a) the processes followed by the subjects and (b) the quality of the final answer Analysis indicated that the problem-solving processes do not necessarily provide the same information as that obtained by the final answers. The knowledge obtained by examining the questions subjects ask is different from the knowledge obtained when examining the answers to multiple-choice items.


Sign in / Sign up

Export Citation Format

Share Document