Item Order Affects Performance on Multiple-Choice Exams

1989 ◽  
Vol 16 (2) ◽  
pp. 75-77 ◽  
Author(s):  
William R. Balch

A total of 404 General Psychology students were assigned to one of three different item orders (sequential, chapter contiguity, and random) of the same final exam consisting of 75 multiple- choice questions. In the sequential (S) order exam, items appeared in the same sequence in which their supporting material was presented in the textbook and lectures. For the chapter contiguity (CC) order exam, items based on the same chapter appeared together, but were not sequentially arranged within or between chapters. The order of the third exam was random (R). Scores for the sequential order exam were higher than for the other two. There were no significant differences in the completion times for any of the exams.

1968 ◽  
Vol 23 (3_suppl) ◽  
pp. 1305-1306
Author(s):  
Jin Ong

T-score procedure was suggested in place of the rights only or corrected formula score for guessing. Multiple-choice tests were administered to 179 general psychology students. T scores using both rights only and corrected formula were calculated. Results showed that the empirical data fit the defined values of the T score quite well and that the T scores for all individual students from both procedures are comparable.


2011 ◽  
Vol 39 (1) ◽  
pp. 34-37 ◽  
Author(s):  
William R. Balch

On their first class day, introductory psychology students took a 14-question multiple-choice pretest on several principles of memory including primacy, recency, storage, retrieval, counterbalancing, and the free-recall method. I randomly preassigned students to come at one of two different times to the second class, 2 days later, when they either participated in a free-recall demonstration/debriefing or heard a lecture on comparable material. In the third class, five days later, they took a posttest identical to the pretest. On the posttest but not the pretest, students participating in the demonstration/debriefing significantly outperformed those hearing only the lecture.


1983 ◽  
Vol 52 (1) ◽  
pp. 203-208 ◽  
Author(s):  
John P. Houston

To estimate the self-evidence of basic principles of psychology, 50 UCLA introductory psychology students answered 21 multiple-choice questions each embodying one learning or memory phenomenon. 71% of the items were answered correctly more often than chance. The probability of an item being answered correctly was unrelated to the subjects' familiarity with the names of the phenomena and unrelated to professional psychologists' ratings of the importance of the phenomena. The possibility that we may spend an inordinate amount of item dealing with self-evident principles, because we do not seek outside evaluation of our work, is discussed.


1998 ◽  
Vol 25 (3) ◽  
pp. 181-185 ◽  
Author(s):  
William R. Balch

Two randomly assigned groups of introductory psychology students received different but comparable presentations of the same sample multiple-choice fined exam. The practice-exam group took a test on the questions and immediately afterward scored their tests according to the key (i.e., questions and answers), whereas the review-exam group saw only the key and performed a control task concurrently. On a final exam given I week later, the practice-exam students scored significantly higher than the review-exam group. In addition, they rated their task as more helpful in preparing them for the final. These effects did not interact with students' class standing. Apparently, students at all levels of academic ability benefit from an objective assessment of their preparation for a final exam.


2011 ◽  
Vol 39 (1) ◽  
pp. 38-41 ◽  
Author(s):  
Alisha L. Francis ◽  
Jerrold Barnett

We investigated Montepare’s (2005, 2007) self-correcting procedure for multiple-choice exams. Findings related to memory suggest this procedure should lead to improved retention by encouraging students to distribute the time spent reviewing the material. Results from a general psychology class ( n = 98) indicate that the benefits are not as definitive as expected. Students’ initial performance moderated the benefits of the procedure such that comprehensive final exam scores were significantly higher for the self-correcting condition when controlling for initial quiz performance, with a marginally significant interaction ( p = .06) between initial quiz scores and condition. The findings underscore the importance of using the scientist-educator model to evaluate pedagogical decisions while considering practical implications and return on investment.


2019 ◽  
Vol 8 (1) ◽  
pp. 57-68
Author(s):  
Anetha L. F. Tilaar ◽  
Hasriyanti Hasriyanti

Abstract The results of the initial interview, it was revealed that every time the final school exam was conducted, the questions tested had not been analyzed as a whole, so the quality of the questions was unknown. Therefore this study aims to analyze the odd semester final exam items in mathematics in SMP Negeri 4 Pineleng in the 2018/2019 school year in terms of reliability, validity, distinguishing features, level of difficulty and effectiveness of deceivers (specifically multiple choice questions). This type of research is a descriptive analysis with a quantitative approach. The number of subjects in this study were 97 students consisting of all VIII grade students of SMP Negeri 4 Pineleng in the 2018/2019 school year. The data collection technique used is documentation. The data is processed manually with the help of the excel program and uses the help of the Anatest Version 4 program specifically to analyze effectiveness. From the results of the analysis it can be concluded that the final semester exams for the mathematics students of SMP Negeri 4 Pineleng in the 2018/2019 school year: (1) Types of multiple choice questions, 5 questions or 16.67% have very good quality so they can be stored in the question bank for reuse , 15 questions or 50.00% still need to be revised and 10 questions or 33.33% have very poor quality so that they cannot be stored in the question bank. (2) The types of description questions, 2 questions or 40.00% have good quality, 2 questions or 40.00% need to be revised, 1 problem or 20.00% have bad quality.AbstrakHasil wanwancara awal, terungkap bahwa setiap pelaksanaan ujian akhir sekolah, soal yang diujikan belum dianalisis secara keseluruhan, sehingga kualitas soal tersebut tidak diketahui. Oleh karena itu penelitian ini bertujuan untuk menganalisis butir soal ujian akhir semester ganjil mata pelajaran matematika SMP Negeri 4 Pineleng tahun ajaran 2018/2019 ditinjau dari reliabilitas, validitas, daya pembeda soal, tingkat kesukaran dan efektifitas pengecoh (khusus soal pilihan ganda). Jenis penelitian ini adalah analisis deskriptif dengan pendekatan kuantitatif. Jumlah subjek dalam penelitian ini adalah 97 siswa terdiri dari seluruh siswa kelas VIII SMP Negeri 4 Pineleng tahun ajaran 2018/2019. Teknik pengumpulan data yang digunakan adalah dokumentasi. Data diolah secara manual dengan bantuan program excel dan menggunakan bantuan program Anatest Versi 4 khusus untuk menganalisis efektifitas. Dari hasil analisis dapat disimpulkan bahwa ujian akhir semester ganjil mata pelajarn matematika SMP Negeri 4 Pineleng tahun ajaran 2018/2019 : (1) Jenis soal pilihan ganda, 5 soal atau 16.67% memiliki kualitas sangat baik sehingga dapta disimpan di bank soal untuk dapat digunakan kembali, 15 soal atau 50.00% masih perlu direvisi dan 10 soal atau 33.33% memiliki kualitas sangat tidak baik sehingga tidak dapat disimpan di bank soal. (2) Jenis soal uraian, 2 soal atau 40.00% memiliki kualitas baik, 2 soal atau 40.00% perlu direvisi, 1 soal atau 20,00% meiliki kualitas tidak baik. 


2018 ◽  
Vol 2018 ◽  
pp. 1-12 ◽  
Author(s):  
Andreas Melzer ◽  
Ulrich Gergs ◽  
Josef Lukas ◽  
Joachim Neumann

Multiple-choice questions are widely used in clinical education. Usually, the students have to mark the one and only correct answer from a set of five alternatives. Here, in a voluntary exam, at the end of an obligatory pharmacology exam, we tested a format where more than one alternative could be correct (N=544 students from three year groups). Moreover, the students were asked to rate each item. The students were unaware how many correct answers were contained in the questions. Finally, a questionnaire had to be filled out about the difficulty of the new tests compared to the one out of five tests. In the obligatory final exam, all groups performed similarly. From the results, we conclude that the new rating scales were a better challenge and could be adapted to assess student knowledge and confidence in more depth than previous multiple-choice questions.


1994 ◽  
Vol 21 (1) ◽  
pp. 44-45 ◽  
Author(s):  
Darlene L. Neely ◽  
Frederick J. Springston ◽  
Stewart J. H. McCann

Balch (1989) reported that students scored higher on multiple-choice exams when item order was sequenced (S) than when it was random (R). To provide cross-validation and to determine whether test anxiety interacts with order formats, three experiments were conducted with a total of 253 introductory psychology students. In two experiments, students were randomly assigned S-or R-format tests. A third experiment used a within-subjects design and a two-part testing procedure with counterbalanced S and R formats. A significant Anxiety × Item Order interaction was detected in one experiment, but no evidence was found for superior performance on S-format tests in any of the three experiments. Concern with practical ramifications of Balch's findings may be unwarranted.


1986 ◽  
Vol 13 (4) ◽  
pp. 196-199 ◽  
Author(s):  
Anthony F. Nield ◽  
Maxine Gallander Wintre

Introductory Psychology students were graded on four tests using multiple-choice questions with an explicit option to explain their answers (E-option), and were later asked how they would compare this format with short answer, essay, true/false, fill-in-the-blank, and regular multiple-choice. Students rated the E-option and short-answer formats as most preferred, and less frustrating and anxiety producing than other formats (p < .05). Of 416 students, 173 used the E-option, averaging less than one explanation per test over the four tests. During the course, only 30 points were gained and 5 points lost due to E-option use. The E-option seems to be an efficient and humane technique for testing large classes.


1992 ◽  
Vol 19 (3) ◽  
pp. 136-141 ◽  
Author(s):  
William R. Balch

Ninety undergraduate introductory psychology students predicted their numerical scores on a multiple-choice final exam directly before the exam was passed out (pretest prediction) and just after completing the exam (posttest prediction). Based on their all-but-final-exam point totals, students were ranked with respect to class standing and categorized as above average (top third), average (middle third), or below average (bottom third). Below average students significantly overestimated their final exam scores on both pretest (9.47%) and posttest (7.73%) predictions. Average students significantly overestimated their scores on pretest (5.33%) but not posttest (2.13%) predictions. Above average students, however, were fairly accurate for both types of prediction, slightly but not significantly underestimating (about 2%) their exam scores.


Sign in / Sign up

Export Citation Format

Share Document