scholarly journals Fairness and Ethics in Multiple Choice (MC) Scoring: An Empirical Study

2019 ◽  
Vol 5 (1) ◽  
pp. 85-97
Author(s):  
George S. Ypsilandis ◽  
Anna Mouti

One among the main concerns of language testers in the design and implementation of tests is selecting the method of scoring for the tool used to perform the evaluation. This attribute indirectly reveals the tester’s ethical beliefs and personal stance on testing pedagogy. This is another study challenging the typical 1-0 method of scoring in Multiple Choice Tests (MCT) and implements, for experimental purposes, a simple polychotomous partial-credit scoring system on official tests administered for the National Foreign Language Exam System in Greece (NFLES-Gr). The study comes in support of earlier findings on the subject by the same authors in analogous smaller-scale studies. The MCT items chosen were completed by a total of 1,922 subjects in different levels of the NFLES-Gr test for Italian as an L2 in Greece. Results clearly indicate that the tested scoring procedure provides refined insights into students’ interlanguage levels, enhances sensitivity in scoring procedures, and may provide significant differences for testees found to be close to the pass/non-pass borderline without jeopardizing test reliability.

Author(s):  
Robert Lado

Discussions of the testing of proficiency to write a foreign language are usually limited to techniques; and without a rationale or set criteria of what is to be tested, the result is confusion. Partly as a consequence of the lack of a rationale we are faced with a dearth of techniques in use. Essentially we find only two: objective short answer tests, which are distrusted, and composition tests, which are frustrating because of problems of scoring and the time involved.Superficial clichés are freely applied to these two techniques. Judgments are made on outward appearances — face validity — without reference to linguistic content or to empirically tested validity. On the basis of appearance, objective tests are criticized because presumably (1) they do not force the student to think, (2) they do not require that the student organize and present information, (3) they are only recognition, multiple choice tests, (4) they are considered elementary in comparison with the business of writing a free composition in the foreign language.


1966 ◽  
Vol 19 (3_suppl) ◽  
pp. 1239-1243 ◽  
Author(s):  
Donald W. Zimmerman ◽  
Richard H. Williams ◽  
Graham J. Burkheimer

An equation is derived which expresses test reliability as a function of number of item alternatives for the case in which only error due to guessing is present. This result is compared with the modified Spearman-Brown equation given by H. H. Remmers and his associates. Reliability coefficients predicted by these equations are compared with coefficients generated by a computer simulation method.


Author(s):  
Muhammad Yani

The aim of this study is to determine the effectiveness of the distractor on multiple choice tests to detect the students’ errors and the types of errors students made in solving math problems. This research is a descriptive qualitative study which research subjects were students of class IX MTsN Model Banda Aceh that consisted of 36 students, then three students were selected as subjects to be interviewed about mastery and mistakes made when completing math problems. Data were collected through tests and interviews which validity was used to test the credibility of the data by means of triangulation. Data analysis consisted of the stages of data reduction, data presentation, and conclusion drawing. The results showed that: 1) A good distractor on multiple choice tests had been completed such as a description test question for each answer option chosen by the students was very effective for detecting students’ errors in solving mathematics problems, in which 17.1% of students incorrectly applied mathematical concepts , 3.7% of students made mistakes due to lack of accuracy or error, 1.2% of students misunderstood the problem, and 4.7% of students did not make a solution, but only guessed from the ten questions given. 2) Conceptual errors made by the three subjects (MAH, BSR, and PES) were errors in analyzing the combined area of space and errors in algebraic factorization. Errors in understanding and applying the difference formula were only done by the subject MAH, errors in determining the square root of a number were only done by the subject BSR, and errors in squaring a number were only done by the subject PES. While procedural errors due to lack of thoroughness or error in substituting a value were carried out by subject MAH and PES, errors due to the inability to manipulate steps to solve a problem were only carried out by the subject MAH, and errors in writing the final results were only done by the BSR subject.


Seminar.net ◽  
2010 ◽  
Vol 6 (3) ◽  
Author(s):  
Bjørn Klefstad ◽  
Geir Maribu ◽  
Svend Andreas Horgen ◽  
Thorleif Hjeltnes

The use of digital multiple-choice tests in formative and summative assessment has many advantages. Such tests are effective, objective, and flexible. However, it is still challenging to create tests that are valid and reliable. Bloom’s taxonomy is used as a framework for assessment in higher education and therefore has a great deal of influence on how the learning outcomes are formulated. Using digital tools to create tests has been common for some time, yet the tests are still mostly answered on paper. Our hypothesis has two parts: first, it is possible to create summative tests that match different levels and learning outcomes within a chosen subject; second, a test tool of some kind is necessary to enable teachers and examiners to take a more proactive attitude to(wards) different levels and learning outcomes in a subject and so ensure the quality of digital test designing. Based on an analysis of several digital tests we examine to what degree learning outcomes and levels are reflected in the different test questions. We also suggest functionality for a future test tool to support an improved design process.


2010 ◽  
Vol 1 (4) ◽  
pp. 32-41 ◽  
Author(s):  
E. Serradell-Lopez ◽  
P. Lara ◽  
D. Castillo ◽  
I. González

The purpose of this paper is to determine the effectiveness of using multiple choice tests in subjects related to the administration and business management. To this end the authors used a multiple-choice test with specific questions to verify the extent of knowledge gained and the confidence and trust in the answers. The analysis made, conducted by tests given out to a group of 200 students, has been implemented in one subject related with investment analysis and has measured the level of knowledge gained and the degree of trust and security in the responses at two different times of the business administration and management course. Measurements were taken into account at different levels of difficulty in the questions asked and the time spent by students to complete the test. Results confirm that students are generally able to obtain more knowledge along the way and get increases in the degree of trust and confidence. It is estimated that improvement in skills learned is viewed favourably by businesses and are important for job placement. Finally, the authors proceed to analyze a multi-choice test using a combination of knowledge and confidence levels.


Author(s):  
Septiwi Hadi Lubis And Elia Masa Gintings

This study deals with improving students’ reading comprehension by using Index Card Match strategy. This study was conducted by using classroom action research. The subject of the research was class VIIa MTs.Lab. IKIP Al- Washliyah Medan which consisted of 36 students. The research was conducted in two cycles and every cycle consists of four meetings and two meetings. The instruments for collecting data were multiple choice tests as the quantitative data and observation sheet, questionnaire sheet and diary notes as the qualitative data. Based on multiple choice test score, students’ score kept improving in every test. In test I, the mean score was 63.05, in test II, the mean score was 67.63, and in test III the mean score was 72.77. The finding of the study indicated that the implementation of Index card Match strategy was successful since the criteria of success were achieved. The result of observation sheet and questionnaire sheet showed that students were active involved in the classroom. Based on the finding before, it’s concluded that Index Card Match strategy could improve students’ reading comprehension.


1968 ◽  
Author(s):  
J. Brown Grier ◽  
Raymond Ditrichs

Sign in / Sign up

Export Citation Format

Share Document