scholarly journals Determining the Length of a Criterion-Referenced Test

1980 ◽  
Vol 4 (4) ◽  
pp. 425-446 ◽  
Author(s):  
Rand R. Wilcox
1984 ◽  
Vol 9 (3) ◽  
pp. 237-251 ◽  
Author(s):  
Karl Josef Klauer

This article presents two criterion-referenced grading models developed in Germany, and shows how they can be applied in the classroom setting. The models are a binomial grading model ( Lindner, 1980 ) and an arcsine grading model ( Klauer, 1982 ). In both cases, one can assign grades on the basis of a criterion-referenced test with a systematic control of the possible errors of misclassification.


1977 ◽  
Vol 21 (4) ◽  
pp. 353-357
Author(s):  
Gerald J. Laabs ◽  
Robert C. Panell

A criterion-referenced test keyed to an individualized, self-paced instruction program was developed as part of a diagnostic testing/shipboard training system. The job-based test described hypothetical situations that were based on known job requirements. Under each such situation, questions were asked that required the demonstration of skills and knowledges that supported the job and were covered in the various modules of the instruction program. High face and content validity were ensured by using cards, charts, diagrams, and illustrations in presenting each job situation; and by having job experts write test items. Items were selected that best discriminated between graduates and nongraduates of a shorebased program using the same instruction. The cutoff score for the set of items pertaining to each module was determined from the performance of graduates of the shorebased program and applied to a cross-validation sample. Test scores from two administrations were used to estimate the reliability of the diagnostic decision making.


1981 ◽  
Vol 51 (3) ◽  
pp. 379-402 ◽  
Author(s):  
Wim J. van der Linden

Since Cox and Vargas (1966) introduced their pretest-posttest validity index for criterion-referenced test items, a great number of additions and modifications have followed. All are based on the idea of gain scoring; that is, they are computed from the differences between proportions of pretest and posttest item responses. Although the method is simple and generally considered as the prototype of criterion-referenced item analysis, it has many and serious disadvantages. Some of these go back to the fact that it leads to indices based on a dual test administration- and population-dependent item p values. Others have to do with the global information about the discriminating power that these indices provide, the implicit weighting they suppose, and the meaningless maximization of posttest scores they lead to. Analyzing the pretest-posttest method from a latent trait point of view, it is proposed to replace indices like Cox and Vargas’ Dpp by an evaluation of the item information function for the mastery score. An empirical study was conducted to compare the differences in item selection between both methods.


Author(s):  
N uman S. Al-Musawi

The purpose of the study was to develop a criterion-referenced test to measurestudent's achievement in educational evaluation using item response theory. To achieve this goal, the author constructed a 3-option multiple-choice achievement test of 48 items that was later administered to 348 students enrolled at the University of Bahrain. The findings of study revealed that the students' responses to 31 items fit the Rasch model assumptions while 17 items did not fit the model. All items of the final version of the test, however, were located within the range of the model's infit and outfit indicators. Also, the reliability estimates for persons and items were .87 and .93, respectively, indicating a high reliability of the test, and the maximum information extracted from the three-option test is obtained at the average ability levels. Based on these results, the author recommends using the developed test as a reliable measure of the level of university student's achievement in the subject of educational evaluation


Sign in / Sign up

Export Citation Format

Share Document