Determining the Length of a Criterion-Referenced Test

This article presents two criterion-referenced grading models developed in Germany, and shows how they can be applied in the classroom setting. The models are a binomial grading model ( Lindner, 1980 ) and an arcsine grading model ( Klauer, 1982 ). In both cases, one can assign grades on the basis of a criterion-referenced test with a systematic control of the possible errors of misclassification.

Download Full-text

التوافق بين النظرية التقليدية في القياس و نظرية استجابة الفقرة في مطابقة فقرات اختبار محكي المرجع في وحدة الهندسة التحليلية = The Compatibility between the Traditional Test Theory and the Item Response Theory in Item Stratification of Criterion Referenced Test

دراسات عربية في التربية و علم النفس ◽

10.12816/0030319 ◽

2016 ◽

pp. 189-215

Author(s):

باسل خميس سالم أبو فودة

Keyword(s):

Item Response Theory ◽

Item Response ◽

Test Theory ◽

Response Theory ◽

Traditional Test ◽

Criterion Referenced Test ◽

Criterion Referenced

Download Full-text

Development of a Job-Based Diagnostic Test in an Individualized Instruction Program for Boiler Technicians

Proceedings of the Human Factors Society Annual Meeting ◽

10.1177/107118137702100420 ◽

1977 ◽

Vol 21 (4) ◽

pp. 353-357

Author(s):

Gerald J. Laabs ◽

Robert C. Panell

Keyword(s):

Diagnostic Testing ◽

Training System ◽

Cutoff Score ◽

Job Requirements ◽

Test Items ◽

Diagnostic Decision ◽

Criterion Referenced Test ◽

Sample Test ◽

Criterion Referenced ◽

Diagnostic Decision Making

A criterion-referenced test keyed to an individualized, self-paced instruction program was developed as part of a diagnostic testing/shipboard training system. The job-based test described hypothetical situations that were based on known job requirements. Under each such situation, questions were asked that required the demonstration of skills and knowledges that supported the job and were covered in the various modules of the instruction program. High face and content validity were ensured by using cards, charts, diagrams, and illustrations in presenting each job situation; and by having job experts write test items. Items were selected that best discriminated between graduates and nongraduates of a shorebased program using the same instruction. The cutoff score for the set of items pertaining to each module was determined from the performance of graduates of the shorebased program and applied to a cross-validation sample. Test scores from two administrations were used to estimate the reliability of the diagnostic decision making.

Download Full-text

Short-cut estimators of criterion-referenced test consistency

Language Testing ◽

10.1177/026553229000700106 ◽

1990 ◽

Vol 7 (1) ◽

pp. 77-97 ◽

Cited By ~ 12

Author(s):

James Dean Brown

Keyword(s):

Criterion Referenced Test ◽

Criterion Referenced

Download Full-text

A Latent Trait Look at Pretest-Posttest Validation of Criterion-referenced Test Items

Review of Educational Research ◽

10.3102/00346543051003379 ◽

1981 ◽

Vol 51 (3) ◽

pp. 379-402 ◽

Cited By ~ 12

Author(s):

Wim J. van der Linden

Keyword(s):

Item Analysis ◽

Latent Trait ◽

Point Of View ◽

Information Function ◽

P Values ◽

Test Items ◽

Discriminating Power ◽

Criterion Referenced Test ◽

Item Responses ◽

Criterion Referenced

Since Cox and Vargas (1966) introduced their pretest-posttest validity index for criterion-referenced test items, a great number of additions and modifications have followed. All are based on the idea of gain scoring; that is, they are computed from the differences between proportions of pretest and posttest item responses. Although the method is simple and generally considered as the prototype of criterion-referenced item analysis, it has many and serious disadvantages. Some of these go back to the fact that it leads to indices based on a dual test administration- and population-dependent item p values. Others have to do with the global information about the discriminating power that these indices provide, the implicit weighting they suppose, and the meaningless maximization of posttest scores they lead to. Analyzing the pretest-posttest method from a latent trait point of view, it is proposed to replace indices like Cox and Vargas’ Dpp by an evaluation of the item information function for the mastery score. An empirical study was conducted to compare the differences in item selection between both methods.

Download Full-text

College Readiness of Filipino K to 12 Graduates: Insights from a Criterion-Referenced Test

International Journal of Education and Practice ◽

10.18488/journal.61.2020.84.625.637 ◽

2020 ◽

Vol 8 (4) ◽

pp. 625-637

Author(s):

Maria Mamba ◽

Antonio Tamayao ◽

Rudolf Vecaldo

Keyword(s):

College Readiness ◽

Criterion Referenced Test ◽

Criterion Referenced ◽

K To 12

Download Full-text

Using Item Response Models to Develop a Criterion-Referenced Test to Measure the Students' Achievement in Educational Evaluation

Journal of Educational and Psychological Studies [JEPS] ◽

10.24200/jeps.vol10iss4pp727-736 ◽

2016 ◽

Vol 10 (4) ◽

pp. 727

Author(s):

N uman S. Al-Musawi

Keyword(s):

Item Response ◽

High Reliability ◽

Educational Evaluation ◽

Maximum Information ◽

Item Response Models ◽

Reliability Estimates ◽

Criterion Referenced Test ◽

Criterion Referenced ◽

University Of Bahrain ◽

The University

The purpose of the study was to develop a criterion-referenced test to measurestudent's achievement in educational evaluation using item response theory. To achieve this goal, the author constructed a 3-option multiple-choice achievement test of 48 items that was later administered to 348 students enrolled at the University of Bahrain. The findings of study revealed that the students' responses to 31 items fit the Rasch model assumptions while 17 items did not fit the model. All items of the final version of the test, however, were located within the range of the model's infit and outfit indicators. Also, the reliability estimates for persons and items were .87 and .93, respectively, indicating a high reliability of the test, and the maximum information extracted from the three-option test is obtained at the average ability levels. Based on these results, the author recommends using the developed test as a reliable measure of the level of university student's achievement in the subject of educational evaluation

Download Full-text