The Effect of Item Pools of Different Strengths on the Test Results of Computerized-Adaptive Testing

Ninth Grade ◽

Adaptive Testing ◽

Test Results ◽

Basic Principles ◽

Education Process

Abstract Testing as such is important for diagnostics and evaluation. It is used as feedback by both the pedagogues and the ones being tested. The more a teacher learns from the test results, the better is their chance to correct, clarify, or modify the test itself; i.e. to carry out the changes in their instruction or the education process. The more a student learns from the test, the better is their chance to thoroughly learn and master information, and to clarify problematic issues of a particular curriculum. Moreover, their motivation to further study is growing as they deal with more demanding tasks. Adaptive testing carried out in a suitable LMS offers such possibilities. This paper is aimed at the introduction of basic principles and rules of computerized adaptive testing. Moreover, it provides information about the process and results of computerized adaptive testing, which was experimentally carried out on the sample of 53 ninth-grade pupils at the Porubská elementary school in Ostrava.

Optimal Stratification of Item Pools in α-Stratified Computerized Adaptive Testing

Applied Psychological Measurement ◽

10.1177/0146621603027004002 ◽

2003 ◽

Vol 27 (4) ◽

pp. 262-274 ◽

Cited By ~ 18

Author(s):

Hua-Hua Chang ◽

Wim J. van der Linden

Keyword(s):

Adaptive Testing ◽

Assembling a Computerized Adaptive Testing Item Pool as a Set of Linear Tests

Journal of Educational and Behavioral Statistics ◽

10.3102/10769986031001081 ◽

2006 ◽

Vol 31 (1) ◽

pp. 81-99 ◽

Cited By ~ 20

Author(s):

Wim J. van der Linden ◽

Adelaide Ariel ◽

Bernard P. Veldkamp

Keyword(s):

Mean Squared Error ◽

Programming Model ◽

Adaptive Testing ◽

Law School ◽

Mixed Integer ◽

Item Exposure ◽

Item Writing ◽

Optimal Information ◽

Test-item writing efforts typically results in item pools with an undesirable correlational structure between the content attributes of the items and their statistical information. If such pools are used in computerized adaptive testing (CAT), the algorithm may be forced to select items with less than optimal information, that violate the content constraints, and/or have unfavorable exposure rates. Although at first sight somewhat counterintuitive, it is shown that if the CAT pool is assembled as a set of linear test forms, undesirable correlations can be broken down effectively. It is proposed to assemble such pools using a mixed integer programming model with constraints that guarantee that each test meets all content specifications and an objective function that requires them to have maximal information at a well-chosen set of ability values. An empirical example with a previous master pool from the Law School Admission Test (LSAT) yielded a CAT with nearly uniform bias and mean-squared error functions for the ability estimator and item-exposure rates that satisfied the target for all items in the pool.

Designing Item Pools for Computerized Adaptive Testing

Computerized Adaptive Testing: Theory and Practice ◽

10.1007/0-306-47531-6_8 ◽

2000 ◽

pp. 149-162 ◽

Cited By ~ 13

Author(s):

Bernard P. Veldkamp ◽

Wim J. van der Linden

Keyword(s):

Adaptive Testing ◽

Item-presentation controls for multidimensional item pools in computerized adaptive testing

Behavior Research Methods Instruments &amp Computers ◽

10.3758/bf03203154 ◽

1990 ◽

Vol 22 (2) ◽

pp. 247-252

Author(s):

Thomas J. Thomas

Keyword(s):

Adaptive Testing ◽

Item Presentation ◽

Benefits from Computerized Adaptive Testing as Seen in Simulation Studies

European Journal of Psychological Assessment ◽

10.1027//1015-5759.15.2.91 ◽

1999 ◽

Vol 15 (2) ◽

pp. 91-98 ◽

Cited By ~ 10

Author(s):

Lutz F. Hornke

Keyword(s):

Measurement Error ◽

Test Procedure ◽

Adaptive Testing ◽

Parameter Estimates ◽

Simulation Studies ◽

Computerized Adaptive Test ◽

Item Banks ◽

Item Parameters ◽

General Reliability

Summary: Item parameters for several hundreds of items were estimated based on empirical data from several thousands of subjects. The logistic one-parameter (1PL) and two-parameter (2PL) model estimates were evaluated. However, model fit showed that only a subset of items complied sufficiently, so that the remaining ones were assembled in well-fitting item banks. In several simulation studies 5000 simulated responses were generated in accordance with a computerized adaptive test procedure along with person parameters. A general reliability of .80 or a standard error of measurement of .44 was used as a stopping rule to end CAT testing. We also recorded how often each item was used by all simulees. Person-parameter estimates based on CAT correlated higher than .90 with true values simulated. For all 1PL fitting item banks most simulees used more than 20 items but less than 30 items to reach the pre-set level of measurement error. However, testing based on item banks that complied to the 2PL revealed that, on average, only 10 items were sufficient to end testing at the same measurement error level. Both clearly demonstrate the precision and economy of computerized adaptive testing. Empirical evaluations from everyday uses will show whether these trends will hold up in practice. If so, CAT will become possible and reasonable with some 150 well-calibrated 2PL items.

Methods for Restricting Maximum Exposure Rate in Computerized Adaptative Testing

Methodology ◽

10.1027/1614-2241.3.1.14 ◽

2007 ◽

Vol 3 (1) ◽

pp. 14-23 ◽

Cited By ~ 9

Author(s):

Juan Ramon Barrada ◽

Julio Olea ◽

Vicente Ponsoda

Keyword(s):

Measurement Accuracy ◽

Computation Time ◽

Adaptive Testing ◽

Exposure Rate ◽

Control Parameters ◽

The Impact ◽

Two Alternatives ◽

Selection Of ◽

Maximum Exposure

Abstract. The Sympson-Hetter (1985) method provides a means of controlling maximum exposure rate of items in Computerized Adaptive Testing. Through a series of simulations, control parameters are set that mark the probability of administration of an item on being selected. This method presents two main problems: it requires a long computation time for calculating the parameters and the maximum exposure rate is slightly above the fixed limit. Van der Linden (2003) presented two alternatives which appear to solve both of the problems. The impact of these methods in the measurement accuracy has not been tested yet. We show how these methods over-restrict the exposure of some highly discriminating items and, thus, the accuracy is decreased. It also shown that, when the desired maximum exposure rate is near the minimum possible value, these methods offer an empirical maximum exposure rate clearly above the goal. A new method, based on the initial estimation of the probability of administration and the probability of selection of the items with the restricted method ( Revuelta & Ponsoda, 1998 ), is presented in this paper. It can be used with the Sympson-Hetter method and with the two van der Linden's methods. This option, when used with Sympson-Hetter, speeds the convergence of the control parameters without decreasing the accuracy.