CATBOOK Computerized Adaptive Testing: From Inquiry to Operation

Author(s):  
W. A. Sands ◽  
Brian K. Waters ◽  
James R. McBride
1999 ◽  
Vol 15 (2) ◽  
pp. 91-98 ◽  
Author(s):  
Lutz F. Hornke

Summary: Item parameters for several hundreds of items were estimated based on empirical data from several thousands of subjects. The logistic one-parameter (1PL) and two-parameter (2PL) model estimates were evaluated. However, model fit showed that only a subset of items complied sufficiently, so that the remaining ones were assembled in well-fitting item banks. In several simulation studies 5000 simulated responses were generated in accordance with a computerized adaptive test procedure along with person parameters. A general reliability of .80 or a standard error of measurement of .44 was used as a stopping rule to end CAT testing. We also recorded how often each item was used by all simulees. Person-parameter estimates based on CAT correlated higher than .90 with true values simulated. For all 1PL fitting item banks most simulees used more than 20 items but less than 30 items to reach the pre-set level of measurement error. However, testing based on item banks that complied to the 2PL revealed that, on average, only 10 items were sufficient to end testing at the same measurement error level. Both clearly demonstrate the precision and economy of computerized adaptive testing. Empirical evaluations from everyday uses will show whether these trends will hold up in practice. If so, CAT will become possible and reasonable with some 150 well-calibrated 2PL items.


Methodology ◽  
2007 ◽  
Vol 3 (1) ◽  
pp. 14-23 ◽  
Author(s):  
Juan Ramon Barrada ◽  
Julio Olea ◽  
Vicente Ponsoda

Abstract. The Sympson-Hetter (1985) method provides a means of controlling maximum exposure rate of items in Computerized Adaptive Testing. Through a series of simulations, control parameters are set that mark the probability of administration of an item on being selected. This method presents two main problems: it requires a long computation time for calculating the parameters and the maximum exposure rate is slightly above the fixed limit. Van der Linden (2003) presented two alternatives which appear to solve both of the problems. The impact of these methods in the measurement accuracy has not been tested yet. We show how these methods over-restrict the exposure of some highly discriminating items and, thus, the accuracy is decreased. It also shown that, when the desired maximum exposure rate is near the minimum possible value, these methods offer an empirical maximum exposure rate clearly above the goal. A new method, based on the initial estimation of the probability of administration and the probability of selection of the items with the restricted method ( Revuelta & Ponsoda, 1998 ), is presented in this paper. It can be used with the Sympson-Hetter method and with the two van der Linden's methods. This option, when used with Sympson-Hetter, speeds the convergence of the control parameters without decreasing the accuracy.


2021 ◽  
Author(s):  
Bryant A Seamon ◽  
Steven A Kautz ◽  
Craig A Velozo

Abstract Objective Administrative burden often prevents clinical assessment of balance confidence in people with stroke. A computerized adaptive test (CAT) version of the Activities-specific Balance Confidence Scale (ABC CAT) can dramatically reduce this burden. The objective of this study was to test balance confidence measurement precision and efficiency in people with stroke with an ABC CAT. Methods We conducted a retrospective cross-sectional simulation study with data from 406 adults approximately 2-months post-stroke in the Locomotor-Experience Applied Post-Stroke (LEAPS) trial. Item parameters for CAT calibration were estimated with the Rasch model using a random sample of participants (n = 203). Computer simulation was used with response data from remaining 203 participants to evaluate the ABC CAT algorithm under varying stopping criteria. We compared estimated levels of balance confidence from each simulation to actual levels predicted from the Rasch model (Pearson correlations and mean standard error (SE)). Results Results from simulations with number of items as a stopping criterion strongly correlated with actual ABC scores (full item, r = 1, 12-item, r = 0.994; 8-item, r = 0.98; 4-item, r = 0.929). Mean SE increased with decreasing number of items administered (full item, SE = 0.31; 12-item, SE = 0.33; 8-item, SE = 0.38; 4-item, SE = 0.49). A precision-based stopping rule (mean SE = 0.5) also strongly correlated with actual ABC scores (r = .941) and optimized the relationship between number of items administrated with precision (mean number of items 4.37, range [4–9]). Conclusions An ABC CAT can determine accurate and precise measures of balance confidence in people with stroke with as few as 4 items. Individuals with lower balance confidence may require a greater number of items (up to 9) and attributed to the LEAPS trial excluding more functionally impaired persons. Impact Statement Computerized adaptive testing can drastically reduce the ABC’s test administration time while maintaining accuracy and precision. This should greatly enhance clinical utility, facilitating adoption of clinical practice guidelines in stroke rehabilitation. Lay Summary If you have had a stroke, your physical therapist will likely test your balance confidence. A computerized adaptive test version of the ABC scale can accurately identify balance with as few as 4 questions, which takes much less time.


Sign in / Sign up

Export Citation Format

Share Document