Methods for Restricting Maximum Exposure Rate in Computerized Adaptative Testing

Methodology ◽  
2007 ◽  
Vol 3 (1) ◽  
pp. 14-23 ◽  
Author(s):  
Juan Ramon Barrada ◽  
Julio Olea ◽  
Vicente Ponsoda

Abstract. The Sympson-Hetter (1985) method provides a means of controlling maximum exposure rate of items in Computerized Adaptive Testing. Through a series of simulations, control parameters are set that mark the probability of administration of an item on being selected. This method presents two main problems: it requires a long computation time for calculating the parameters and the maximum exposure rate is slightly above the fixed limit. Van der Linden (2003) presented two alternatives which appear to solve both of the problems. The impact of these methods in the measurement accuracy has not been tested yet. We show how these methods over-restrict the exposure of some highly discriminating items and, thus, the accuracy is decreased. It also shown that, when the desired maximum exposure rate is near the minimum possible value, these methods offer an empirical maximum exposure rate clearly above the goal. A new method, based on the initial estimation of the probability of administration and the probability of selection of the items with the restricted method ( Revuelta & Ponsoda, 1998 ), is presented in this paper. It can be used with the Sympson-Hetter method and with the two van der Linden's methods. This option, when used with Sympson-Hetter, speeds the convergence of the control parameters without decreasing the accuracy.

2008 ◽  
Vol 11 (2) ◽  
pp. 618-625 ◽  
Author(s):  
Juan Ramón Barrada ◽  
Julio Olea ◽  
Francisco José Abad

If examinees were to know, beforehand, part of the content of a computerized adaptive test, their estimated trait levels would then have a marked positive bias. One of the strategies to avoid this consists of dividing a large item bank into several sub-banks and rotating the sub-bank employed (Ariel, Veldkamp & van der Linden, 2004). This strategy permits substantial improvements in exposure control at little cost to measurement accuracy. However, we do not know whether this option provides better results than using the master bank with greater restriction in the maximum exposure rates (Sympson & Hetter, 1985). In order to investigate this issue, we worked with several simulated banks of 2100 items, comparing them, for RMSE and overlap rate, with the same banks divided in two, three… up to seven sub-banks. By means of extensive manipulation of the maximum exposure rate in each bank, we found that the option of rotating banks slightly outperformed the option of restricting maximum exposure rate of the master bank by means of the Sympson-Hetter method.


2020 ◽  
Vol 11 ◽  
Author(s):  
Xiaojian Sun ◽  
Yanlou Liu ◽  
Tao Xin ◽  
Naiqing Song

Calibration errors are inevitable and should not be ignored during the estimation of item parameters. Items with calibration error can affect the measurement results of tests. One of the purposes of the current study is to investigate the impacts of the calibration errors during the estimation of item parameters on the measurement accuracy, average test length, and test efficiency for variable-length cognitive diagnostic computerized adaptive testing. The other purpose is to examine the methods for reducing the adverse effects of calibration errors. Simulation results show that (1) calibration error has negative effect on the measurement accuracy for the deterministic input, noisy “and” gate (DINA) model, and the reduced reparameterized unified model; (2) the average test lengths is shorter, and the test efficiency is overestimated for items with calibration errors; (3) the compensatory reparameterized unified model (CRUM) is less affected by the calibration errors, and the classification accuracy, average test length, and test efficiency are slightly stable in the CRUM framework; (4) methods such as improving the quality of items, using large calibration sample to calibrate the parameters of items, as well as using cross-validation method can reduce the adverse effects of calibration errors on CD-CAT.


2020 ◽  
pp. 107110072097266
Author(s):  
Joseph T. O’Neil ◽  
Otho R. Plummer ◽  
Steven M. Raikin

Background: Patient-reported outcome measures are an increasingly important tool for assessing the impact of treatments orthopedic surgeons render. Despite their importance, they can present a burden. We examined the validity and utility of a computerized adaptive testing (CAT) method to reduce the number of questions on the Foot and Ankle Ability Measure (FAAM), a validated anatomy-specific outcome measure. Methods: A previously developed FAAM CAT system was applied to the responses of patients undergoing foot and ankle evaluation and treatment over a 3-year period (2017-2019). A total of 15 902 responses for the Activities of Daily Living (ADL) subscale and a total of 14 344 responses for the Sports subscale were analyzed. The accuracy of the CAT to replicate the full-form score was assessed. Results: The CAT system required 11 questions to be answered for the ADL subscale in 85.1% of cases (range, 11-12). The number of questions answered on the Sports subscale was 6 (range, 5-6) in 66.4% of cases. The mean difference between the full FAAM ADL subscale and CAT was 0.63 of a point. The mean difference between the FAAM Sports subscale and CAT was 0.65 of a point. Conclusion: The FAAM CAT was able to reduce the number of responses a patient would need to answer by nearly 50%, while still providing a valid outcome score. This measure can therefore be directly correlated with previously obtained full FAAM scores in addition to providing a foot/ankle-specific measure, which previously reported CAT systems are not able to do. Level of Evidence: Level IV, case series.


2020 ◽  
pp. 001316442091989
Author(s):  
Michael L. Thomas ◽  
Gregory G. Brown ◽  
Virginie M. Patt ◽  
John R. Duffy

The adaptation of experimental cognitive tasks into measures that can be used to quantify neurocognitive outcomes in translational studies and clinical trials has become a key component of the strategy to address psychiatric and neurological disorders. Unfortunately, while most experimental cognitive tests have strong theoretical bases, they can have poor psychometric properties, leaving them vulnerable to measurement challenges that undermine their use in applied settings. Item response theory–based computerized adaptive testing has been proposed as a solution but has been limited in experimental and translational research due to its large sample requirements. We present a generalized latent variable model that, when combined with strong parametric assumptions based on mathematical cognitive models, permits the use of adaptive testing without large samples or the need to precalibrate item parameters. The approach is demonstrated using data from a common measure of working memory—the N-back task—collected across a diverse sample of participants. After evaluating dimensionality and model fit, we conducted a simulation study to compare adaptive versus nonadaptive testing. Computerized adaptive testing either made the task 36% more efficient or score estimates 23% more precise, when compared to nonadaptive testing. This proof-of-concept study demonstrates that latent variable modeling and adaptive testing can be used in experimental cognitive testing even with relatively small samples. Adaptive testing has the potential to improve the impact and replicability of findings from translational studies and clinical trials that use experimental cognitive tasks as outcome measures.


2020 ◽  
Vol 5 (4) ◽  
pp. 2473011420S0006
Author(s):  
Joseph T. O’Neil ◽  
Otho R. Plummer ◽  
Steven M. Raikin

Category: Other Introduction/Purpose: Patient-reported outcome measures (PROMs) are an increasingly important tool for assessing the impact of treatments orthopaedic surgeons render to patients. They provide information directly reported by the patient pertaining to the perception of their own outcome, functional status, and quality of life. Despite their importance, they can present a burden for patients as well as for a busy outpatient clinic. The Foot and Ankle Ability Measure (FAAM) is a freely available validated anatomy-specific outcome measure consisting of 32 questions, and has been found to be reliable for patients with a wide spectrum of foot and ankle conditions. We examined the validity and utility of a computerized adaptive testing (CAT) method to reduce the number of questions on the Foot and Ankle Ability Measure. Methods: A previously developed FAAM CAT system was applied to the responses of patients undergoing foot and ankle evaluation and treatment at a busy tertiary referral orthopaedic practice over a 3-year period (2017-2019). A total of 15,902 responses for the Activities of Daily Living (ADL) subscale and a total of 14,344 responses for the Sports subscale were analyzed. The accuracy of the CAT to replicate the full-form score was assessed using the mean and standard deviation of scores for both groups (FAAM versus CAT), frequency distributions of the scores and score differences for both groups, Pearson and intraclass correlation coefficients, and Bland-Altman assessments of patterns in score differences. Results: The CAT system required 11 questions to be answered for the ADL subscale in 85.1% of cases (compared to 22 questions for the FAAM) and 12 in 14.9% of cases. The number of questions answered on the Sports subscale was 6 in 66.4% of cases (compared to 10 for the FAAM) and 5 in 33.6% of cases. The mean difference between the full FAAM ADL subscale (out of 100 points) and CAT was 0.6266 of a point and scores were within 7.5 points in greater than 95% of cases. The mean difference between the FAAM Sports subscale (out of 100 points) and CAT was 0.5967 of a point and scores were within the minimal clinically important difference of 9 in greater than 95% of cases. Conclusion: The FAAM CAT was able to reduce the number of responses a patient would need to answer by nearly 50%, while still providing a valid outcome score. This measure can therefore be directly correlated with previously obtained full FAAM scores in addition to providing a foot/ankle-specific measure, which previously reported CAT systems are not able to do.


2021 ◽  
Vol 12 ◽  
Author(s):  
Xiaojian Sun ◽  
Yizhu Gao ◽  
Tao Xin ◽  
Naiqing Song

Although classification accuracy is a critical issue in cognitive diagnostic computerized adaptive testing, attention has increasingly shifted to item exposure control to ensure test security. In this study, we developed the binary restrictive threshold (BRT) method to balance measurement accuracy and item exposure. In addition, a simulation study was conducted to evaluate its performance. The results indicated that the BRT method performed better than the restrictive progressive (RP) and stratified dynamic binary searching (SDBS) approaches but worse than the restrictive threshold (RT) method in terms of classification accuracy. With respect to item exposure control, the BRT method exhibited noticeably stronger performance compared with the RT method, even though its performance was not as high as that of the RP and SDBS methods.


Sign in / Sign up

Export Citation Format

Share Document