Computerized Adaptive Testing in Early Education: Exploring the Impact of Item Position Effects on Ability Estimation

2019 ◽  
Vol 56 (2) ◽  
pp. 437-451 ◽  
Author(s):  
Anthony D. Albano ◽  
Liuhan Cai ◽  
Erin M. Lease ◽  
Scott R. McConnell
Methodology ◽  
2007 ◽  
Vol 3 (1) ◽  
pp. 14-23 ◽  
Author(s):  
Juan Ramon Barrada ◽  
Julio Olea ◽  
Vicente Ponsoda

Abstract. The Sympson-Hetter (1985) method provides a means of controlling maximum exposure rate of items in Computerized Adaptive Testing. Through a series of simulations, control parameters are set that mark the probability of administration of an item on being selected. This method presents two main problems: it requires a long computation time for calculating the parameters and the maximum exposure rate is slightly above the fixed limit. Van der Linden (2003) presented two alternatives which appear to solve both of the problems. The impact of these methods in the measurement accuracy has not been tested yet. We show how these methods over-restrict the exposure of some highly discriminating items and, thus, the accuracy is decreased. It also shown that, when the desired maximum exposure rate is near the minimum possible value, these methods offer an empirical maximum exposure rate clearly above the goal. A new method, based on the initial estimation of the probability of administration and the probability of selection of the items with the restricted method ( Revuelta & Ponsoda, 1998 ), is presented in this paper. It can be used with the Sympson-Hetter method and with the two van der Linden's methods. This option, when used with Sympson-Hetter, speeds the convergence of the control parameters without decreasing the accuracy.


2020 ◽  
pp. 107699862097280
Author(s):  
Shiyu Wang ◽  
Houping Xiao ◽  
Allan Cohen

An adaptive weight estimation approach is proposed to provide robust latent ability estimation in computerized adaptive testing (CAT) with response revision. This approach assigns different weights to each distinct response to the same item when response revision is allowed in CAT. Two types of weight estimation procedures, nonfunctional and functional weight, are proposed to determine the weight adaptively based on the compatibility of each revised response with the assumed statistical model in relation to remaining observations. The application of this estimation approach to a data set collected from a large-scale multistage adaptive testing demonstrates the capability of this method to reveal more information regarding the test taker’s latent ability by using the valid response path compared with only using the very last response. Limited simulation studies were concluded to evaluate the proposed ability estimation method and to compare it with several other estimation procedures in literature. Results indicate that the proposed ability estimation approach is able to provide robust estimation results in two test-taking scenarios.


2020 ◽  
pp. 107110072097266
Author(s):  
Joseph T. O’Neil ◽  
Otho R. Plummer ◽  
Steven M. Raikin

Background: Patient-reported outcome measures are an increasingly important tool for assessing the impact of treatments orthopedic surgeons render. Despite their importance, they can present a burden. We examined the validity and utility of a computerized adaptive testing (CAT) method to reduce the number of questions on the Foot and Ankle Ability Measure (FAAM), a validated anatomy-specific outcome measure. Methods: A previously developed FAAM CAT system was applied to the responses of patients undergoing foot and ankle evaluation and treatment over a 3-year period (2017-2019). A total of 15 902 responses for the Activities of Daily Living (ADL) subscale and a total of 14 344 responses for the Sports subscale were analyzed. The accuracy of the CAT to replicate the full-form score was assessed. Results: The CAT system required 11 questions to be answered for the ADL subscale in 85.1% of cases (range, 11-12). The number of questions answered on the Sports subscale was 6 (range, 5-6) in 66.4% of cases. The mean difference between the full FAAM ADL subscale and CAT was 0.63 of a point. The mean difference between the FAAM Sports subscale and CAT was 0.65 of a point. Conclusion: The FAAM CAT was able to reduce the number of responses a patient would need to answer by nearly 50%, while still providing a valid outcome score. This measure can therefore be directly correlated with previously obtained full FAAM scores in addition to providing a foot/ankle-specific measure, which previously reported CAT systems are not able to do. Level of Evidence: Level IV, case series.


2020 ◽  
pp. 001316442091989
Author(s):  
Michael L. Thomas ◽  
Gregory G. Brown ◽  
Virginie M. Patt ◽  
John R. Duffy

The adaptation of experimental cognitive tasks into measures that can be used to quantify neurocognitive outcomes in translational studies and clinical trials has become a key component of the strategy to address psychiatric and neurological disorders. Unfortunately, while most experimental cognitive tests have strong theoretical bases, they can have poor psychometric properties, leaving them vulnerable to measurement challenges that undermine their use in applied settings. Item response theory–based computerized adaptive testing has been proposed as a solution but has been limited in experimental and translational research due to its large sample requirements. We present a generalized latent variable model that, when combined with strong parametric assumptions based on mathematical cognitive models, permits the use of adaptive testing without large samples or the need to precalibrate item parameters. The approach is demonstrated using data from a common measure of working memory—the N-back task—collected across a diverse sample of participants. After evaluating dimensionality and model fit, we conducted a simulation study to compare adaptive versus nonadaptive testing. Computerized adaptive testing either made the task 36% more efficient or score estimates 23% more precise, when compared to nonadaptive testing. This proof-of-concept study demonstrates that latent variable modeling and adaptive testing can be used in experimental cognitive testing even with relatively small samples. Adaptive testing has the potential to improve the impact and replicability of findings from translational studies and clinical trials that use experimental cognitive tasks as outcome measures.


2020 ◽  
Vol 5 (4) ◽  
pp. 2473011420S0006
Author(s):  
Joseph T. O’Neil ◽  
Otho R. Plummer ◽  
Steven M. Raikin

Category: Other Introduction/Purpose: Patient-reported outcome measures (PROMs) are an increasingly important tool for assessing the impact of treatments orthopaedic surgeons render to patients. They provide information directly reported by the patient pertaining to the perception of their own outcome, functional status, and quality of life. Despite their importance, they can present a burden for patients as well as for a busy outpatient clinic. The Foot and Ankle Ability Measure (FAAM) is a freely available validated anatomy-specific outcome measure consisting of 32 questions, and has been found to be reliable for patients with a wide spectrum of foot and ankle conditions. We examined the validity and utility of a computerized adaptive testing (CAT) method to reduce the number of questions on the Foot and Ankle Ability Measure. Methods: A previously developed FAAM CAT system was applied to the responses of patients undergoing foot and ankle evaluation and treatment at a busy tertiary referral orthopaedic practice over a 3-year period (2017-2019). A total of 15,902 responses for the Activities of Daily Living (ADL) subscale and a total of 14,344 responses for the Sports subscale were analyzed. The accuracy of the CAT to replicate the full-form score was assessed using the mean and standard deviation of scores for both groups (FAAM versus CAT), frequency distributions of the scores and score differences for both groups, Pearson and intraclass correlation coefficients, and Bland-Altman assessments of patterns in score differences. Results: The CAT system required 11 questions to be answered for the ADL subscale in 85.1% of cases (compared to 22 questions for the FAAM) and 12 in 14.9% of cases. The number of questions answered on the Sports subscale was 6 in 66.4% of cases (compared to 10 for the FAAM) and 5 in 33.6% of cases. The mean difference between the full FAAM ADL subscale (out of 100 points) and CAT was 0.6266 of a point and scores were within 7.5 points in greater than 95% of cases. The mean difference between the FAAM Sports subscale (out of 100 points) and CAT was 0.5967 of a point and scores were within the minimal clinically important difference of 9 in greater than 95% of cases. Conclusion: The FAAM CAT was able to reduce the number of responses a patient would need to answer by nearly 50%, while still providing a valid outcome score. This measure can therefore be directly correlated with previously obtained full FAAM scores in addition to providing a foot/ankle-specific measure, which previously reported CAT systems are not able to do.


Sign in / Sign up

Export Citation Format

Share Document