Computerized Adaptive Testing in Early Education: Exploring the Impact of Item Position Effects on Ability Estimation

Abstract. The Sympson-Hetter (1985) method provides a means of controlling maximum exposure rate of items in Computerized Adaptive Testing. Through a series of simulations, control parameters are set that mark the probability of administration of an item on being selected. This method presents two main problems: it requires a long computation time for calculating the parameters and the maximum exposure rate is slightly above the fixed limit. Van der Linden (2003) presented two alternatives which appear to solve both of the problems. The impact of these methods in the measurement accuracy has not been tested yet. We show how these methods over-restrict the exposure of some highly discriminating items and, thus, the accuracy is decreased. It also shown that, when the desired maximum exposure rate is near the minimum possible value, these methods offer an empirical maximum exposure rate clearly above the goal. A new method, based on the initial estimation of the probability of administration and the probability of selection of the items with the restricted method ( Revuelta & Ponsoda, 1998 ), is presented in this paper. It can be used with the Sympson-Hetter method and with the two van der Linden's methods. This option, when used with Sympson-Hetter, speeds the convergence of the control parameters without decreasing the accuracy.

Download Full-text

Adaptive Weight Estimation of Latent Ability: Application to Computerized Adaptive Testing With Response Revision

Journal of Educational and Behavioral Statistics ◽

10.3102/1076998620972800 ◽

2020 ◽

pp. 107699862097280

Author(s):

Shiyu Wang ◽

Houping Xiao ◽

Allan Cohen

Keyword(s):

Large Scale ◽

Computerized Adaptive Testing ◽

Estimation Method ◽

Adaptive Testing ◽

Ability Estimation ◽

Weight Estimation ◽

Data Set ◽

Adaptive Weight ◽

Latent Ability ◽

Estimation Procedures

An adaptive weight estimation approach is proposed to provide robust latent ability estimation in computerized adaptive testing (CAT) with response revision. This approach assigns different weights to each distinct response to the same item when response revision is allowed in CAT. Two types of weight estimation procedures, nonfunctional and functional weight, are proposed to determine the weight adaptively based on the compatibility of each revised response with the assumed statistical model in relation to remaining observations. The application of this estimation approach to a data set collected from a large-scale multistage adaptive testing demonstrates the capability of this method to reveal more information regarding the test taker’s latent ability by using the valid response path compared with only using the very last response. Limited simulation studies were concluded to evaluate the proposed ability estimation method and to compare it with several other estimation procedures in literature. Results indicate that the proposed ability estimation approach is able to provide robust estimation results in two test-taking scenarios.

Download Full-text

Application of Computerized Adaptive Testing to the Foot and Ankle Ability Measure

Foot & Ankle International ◽

10.1177/1071100720972663 ◽

2020 ◽

pp. 107110072097266

Author(s):

Joseph T. O’Neil ◽

Otho R. Plummer ◽

Steven M. Raikin

Keyword(s):

Computerized Adaptive Testing ◽

Case Series ◽

Adaptive Testing ◽

Mean Difference ◽

Foot And Ankle ◽

Level Of Evidence ◽

Patient Reported ◽

The Mean ◽

Ability Measure ◽

The Impact

Background: Patient-reported outcome measures are an increasingly important tool for assessing the impact of treatments orthopedic surgeons render. Despite their importance, they can present a burden. We examined the validity and utility of a computerized adaptive testing (CAT) method to reduce the number of questions on the Foot and Ankle Ability Measure (FAAM), a validated anatomy-specific outcome measure. Methods: A previously developed FAAM CAT system was applied to the responses of patients undergoing foot and ankle evaluation and treatment over a 3-year period (2017-2019). A total of 15 902 responses for the Activities of Daily Living (ADL) subscale and a total of 14 344 responses for the Sports subscale were analyzed. The accuracy of the CAT to replicate the full-form score was assessed. Results: The CAT system required 11 questions to be answered for the ADL subscale in 85.1% of cases (range, 11-12). The number of questions answered on the Sports subscale was 6 (range, 5-6) in 66.4% of cases. The mean difference between the full FAAM ADL subscale and CAT was 0.63 of a point. The mean difference between the FAAM Sports subscale and CAT was 0.65 of a point. Conclusion: The FAAM CAT was able to reduce the number of responses a patient would need to answer by nearly 50%, while still providing a valid outcome score. This measure can therefore be directly correlated with previously obtained full FAAM scores in addition to providing a foot/ankle-specific measure, which previously reported CAT systems are not able to do. Level of Evidence: Level IV, case series.

Download Full-text

Effects of Calibration Sample Size and Item Bank Size on Ability Estimation in Computerized Adaptive Testing

Educational Sciences Theory & Practice ◽

10.12738/estp.2015.6.0102 ◽

2015 ◽

Cited By ~ 1

Keyword(s):

Sample Size ◽

Computerized Adaptive Testing ◽

Adaptive Testing ◽

Item Bank ◽

Ability Estimation ◽

Calibration Sample

Download Full-text

A computerized adaptive testing approach to measuring the impact of mobility devices on activities and participation

Gerontechnology ◽

10.4017/gt.2010.09.02.055.00 ◽

2010 ◽

Vol 9 (2) ◽

Author(s):

J. Jutai ◽

M. Fuhrer ◽

R. Bode ◽

A. Heinemann ◽

L. Demers ◽

...

Keyword(s):

Computerized Adaptive Testing ◽

Adaptive Testing ◽

Mobility Devices ◽

Testing Approach ◽

The Impact

Download Full-text

Effects of Scale Transformation and Test-Termination Rule on the Precision of Ability Estimation in Computerized Adaptive Testing

Journal of Educational Measurement ◽

10.1111/j.1745-3984.2001.tb01127.x ◽

2001 ◽

Vol 38 (3) ◽

pp. 267-292 ◽

Cited By ~ 7

Author(s):

Qing Yi ◽

Tianyou Wang ◽

Jae-Chun Ban

Keyword(s):

Computerized Adaptive Testing ◽

Adaptive Testing ◽

Scale Transformation ◽

Ability Estimation

Download Full-text

About the ability estimation algorithm optimization of IRT-based Computerized Adaptive Testing system

2008 7th World Congress on Intelligent Control and Automation ◽

10.1109/wcica.2008.4593865 ◽

2008 ◽

Cited By ~ 1

Author(s):

Yan Cheng ◽

Weisheng Xu ◽

Youling Yu

Keyword(s):

Computerized Adaptive Testing ◽

Estimation Algorithm ◽

Adaptive Testing ◽

Testing System ◽

Ability Estimation ◽

Algorithm Optimization

Download Full-text

Latent Variable Modeling and Adaptive Testing for Experimental Cognitive Psychopathology Research

Educational and Psychological Measurement ◽

10.1177/0013164420919898 ◽

2020 ◽

pp. 001316442091989

Author(s):

Michael L. Thomas ◽

Gregory G. Brown ◽

Virginie M. Patt ◽

John R. Duffy

Keyword(s):

Clinical Trials ◽

Latent Variable ◽

Computerized Adaptive Testing ◽

Adaptive Testing ◽

Cognitive Testing ◽

Cognitive Tasks ◽

Latent Variable Modeling ◽

Variable Model ◽

Translational Studies ◽

The Impact

The adaptation of experimental cognitive tasks into measures that can be used to quantify neurocognitive outcomes in translational studies and clinical trials has become a key component of the strategy to address psychiatric and neurological disorders. Unfortunately, while most experimental cognitive tests have strong theoretical bases, they can have poor psychometric properties, leaving them vulnerable to measurement challenges that undermine their use in applied settings. Item response theory–based computerized adaptive testing has been proposed as a solution but has been limited in experimental and translational research due to its large sample requirements. We present a generalized latent variable model that, when combined with strong parametric assumptions based on mathematical cognitive models, permits the use of adaptive testing without large samples or the need to precalibrate item parameters. The approach is demonstrated using data from a common measure of working memory—the N-back task—collected across a diverse sample of participants. After evaluating dimensionality and model fit, we conducted a simulation study to compare adaptive versus nonadaptive testing. Computerized adaptive testing either made the task 36% more efficient or score estimates 23% more precise, when compared to nonadaptive testing. This proof-of-concept study demonstrates that latent variable modeling and adaptive testing can be used in experimental cognitive testing even with relatively small samples. Adaptive testing has the potential to improve the impact and replicability of findings from translational studies and clinical trials that use experimental cognitive tasks as outcome measures.

Download Full-text

Properties of Ability Estimation Methods in Computerized Adaptive Testing

Journal of Educational Measurement ◽

10.1111/j.1745-3984.1998.tb00530.x ◽

1998 ◽

Vol 35 (2) ◽

pp. 109-135 ◽

Cited By ~ 43

Author(s):

Tianyou Wang ◽

Walter P. Vispoel

Keyword(s):

Computerized Adaptive Testing ◽

Adaptive Testing ◽

Estimation Methods ◽

Ability Estimation

Download Full-text

2020 Roger A. Mann Award Winner: Application of Computerized Adaptive Testing to the Foot and Ankle Ability Measure

Foot & Ankle Orthopaedics ◽

10.1177/2473011420s00067 ◽

2020 ◽

Vol 5 (4) ◽

pp. 2473011420S0006

Author(s):

Joseph T. O’Neil ◽

Otho R. Plummer ◽

Steven M. Raikin

Keyword(s):

Computerized Adaptive Testing ◽

Wide Spectrum ◽

Intraclass Correlation ◽

Adaptive Testing ◽

Mean Difference ◽

Foot And Ankle ◽

Patient Reported ◽

The Mean ◽

Ability Measure ◽

The Impact

Category: Other Introduction/Purpose: Patient-reported outcome measures (PROMs) are an increasingly important tool for assessing the impact of treatments orthopaedic surgeons render to patients. They provide information directly reported by the patient pertaining to the perception of their own outcome, functional status, and quality of life. Despite their importance, they can present a burden for patients as well as for a busy outpatient clinic. The Foot and Ankle Ability Measure (FAAM) is a freely available validated anatomy-specific outcome measure consisting of 32 questions, and has been found to be reliable for patients with a wide spectrum of foot and ankle conditions. We examined the validity and utility of a computerized adaptive testing (CAT) method to reduce the number of questions on the Foot and Ankle Ability Measure. Methods: A previously developed FAAM CAT system was applied to the responses of patients undergoing foot and ankle evaluation and treatment at a busy tertiary referral orthopaedic practice over a 3-year period (2017-2019). A total of 15,902 responses for the Activities of Daily Living (ADL) subscale and a total of 14,344 responses for the Sports subscale were analyzed. The accuracy of the CAT to replicate the full-form score was assessed using the mean and standard deviation of scores for both groups (FAAM versus CAT), frequency distributions of the scores and score differences for both groups, Pearson and intraclass correlation coefficients, and Bland-Altman assessments of patterns in score differences. Results: The CAT system required 11 questions to be answered for the ADL subscale in 85.1% of cases (compared to 22 questions for the FAAM) and 12 in 14.9% of cases. The number of questions answered on the Sports subscale was 6 in 66.4% of cases (compared to 10 for the FAAM) and 5 in 33.6% of cases. The mean difference between the full FAAM ADL subscale (out of 100 points) and CAT was 0.6266 of a point and scores were within 7.5 points in greater than 95% of cases. The mean difference between the FAAM Sports subscale (out of 100 points) and CAT was 0.5967 of a point and scores were within the minimal clinically important difference of 9 in greater than 95% of cases. Conclusion: The FAAM CAT was able to reduce the number of responses a patient would need to answer by nearly 50%, while still providing a valid outcome score. This measure can therefore be directly correlated with previously obtained full FAAM scores in addition to providing a foot/ankle-specific measure, which previously reported CAT systems are not able to do.

Download Full-text