To Implement Computerized Adaptive Testing by Automatically Adjusting Item Difficulty Index on Adaptive English Learning Platform

2021 ◽  
Vol 22 (7) ◽  
pp. 1599-1607
Author(s):  
鄭淑臻 鄭淑臻 ◽  
Yu-Ping Cheng Shu-Chen Cheng ◽  
黃悅民 Yu-Ping Cheng

2019 ◽  
Vol 44 (3) ◽  
pp. 182-196
Author(s):  
Jyun-Hong Chen ◽  
Hsiu-Yi Chao ◽  
Shu-Ying Chen

When computerized adaptive testing (CAT) is under stringent item exposure control, the precision of trait estimation will substantially decrease. A new item selection method, the dynamic Stratification method based on Dominance Curves (SDC), which is aimed at improving trait estimation, is proposed to mitigate this problem. The objective function of the SDC in item selection is to maximize the sum of test information for all examinees rather than maximizing item information for individual examinees at a single-item administration, as in conventional CAT. To achieve this objective, the SDC uses dominance curves to stratify an item pool into strata with the number being equal to the test length to precisely and accurately increase the quality of the administered items as the test progresses, reducing the likelihood that a high-discrimination item will be administered to an examinee whose ability is not close to the item difficulty. Furthermore, the SDC incorporates a dynamic process for on-the-fly item–stratum adjustment to optimize the use of quality items. Simulation studies were conducted to investigate the performance of the SDC in CAT under item exposure control at different levels of severity. According to the results, the SDC can efficiently improve trait estimation in CAT through greater precision and more accurate trait estimation than those generated by other methods (e.g., the maximum Fisher information method) in most conditions.


2011 ◽  
Vol 27 (3) ◽  
pp. 157-163 ◽  
Author(s):  
Tuulia M. Ortner ◽  
Juliane Caspers

We investigated the effects of test anxiety on test performance using computerized adaptive testing (CAT) versus conventional fixed item testing (FIT). We hypothesized that tests containing mainly items with medium probabilities of being solved would have negative effects on test performance for testtakers high in test anxiety. A total of 110 students (aged 16 to 20) from a German secondary modern school filled out a short form of the Test Anxiety Inventory (TAI-G; Wacker, Jaunzeme, & Jaksztat, 2008 ) and then were presented with items from the Adaptive Matrices Test (AMT; Hornke, Etzel, & Rettig, 1999 ) on the computer, either in CAT form or in a fixed item test form with a selection of items arranged in order of increasing item difficulty. Additionally, half of the students were given a short summary of information about the mode of item selection in adaptive testing before working on the CAT. In a moderated regression approach, a significant interaction of test anxiety and test mode was revealed. The effect of test mode on the AMT score was stronger for students with higher scores on test anxiety than for students with lower test anxiety. Furthermore, getting information about CAT led to significantly better results than receiving standard test instructions. Results are discussed with reference to test fairness.


2015 ◽  
Vol 23 (88) ◽  
pp. 593-610
Author(s):  
Patrícia Costa ◽  
Maria Eugénia Ferrão

This study aims to provide statistical evidence of the complementarity between classical test theory and item response models for certain educational assessment purposes. Such complementarity might support, at a reduced cost, future development of innovative procedures for item calibration in adaptive testing. Classical test theory and the generalized partial credit model are applied to tests comprising multiple choice, short answer, completion, and open response items scored partially. Datasets are derived from the tests administered to the Portuguese population of students enrolled in the 4th and 6th grades. The results show a very strong association between the estimates of difficulty obtained from classical test theory and item response models, corroborating the statistical theory of mental testing.


1999 ◽  
Vol 12 (2) ◽  
pp. 185-198 ◽  
Author(s):  
Steven L. Wise ◽  
Sara J. Finney ◽  
Craig K. Enders ◽  
Sharon A. Freeman ◽  
Donald D. Severance

2020 ◽  
Author(s):  
John Harmon Wolfe ◽  
Gerald E. Larson

The feasibility of generating items in real-time for computerized adaptive testing is explored, using forward digit span as an exemplar. A sample of 531 recruits at the Naval Training Center in San Diego were administered 36 computer-generated forward digit span items of varying lengths. Calibrations showed that the item difficulty was a simple linear function of the number of digits in the item, thus making the difficulty of newly generated items predictable. Simulations of Computerized adaptive testing with the approach were conducted with favorable results.


1999 ◽  
Vol 15 (2) ◽  
pp. 91-98 ◽  
Author(s):  
Lutz F. Hornke

Summary: Item parameters for several hundreds of items were estimated based on empirical data from several thousands of subjects. The logistic one-parameter (1PL) and two-parameter (2PL) model estimates were evaluated. However, model fit showed that only a subset of items complied sufficiently, so that the remaining ones were assembled in well-fitting item banks. In several simulation studies 5000 simulated responses were generated in accordance with a computerized adaptive test procedure along with person parameters. A general reliability of .80 or a standard error of measurement of .44 was used as a stopping rule to end CAT testing. We also recorded how often each item was used by all simulees. Person-parameter estimates based on CAT correlated higher than .90 with true values simulated. For all 1PL fitting item banks most simulees used more than 20 items but less than 30 items to reach the pre-set level of measurement error. However, testing based on item banks that complied to the 2PL revealed that, on average, only 10 items were sufficient to end testing at the same measurement error level. Both clearly demonstrate the precision and economy of computerized adaptive testing. Empirical evaluations from everyday uses will show whether these trends will hold up in practice. If so, CAT will become possible and reasonable with some 150 well-calibrated 2PL items.


Methodology ◽  
2007 ◽  
Vol 3 (1) ◽  
pp. 14-23 ◽  
Author(s):  
Juan Ramon Barrada ◽  
Julio Olea ◽  
Vicente Ponsoda

Abstract. The Sympson-Hetter (1985) method provides a means of controlling maximum exposure rate of items in Computerized Adaptive Testing. Through a series of simulations, control parameters are set that mark the probability of administration of an item on being selected. This method presents two main problems: it requires a long computation time for calculating the parameters and the maximum exposure rate is slightly above the fixed limit. Van der Linden (2003) presented two alternatives which appear to solve both of the problems. The impact of these methods in the measurement accuracy has not been tested yet. We show how these methods over-restrict the exposure of some highly discriminating items and, thus, the accuracy is decreased. It also shown that, when the desired maximum exposure rate is near the minimum possible value, these methods offer an empirical maximum exposure rate clearly above the goal. A new method, based on the initial estimation of the probability of administration and the probability of selection of the items with the restricted method ( Revuelta & Ponsoda, 1998 ), is presented in this paper. It can be used with the Sympson-Hetter method and with the two van der Linden's methods. This option, when used with Sympson-Hetter, speeds the convergence of the control parameters without decreasing the accuracy.


Sign in / Sign up

Export Citation Format

Share Document