Alternative item selection strategies for improving test security in computerized adaptive testing of the algorithm

Salah satu metode estimasi kemampuan yang banyak diaplikasikan pada algoritma Computerized Adaptive Testing (CAT) adalah Maximum Likeli-hood Estimation (MLE). Metode MLE mempunyai kekurangan yaitu ketidakmampuan menemukan solusi estimasi kemampuan peserta tes ketika skor peserta masih belum berpola. Bila ada peserta tes yang memperoleh skor 0 atau skor sempurna, maka untuk menentukan estimasi kemampuan peserta tes umumnya menggunakan model step size. Namun, model step-size tersebut mengakibatkan item exposure. Item exposure merupakan fenomena dimana butir-butir soal tertentu akan lebih sering muncul dibandingkan dengan butir-butir soal yang lain. Hal tersebut membuat tes menjadi tidak aman karena butir-butir soal yang sering muncul akan lebih mudah pula untuk dikenali. Kajian ini mencoba memberikan alternatif strategi dengan cara memodifikasi model step-size dan dilanjutkan dengan merandom hasil perhitungan fungsi informasi yang diperoleh. Berdasarkan hasil kajian didapatkan bahwa alternatif strategi pemilihan butir soal ini mampu menghasilkan kemunculan butir soal yang lebih bervariasi sehingga dapat meningkatkan keamanan tes pada CAT.Kata kunci: item exposure, step-size, adaptive testing AbstractOne method of capability estimation that is widely applied to the Computerized Adaptive Testing (CAT) algorithm is Maximum Likeli-hood Estimation (MLE). The MLE method has the disadvantage of being unable to find a solution to the test taker's ability when the participant's score is not patterned. If there are test takers who get a score of 0 or perfect score, then to determine the ability of the test takers to generally use the step size model. However, the step-size model results in exposure items. The exposure item is a phenomenon where certain items will appear more often than other items. This makes the test insecure because items that often appear will be easier to recognize. This study tries to provide an alternative strategy by modifying the step-size model and proceed by randomizing the results of the calculation of the information function obtained. Based on the results of the study, it was found that alternative item selection strategies were able to produce the appearance of more varied items so as to improve the safety of tests on the CAT.

Download Full-text

Item Selection Methods in Multidimensional Computerized Adaptive Testing With Polytomously Scored Items

Applied Psychological Measurement ◽

10.1177/0146621618762748 ◽

2018 ◽

Vol 42 (8) ◽

pp. 677-694 ◽

Cited By ~ 1

Author(s):

Dongbo Tu ◽

Yuting Han ◽

Yan Cai ◽

Xuliang Gao

Keyword(s):

Computerized Adaptive Testing ◽

Adaptive Testing ◽

Entropy Method ◽

Two Dimensions ◽

Item Selection ◽

Estimation Accuracy ◽

Selection Methods ◽

Item Exposure ◽

Selection Strategies ◽

Selection Algorithms

Multidimensional computerized adaptive testing (MCAT) has been developed over the past decades, and most of them can only deal with dichotomously scored items. However, polytomously scored items have been broadly used in a variety of tests for their advantages of providing more information and testing complicated abilities and skills. The purpose of this study is to discuss the item selection algorithms used in MCAT with polytomously scored items (PMCAT). Several promising item selection algorithms used in MCAT are extended to PMCAT, and two new item selection methods are proposed to improve the existing selection strategies. Two simulation studies are conducted to demonstrate the feasibility of the extended and proposed methods. The simulation results show that most of the extended item selection methods for PMCAT are feasible and the new proposed item selection methods perform well. Combined with the security of the pool, when two dimensions are considered (Study 1), the proposed modified continuous entropy method (MCEM) is the ideal of all in that it gains the lowest item exposure rate and has a relatively high accuracy. As for high dimensions (Study 2), results show that mutual information (MUI) and MCEM keep relatively high estimation accuracy, and the item exposure rates decrease as the correlation increases.

Download Full-text

Dynamic and Comprehensive Item Selection Strategies for Computerized Adaptive Testing Based on Graded Response Model

Acta Psychologica Sinica ◽

10.3724/sp.j.1041.2012.00400 ◽

2013 ◽

Vol 44 (3) ◽

pp. 400-412 ◽

Cited By ~ 1

Author(s):

Fen LUO ◽

Shu-Liang DING ◽

Xiao-Qing WANG

Keyword(s):

Computerized Adaptive Testing ◽

Adaptive Testing ◽

Item Selection ◽

Response Model ◽

Graded Response Model ◽

Selection Strategies ◽

Graded Response

Download Full-text

Item Selection Strategies for Computerized Adaptive Testing with the Generalized Partial Credit Model

Acta Psychologica Sinica ◽

10.3724/sp.j.1041.2008.00618 ◽

2008 ◽

Vol 40 (5) ◽

pp. 618-625 ◽

Cited By ~ 2

Author(s):

Zhen LIU

Keyword(s):

Computerized Adaptive Testing ◽

Adaptive Testing ◽

Item Selection ◽

Partial Credit Model ◽

Partial Credit ◽

Generalized Partial Credit Model ◽

Selection Strategies ◽

Generalized Partial Credit

Download Full-text

A Dynamic Stratification Method for Improving Trait Estimation in Computerized Adaptive Testing Under Item Exposure Control

Applied Psychological Measurement ◽

10.1177/0146621619843820 ◽

2019 ◽

Vol 44 (3) ◽

pp. 182-196

Author(s):

Jyun-Hong Chen ◽

Hsiu-Yi Chao ◽

Shu-Ying Chen

Keyword(s):

Computerized Adaptive Testing ◽

Item Difficulty ◽

Adaptive Testing ◽

Item Selection ◽

Exposure Control ◽

Item Exposure ◽

Stratification Method ◽

High Discrimination ◽

Item Exposure Control ◽

Trait Estimation

When computerized adaptive testing (CAT) is under stringent item exposure control, the precision of trait estimation will substantially decrease. A new item selection method, the dynamic Stratification method based on Dominance Curves (SDC), which is aimed at improving trait estimation, is proposed to mitigate this problem. The objective function of the SDC in item selection is to maximize the sum of test information for all examinees rather than maximizing item information for individual examinees at a single-item administration, as in conventional CAT. To achieve this objective, the SDC uses dominance curves to stratify an item pool into strata with the number being equal to the test length to precisely and accurately increase the quality of the administered items as the test progresses, reducing the likelihood that a high-discrimination item will be administered to an examinee whose ability is not close to the item difficulty. Furthermore, the SDC incorporates a dynamic process for on-the-fly item–stratum adjustment to optimize the use of quality items. Simulation studies were conducted to investigate the performance of the SDC in CAT under item exposure control at different levels of severity. According to the results, the SDC can efficiently improve trait estimation in CAT through greater precision and more accurate trait estimation than those generated by other methods (e.g., the maximum Fisher information method) in most conditions.

Download Full-text

The Effect of Item Exposure Control Methods on Measurement Precision and Test Security under Different Measurement Conditions in Computerized Adaptive Testing

TED EĞİTİM VE BİLİM ◽

10.15390/eb.2020.8256 ◽

2020 ◽

Author(s):

Recep Gür ◽

H. Deniz Gülleroğlu

Keyword(s):

Computerized Adaptive Testing ◽

Adaptive Testing ◽

Measurement Precision ◽

Control Methods ◽

Exposure Control ◽

Test Security ◽

Item Exposure ◽

Item Exposure Control

Download Full-text

A Comparison of Item Selection Methods for Controlling Exposure Rate in Cognitive Diagnostic Computerized Adaptive Testing

Acta Psychologica Sinica ◽

10.3724/sp.j.1041.2013.00694 ◽

2013 ◽

Vol 45 (6) ◽

pp. 694-703

Author(s):

Xiuzhen MAO ◽

Tao XIN

Keyword(s):

Computerized Adaptive Testing ◽

Adaptive Testing ◽

Item Selection ◽

Exposure Rate ◽

Selection Methods

Download Full-text

a-Stratified Methods Combining Item Exposure Control and General Test Overlap in Computerized Adaptive Testing

Acta Psychologica Sinica ◽

10.3724/sp.j.1041.2014.00702 ◽

2014 ◽

Vol 46 (5) ◽

pp. 702

Author(s):

Lei GUO ◽

Zhuoran WANG ◽

Feng WANG ◽

Yufang BIAN

Keyword(s):

Computerized Adaptive Testing ◽

Adaptive Testing ◽

Exposure Control ◽

Item Exposure ◽

General Test ◽

Item Exposure Control ◽

Test Overlap

Download Full-text

Evaluating a Computerized Adaptive Testing Version of a Cognitive Ability Test Using a Simulation Study

Journal of Psychoeducational Assessment ◽

10.1177/07342829211027753 ◽

2021 ◽

pp. 073428292110277

Author(s):

Ioannis Tsaousis ◽

Georgios D. Sideridis ◽

Hannan M. AlGhamdi

Keyword(s):

Cognitive Ability ◽

Simulation Study ◽

Computerized Adaptive Testing ◽

Adaptive Testing ◽

Estimation Methods ◽

Item Pool ◽

Sequential Approach ◽

Ability Test ◽

Promising Alternative ◽

Item Exposure

This study evaluated the psychometric quality of a computerized adaptive testing (CAT) version of the general cognitive ability test (GCAT), using a simulation study protocol put forth by Han, K. T. (2018a). For the needs of the analysis, three different sets of items were generated, providing an item pool of 165 items. Before evaluating the efficiency of the GCAT, all items in the final item pool were linked (equated), following a sequential approach. Data were generated using a standard normal for 10,000 virtual individuals ( M = 0 and SD = 1). Using the measure’s 165-item bank, the ability value (θ) for each participant was estimated. maximum Fisher information (MFI) and maximum likelihood estimation with fences (MLEF) were used as item selection and score estimation methods, respectively. For item exposure control, the fade away method (FAM) was preferred. The termination criterion involved a minimum SE ≤ 0.33. The study revealed that the average number of items administered for 10,000 participants was 15. Moreover, the precision level in estimating the participant’s ability score was very high, as demonstrated by the CBIAS, CMAE, and CRMSE). It is concluded that the CAT version of the test is a promising alternative to administering the corresponding full-length measure since it reduces the number of administered items, prevents high rates of item exposure, and provides accurate scores with minimum measurement error.

Download Full-text