scholarly journals A Note on the Relationship of the Shannon Entropy Procedure and the Jensen–Shannon Divergence in Cognitive Diagnostic Computerized Adaptive Testing

SAGE Open ◽  
2020 ◽  
Vol 10 (1) ◽  
pp. 215824401989904
Author(s):  
Wenyi Wang ◽  
Lihong Song ◽  
Teng Wang ◽  
Peng Gao ◽  
Jian Xiong

The purpose of this study is to investigate the relationship between the Shannon entropy procedure and the Jensen–Shannon divergence (JSD) that are used as item selection criteria in cognitive diagnostic computerized adaptive testing (CD-CAT). Because the JSD itself is defined by the Shannon entropy, we apply the well-known relationship between the JSD and Shannon entropy to establish a relationship between the item selection criteria that are based on these two measures. To understand the relationship between these two item selection criteria better, an alternative way is also provided. Theoretical derivations and empirical examples have shown that the Shannon entropy procedure and the JSD in CD-CAT have a linear relation under cognitive diagnostic models. Consistent with our theoretical conclusions, simulation results have shown that two item selection criteria behaved quite similarly in terms of attribute-level and pattern recovery rates under all conditions and they selected the same set of items for each examinee from an item bank with item parameters drawn from a uniform distribution U(0.1, 0.3) under post hoc simulations. We provide some suggestions for future studies and a discussion of relationship between the modified posterior-weighted Kullback–Leibler index and the G-DINA (generalized deterministic inputs, noisy “and” gate) discrimination index.

2012 ◽  
Vol 43 (2) ◽  
pp. 203-212 ◽  
Author(s):  
Xiao-Yang CHENG ◽  
Shu-Liang DING ◽  
Shen-Hai YAN ◽  
Long-Yin ZHU

2014 ◽  
Vol 30 (1) ◽  
pp. 48-56 ◽  
Author(s):  
Tuulia M. Ortner ◽  
Eva Weißkopf ◽  
Tobias Koch

We investigated the effects of computerized adaptive testing (CAT) versus computerized fixed item testing (FIT) of reasoning ability on current motivation in terms of situational fear of failure and subjective probability of success, as well as flow. A group of 174 students (aged 15–21) from two German secondary schools was presented either a CAT or a FIT version of a matrices test; motivational variables were assessed during a short break in testing. More situational fear of failure and less subjective probability of success were reported using CAT compared to FIT. Self-reported flow did not differ between test mode conditions. When we addressed the hypothesis that adaptive testing is equally motivating for both high and lower performers, test performance appeared to moderate the relationship of test mode and subjective probability of success: Only during FIT was subjective probability of success higher with lower test performance. This moderation effect was also revealed for the relationship of test mode and flow. However, as average reported motivation was lower during CAT, results contradict assumptions of enhanced motivation during CAT. Results are discussed in relation to self-concept relevance of testing domains and with reference to test fairness.


2018 ◽  
Vol 79 (2) ◽  
pp. 335-357 ◽  
Author(s):  
Chuan-Ju Lin ◽  
Hua-Hua Chang

For item selection in cognitive diagnostic computerized adaptive testing (CD-CAT), ideally, a single item selection index should be created to simultaneously regulate precision, exposure status, and attribute balancing. For this purpose, in this study, we first proposed an attribute-balanced item selection criterion, namely, the standardized weighted deviation global discrimination index (SWDGDI), and subsequently formulated the constrained progressive index (CP_SWDGDI) by casting the SWDGDI in a progressive algorithm. A simulation study revealed that the SWDGDI method was effective in balancing attribute coverage and the CP_SWDGDI method was able to simultaneously balance attribute coverage and item pool usage while maintaining acceptable estimation precision. This research also demonstrates the advantage of a relatively low number of attributes in CD-CAT applications.


Author(s):  
Dong Gi Seo ◽  
Jeongwook Choi

Purpose: Computerized adaptive testing (CAT) has been adopted in licensing examinations because it improves the efficiency and accuracy of the tests, as shown in many studies. This simulation study investigated CAT scoring and item selection methods for the Korean Medical Licensing Examination (KMLE). Methods: This study used a post-hoc (real data) simulation design. The item bank used in this study included all items from the January 2017 KMLE. All CAT algorithms for this study were implemented using the ‘catR’ package in the R program. Results: In terms of accuracy, the Rasch and 2-parametric logistic (PL) models performed better than the 3PL model. The ‘modal a posteriori’ and ‘expected a posterior’ methods provided more accurate estimates than maximum likelihood estimation or weighted likelihood estimation. Furthermore, maximum posterior weighted information and minimum expected posterior variance performed better than other item selection methods. In terms of efficiency, the Rasch model is recommended to reduce test length. Conclusion: Before implementing live CAT, a simulation study should be performed under varied test conditions. Based on a simulation study, and based on the results, specific scoring and item selection methods should be predetermined.


1999 ◽  
Vol 15 (2) ◽  
pp. 91-98 ◽  
Author(s):  
Lutz F. Hornke

Summary: Item parameters for several hundreds of items were estimated based on empirical data from several thousands of subjects. The logistic one-parameter (1PL) and two-parameter (2PL) model estimates were evaluated. However, model fit showed that only a subset of items complied sufficiently, so that the remaining ones were assembled in well-fitting item banks. In several simulation studies 5000 simulated responses were generated in accordance with a computerized adaptive test procedure along with person parameters. A general reliability of .80 or a standard error of measurement of .44 was used as a stopping rule to end CAT testing. We also recorded how often each item was used by all simulees. Person-parameter estimates based on CAT correlated higher than .90 with true values simulated. For all 1PL fitting item banks most simulees used more than 20 items but less than 30 items to reach the pre-set level of measurement error. However, testing based on item banks that complied to the 2PL revealed that, on average, only 10 items were sufficient to end testing at the same measurement error level. Both clearly demonstrate the precision and economy of computerized adaptive testing. Empirical evaluations from everyday uses will show whether these trends will hold up in practice. If so, CAT will become possible and reasonable with some 150 well-calibrated 2PL items.


Sign in / Sign up

Export Citation Format

Share Document