scholarly journals Measuring language ability of students with compensatory multidimensional CAT: A post-hoc simulation study

Author(s):  
Burhanettin Ozdemir ◽  
Selahattin Gelbal

AbstractThe computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to measure the language ability of students and to compare the results of MCAT designs with the outcomes of corresponding paper–pencil tests. For this purpose, items in the English Proficiency Tests (EPT) were used to create a multi-dimensional item pool that consists of 599 items. The performance of the MCAT designs was evaluated and compared based on the reliability coefficients, root means square error (RMSE), test-length, and root means squared difference (RMSD) statistics, respectively. Therefore, 36 different conditions were investigated in total. The results of the post-hoc simulation designs indicate that the MCAT designs with the A-optimality item selection method outperformed MCAT designs with other item selection methods by decreasing the test length and RMSD values without any sacrifice in test reliability. Additionally, the best error variance stopping rule for each MCAT algorithm with A-optimality item selection could be considered as 0.25 with 27.9 average test length and 30 items for the fixed test-length stopping rule for the Bayesian MAP method. Overall, MCAT designs tend to decrease the test length by 60 to 65 percent and provide ability estimations with higher precision compared to the traditional paper–pencil tests with 65 to 75 items. Therefore, it is suggested to use the A-optimality method for item selection and the Bayesian MAP method for ability estimation for the MCAT designs since the MCAT algorithm with these specifications shows better performance than others.

2020 ◽  
Vol 44 (7-8) ◽  
pp. 531-547
Author(s):  
Johan Braeken ◽  
Muirne C. S. Paap

Fixed-precision between-item multidimensional computerized adaptive tests (MCATs) are becoming increasingly popular. The current generation of item-selection rules used in these types of MCATs typically optimize a single-valued objective criterion for multivariate precision (e.g., Fisher information volume). In contrast, when all dimensions are of interest, the stopping rule is typically defined in terms of a required fixed marginal precision per dimension. This asymmetry between multivariate precision for selection and marginal precision for stopping, which is not present in unidimensional computerized adaptive tests, has received little attention thus far. In this article, we will discuss this selection-stopping asymmetry and its consequences, and introduce and evaluate three alternative item-selection approaches. These alternatives are computationally inexpensive, easy to communicate and implement, and result in effective fixed-marginal-precision MCATs that are shorter in test length than with the current generation of item-selection approaches.


At-Turats ◽  
2014 ◽  
Vol 8 (2) ◽  
Author(s):  
Ali Hasmy

This research was conducted to see the effect of the number of testee (sample size), the number of items (test length), the number of options, and index of difficulty to various item discrimination statistics and test reliability. The data used are simulated data and analyzed using the Test Analysis Program (TAP) version 6.65 with a full factorial design. In general, the results show that the number of testee, the number of items, and index of difficulty (except number of options) significantly affect the various item discrimination statistics and test reliability. The statistics are robust to these three factors is only the Mean of Item Discrimination and Spearman Brown’s 1-2 Split-Half, while the most sensitive is Split-Half Odd-Even statistic.


2014 ◽  
Author(s):  
Niels Kildebro ◽  
Ilda Amirian ◽  
Ismail Gögenur ◽  
Jacob Rosenberg

Objectives: To determine test re-test reliability and construct validity of the star-track test of manual dexterity. Design: Test re-test reliability was tested in a controlled study. Construct validity was tested in a blinded randomized crossover study. Setting: The study was performed at a university hospital in Denmark.Participants: A total of 11 subjects for test re-test and 20 subjects for the construct validity study were included. All were healthy volunteers. Intervention: The test re-test trial had two measurements with 2 days pause in between. The interventions in the construct validity study included baseline measurement, intervention 1: fatigue, intervention 2: stress, and intervention 3: fatigue and stress. There was a 2 day pause between each intervention. Main outcome measure: Integrated measure of completion time and number of errors. Results: All participants completed the study (test re-test n = 11; construct validity n=20). Test re-test showed a strong Pearson product-moment correlation (r= 0.90, n= 11, P < 0.01) with no sign of learning effect. The 20 subjects in the construct validity trial were randomized to the order of the four interventions, so that all subjects completed each intervention once. A repeated measures ANOVA determined that mean integrated measure differed between interventions (p = 0.003). Post hoc tests using Bonferroni correction revealed that compared with baseline all interventions had significantly higher integrated scores ranging from 47-59% difference in mean. Conclusion: The star track test of manual dexterity had a strong test re-test reliability, and was able to discriminate between a subject´s normal manual dexterity and dexterity after exposure to fatigue and/or stress.


SAGE Open ◽  
2020 ◽  
Vol 10 (1) ◽  
pp. 215824401989904
Author(s):  
Wenyi Wang ◽  
Lihong Song ◽  
Teng Wang ◽  
Peng Gao ◽  
Jian Xiong

The purpose of this study is to investigate the relationship between the Shannon entropy procedure and the Jensen–Shannon divergence (JSD) that are used as item selection criteria in cognitive diagnostic computerized adaptive testing (CD-CAT). Because the JSD itself is defined by the Shannon entropy, we apply the well-known relationship between the JSD and Shannon entropy to establish a relationship between the item selection criteria that are based on these two measures. To understand the relationship between these two item selection criteria better, an alternative way is also provided. Theoretical derivations and empirical examples have shown that the Shannon entropy procedure and the JSD in CD-CAT have a linear relation under cognitive diagnostic models. Consistent with our theoretical conclusions, simulation results have shown that two item selection criteria behaved quite similarly in terms of attribute-level and pattern recovery rates under all conditions and they selected the same set of items for each examinee from an item bank with item parameters drawn from a uniform distribution U(0.1, 0.3) under post hoc simulations. We provide some suggestions for future studies and a discussion of relationship between the modified posterior-weighted Kullback–Leibler index and the G-DINA (generalized deterministic inputs, noisy “and” gate) discrimination index.


1993 ◽  
Vol 18 (2) ◽  
pp. 197-206 ◽  
Author(s):  
George A. Marcoulides

Generalizability theory provides a framework for examining the dependability of behavioral measurements. When designing generalizability studies, two important statistical issues are generally considered: power and measurement error. Control over power and error of measurement can be obtained by manipulation of sample size and/or test reliability. In generalizability theory, the mean error variance is an estimate that takes into account both these statistical issues. When limited resources are available, determining an optimal measurement design is not a simple task. This article presents a methodology for minimizing mean error variance in generalizability studies when resource constraints are imposed.


Sign in / Sign up

Export Citation Format

Share Document