Improved Wald Statistics for Item-Level Model Comparison in Diagnostic Classification Models

AbstractThe current study compared the model fit indices, skill mastery probabilities, and classification accuracy of six Diagnostic Classification Models (DCMs): a general model (G-DINA) against five specific models (LLM, RRUM, ACDM, DINA, and DINO). To do so, the response data to the grammar and vocabulary sections of a General English Achievement Test, designed specifically for cognitive diagnostic purposes from scratch, was analyzed. The results of the test-level-model fit values obtained strong evidence in supporting the G-DINA and LLM models possessing the best model fit. In addition, the ACDM and RRUM were almost very identical to that of the G-DINA. The value indices of the DINO and DINA models were very close to each other but larger than those of the G-DINA and LLM. The model fit was also investigated at the item level, and the results revealed that model selection should be performed at the item level rather than the test level, and most of the specific models might perform well for the test. The findings of this study suggested that the relationships among the attributes of grammar and vocabulary are not ‘either-or’ compensatory or non-compensatory but a combination of both.

Download Full-text

Diagnostic Classification Models for Ordinal Item Responses

Frontiers in Psychology ◽

10.3389/fpsyg.2018.02512 ◽

2018 ◽

Vol 9 ◽

Cited By ~ 3

Author(s):

Ren Liu ◽

Zhehan Jiang

Keyword(s):

Diagnostic Classification ◽

Classification Models ◽

Diagnostic Classification Models ◽

Item Responses

Download Full-text

Reliably Assessing Growth with Longitudinal Diagnostic Classification Models

Educational Measurement Issues and Practice ◽

10.1111/emip.12243 ◽

2019 ◽

Vol 38 (2) ◽

pp. 68-78

Author(s):

Matthew J. Madison

Keyword(s):

Diagnostic Classification ◽

Classification Models ◽

Diagnostic Classification Models

Download Full-text

Q-Matrix Designs of Longitudinal Diagnostic Classification Models With Hierarchical Attributes for Formative Assessment

Frontiers in Psychology ◽

10.3389/fpsyg.2020.01694 ◽

2020 ◽

Vol 11 ◽

Author(s):

Wei Tian ◽

Jiahui Zhang ◽

Qian Peng ◽

Xiaoguang Yang

Keyword(s):

Formative Assessment ◽

Diagnostic Classification ◽

Classification Models ◽

Diagnostic Classification Models ◽

Q Matrix

Download Full-text

Using the iterative latent-class analysis approach to improve attribute accuracy in diagnostic classification models

Behavior Research Methods ◽

10.3758/s13428-018-01191-0 ◽

2019 ◽

Vol 51 (3) ◽

pp. 1075-1084

Author(s):

Zhehan Jiang

Keyword(s):

Latent Class Analysis ◽

Latent Class ◽

Analysis Approach ◽

Diagnostic Classification ◽

Class Analysis ◽

Classification Models ◽

Diagnostic Classification Models

Download Full-text

Nested diagnostic classification models for multiple‐choice items

British Journal of Mathematical and Statistical Psychology ◽

10.1111/bmsp.12214 ◽

2020 ◽

Author(s):

Ren Liu ◽

Haiyan Liu

Keyword(s):

Multiple Choice ◽

Diagnostic Classification ◽

Classification Models ◽

Diagnostic Classification Models ◽

Multiple Choice Items

Download Full-text

The Reliability of the Posterior Probability of Skill Attainment in Diagnostic Classification Models

Journal of Educational and Behavioral Statistics ◽

10.3102/1076998619864550 ◽

2019 ◽

Vol 45 (1) ◽

pp. 5-31

Author(s):

Matthew S. Johnson ◽

Sandip Sinharay

Keyword(s):

Posterior Probability ◽

Real Data ◽

Exact Calculation ◽

Diagnostic Classification ◽

Classification Models ◽

Diagnostic Classification Models ◽

Reliability Measure ◽

Skill Mastery

One common score reported from diagnostic classification assessments is the vector of posterior means of the skill mastery indicators. As with any assessment, it is important to derive and report estimates of the reliability of the reported scores. After reviewing a reliability measure suggested by Templin and Bradshaw, this article suggests three new measures of reliability of the posterior means of skill mastery indicators and methods for estimating the measures when the number of items on the assessment and the number of skills being assessed render exact calculation computationally burdensome. The utility of the new measures is demonstrated using simulated and real data examples. Two of the suggested measures are recommended for future use.

Download Full-text