scholarly journals Bayesian Psychometrics for Diagnostic Assessments: A Proof of Concept

2020 ◽  
Author(s):  
W. Jake Thompson

Diagnostic assessments measure the knowledge, skills, and understandings of students at a smaller and more actionable grain size than traditional scale-score assessments. Results of diagnostic assessments are reported as a mastery profile, indicating which knowledge, skills, and understandings the student has mastered and which ones may need more instruction. These mastery decisions are based on probabilities of mastery derived from diagnostic classification models (DCMs).This report outlines a Bayesian framework for the estimation and evaluation of DCMs. Findings illustrate the utility of the Bayesian framework for estimating and evaluating DCMs in applied settings. Specifically, the findings demonstrate how a variety of DCMs can be defined within the same conceptual framework. Additionally, using this framework, the evaluation of model fit is more straightforward and easier to interpret with intuitive graphics. Throughout, recommendations are made for specific implementation decisions for the estimation process and the assessment of model fit.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Mahdieh Shafipoor ◽  
Hamdollah Ravand ◽  
Parviz Maftoon

AbstractThe current study compared the model fit indices, skill mastery probabilities, and classification accuracy of six Diagnostic Classification Models (DCMs): a general model (G-DINA) against five specific models (LLM, RRUM, ACDM, DINA, and DINO). To do so, the response data to the grammar and vocabulary sections of a General English Achievement Test, designed specifically for cognitive diagnostic purposes from scratch, was analyzed. The results of the test-level-model fit values obtained strong evidence in supporting the G-DINA and LLM models possessing the best model fit. In addition, the ACDM and RRUM were almost very identical to that of the G-DINA. The value indices of the DINO and DINA models were very close to each other but larger than those of the G-DINA and LLM. The model fit was also investigated at the item level, and the results revealed that model selection should be performed at the item level rather than the test level, and most of the specific models might perform well for the test. The findings of this study suggested that the relationships among the attributes of grammar and vocabulary are not ‘either-or’ compensatory or non-compensatory but a combination of both.


2019 ◽  
Vol 45 (1) ◽  
pp. 5-31
Author(s):  
Matthew S. Johnson ◽  
Sandip Sinharay

One common score reported from diagnostic classification assessments is the vector of posterior means of the skill mastery indicators. As with any assessment, it is important to derive and report estimates of the reliability of the reported scores. After reviewing a reliability measure suggested by Templin and Bradshaw, this article suggests three new measures of reliability of the posterior means of skill mastery indicators and methods for estimating the measures when the number of items on the assessment and the number of skills being assessed render exact calculation computationally burdensome. The utility of the new measures is demonstrated using simulated and real data examples. Two of the suggested measures are recommended for future use.


Sign in / Sign up

Export Citation Format

Share Document