Estimating Probabilities of Passing for Examinees With Incomplete Data in Mastery Tests

2021 ◽  
pp. 001316442110237
Author(s):  
Sandip Sinharay

Administrative problems such as computer malfunction and power outage occasionally lead to missing item scores and hence to incomplete data on mastery tests such as the AP and U.S. Medical Licensing examinations. Investigators are often interested in estimating the probabilities of passing of the examinees with incomplete data on mastery tests. However, there is a lack of research on this estimation problem. The goal of this article is to suggest two new approaches—one each based on classical test theory and item response theory—for estimating the probabilities of passing of the examinees with incomplete data on mastery tests. The two approaches are demonstrated to have high accuracy and negligible misclassification rates.

2020 ◽  
Vol 29 (4) ◽  
pp. 996-1014
Author(s):  
R Gorter ◽  
J-P Fox ◽  
I Eekhout ◽  
MW Heymans ◽  
JWR Twisk

In medical research, repeated questionnaire data is often used to measure and model latent variables across time. Through a novel imputation method, a direct comparison is made between latent growth analysis under classical test theory and item response theory, while also including effects of missing item responses. For classical test theory and item response theory, by means of a simulation study the effects of item missingness on latent growth parameter estimates are examined given longitudinal item response data. Several missing data mechanisms and conditions are evaluated in the simulation study. The additional effects of missingness on differences in classical test theory- and item response theory-based latent growth analysis are directly assessed by rescaling the multiple imputations. The multiple imputation method is used to generate latent variable and item scores from the posterior predictive distributions to account for missing item responses in observed multilevel binary response data. It is shown that a multivariate probit model, as a novel imputation model, improves the latent growth analysis, when dealing with missing at random (MAR) in classical test theory. The study also shows that the parameter estimates for the latent growth model using item response theory show less bias and have smaller MSE’s compared to the estimates using classical test theory.


2021 ◽  
Vol 11 (13) ◽  
pp. 6048
Author(s):  
Jaroslav Melesko ◽  
Simona Ramanauskaite

Feedback is a crucial component of effective, personalized learning, and is usually provided through formative assessment. Introducing formative assessment into a classroom can be challenging because of test creation complexity and the need to provide time for assessment. The newly proposed formative assessment algorithm uses multivariate Elo rating and multi-armed bandit approaches to solve these challenges. In the case study involving 106 students of the Cloud Computing course, the algorithm shows double learning path recommendation precision compared to classical test theory based assessment methods. The algorithm usage approaches item response theory benchmark precision with greatly reduced quiz length without the need for item difficulty calibration.


Psychometrika ◽  
2021 ◽  
Author(s):  
Ron D. Hays ◽  
Karen L. Spritzer ◽  
Steven P. Reise

AbstractThe reliable change index has been used to evaluate the significance of individual change in health-related quality of life. We estimate reliable change for two measures (physical function and emotional distress) in the Patient-Reported Outcomes Measurement Information System (PROMIS®) 29-item health-related quality of life measure (PROMIS-29 v2.1). Using two waves of data collected 3 months apart in a longitudinal observational study of chronic low back pain and chronic neck pain patients receiving chiropractic care, and simulations, we compare estimates of reliable change from classical test theory fixed standard errors with item response theory standard errors from the graded response model. We find that unless true change in the PROMIS physical function and emotional distress scales is substantial, classical test theory estimates of significant individual change are much more optimistic than estimates of change based on item response theory.


Sign in / Sign up

Export Citation Format

Share Document