Prelude to Machine Learning-Based IRT Research: Bayesian Item Parameter Recovery

2021 ◽  
Vol 23 (4) ◽  
pp. 1509-1516
Author(s):  
Taeyoung Kim ◽  
Seungbae Choi ◽  
Hae-Gyung Yoon
2021 ◽  
pp. 107699862199436
Author(s):  
Yue Liu ◽  
Hongyun Liu

The prevalence and serious consequences of noneffortful responses from unmotivated examinees are well-known in educational measurement. In this study, we propose to apply an iterative purification process based on a response time residual method with fixed item parameter estimates to detect noneffortful responses. The proposed method is compared with the traditional residual method and noniterative method with fixed item parameters in two simulation studies in terms of noneffort detection accuracy and parameter recovery. The results show that when severity of noneffort is high, the proposed method leads to a much higher true positive rate with a small increase of false discovery rate. In addition, parameter estimation is significantly improved by the strategies of fixing item parameters and iteratively cleansing. These results suggest that the proposed method is a potential solution to reduce the impact of data contamination due to severe low test-taking effort and to obtain more accurate parameter estimates. An empirical study is also conducted to show the differences in the detection rate and parameter estimates among different approaches.


2017 ◽  
Vol 41 (7) ◽  
pp. 530-544 ◽  
Author(s):  
Dubravka Svetina ◽  
Arturo Valdivia ◽  
Stephanie Underhill ◽  
Shenghai Dai ◽  
Xiaolin Wang

Information about the psychometric properties of items can be highly useful in assessment development, for example, in item response theory (IRT) applications and computerized adaptive testing. Although literature on parameter recovery in unidimensional IRT abounds, less is known about parameter recovery in multidimensional IRT (MIRT), notably when tests exhibit complex structures or when latent traits are nonnormal. The current simulation study focuses on investigation of the effects of complex item structures and the shape of examinees’ latent trait distributions on item parameter recovery in compensatory MIRT models for dichotomous items. Outcome variables included bias and root mean square error. Results indicated that when latent traits were skewed, item parameter recovery was generally adversely impacted. In addition, the presence of complexity contributed to decreases in the precision of parameter recovery, particularly for discrimination parameters along one dimension when at least one latent trait was generated as skewed.


2020 ◽  
Vol 80 (4) ◽  
pp. 775-807
Author(s):  
Yue Liu ◽  
Ying Cheng ◽  
Hongyun Liu

The responses of non-effortful test-takers may have serious consequences as non-effortful responses can impair model calibration and latent trait inferences. This article introduces a mixture model, using both response accuracy and response time information, to help differentiating non-effortful and effortful individuals, and to improve item parameter estimation based on the effortful group. Two mixture approaches are compared with the traditional response time mixture model (TMM) method and the normative threshold 10 (NT10) method with response behavior effort criteria in four simulation scenarios with regard to item parameter recovery and classification accuracy. The results demonstrate that the mixture methods and the TMM method can reduce the bias of item parameter estimates caused by non-effortful individuals, with the mixture methods showing more advantages when the non-effort severity is high or the response times are not lognormally distributed. An illustrative example is also provided.


2021 ◽  
Vol 11 ◽  
Author(s):  
Sedat Sen ◽  
Allan S. Cohen

Results of a comprehensive simulation study are reported investigating the effects of sample size, test length, number of attributes and base rate of mastery on item parameter recovery and classification accuracy of four DCMs (i.e., C-RUM, DINA, DINO, and LCDMREDUCED). Effects were evaluated using bias and RMSE computed between true (i.e., generating) parameters and estimated parameters. Effects of simulated factors on attribute assignment were also evaluated using the percentage of classification accuracy. More precise estimates of item parameters were obtained with larger sample size and longer test length. Recovery of item parameters decreased as the number of attributes increased from three to five but base rate of mastery had a varying effect on the item recovery. Item parameter and classification accuracy were higher for DINA and DINO models.


2020 ◽  
Author(s):  
Joseph Rios ◽  
Jim Soland

As low-stakes testing contexts increase, low test-taking effort may serve as a serious validity threat. One common solution to this problem is to identify noneffortful responses and treat them as missing during parameter estimation via the Effort-Moderated IRT (EM-IRT) model. Although this model has been shown to outperform traditional IRT models (e.g., 2PL) in parameter estimation under simulated conditions, prior research has failed to examine its performance under violations to the model’s assumptions. Therefore, the objective of this simulation study was to examine item and mean ability parameter recovery when violating the assumptions that noneffortful responding occurs randomly (assumption #1) and is unrelated to the underlying ability of examinees (assumption #2). Results demonstrated that, across conditions, the EM-IRT model provided robust item parameter estimates to violations of assumption #1. However, bias values greater than 0.20 SDs were observed for the EM-IRT model when violating assumption #2; nonetheless, these values were still lower than the 2PL model. In terms of mean ability estimates, model results indicated equal performance between the EM-IRT and 2PL models across conditions. Across both models, mean ability estimates were found to be biased by more than 0.25 SDs when violating assumption #2. However, our accompanying empirical study suggested that this biasing occurred under extreme conditions that may not be present in some operational settings. Overall, these results suggest that the EM-IRT model provides superior item and equal mean ability parameter estimates in the presence of model violations under realistic conditions when compared to the 2PL model.


2020 ◽  
Author(s):  
Joseph Rios ◽  
Jim Soland

As low-stakes testing contexts increase, low test-taking effort may serve as a serious validity threat. One common solution to this problem is to identify noneffortful responses and treat them as missing during parameter estimation via the Effort-Moderated IRT (EM-IRT) model. Although this model has been shown to outperform traditional IRT models (e.g., 2PL) in parameter estimation under simulated conditions, prior research has failed to examine its performance under violations to the model’s assumptions. Therefore, the objective of this simulation study was to examine item and mean ability parameter recovery when violating the assumptions that noneffortful responding occurs randomly (assumption #1) and is unrelated to the underlying ability of examinees (assumption #2). Results demonstrated that, across conditions, the EM-IRT model provided robust item parameter estimates to violations of assumption #1. However, bias values greater than 0.20 SDs were observed for the EM-IRT model when violating assumption #2; nonetheless, these values were still lower than the 2PL model. In terms of mean ability estimates, model results indicated equal performance between the EM-IRT and 2PL models across conditions. Across both models, mean ability estimates were found to be biased by more than 0.25 SDs when violating assumption #2. However, our accompanying empirical study suggested that this biasing occurred under extreme conditions that may not be present in some operational settings. Overall, these results suggest that the EM-IRT model provides superior item and equal mean ability parameter estimates in the presence of model violations under realistic conditions when compared to the 2PL model.


Sign in / Sign up

Export Citation Format

Share Document