scholarly journals Score-Based Measurement Invariance Checks for Bayesian Maximum-a-Posteriori Estimates in Item Response Theory

2020 ◽  
Author(s):  
Rudolf Debelak ◽  
Samuel Pawel ◽  
Carolin Strobl ◽  
Edgar C. Merkle

A family of score-based tests has been proposed in the past years for assessing the invariance of model parameters in several models of item response theory. These tests were originally developed in a maximum likelihood framework. This study aims to extend the theoretical framework of these tests to Bayesian maximum-a-posteriori estimates and to multiple group IRT models. We propose two families of statistical tests, which are based on a) an approximation using a pooled variance method, or b) a simulation-based approach based on asymptotic results. The resulting tests were evaluated by a simulation study, which investigated their sensitivity against differential item functioning with respect to a categorical or continuous person covariate in the two- and three-parametric logistic models. Whereas the method based on pooled variance was found to be practically useful with maximum likelihood as well as maximum-a-posteriori estimates, the simulation-based approach was found to require large sample sizes to lead to satisfactory results.

2021 ◽  
Author(s):  
Metin Bulus ◽  
Wes Bonifay

Comprehension of foundational but fairly complex statistical theories may require assistive interactive tools to understand underlying equations and theory. We provide a collection of interactive shiny applications to demonstrate or explore some of the fundamental yet complex Item Response Theory (IRT) concepts such as estimation, scoring and multidimensionality. Users can explore principles of Maximum Likelihood Estimation such as Newton-Raphson iterations, influence of starting values and extreme scores on the convergence, principles of Expected A Posteriori (EAP) and Maximum A Posteriori (MAP) ability estimation such as likelihood, quadrature points, influence of prior mean, prior standard deviation and prior skewness on EAP and MAP estimates, and multidimensional IRT concepts such as item response surface, item information, compensatory and partially compensatory models for two-, three- and four-parameter logistic or ogive IRT models. We hope that these applications give a head-start to emerging practitioners and researchers interested in advanced measurement topics.


2020 ◽  
Vol 44 (7-8) ◽  
pp. 566-567
Author(s):  
Shaoyang Guo ◽  
Chanjin Zheng ◽  
Justin L. Kern

A recently released R package IRTBEMM is presented in this article. This package puts together several new estimation algorithms (Bayesian EMM, Bayesian E3M, and their maximum likelihood versions) for the Item Response Theory (IRT) models with guessing and slipping parameters (e.g., 3PL, 4PL, 1PL-G, and 1PL-AG models). IRTBEMM should be of interest to the researchers in IRT estimation and applying IRT models with the guessing and slipping effects to real datasets.


Author(s):  
Anju Devianee Keetharuth ◽  
Jakob Bue Bjorner ◽  
Michael Barkham ◽  
John Browne ◽  
Tim Croudace ◽  
...  

Abstract Purpose ReQoL-10 and ReQoL-20 have been developed for use as outcome measures with individuals aged 16 and over, experiencing mental health difficulties. This paper reports modelling results from the item response theory (IRT) analyses that were used for item reduction. Methods From several stages of preparatory work including focus groups and a previous psychometric survey, a pool of items was developed. After confirming that the ReQoL item pool was sufficiently unidimensional for scoring, IRT model parameters were estimated using Samejima’s Graded Response Model (GRM). All 39 mental health items were evaluated with respect to item fit and differential item function regarding age, gender, ethnicity, and diagnosis. Scales were evaluated regarding overall measurement precision and known-groups validity (by care setting type and self-rating of overall mental health). Results The study recruited 4266 participants with a wide range of mental health diagnoses from multiple settings. The IRT parameters demonstrated excellent coverage of the latent construct with the centres of item information functions ranging from − 0.98 to 0.21 and with discrimination slope parameters from 1.4 to 3.6. We identified only two poorly fitting items and no evidence of differential item functioning of concern. Scales showed excellent measurement precision and known-groups validity. Conclusion The results from the IRT analyses confirm the robust structure properties and internal construct validity of the ReQoL instruments. The strong psychometric evidence generated guided item selection for the final versions of the ReQoL measures.


2020 ◽  
Vol 44 (5) ◽  
pp. 362-375
Author(s):  
Tyler Strachan ◽  
Edward Ip ◽  
Yanyan Fu ◽  
Terry Ackerman ◽  
Shyh-Huei Chen ◽  
...  

As a method to derive a “purified” measure along a dimension of interest from response data that are potentially multidimensional in nature, the projective item response theory (PIRT) approach requires first fitting a multidimensional item response theory (MIRT) model to the data before projecting onto a dimension of interest. This study aims to explore how accurate the PIRT results are when the estimated MIRT model is misspecified. Specifically, we focus on using a (potentially misspecified) two-dimensional (2D)-MIRT for projection because of its advantages, including interpretability, identifiability, and computational stability, over higher dimensional models. Two large simulation studies (I and II) were conducted. Both studies examined whether the fitting of a 2D-MIRT is sufficient to recover the PIRT parameters when multiple nuisance dimensions exist in the test items, which were generated, respectively, under compensatory MIRT and bifactor models. Various factors were manipulated, including sample size, test length, latent factor correlation, and number of nuisance dimensions. The results from simulation studies I and II showed that the PIRT was overall robust to a misspecified 2D-MIRT. Smaller third and fourth simulation studies were done to evaluate recovery of the PIRT model parameters when the correctly specified higher dimensional MIRT or bifactor model was fitted with the response data. In addition, a real data set was used to illustrate the robustness of PIRT.


2020 ◽  
Vol 80 (5) ◽  
pp. 975-994
Author(s):  
Yoonsun Jang ◽  
Allan S. Cohen

A nonconverged Markov chain can potentially lead to invalid inferences about model parameters. The purpose of this study was to assess the effect of a nonconverged Markov chain on the estimation of parameters for mixture item response theory models using a Markov chain Monte Carlo algorithm. A simulation study was conducted to investigate the accuracy of model parameters estimated with different degree of convergence. Results indicated the accuracy of the estimated model parameters for the mixture item response theory models decreased as the number of iterations of the Markov chain decreased. In particular, increasing the number of burn-in iterations resulted in more accurate estimation of mixture IRT model parameters. In addition, the different methods for monitoring convergence of a Markov chain resulted in different degrees of convergence despite almost identical accuracy of estimation.


1998 ◽  
Vol 2 (4) ◽  
pp. 395-403 ◽  
Author(s):  
Craig K. Abbey ◽  
Eric Clarkson ◽  
Harrison H. Barrett ◽  
Stefan P. Müller ◽  
Frank J. Rybicki

Sign in / Sign up

Export Citation Format

Share Document