A Comparison of Two MCMC Algorithms for the 2PL IRT Model

The Effect of Alternative Scoring Procedures on the Measurement Properties of a Self-Administered Depression Scale

European Journal of Psychological Assessment ◽

10.1027/1015-5759/a000371 ◽

2019 ◽

Vol 35 (1) ◽

pp. 55-62 ◽

Cited By ~ 1

Author(s):

Noboru Iwata ◽

Akizumi Tsutsumi ◽

Takafumi Wakita ◽

Ryuichi Kumagai ◽

Hiroyuki Noguchi ◽

...

Keyword(s):

Classical Test Theory ◽

Depression Scale ◽

Psychometric Testing ◽

Epidemiologic Studies ◽

Test Theory ◽

Measurement Properties ◽

Irt Model ◽

Response Alternatives ◽

Polytomous Item Response ◽

Θ Point

Abstract. To investigate the effect of response alternatives/scoring procedures on the measurement properties of the Center for Epidemiologic Studies Depression Scale (CES-D) which has the four response alternatives, a polytomous item response theory (IRT) model was applied to the responses of 2,061 workers and university students (1,640 males, 421 females). Test information functions derived from the polytomous IRT analyses on the CES-D data with various scoring procedures indicated that: (1) the CES-D with its standard (0-1-2-3) scoring procedure should be useful for screening to detect subjects with “at high-risk” of depression if the θ point showing the highest information corresponds to the cut-off point, because of its extremely higher information; (2) the CES-D with the 0-1-1-2 scoring procedure could cover wider range of depressive severity, suggesting that this scoring procedure might be useful in cases where more exhaustive discrimination in symptomatology is of interest; and (3) the revised version of CES-D with replacing original positive items into negatively revised items outperformed the original version. These findings have never been demonstrated by the classical test theory analyses, and thus the utility of this kind of psychometric testing should be warranted to further investigation for the standard measures of psychological assessment.

Download Full-text

Modeling Self-Determination Theory Motivation Data by Using Unfolding IRT

European Journal of Psychological Assessment ◽

10.1027/1015-5759/a000629 ◽

2020 ◽

pp. 1-9

Author(s):

Philipp A. Freund ◽

Annette Lohbeck

Keyword(s):

Ideal Point ◽

Self Determination Theory ◽

Self Determination ◽

Dominance Model ◽

Irt Model ◽

Autonomous Behavior ◽

Location Parameters ◽

Unfolding Models ◽

Using Data ◽

Item Location

Abstract. Self-determination theory (SDT) suggests that the degree of autonomous behavior regulation is a characteristic of distinct motivation types which thus can be ordered on the so-called Autonomy-Control Continuum (ACC). The present study employs an item response theory (IRT) model under the ideal point response/unfolding paradigm in order to model the response process to SDT motivation items in theoretical accordance with the ACC. Using data from two independent student samples (measuring SDT motivation for the academic subjects of Mathematics and German as a native language), it was found that an unfolding model exhibited a relatively better fit compared to a dominance model. The item location parameters under the unfolding paradigm showed clusters of items representing the different regulation types on the ACC to be (almost perfectly) empirically separable, as suggested by SDT. Besides theoretical implications, perspectives for the application of ideal point response/unfolding models in the development of measures for non-cognitive constructs are addressed.

Download Full-text

Analysis of the Problem-solving strategies in computer-based dynamic assessment: The extension and application of multilevel mixture IRT model

Acta Psychologica Sinica ◽

10.3724/sp.j.1041.2020.00528 ◽

2020 ◽

Vol 52 (4) ◽

pp. 528

Keyword(s):

Problem Solving ◽

Dynamic Assessment ◽

Irt Model ◽

Mixture Irt ◽

Computer Based ◽

Problem Solving Strategies

Download Full-text

Investigating the Impact of Noneffortful Responses on Individual-Level Scores: Can the Effort-Moderated IRT Model Serve as a Solution?

Applied Psychological Measurement ◽

10.1177/01466216211013896 ◽

2021 ◽

pp. 014662162110138

Author(s):

Joseph A. Rios ◽

James Soland

Keyword(s):

Classification Accuracy ◽

Negative Impact ◽

Parameter Recovery ◽

Individual Level ◽

Irt Model ◽

Ability Estimates ◽

Misclassification Rates ◽

Individual Scores ◽

3Pl Model ◽

The Impact

Suboptimal effort is a major threat to valid score-based inferences. While the effects of such behavior have been frequently examined in the context of mean group comparisons, minimal research has considered its effects on individual score use (e.g., identifying students for remediation). Focusing on the latter context, this study addressed two related questions via simulation and applied analyses. First, we investigated how much including noneffortful responses in scoring using a three-parameter logistic (3PL) model affects person parameter recovery and classification accuracy for noneffortful responders. Second, we explored whether improvements in these individual-level inferences were observed when employing the Effort Moderated IRT (EM-IRT) model under conditions in which its assumptions were met and violated. Results demonstrated that including 10% noneffortful responses in scoring led to average bias in ability estimates and misclassification rates by as much as 0.15 SDs and 7%, respectively. These results were mitigated when employing the EM-IRT model, particularly when model assumptions were met. However, once model assumptions were violated, the EM-IRT model’s performance deteriorated, though still outperforming the 3PL model. Thus, findings from this study show that (a) including noneffortful responses when using individual scores can lead to potential unfounded inferences and potential score misuse, and (b) the negative impact that noneffortful responding has on person ability estimates and classification accuracy can be mitigated by employing the EM-IRT model, particularly when its assumptions are met.

Download Full-text

Computer-based application for high school physics exams using IRT model 1P

10.1063/5.0037567 ◽

2021 ◽

Author(s):

Yetti Supriyati ◽

Dwi Susanti ◽

Slamet Maulana

Keyword(s):

High School ◽

Irt Model ◽

High School Physics ◽

Computer Based

Download Full-text

Spectral bounds for certain two-factor non-reversible MCMC algorithms

Electronic Communications in Probability ◽

10.1214/ecp.v20-4528 ◽

2015 ◽

Vol 20 (0) ◽

Cited By ~ 4

Author(s):

Jeffrey Rosenthal ◽

Peter Rosenthal

Keyword(s):

Mcmc Algorithms ◽

Spectral Bounds

Download Full-text

Relaxing Measurement Invariance in Cross-National Consumer Research Using a Hierarchical IRT Model

Journal of Consumer Research ◽

10.1086/518532 ◽

2007 ◽

Vol 34 (2) ◽

pp. 260-278 ◽

Cited By ~ 109

Author(s):

Martijn G. De Jong ◽

Jan-Benedict E. M. Steenkamp ◽

Jean-Paul Fox

Keyword(s):

Measurement Invariance ◽

Consumer Research ◽

Irt Model ◽

Cross National

Download Full-text

Using the GLIMMIX Procedure in SAS 9.3 to Fit a Standard Dichotomous Rasch and Hierarchical 1-PL IRT Model

Applied Psychological Measurement ◽

10.1177/0146621612441857 ◽

2012 ◽

Vol 36 (3) ◽

pp. 237-248 ◽

Cited By ~ 2

Author(s):

Ryan A. Black ◽

Stephen F. Butler

Keyword(s):

Irt Model

Download Full-text

A multidimensional generalized many-facet Rasch model for rubric-based performance assessment

Behaviormetrika ◽

10.1007/s41237-021-00144-w ◽

2021 ◽

Author(s):

Masaki Uto

Keyword(s):

Performance Assessment ◽

Rasch Model ◽

Measurement Accuracy ◽

Estimation Method ◽

Real Data ◽

Monte Carlo Algorithm ◽

Irt Model ◽

Irt Models ◽

Proposed Model ◽

Problem Item

AbstractPerformance assessment, in which human raters assess examinee performance in a practical task, often involves the use of a scoring rubric consisting of multiple evaluation items to increase the objectivity of evaluation. However, even when using a rubric, assigned scores are known to depend on characteristics of the rubric’s evaluation items and the raters, thus decreasing ability measurement accuracy. To resolve this problem, item response theory (IRT) models that can estimate examinee ability while considering the effects of these characteristics have been proposed. These IRT models assume unidimensionality, meaning that a rubric measures one latent ability. In practice, however, this assumption might not be satisfied because a rubric’s evaluation items are often designed to measure multiple sub-abilities that constitute a targeted ability. To address this issue, this study proposes a multidimensional IRT model for rubric-based performance assessment. Specifically, the proposed model is formulated as a multidimensional extension of a generalized many-facet Rasch model. Moreover, a No-U-Turn variant of the Hamiltonian Markov chain Monte Carlo algorithm is adopted as a parameter estimation method for the proposed model. The proposed model is useful not only for improving the ability measurement accuracy, but also for detailed analysis of rubric quality and rubric construct validity. The study demonstrates the effectiveness of the proposed model through simulation experiments and application to real data.

Download Full-text

A Distributed Computer System for Parallel Markov Chain Monte Carlo (MCMC)

Inquiry@Queen's Undergraduate Research Conference Proceedings ◽

10.24908/iqurcp.9597 ◽

2018 ◽

Author(s):

Michael Hynes

Keyword(s):

Monte Carlo ◽

Markov Chain ◽

Markov Chain Monte Carlo ◽

Parameter Space ◽

Heterogeneous Computing ◽

Peer To Peer ◽

Parallel Tempering ◽

Single Chain ◽

Distributed Computing Systems ◽

Mcmc Algorithms

A ubiquitous problem in physics is to determine expectation values of observables associated with a system. This problem is typically formulated as an integration of some likelihood over a multidimensional parameter space. In Bayesian analysis, numerical Markov Chain Monte Carlo (MCMC) algorithms are employed to solve such integrals using a fixed number of samples in the Markov Chain. In general, MCMC algorithms are computationally expensive for large datasets and have difficulties sampling from multimodal parameter spaces. An MCMC implementation that is robust and inexpensive for researchers is desired. Distributed computing systems have shown the potential to act as virtual supercomputers, such as in the SETI@home project in which millions of private computers participate. We propose that a clustered peer-to-peer (P2P) computer network serves as an ideal structure to run Markovian state exchange algorithms such as Parallel Tempering (PT). PT overcomes the difficulty in sampling from multimodal distributions by running multiple chains in parallel with different target distributions andexchanging their states in a Markovian manner. To demonstrate the feasibility of peer-to-peer Parallel Tempering (P2P PT), a simple two-dimensional dataset consisting of two Gaussian signals separated by a region of low probability was used in a Bayesian parameter fitting algorithm. A small connected peer-to-peer network was constructed using separate processes on a linux kernel, and P2P PT was applied to the dataset. These sampling results were compared with those obtained from sampling the parameter space with a single chain. It was found that the single chain was unable to sample both modes effectively, while the P2P PT method explored the target distribution well, visiting both modes approximately equally. Future work will involve scaling to many dimensions and large networks, and convergence conditions with highly heterogeneous computing capabilities of members within the network.

Download Full-text