Considering Local Dependencies: Person Parameter Estimation for IRT Models of Forced-Choice Data

Author(s):  
Safir Yousfi
2018 ◽  
Vol 43 (3) ◽  
pp. 226-240 ◽  
Author(s):  
Philseok Lee ◽  
Seang-Hwane Joo ◽  
Stephen Stark ◽  
Oleksandr S. Chernyshenko

Historically, multidimensional forced choice (MFC) measures have been criticized because conventional scoring methods can lead to ipsativity problems that render scores unsuitable for interindividual comparisons. However, with the recent advent of item response theory (IRT) scoring methods that yield normative information, MFC measures are surging in popularity and becoming important components in high-stake evaluation settings. This article aims to add to burgeoning methodological advances in MFC measurement by focusing on statement and person parameter recovery for the GGUM-RANK (generalized graded unfolding-RANK) IRT model. Markov chain Monte Carlo (MCMC) algorithm was developed for estimating GGUM-RANK statement and person parameters directly from MFC rank responses. In simulation studies, it was examined that how the psychometric properties of statements composing MFC items, test length, and sample size influenced statement and person parameter estimation; and it was explored for the benefits of measurement using MFC triplets relative to pairs. To demonstrate this methodology, an empirical validity study was then conducted using an MFC triplet personality measure. The results and implications of these studies for future research and practice are discussed.


2020 ◽  
pp. 001316442093486
Author(s):  
Niklas Schulte ◽  
Heinz Holling ◽  
Paul-Christian Bürkner

Forced-choice questionnaires can prevent faking and other response biases typically associated with rating scales. However, the derived trait scores are often unreliable and ipsative, making interindividual comparisons in high-stakes situations impossible. Several studies suggest that these problems vanish if the number of measured traits is high. To determine the necessary number of traits under varying sample sizes, factor loadings, and intertrait correlations, simulations were performed for the two most widely used scoring methods, namely the classical (ipsative) approach and Thurstonian item response theory (IRT) models. Results demonstrate that while especially Thurstonian IRT models perform well under ideal conditions, both methods yield insufficient reliabilities in most conditions resembling applied contexts. Moreover, not only the classical estimates but also the Thurstonian IRT estimates for questionnaires with equally keyed items remain (partially) ipsative, even when the number of traits is very high (i.e., 30). This result not only questions earlier assumptions regarding the use of classical scores in high-dimensional questionnaires, but it also raises doubts about many validation studies on Thurstonian IRT models because correlations of (partially) ipsative scores with external criteria cannot be interpreted in a usual way.


1971 ◽  
Vol 32 (2) ◽  
pp. 533-534 ◽  
Author(s):  
Carl Auerbach

A method for correcting two-alternative forced-choice data for response bias is presented which requires only a table of integrals of a normal distribution.


2021 ◽  
Author(s):  
Qiuli Ma ◽  
Jeffrey Joseph Starns ◽  
David Kellen

We explored a two-stage recognition memory paradigm in which people first make single-item “studied”/“not studied” decisions and then have a chance to correct their errors in forced-choice trials. Each forced-choice trial included one studied word (“target”) and one non-studied word (“lure”) that received the same previous single-item response. For example, a “studied”-“studied” trial would have a target that was correctly called “studied” and a lure that was incorrectly called “studied.” The two-high-threshold (2HT) model and the unequal-variance signal detection (UVSD) model predict opposite effects of biasing the initial single-item responses on subsequent forced-choice accuracy. Results from two experiments showed that the bias effect is actually near zero and well out of the range of effects predicted by either model. Follow-up analyses showed that the model failures were not a function of experiment artifacts like changing memory states between the two types of recognition trials. Follow-up analyses also showed that the dual process signal detection (DPSD) model made better predictions for the forced-choice data than 2HT and UVSD models.


2020 ◽  
Vol 44 (4) ◽  
pp. 327-328
Author(s):  
Pere J. Ferrando ◽  
David Navarro-González

InDisc is an R Package that implements procedures for estimating and fitting unidimensional Item Response Theory (IRT) Dual Models (DMs). DMs are intended for personality and attitude measures and are, essentially, extended standard IRT models with an extra person parameter that models the discriminating power of the individual. The package consists of a main function, which calls subfunctions for fitting binary, graded, and continuous responses. The program, a detailed user’s guide, and an empirical example are available at no cost to the interested practitioner.


Assessment ◽  
2016 ◽  
Vol 25 (4) ◽  
pp. 513-526 ◽  
Author(s):  
Nigel Guenole ◽  
Anna A. Brown ◽  
Andrew J. Cooper

This article describes an investigation of whether Thurstonian item response modeling is a viable method for assessment of maladaptive traits. Forced-choice responses from 420 working adults to a broad-range personality inventory assessing six maladaptive traits were considered. The Thurstonian item response model’s fit to the forced-choice data was adequate, while the fit of a counterpart item response model to responses to the same items but arranged in a single-stimulus design was poor. Monotrait heteromethod correlations indicated corresponding traits in the two formats overlapped substantially, although they did not measure equivalent constructs. A better goodness of fit and higher factor loadings for the Thurstonian item response model, coupled with a clearer conceptual alignment to the theoretical trait definitions, suggested that the single-stimulus item responses were influenced by biases that the independent clusters measurement model did not account for. Researchers may wish to consider forced-choice designs and appropriate item response modeling techniques such as Thurstonian item response modeling for personality questionnaire applications in industrial psychology, especially when assessing maladaptive traits. We recommend further investigation of this approach in actual selection situations and with different assessment instruments.


Sign in / Sign up

Export Citation Format

Share Document