Considering Local Dependencies: Person Parameter Estimation for IRT Models of Forced-Choice Data

Historically, multidimensional forced choice (MFC) measures have been criticized because conventional scoring methods can lead to ipsativity problems that render scores unsuitable for interindividual comparisons. However, with the recent advent of item response theory (IRT) scoring methods that yield normative information, MFC measures are surging in popularity and becoming important components in high-stake evaluation settings. This article aims to add to burgeoning methodological advances in MFC measurement by focusing on statement and person parameter recovery for the GGUM-RANK (generalized graded unfolding-RANK) IRT model. Markov chain Monte Carlo (MCMC) algorithm was developed for estimating GGUM-RANK statement and person parameters directly from MFC rank responses. In simulation studies, it was examined that how the psychometric properties of statements composing MFC items, test length, and sample size influenced statement and person parameter estimation; and it was explored for the benefits of measurement using MFC triplets relative to pairs. To demonstrate this methodology, an empirical validity study was then conducted using an MFC triplet personality measure. The results and implications of these studies for future research and practice are discussed.

Download Full-text

Person Parameter Estimation and Measurement in Rasch Models

Rasch Models in Health ◽

10.1002/9781118574454.ch4 ◽

2013 ◽

pp. 63-78 ◽

Cited By ~ 11

Author(s):

Svend Kreiner ◽

Karl Bang Christensen

Keyword(s):

Parameter Estimation ◽

Rasch Models ◽

Person Parameter

Download Full-text

Second-Order Probability Matching Priors for the Person Parameter in Unidimensional IRT Models

Psychometrika ◽

10.1007/s11336-019-09675-4 ◽

2019 ◽

Vol 84 (3) ◽

pp. 701-718 ◽

Cited By ~ 1

Author(s):

Yang Liu ◽

Jan Hannig ◽

Abhishek Pal Majumder

Keyword(s):

Second Order ◽

Probability Matching ◽

Order Probability ◽

Irt Models ◽

Person Parameter ◽

Probability Matching Priors ◽

Matching Priors

Download Full-text

Can High-Dimensional Questionnaires Resolve the Ipsativity Issue of Forced-Choice Response Formats?

Educational and Psychological Measurement ◽

10.1177/0013164420934861 ◽

2020 ◽

pp. 001316442093486

Author(s):

Niklas Schulte ◽

Heinz Holling ◽

Paul-Christian Bürkner

Keyword(s):

Rating Scales ◽

Forced Choice ◽

High Dimensional ◽

Choice Response ◽

Scoring Methods ◽

High Stakes ◽

Irt Models ◽

Response Formats ◽

Response Biases ◽

Very High

Forced-choice questionnaires can prevent faking and other response biases typically associated with rating scales. However, the derived trait scores are often unreliable and ipsative, making interindividual comparisons in high-stakes situations impossible. Several studies suggest that these problems vanish if the number of measured traits is high. To determine the necessary number of traits under varying sample sizes, factor loadings, and intertrait correlations, simulations were performed for the two most widely used scoring methods, namely the classical (ipsative) approach and Thurstonian item response theory (IRT) models. Results demonstrate that while especially Thurstonian IRT models perform well under ideal conditions, both methods yield insufficient reliabilities in most conditions resembling applied contexts. Moreover, not only the classical estimates but also the Thurstonian IRT estimates for questionnaires with equally keyed items remain (partially) ipsative, even when the number of traits is very high (i.e., 30). This result not only questions earlier assumptions regarding the use of classical scores in high-dimensional questionnaires, but it also raises doubts about many validation studies on Thurstonian IRT models because correlations of (partially) ipsative scores with external criteria cannot be interpreted in a usual way.

Download Full-text

Practice for Defining and Calculating Individual and Group Sensory Thresholds from Forced-Choice Data Sets of Intermediate Size

10.1520/e1432-04r11 ◽

2011 ◽

Author(s):

Keyword(s):

Forced Choice ◽

Data Sets ◽

Sensory Thresholds ◽

Choice Data ◽

Intermediate Size

Download Full-text

Correcting Two-Alternative Forced-Choice Data for Response Bias

Perceptual and Motor Skills ◽

10.2466/pms.1971.32.2.533 ◽

1971 ◽

Vol 32 (2) ◽

pp. 533-534 ◽

Cited By ~ 3

Author(s):

Carl Auerbach

Keyword(s):

Normal Distribution ◽

Response Bias ◽

Forced Choice ◽

Choice Data

A method for correcting two-alternative forced-choice data for response bias is presented which requires only a table of integrals of a normal distribution.

Download Full-text

Bias effects in a two-stage recognition paradigm: A challenge for ‘pure’ threshold and signal detection models

10.31234/osf.io/hzmwt ◽

2021 ◽

Author(s):

Qiuli Ma ◽

Jeffrey Joseph Starns ◽

David Kellen

Keyword(s):

Signal Detection ◽

Forced Choice ◽

Dual Process ◽

High Threshold ◽

Word Target ◽

Choice Data ◽

Two Stage ◽

Unequal Variance ◽

Single Item

We explored a two-stage recognition memory paradigm in which people first make single-item “studied”/“not studied” decisions and then have a chance to correct their errors in forced-choice trials. Each forced-choice trial included one studied word (“target”) and one non-studied word (“lure”) that received the same previous single-item response. For example, a “studied”-“studied” trial would have a target that was correctly called “studied” and a lure that was incorrectly called “studied.” The two-high-threshold (2HT) model and the unequal-variance signal detection (UVSD) model predict opposite effects of biasing the initial single-item responses on subsequent forced-choice accuracy. Results from two experiments showed that the bias effect is actually near zero and well out of the range of effects predicted by either model. Follow-up analyses showed that the model failures were not a function of experiment artifacts like changing memory states between the two types of recognition trials. Follow-up analyses also showed that the dual process signal detection (DPSD) model made better predictions for the forced-choice data than 2HT and UVSD models.

Download Full-text

InDisc: An R Package for Assessing Person and Item Discrimination in Typical-Response Measures

Applied Psychological Measurement ◽

10.1177/0146621620909901 ◽

2020 ◽

Vol 44 (4) ◽

pp. 327-328

Author(s):

Pere J. Ferrando ◽

David Navarro-González

Keyword(s):

R Package ◽

Main Function ◽

Irt Models ◽

Discriminating Power ◽

Person Parameter ◽

Attitude Measures ◽

Continuous Responses ◽

Response Measures ◽

The Individual ◽

Dual Models

InDisc is an R Package that implements procedures for estimating and fitting unidimensional Item Response Theory (IRT) Dual Models (DMs). DMs are intended for personality and attitude measures and are, essentially, extended standard IRT models with an extra person parameter that models the discriminating power of the individual. The package consists of a main function, which calls subfunctions for fitting binary, graded, and continuous responses. The program, a detailed user’s guide, and an empirical example are available at no cost to the interested practitioner.

Download Full-text

Forced-Choice Assessment of Work-Related Maladaptive Personality Traits: Preliminary Evidence From an Application of Thurstonian Item Response Modeling

Assessment ◽

10.1177/1073191116641181 ◽

2016 ◽

Vol 25 (4) ◽

pp. 513-526 ◽

Cited By ~ 16

Author(s):

Nigel Guenole ◽

Anna A. Brown ◽

Andrew J. Cooper

Keyword(s):

Item Response ◽

Goodness Of Fit ◽

Stimulus Item ◽

Forced Choice ◽

Single Stimulus ◽

Response Model ◽

Item Response Model ◽

Choice Data ◽

Item Response Modeling ◽

Response Modeling

This article describes an investigation of whether Thurstonian item response modeling is a viable method for assessment of maladaptive traits. Forced-choice responses from 420 working adults to a broad-range personality inventory assessing six maladaptive traits were considered. The Thurstonian item response model’s fit to the forced-choice data was adequate, while the fit of a counterpart item response model to responses to the same items but arranged in a single-stimulus design was poor. Monotrait heteromethod correlations indicated corresponding traits in the two formats overlapped substantially, although they did not measure equivalent constructs. A better goodness of fit and higher factor loadings for the Thurstonian item response model, coupled with a clearer conceptual alignment to the theoretical trait definitions, suggested that the single-stimulus item responses were influenced by biases that the independent clusters measurement model did not account for. Researchers may wish to consider forced-choice designs and appropriate item response modeling techniques such as Thurstonian item response modeling for personality questionnaire applications in industrial psychology, especially when assessing maladaptive traits. We recommend further investigation of this approach in actual selection situations and with different assessment instruments.

Download Full-text