Modeling Faking in the Multidimensional Forced-Choice Format: The Faking Mixture Model

Psychometrika ◽

10.1007/s11336-021-09818-6 ◽

2021 ◽

Author(s):

Susanne Frick

Keyword(s):

Item Response Theory ◽

Item Response ◽

Mixture Model ◽

Test Construction ◽

Theory Model ◽

Forced Choice ◽

Empirical Validation ◽

Response Theory ◽

Parameter Recovery ◽

Good Parameter

AbstractThe multidimensional forced-choice (MFC) format has been proposed to reduce faking because items within blocks can be matched on desirability. However, the desirability of individual items might not transfer to the item blocks. The aim of this paper is to propose a mixture item response theory model for faking in the MFC format that allows to estimate the fakability of MFC blocks, termed the Faking Mixture model. Given current computing capabilities, within-subject data from both high- and low-stakes contexts are needed to estimate the model. A simulation showed good parameter recovery under various conditions. An empirical validation showed that matching was necessary but not sufficient to create an MFC questionnaire that can reduce faking. The Faking Mixture model can be used to reduce fakability during test construction.

Download Full-text

Detecting DIF in Multidimensional Forced Choice Measures Using the Thurstonian Item Response Theory Model

Organizational Research Methods ◽

10.1177/1094428120959822 ◽

2020 ◽

pp. 109442812095982

Author(s):

Philseok Lee ◽

Seang-Hwane Joo ◽

Stephen Stark

Keyword(s):

Item Response Theory ◽

Item Response ◽

Test Construction ◽

Theory Model ◽

Forced Choice ◽

Practical Significance ◽

Large Sample Size ◽

Response Theory ◽

Study Results ◽

The Impact

Although modern item response theory (IRT) methods of test construction and scoring have overcome ipsativity problems historically associated with multidimensional forced choice (MFC) formats, there has been little research on MFC differential item functioning (DIF) detection, where item refers to a block, or group, of statements presented for an examinee’s consideration. This research investigated DIF detection with three-alternative MFC items based on the Thurstonian IRT (TIRT) model, using omnibus Wald tests on loadings and thresholds. We examined constrained and free baseline model comparisons strategies with different types and magnitudes of DIF, latent trait correlations, sample sizes, and levels of impact in an extensive Monte Carlo study. Results indicated the free baseline strategy was highly effective in detecting DIF, with power approaching 1.0 in the large sample size and large magnitude of DIF conditions, and similar effectiveness in the impact and no-impact conditions. This research also included an empirical example to demonstrate the viability of the best performing method with real examinees and showed how a DIF and a DTF effect size measure can be used to assess the practical significance of MFC DIF findings.

Download Full-text

On the Validity of Forced Choice Scores Derived From the Thurstonian Item Response Theory Model

Assessment ◽

10.1177/1073191119843585 ◽

2019 ◽

Vol 27 (4) ◽

pp. 706-718 ◽

Cited By ~ 3

Author(s):

Kate E. Walton ◽

Lina Cherkasova ◽

Richard D. Roberts

Keyword(s):

Item Response Theory ◽

Item Response ◽

Discriminant Validity ◽

Criterion Validity ◽

Theory Model ◽

Forced Choice ◽

Response Theory ◽

Test Criterion ◽

Convergent And Discriminant Validity ◽

Response Biases

Forced choice (FC) measures may be a desirable alternative to single stimulus (SS) Likert items, which are easier to fake and can have associated response biases. However, classical methods of scoring FC measures lead to ipsative data, which have a number of psychometric problems. A Thurstonian item response theory (TIRT) model has been introduced as a way to overcome these issues, but few empirical validity studies have been conducted to ensure its effectiveness. This was the goal of the current three studies, which used FC measures of domains from popular personality frameworks including the Big Five and HEXACO, and both statement and adjective item stems. We computed TIRT and ipsative scores and compared their validity estimates. Convergent and discriminant validity of the scores were evaluated by correlating them with SS scores, and test-criterion validity evidence was evaluated by examining their relationships with meaningful outcomes. In all three studies, there was evidence for the convergent and test-criterion validity of the TIRT scores, though at times this was on par with the validity of the ipsative scores. The discriminant validity of the TIRT scores was problematic and was often worse than the ipsative scores.

Download Full-text

A Bayesian Random Block Item Response Theory Model for Forced-Choice Formats

Educational and Psychological Measurement ◽

10.1177/0013164419871659 ◽

2019 ◽

Vol 80 (3) ◽

pp. 578-603

Author(s):

HyeSun Lee ◽

Weldon Z. Smith

Keyword(s):

Item Response Theory ◽

Item Response ◽

Model Performance ◽

Theory Model ◽

Forced Choice ◽

Simultaneous Estimation ◽

Response Theory ◽

Irt Model ◽

Measurement Models ◽

Random Block

Based on the framework of testlet models, the current study suggests the Bayesian random block item response theory (BRB IRT) model to fit forced-choice formats where an item block is composed of three or more items. To account for local dependence among items within a block, the BRB IRT model incorporated a random block effect into the response function and used a Markov Chain Monte Carlo procedure for simultaneous estimation of item and trait parameters. The simulation results demonstrated that the BRB IRT model performed well for the estimation of item and trait parameters and for screening those with relatively low scores on target traits. As found in the literature, the composition of item blocks was crucial for model performance; negatively keyed items were required for item blocks. The empirical application showed the performance of the BRB IRT model was equivalent to that of the Thurstonian IRT model. The potential advantage of the BRB IRT model as a base for more complex measurement models was also demonstrated by incorporating gender as a covariate into the BRB IRT model to explain response probabilities. Recommendations for the adoption of forced-choice formats were provided along with the discussion about using negatively keyed items.

Download Full-text

Reviewing the Structure of Kolb’s Learning Style Inventory From Factor Analysis and Thurstonian Item Response Theory (IRT) Model Approaches

Journal of Psychoeducational Assessment ◽

10.1177/07342829211003739 ◽

2021 ◽

pp. 073428292110037

Author(s):

Carlos Calderón Carvajal ◽

Carmen Ximénez Gómez ◽

Siu Lay-Lisboa ◽

Mauricio Briceño

Keyword(s):

Factor Analysis ◽

Item Response Theory ◽

Item Response ◽

Learning Style ◽

Principal Component ◽

Theory Model ◽

Forced Choice ◽

Short Version ◽

Response Theory ◽

Learning Style Inventory

Kolb’s Learning Style Inventory (LSI) continues to generate a great debate among researchers, given the contradictory evidence resulting from its psychometric properties. One primary criticism focuses on the artificiality of the results derived from its internal structure because of the ipsative nature of the forced-choice format. This study seeks to contribute to the resolution of this debate. A short version of Kolb’s LSI with a forced-choice format and an additional inventory scored on a Likert scale was completed by a sample of students of the University Católica del Norte in Antofagasta, Chile. The data obtained from the two forms of the reduced version of the LSI were compared using principal component analysis, confirmatory factor analysis, and the Thurstonian Item Response Theory model. The results support the hypothesis of the existence of four learning mode dimensions. However, they do not support the existence of the learning styles as proposed by Kolb, indicating that said reports are the product of the artificial structure generated by the ipsative forced-choice format .

Download Full-text

Item Response Theory Models for Ipsative Tests With Multidimensional Pairwise Comparison Items

Applied Psychological Measurement ◽

10.1177/0146621617703183 ◽

2017 ◽

Vol 41 (8) ◽

pp. 600-613 ◽

Cited By ~ 9

Author(s):

Wen-Chung Wang ◽

Xue-Lan Qiu ◽

Chia-Wen Chen ◽

Sage Ro ◽

Kuan-Yu Jin

Keyword(s):

Item Response Theory ◽

Item Response ◽

Pairwise Comparison ◽

Forced Choice ◽

Measurement Properties ◽

Response Theory ◽

Psychological Differentiation ◽

Parameter Recovery ◽

Latent Traits ◽

Item Response Theory Models

There is re-emerging interest in adopting forced-choice items to address the issue of response bias in Likert-type items for noncognitive latent traits. Multidimensional pairwise comparison (MPC) items are commonly used forced-choice items. However, few studies have been aimed at developing item response theory models for MPC items owing to the challenges associated with ipsativity. Acknowledging that the absolute scales of latent traits are not identifiable in ipsative tests, this study developed a Rasch ipsative model for MPC items that has desirable measurement properties, yields a single utility value for each statement, and allows for comparing psychological differentiation between and within individuals. The simulation results showed a good parameter recovery for the new model with existing computer programs. This article provides an empirical example of an ipsative test on work style and behaviors.

Download Full-text

Modeling Multidimensional Forced Choice Measures with the Zinnes and Griggs Pairwise Preference Item Response Theory Model

Multivariate Behavioral Research ◽

10.1080/00273171.2021.1960142 ◽

2021 ◽

pp. 1-21

Author(s):

Seang-Hwane Joo ◽

Philseok Lee ◽

Stephen Stark

Keyword(s):

Item Response Theory ◽

Item Response ◽

Theory Model ◽

Forced Choice ◽

Item Response Theory Model ◽

Response Theory ◽

Pairwise Preference

Download Full-text

Linear Ballistic Accumulator Item Response Theory Model for Multidimensional Multiple-Alternative Forced-Choice Measurement of Personality

Multivariate Behavioral Research ◽

10.1080/00273171.2021.1896351 ◽

2021 ◽

pp. 1-35

Author(s):

Kyosuke Bunji ◽

Kensuke Okada

Keyword(s):

Item Response Theory ◽

Item Response ◽

Theory Model ◽

Forced Choice ◽

Item Response Theory Model ◽

Response Theory ◽

Linear Ballistic Accumulator ◽

Multiple Alternative

Download Full-text

Effect Size Measures for Bi-Factor Testlet Item Response Theory Model

PsycEXTRA Dataset ◽

10.1037/e589212013-001 ◽

2013 ◽

Author(s):

Akihito Kamata ◽

Chalie Patarapichayatham

Keyword(s):

Item Response Theory ◽

Item Response ◽

Effect Size ◽

Theory Model ◽

Item Response Theory Model ◽

Response Theory

Download Full-text

A Multidimensional Item Response Theory Model for Continuous and Graded Responses With Error in Persons and Items

Educational and Psychological Measurement ◽

10.1177/0013164421998412 ◽

2021 ◽

pp. 001316442199841

Author(s):

Pere J. Ferrando ◽

David Navarro-González

Keyword(s):

Item Response Theory ◽

Item Response ◽

Theory Model ◽

Response Model ◽

Response Theory ◽

Continuous Response ◽

Graded Responses ◽

Graded Response ◽

Continuous Responses ◽

Differential Measurement Error

Item response theory “dual” models (DMs) in which both items and individuals are viewed as sources of differential measurement error so far have been proposed only for unidimensional measures. This article proposes two multidimensional extensions of existing DMs: the M-DTCRM (dual Thurstonian continuous response model), intended for (approximately) continuous responses, and the M-DTGRM (dual Thurstonian graded response model), intended for ordered-categorical responses (including binary). A rationale for the extension to the multiple-content-dimensions case, which is based on the concept of the multidimensional location index, is first proposed and discussed. Then, the models are described using both the factor-analytic and the item response theory parameterizations. Procedures for (a) calibrating the items, (b) scoring individuals, (c) assessing model appropriateness, and (d) assessing measurement precision are finally discussed. The simulation results suggest that the proposal is quite feasible, and an illustrative example based on personality data is also provided. The proposals are submitted to be of particular interest for the case of multidimensional questionnaires in which the number of items per scale would not be enough for arriving at stable estimates if the existing unidimensional DMs were fitted on a separate-scale basis.

Download Full-text