Situational Judgment Tests as a method for measuring personality: Development and validity evidence for a test of Dependability

Situational judgment tests (SJTs) occasionally fail to predict job performance in criterion-related validation studies, often despite much effort to follow scholarly recipes for their development. This commentary provides some plausible explanations for why this may occur as well as some tips for SJT development. In most cases, we frame the issue from an implicit trait policy (ITP) perspective (Motowidlo, Hooper, & Jackson, 2006a, 2006b) and the measurement of general domain knowledge. In other instances, we believe that the issue does not have a direct tie to the ITP concept, but our experience suggests that the issue is of sufficient importance to include in this response. The first two issues involve challenges gathering validity evidence to support the use of SJTs, and the remaining issues deal more directly with SJT design considerations.

Download Full-text

Situational judgment test validity: An exploratory model of the participant response process using cognitive and think-aloud interviews

10.21203/rs.3.rs-19804/v1 ◽

2020 ◽

Author(s):

Michael D. Wolcott ◽

Nikki G. Lobczowski ◽

Jacqueline M. Zeeman ◽

Jacqueline E. McLaughlin

Keyword(s):

Test Validity ◽

Cognitive Interview ◽

Validity Evidence ◽

Think Aloud ◽

Response Process ◽

Situational Judgment Test ◽

Situational Judgment ◽

Situational Judgment Tests ◽

Complex Integration ◽

Participant Response

Abstract Background: Situational judgment tests (SJTs) are used in health sciences education to measure knowledge using case-based scenarios. Despite their popularity, there is a significant gap in the validity evidence and research on the response process that demonstrate how SJTs measure their intended constructs. Models of the SJT response process have been proposed in the literature; however, few studies explore and expand these models beyond surface-level attributes. The purpose of this study was to describe the factors and strategies involved in the cognitive process examinees use as they respond to SJT items.Methods: Thirty participants—15 students and 15 experienced practitioners—completed a 12-item SJT designed to measure empathy. Each participant engaged in a think-aloud interview while completing the SJT followed by a cognitive interview probing their decision-making processes. Interviews were transcribed and independently coded by three researchers to identify salient themes and factors that contributed to the response process. Results: Results suggested that the SJT response process included the complex integration of comprehension, retrieval, judgment, and response selections. Each of these response process stages were influenced by attributes such as perceived objective of the task, job-specific knowledge, assumptions about the scenario, and item setting. Conclusions: This study provides an evaluation of the SJT response process and contributes exploratory information to the validity evidence of SJTs; these findings can inform the design, interpretation, and utility of SJTs.

Download Full-text

Reliability and Situational Judgment Tests: A Review of the Literature

PsycEXTRA Dataset ◽

10.1037/e518332013-439 ◽

2012 ◽

Author(s):

Michael C. Campion ◽

Robert E. Ployhart ◽

William I. MacKenzie

Keyword(s):

Review Of The Literature ◽

Situational Judgment ◽

Situational Judgment Tests

Download Full-text

Employee anger influences the validity of situational judgment tests

PsycEXTRA Dataset ◽

10.1037/e518392013-788 ◽

2010 ◽

Author(s):

Jerel E. Slaughter ◽

Evan F. Sinar

Keyword(s):

Situational Judgment ◽

Situational Judgment Tests

Download Full-text

Using situational judgment tests to measure teamwork and communication

PsycEXTRA Dataset ◽

10.1037/e518392013-703 ◽

2010 ◽

Author(s):

James N. Kurtessis ◽

Kelley J. Krokos ◽

Barbara A. Fritzsche

Keyword(s):

Situational Judgment ◽

Situational Judgment Tests

Download Full-text

On the nature of situational judgment tests: A construct-oriented meta-analysis

PsycEXTRA Dataset ◽

10.1037/e518532013-159 ◽

2007 ◽

Author(s):

Michael S. Christian ◽

Bryan D. Edwards ◽

Jill C. Bradley

Keyword(s):

Meta Analysis ◽

Situational Judgment ◽

Situational Judgment Tests

Download Full-text

Personality and situational judgment tests across applicant and incumbent settings

PsycEXTRA Dataset ◽

10.1037/e518712013-009 ◽

2003 ◽

Cited By ~ 2

Author(s):

Jeff A. Weekley ◽

Robert E. Ployhart ◽

Crystal Michele Harold

Keyword(s):

Situational Judgment ◽

Situational Judgment Tests

Download Full-text

Empirical Keying of Situational Judgment Tests: Rationale and Some Examples

PsycEXTRA Dataset ◽

10.1037/e518632013-459 ◽

2004 ◽

Cited By ~ 4

Author(s):

Kelley J. Krokos ◽

Adam W. Meade ◽

April R. Cantwell ◽

Samuel B. Pond ◽

Mark A. Wilson

Keyword(s):

Situational Judgment ◽

Situational Judgment Tests ◽

Empirical Keying

Download Full-text

Nominal Response Model Is Useful for Scoring Multiple-Choice Situational Judgment Tests

Organizational Research Methods ◽

10.1177/1094428118812669 ◽

2018 ◽

Vol 23 (2) ◽

pp. 342-366 ◽

Cited By ~ 1

Author(s):

Jiyun Zu ◽

Patrick C. Kyllonen

Keyword(s):

Multiple Choice ◽

Test Theory ◽

Parameter Estimates ◽

Response Model ◽

Scoring Methods ◽

Response Curves ◽

Situational Judgment ◽

Situational Judgment Tests ◽

Nominal Response Model ◽

Nominal Response

We evaluated the use of the nominal response model (NRM) to score multiple-choice (also known as “select the best option”) situational judgment tests (SJTs). Using data from two large studies, we compared the reliability and correlations of NRM scores with those from various classical and item response theory (IRT) scoring methods. The SJTs measured emotional management (Study 1) and teamwork and collaboration (Study 2). In Study 1 the NRM scoring method was shown to be superior in reliability and in yielding higher correlations with external measures to three classical test theory–based and four other IRT-based methods. In Study 2, only slight differences between scoring methods were observed. An explanation for the discrepancy in findings is that in cases where item keys are ambiguous (as in Study 1), the NRM accommodates that ambiguity, but in cases where item keys are clear (as in Study 2), different methods provide interchangeable scores. We characterize ambiguous and clear keys using category response curves based on parameter estimates of the NRM and discuss the relationships between our findings and those from the wisdom-of-the-crowd literature.

Download Full-text