scholarly journals Situational Judgment Tests as a method for measuring personality: Development and validity evidence for a test of Dependability

PLoS ONE ◽  
2019 ◽  
Vol 14 (2) ◽  
pp. e0211884 ◽  
Author(s):  
Gabriel Olaru ◽  
Jeremy Burrus ◽  
Carolyn MacCann ◽  
Franklin M. Zaromb ◽  
Oliver Wilhelm ◽  
...  
2016 ◽  
Vol 9 (1) ◽  
pp. 71-77 ◽  
Author(s):  
Deborah L. Whetzel ◽  
Matthew C. Reeder

Situational judgment tests (SJTs) occasionally fail to predict job performance in criterion-related validation studies, often despite much effort to follow scholarly recipes for their development. This commentary provides some plausible explanations for why this may occur as well as some tips for SJT development. In most cases, we frame the issue from an implicit trait policy (ITP) perspective (Motowidlo, Hooper, & Jackson, 2006a, 2006b) and the measurement of general domain knowledge. In other instances, we believe that the issue does not have a direct tie to the ITP concept, but our experience suggests that the issue is of sufficient importance to include in this response. The first two issues involve challenges gathering validity evidence to support the use of SJTs, and the remaining issues deal more directly with SJT design considerations.


2020 ◽  
Author(s):  
Michael D. Wolcott ◽  
Nikki G. Lobczowski ◽  
Jacqueline M. Zeeman ◽  
Jacqueline E. McLaughlin

Abstract Background: Situational judgment tests (SJTs) are used in health sciences education to measure knowledge using case-based scenarios. Despite their popularity, there is a significant gap in the validity evidence and research on the response process that demonstrate how SJTs measure their intended constructs. Models of the SJT response process have been proposed in the literature; however, few studies explore and expand these models beyond surface-level attributes. The purpose of this study was to describe the factors and strategies involved in the cognitive process examinees use as they respond to SJT items.Methods: Thirty participants—15 students and 15 experienced practitioners—completed a 12-item SJT designed to measure empathy. Each participant engaged in a think-aloud interview while completing the SJT followed by a cognitive interview probing their decision-making processes. Interviews were transcribed and independently coded by three researchers to identify salient themes and factors that contributed to the response process. Results: Results suggested that the SJT response process included the complex integration of comprehension, retrieval, judgment, and response selections. Each of these response process stages were influenced by attributes such as perceived objective of the task, job-specific knowledge, assumptions about the scenario, and item setting. Conclusions: This study provides an evaluation of the SJT response process and contributes exploratory information to the validity evidence of SJTs; these findings can inform the design, interpretation, and utility of SJTs.


2010 ◽  
Author(s):  
James N. Kurtessis ◽  
Kelley J. Krokos ◽  
Barbara A. Fritzsche

Author(s):  
Kelley J. Krokos ◽  
Adam W. Meade ◽  
April R. Cantwell ◽  
Samuel B. Pond ◽  
Mark A. Wilson

2018 ◽  
Vol 23 (2) ◽  
pp. 342-366 ◽  
Author(s):  
Jiyun Zu ◽  
Patrick C. Kyllonen

We evaluated the use of the nominal response model (NRM) to score multiple-choice (also known as “select the best option”) situational judgment tests (SJTs). Using data from two large studies, we compared the reliability and correlations of NRM scores with those from various classical and item response theory (IRT) scoring methods. The SJTs measured emotional management (Study 1) and teamwork and collaboration (Study 2). In Study 1 the NRM scoring method was shown to be superior in reliability and in yielding higher correlations with external measures to three classical test theory–based and four other IRT-based methods. In Study 2, only slight differences between scoring methods were observed. An explanation for the discrepancy in findings is that in cases where item keys are ambiguous (as in Study 1), the NRM accommodates that ambiguity, but in cases where item keys are clear (as in Study 2), different methods provide interchangeable scores. We characterize ambiguous and clear keys using category response curves based on parameter estimates of the NRM and discuss the relationships between our findings and those from the wisdom-of-the-crowd literature.


Sign in / Sign up

Export Citation Format

Share Document