scholarly journals The Fakeability of Personality Measurement with Graded Paired Comparisons

2021 ◽  
Author(s):  
Niklas Schulte ◽  
Lucas Kaup ◽  
Paul - Christian Bürkner ◽  
Heinz Holling

This pre-registered study compares the faking resistance of Likert scales and graded paired comparisons (GPCs) analyzed with Thurstonian IRT models. Based on findings on other forced-choice formats, we hypothesized that GPCs would be more resistant to faking than Likert scales by resulting in lower score inflation and better recovery of applicants’ true (i.e., honest) trait scores. A total of N = 573 participants completed either the Likert or GPC version of a personality questionnaire first honestly and then in an applicant scenario. Results show that participants were able to increase their scores in both the Likert and GPC format, though their score inflation was smaller in the GPC than the Likert format. However, GPCs did not exhibit higher honest–faking correlations than Likert scales; under certain conditions, we even observed negative associations. These results challenge mean score inflation as the dominant paradigm for judging the utility of foeced-choice questionnaires in high-stakes situations. Even if FC factor scores are less inflated, their ability to recover true trait standings in high-stakes situations might be lower compared with Likert scales. Moreover, in the GPC format, faking effects correlated almost perfectly with the social desirability differences of the corresponding statements, highlighting the importance of matching statements equal in social desirability when constructing forced-choice questionnaires.

2020 ◽  
pp. 001316442093486
Author(s):  
Niklas Schulte ◽  
Heinz Holling ◽  
Paul-Christian Bürkner

Forced-choice questionnaires can prevent faking and other response biases typically associated with rating scales. However, the derived trait scores are often unreliable and ipsative, making interindividual comparisons in high-stakes situations impossible. Several studies suggest that these problems vanish if the number of measured traits is high. To determine the necessary number of traits under varying sample sizes, factor loadings, and intertrait correlations, simulations were performed for the two most widely used scoring methods, namely the classical (ipsative) approach and Thurstonian item response theory (IRT) models. Results demonstrate that while especially Thurstonian IRT models perform well under ideal conditions, both methods yield insufficient reliabilities in most conditions resembling applied contexts. Moreover, not only the classical estimates but also the Thurstonian IRT estimates for questionnaires with equally keyed items remain (partially) ipsative, even when the number of traits is very high (i.e., 30). This result not only questions earlier assumptions regarding the use of classical scores in high-dimensional questionnaires, but it also raises doubts about many validation studies on Thurstonian IRT models because correlations of (partially) ipsative scores with external criteria cannot be interpreted in a usual way.


1969 ◽  
Vol 25 (1) ◽  
pp. 93-94
Author(s):  
John R. Braun ◽  
John J. Tinley

The novel approach used in trying to control the social desirability of the alternatives in Fricke's forced-choice Self-descriptive Check List is discussed. Evidence is presented which indicates that the test does not succeed in producing clusters of alternatives which are functionally equivalent in desirability.


2019 ◽  
Vol 79 (5) ◽  
pp. 827-854 ◽  
Author(s):  
Paul-Christian Bürkner ◽  
Niklas Schulte ◽  
Heinz Holling

Forced-choice questionnaires have been proposed to avoid common response biases typically associated with rating scale questionnaires. To overcome ipsativity issues of trait scores obtained from classical scoring approaches of forced-choice items, advanced methods from item response theory (IRT) such as the Thurstonian IRT model have been proposed. For convenient model specification, we introduce the thurstonianIRT R package, which uses Mplus, lavaan, and Stan for model estimation. Based on practical considerations, we establish that items within one block need to be equally keyed to achieve similar social desirability, which is essential for creating forced-choice questionnaires that have the potential to resist faking intentions. According to extensive simulations, measuring up to five traits using blocks of only equally keyed items does not yield sufficiently accurate trait scores and inter-trait correlation estimates, neither for frequentist nor for Bayesian estimation methods. As a result, persons’ trait scores remain partially ipsative and, thus, do not allow for valid comparisons between persons. However, we demonstrate that trait scores based on only equally keyed blocks can be improved substantially by measuring a sizable number of traits. More specifically, in our simulations of 30 traits, scores based on only equally keyed blocks were non-ipsative and highly accurate. We conclude that in high-stakes situations where persons are motivated to give fake answers, Thurstonian IRT models should only be applied to tests measuring a sizable number of traits.


2020 ◽  
Author(s):  
Eunike Wetzel ◽  
Susanne Frick ◽  
Anna Brown

A common concern with self-reports of personality traits in selection contexts is faking. The multidimensional forced-choice (MFC) format has been proposed as an alternative to rating scales (RS) that could prevent faking. The goal of this study was to compare the susceptibility of the MFC format and RS format to faking in a simulated high-stakes setting when using normative scoring for both formats. Participants were randomly assigned to three groups (total N = 1,867) and filled out the Big Five Triplets once under an honest instruction and once under a fake-good instruction. Latent mean differences between the honest and fake-good administrations indicated that the Big Five domains were faked in the expected direction. Faking effects for all traits were larger for RS compared to MFC. Faking effects were also larger for the MFC version with mixed triplets compared to the MFC version with triplets that were fully matched regarding their social desirability. The MFC format does not prevent faking completely, but it reduces faking substantially. Faking can be further reduced in the MFC format by matching the items presented in a block regarding their social desirability.


2018 ◽  
Author(s):  
Paul - Christian Bürkner ◽  
Niklas Schulte ◽  
Heinz Holling

Forced-choice questionnaires have been proposed to avoid common response biases typically associated with rating scale questionnaires. To overcome ipsativity issues of trait scores obtained from classical scoring approaches of forced-choice items, advanced methods from item response theory (IRT) such as the Thurstonian IRT model have been proposed. For convenient model specification, we introduce the thurstonianIRT R package, which uses Mplus, lavaan, and Stan for model estimation. Based on practical considerations, we establish that items within one block need to be equally keyed to achieve similar social desirability, which is essential for creating force-choice questionnaires that have the potential to resist faking intentions. According to extensive simulations, measuring up to 5 traits using blocks of only equally keyed items does not yield sufficiently accurate trait scores and inter-trait correlation estimates, neither for frequentist nor Bayesian estimation methods. As a result, persons' trait scores remain partially ipsative and, thus, do not allow for valid comparisons between persons. However, we demonstrate that trait scores based on only equally keyed blocks can be improved substantially by measuring a sizeable number of traits. More specifically, in our simulations of 30 traits, scores based on only equally keyed blocks were non-ipsative and highly accurate. We conclude that in high-stakes situations where persons are motivated to give fake answers, Thurstonian IRT models should only be applied to tests measuring a sizeable number of traits.


2019 ◽  
Vol 35 (6) ◽  
pp. 855-867 ◽  
Author(s):  
John T. Kulas ◽  
Rachael Klahr ◽  
Lindsey Knights

Abstract. Many investigators have noted “reverse-coding” method factors when exploring response pattern structure with psychological inventory data. The current article probes for the existence of a confound in these investigations, whereby an item’s level of saturation with socially desirable content tends to covary with the item’s substantive scale keying. We first investigate its existence, demonstrating that 15 of 16 measures that have been previously implicated as exhibiting a reverse-scoring method effect can also be reasonably characterized as exhibiting a scoring key/social desirability confound. A second set of analyses targets the extent to which the confounding variable may confuse interpretation of factor analytic results and documents strong social desirability associations. The results suggest that assessment developers perhaps consider the social desirability scale value of indicators when constructing scale aggregates (and possibly scales when investigating inter-construct associations). Future investigations would ideally disentangle the confound via experimental manipulation.


2013 ◽  
Vol 44 (3) ◽  
pp. 209-218 ◽  
Author(s):  
Benoît Testé ◽  
Samantha Perrin

The present research examines the social value attributed to endorsing the belief in a just world for self (BJW-S) and for others (BJW-O) in a Western society. We conducted four studies in which we asked participants to assess a target who endorsed BJW-S vs. BJW-O either strongly or weakly. Results showed that endorsement of BJW-S was socially valued and had a greater effect on social utility judgments than it did on social desirability judgments. In contrast, the main effect of endorsement of BJW-O was to reduce the target’s social desirability. The results also showed that the effect of BJW-S on social utility is mediated by the target’s perceived individualism, whereas the effect of BJW-S and BJW-O on social desirability is mediated by the target’s perceived collectivism.


Author(s):  
Ann Marie Ryan ◽  
Jacob Bradburn ◽  
Sarena Bhatia ◽  
Evan Beals ◽  
Anthony S. Boyce ◽  
...  

1994 ◽  
Vol 12 (2) ◽  
pp. 211-226 ◽  
Author(s):  
Dick Leith

Abstract: To non-specialists, academic disciplines invariably seem homogeneous, even monolithic. But even a relatively young discipline such as modem linguistics is more diverse in its procedures and concerns than might appear to those working in other fields. In this paper I attempt to show how certain kinds of linguistic inquiry might be relevant to those whose primary concern is rhetoric. I argue that these practices are often opposed to what I call the dominant paradigm in modern linguistics, with its commitment to abstraction and idealization. I discuss first those strands of linguistics, such as discourse analysis, text-linguistics, and stylistics, which tend to take the social formation for granted; I end by considering recent trends in so-called critical language study. Finally, I offer some thoughts on how linguistics may proceed in order to achieve a more programmatic rapprochement with rhetoric.


Author(s):  
Christian List

AbstractThe aim of this exploratory paper is to review an under-appreciated parallel between group agency and artificial intelligence. As both phenomena involve non-human goal-directed agents that can make a difference to the social world, they raise some similar moral and regulatory challenges, which require us to rethink some of our anthropocentric moral assumptions. Are humans always responsible for those entities’ actions, or could the entities bear responsibility themselves? Could the entities engage in normative reasoning? Could they even have rights and a moral status? I will tentatively defend the (increasingly widely held) view that, under certain conditions, artificial intelligent systems, like corporate entities, might qualify as responsible moral agents and as holders of limited rights and legal personhood. I will further suggest that regulators should permit the use of autonomous artificial systems in high-stakes settings only if they are engineered to function as moral (not just intentional) agents and/or there is some liability-transfer arrangement in place. I will finally raise the possibility that if artificial systems ever became phenomenally conscious, there might be a case for extending a stronger moral status to them, but argue that, as of now, this remains very hypothetical.


Sign in / Sign up

Export Citation Format

Share Document