Unimodal and cross-modal identity judgements using an audio-visual sorting task: Evidence for independent processing of faces and voices

AbstractUnimodal and cross-modal information provided by faces and voices contribute to identity percepts. To examine how these sources of information interact, we devised a novel audio-visual sorting task in which participants were required to group video-only and audio-only clips into two identities. In a series of three experiments, we show that unimodal face and voice sorting were more accurate than cross-modal sorting: While face sorting was consistently most accurate followed by voice sorting, cross-modal sorting was at chancel level or below. In Experiment 1, we compared performance in our novel audio-visual sorting task to a traditional identity matching task, showing that unimodal and cross-modal identity perception were overall moderately more accurate than the traditional identity matching task. In Experiment 2, separating unimodal from cross-modal sorting led to small improvements in accuracy for unimodal sorting, but no change in cross-modal sorting performance. In Experiment 3, we explored the effect of minimal audio-visual training: Participants were shown a clip of the two identities in conversation prior to completing the sorting task. This led to small, nonsignificant improvements in accuracy for unimodal and cross-modal sorting. Our results indicate that unfamiliar face and voice perception operate relatively independently with no evidence of mutual benefit, suggesting that extracting reliable cross-modal identity information is challenging.

Download Full-text

Unimodal and cross-modal identity judgements using an audiovisual sorting task: Evidence for independent processing of faces and voices

10.31234/osf.io/y5zbh ◽

2020 ◽

Author(s):

Nadine Lavan ◽

Harriet M J Smith ◽

Carolyn McGettigan

Keyword(s):

Sorting Task ◽

Matching Task ◽

Mutual Benefit ◽

Sources Of Information ◽

Voice Perception ◽

Identity Matching ◽

Identity Perception

Unimodal and cross-modal information provided by faces and voices can contribute to identity percepts. To examine how these unimodal and cross-modal sources of information interact, we devised a novel audiovisual identity sorting task in which participants were required to group video-only and audio-only clips into two identities. In a series of three experiments, we show that unimodal face and voice sorting were more accurate than cross-modal sorting accuracy: While face sorting was consistently most accurate followed by voice sorting, cross-modal sorting was at chancel-level or below. In Experiment 1, we contextualised performance in our novel audiovisual sorting task by comparing it to a traditional identity matching task. Here we found that unimodal and cross-modal identity perception were more accurate in the matching task. In Experiment 2, we separated unimodal from cross-modal sorting, which led to small improvements in accuracy for unimodal sorting, but no change in cross-modal sorting performance. Finally, in Experiment 3 we explored the effect of minimal audiovisual training: Participants were shown an audiovisual clip of the two identities in conversation prior to completing the sorting task. This minimal training led to small but non-significant improvements in accuracy for both unimodal and cross-modal sorting. Our results indicate that, for unfamiliar people, face and voice perception operate relatively independently with no evidence of mutual benefit. We also show that extracting reliable redundant cross-modal information for identity judgements is challenging.

Download Full-text

Unfamiliar Face Matching With Driving Licence and Passport Photographs

Perception ◽

10.1177/0301006619826495 ◽

2019 ◽

Vol 48 (2) ◽

pp. 175-184 ◽

Cited By ~ 4

Author(s):

Robin S. S. Kramer ◽

Sophie Mohamed ◽

Sarah C. Hardy

Keyword(s):

High Resolution ◽

Ecological Validity ◽

Current Work ◽

Sorting Task ◽

Matching Task ◽

Performance Levels ◽

Driving Licence ◽

Face Matching ◽

Unfamiliar Face Matching ◽

Unfamiliar Face

Matching two different images of an unfamiliar face is difficult, although we rely on this process every day when proving our identity. Although previous work with laboratory photosets has shown that performance is error-prone, few studies have focussed on how accurately people carry out this matching task using photographs taken from official forms of identification. In Experiment 1, participants matched high-resolution, colour face photos with current UK driving licence photos of the same group of people in a sorting task. Averaging 19 mistaken pairings out of 30, our results showed that this task was both difficult and error-prone. In Experiment 2, high-resolution photographs were paired with either driving licence or passport photographs in a typical pairwise matching paradigm. We found no difference in performance levels for the two types of ID image, with both producing unacceptable levels of accuracy (around 75%–79% correct). The current work benefits from increased ecological validity and provides a clear demonstration that these forms of official identification are ineffective and alternatives should be considered.

Download Full-text

Familiarity and task context shape the use of acoustic information in voice identity perception

10.31234/osf.io/r8zt9 ◽

2020 ◽

Author(s):

Jens Kreitewolf ◽

Nadine Lavan ◽

Jonas Obleser ◽

Carolyn McGettigan

Keyword(s):

Critical Role ◽

Theoretical Work ◽

Acoustic Properties ◽

Sorting Task ◽

Acoustic Similarity ◽

Voice Perception ◽

Spectral Similarity ◽

Acoustic Information ◽

Context Shape ◽

Identity Perception

Familiar and unfamiliar voice perception are often understood as being distinct from each other. For identity perception, theoretical work has proposed that listeners use acoustic information in different ways to perceive identity from familiar and unfamiliar voices: Unfamiliar voices are thought to be processed based on close comparisons of acoustic properties, while familiar voices are processed based on diagnostic acoustic features that activate a stored person-specific representation of that voice. To date no empirical study has directly examined whether and how familiar and unfamiliar listeners differ in their use of acoustic information for identity perception. Here, we tested this theoretical claim by linking listeners’ judgements in voice identity tasks to complex acoustic representations—spectral similarity of the heard voice recordings. Participants (N=150) who were either familiar or unfamiliar with a set of voices completed an identity discrimination task (Experiment 1) or an identity sorting task (Experiment 2). In both experiments, identity judgements for familiar and unfamiliar voices alike were guided by spectral similarity: Pairs of recordings with greater acoustic similarity were more likely to be perceived as belonging to the same voice identity. However, while there were no differences in how familiar and unfamiliar listeners used acoustic information for identity discrimination, differences were apparent for identity sorting. Our study therefore challenges proposals that view familiar and unfamiliar voice perception as being at all times distinct and suggests a critical role of the listening situation in which familiar and unfamiliar voices are evaluated.

Download Full-text

Talker and accent familiarity yield advantages for voice identity perception: a voice sorting study

10.31234/osf.io/b6ftg ◽

2021 ◽

Author(s):

Njie ◽

Nadine Lavan ◽

Carolyn McGettigan

Keyword(s):

Sorting Task ◽

Northern Irish ◽

Voice Perception ◽

Improve Accuracy ◽

Processing Pathways ◽

Identity Perception ◽

Available Information ◽

Talker Familiarity ◽

Tv Show ◽

Identity Processing

Familiarity benefits in voice identity perception have been frequently described in the literature. Typically, studies have contrasted listeners who were either familiar or unfamiliar with the target voices, thus manipulating talker familiarity. In these studies, familiarity with a voice results in more accurate voice identity perception. Such talker familiarity is, however, only one way in which listeners can be familiar the stimuli used: Another type of familiarity that has been shown to benefit voice identity perception is language or accent familiarity. In the current study, we examine and compare the effects of talker and accent familiarity in the context of a voice identity sorting task, using naturally varying voice recording samples from the TV show “Derry Girls”. Voice samples were thus all spoken with a regional accent of UK/Irish English (Northern Irish). We tested four listeners groups: Listeners were either familiar or unfamiliar with the TV show (and therefore the talker identities) and were either highly familiar or relatively less familiar with the accent. We find that both talker and accent familiarity significantly improve accuracy of voice identity perception. However, the effect sizes for effects of talker familiarity are overall larger. We discuss our findings in light of existing models of voice perception, arguing that they provide evidence for interactions of speech and identity processing pathways in voice perception. We conclude that voice perception is a highly interactive process, during which listeners make use of any available information to achieve their perceptual goals.

Download Full-text

Within-person variability can improve the identification of unfamiliar faces across changes in viewpoint

Quarterly Journal of Experimental Psychology ◽

10.1177/17470218211009771 ◽

2021 ◽

pp. 174702182110097

Author(s):

Niamh Hunnisett ◽

Simone Favelle

Keyword(s):

Image Matching ◽

Matching Task ◽

Target Image ◽

Front View ◽

Multiple Image ◽

Profile View ◽

Unfamiliar Face ◽

Matching Performance ◽

Sequential Matching ◽

Unfamiliar Faces

Unfamiliar face identification is concerningly error prone, especially across changes in viewing conditions. Within-person variability has been shown to improve matching performance for unfamiliar faces, but this has only been demonstrated using images of a front view. In this study, we test whether the advantage of within-person variability from front views extends to matching to target images of a face rotated in view. Participants completed either a simultaneous matching task (Experiment 1) or a sequential matching task (Experiment 2) in which they were tested on their ability to match the identity of a face shown in an array of either one or three ambient front-view images, with a target image shown in front, three-quarter, or profile view. While the effect was stronger in Experiment 2, we found a consistent pattern in match trials across both experiments in that there was a multiple image matching benefit for front, three-quarter, and profile-view targets. We found multiple image effects for match trials only, indicating that providing observers with multiple ambient images confers an advantage for recognising different images of the same identity but not for discriminating between images of different identities. Signal detection measures also indicate a multiple image advantage despite a more liberal response bias for multiple image trials. Our results show that within-person variability information for unfamiliar faces can be generalised across views and can provide insights into the initial processes involved in the representation of familiar faces.

Download Full-text

Computational insights into human perceptual expertise for familiar and unfamiliar face recognition

10.31234/osf.io/bv5mp ◽

2019 ◽

Author(s):

Nicholas Blauch ◽

Marlene Behrmann ◽

David C. Plaut

Keyword(s):

Face Recognition ◽

Theoretical Work ◽

Fine Tuning ◽

Identity Verification ◽

High Performing ◽

Face Identity ◽

Unfamiliar Face ◽

Perceptual Representations ◽

Expert Networks ◽

Identity Perception

Humans are generally thought to be experts at face recognition, and yet identity perception for unfamiliar faces is surprisingly poor compared to that for familiar faces. Prior theoretical work has argued that unfamiliar face identity perception suffers because the majority of identity-invariant visual variability is idiosyncratic to each identity, and thus, each face identity must be learned essentially from scratch. Using a high-performing deep convolutional neural network, we evaluate this claim by examining the effects of visual experience in untrained, object-expert and face-expert networks. We found that only face training led to substantial generalization in an identity verification task of novel unfamiliar identities. Moreover, generalization increased with the number of previously learned identities, highlighting the generality of identity-invariant information in face images. To better understand how familiarity builds upon generic face representations, we simulated familiarization with face identities by fine-tuning the network on images of the previously unfamiliar identities. Familiarization produced a sharp boost in verification, but only approached ceiling performance in the networks that were highly trained on faces. Moreover, in these face-expert networks, the sharp familiarity benefit was seen only at the identity-based output layer, and did not depend on changes to perceptual representations; rather, familiarity effects required learning only at the level of identity readout from a fixed expert representation. Our results thus reconcile the existence of a large familiar face advantage with claims that both familiar and unfamiliar face identity processing depend on shared expert perceptual representations.

Download Full-text

A grey area: How does image hue affect unfamiliar face matching?

10.31234/osf.io/7g9r8 ◽

2018 ◽

Author(s):

Anna K Bobak ◽

Viktoria Roumenova Mileva ◽

Peter Hancock

Keyword(s):

Matching Task ◽

Mixed Condition ◽

Face Matching ◽

Matching Test ◽

Unfamiliar Face Matching ◽

Unfamiliar Face ◽

Matching Performance ◽

Small Decline ◽

Clear Shift

The role of image colour in face identification has received little attention in research despite the importance of identifying people from photographs in identity documents (IDs). Here, in two experiments, we investigated whether colour congruency of two photographs shown side by side affects face matching accuracy. Participants were presented with two images from the Models Face Matching Test (Experiment 1) and a newly devised matching task incorporating female faces (Experiment 2) and asked to decide whether they show the same person, or two different people. The photographs were either both in colour, both in grayscale, or mixed (one in grayscale and one in colour). Participants were more likely to accept a pair of images as a “match”, i.e. same person, in the mixed condition, regardless of whether the identity of the pair was the same or not. This demonstrates a clear shift in bias between “congruent” colour conditions and the mixed trials. In addition, there was a small decline in accuracy in the mixed condition, relative to when the images were presented in colour. Our study provides the first evidence that the hue of document photographs matters for face matching performance. This finding has important implications for the design and regulation of photographic ID worldwide.

Download Full-text

Unfamiliar Face Matching With Frontal and Profile Views

Perception ◽

10.1177/0301006618756809 ◽

2018 ◽

Vol 47 (4) ◽

pp. 414-431 ◽

Cited By ~ 5

Author(s):

Robin S. S. Kramer ◽

Michael G. Reynolds

Keyword(s):

Real World ◽

Sources Of Information ◽

Matching Accuracy ◽

Face Matching ◽

Matching Test ◽

Unfamiliar Face Matching ◽

Unfamiliar Face ◽

Unfamiliar Faces

Research has systematically examined how laboratory participants and real-world practitioners decide whether two face photographs show the same person or not using frontal images. In contrast, research has not examined face matching using profile images. In Experiment 1, we ask whether matching unfamiliar faces is easier with frontal compared with profile views. Participants completed the original, frontal version of the Glasgow Face Matching Test, and also an adapted version where all face pairs were presented in profile. There was no difference in performance across the two tasks, suggesting that both views were similarly useful for face matching. Experiments 2 and 3 examined whether matching unfamiliar faces is improved when both frontal and profile views are provided. We compared face matching accuracy when both a frontal and a profile image of each face were presented, with accuracy using each view alone. Surprisingly, we found no benefit when both views were presented together in either experiment. Overall, these results suggest that either frontal or profile views provide substantially overlapping information regarding identity or participants are unable to utilise both sources of information when making decisions. Each of these conclusions has important implications for face matching research and real-world identification development.

Download Full-text

Dissociating Face Processing Skills: Decisions about Lip read Speech, Expression, and Identity

The Quarterly Journal of Experimental Psychology Section A ◽

10.1080/713755619 ◽

1996 ◽

Vol 49 (2) ◽

pp. 295-314 ◽

Cited By ~ 30

Author(s):

Ruth Campbell ◽

Barbara Brooks ◽

Edward de Haan ◽

Tony Roberts

Keyword(s):

Face Processing ◽

Face Identification ◽

Speech Sounds ◽

Manual Responses ◽

Lip Reading ◽

Identity Matching ◽

Unfamiliar Face ◽

Identity Based ◽

Series Of Experiments

The separability of different subcomponents of face processing has been regularly affirmed, but not always so clearly demonstrated. In particular, the ability to extract speech from faces (lip-reading) has been shown to dissociate doubly from face identification in neurological but not in other populations. In this series of experiments with undergraduates, the classification of speech sounds (lip-reading) from personally familiar and unfamiliar face photographs was explored using speeded manual responses. The independence of lip-reading from identity-based processing was confirmed. Furthermore, the established pattern of independence of expression-matching from, and dependence of identity-matching on, face familiarity was extended to personally familiar faces and “difficult”-emotion decisions. The implications of these findings are discussed.

Download Full-text