mcgurk illusion
Recently Published Documents


TOTAL DOCUMENTS

29
(FIVE YEARS 9)

H-INDEX

9
(FIVE YEARS 1)

Author(s):  
Basil Wahn ◽  
Laura Schmitz ◽  
Alan Kingstone ◽  
Anne Böckler-Raettig

AbstractEye contact is a dynamic social signal that captures attention and plays a critical role in human communication. In particular, direct gaze often accompanies communicative acts in an ostensive function: a speaker directs her gaze towards the addressee to highlight the fact that this message is being intentionally communicated to her. The addressee, in turn, integrates the speaker’s auditory and visual speech signals (i.e., her vocal sounds and lip movements) into a unitary percept. It is an open question whether the speaker’s gaze affects how the addressee integrates the speaker’s multisensory speech signals. We investigated this question using the classic McGurk illusion, an illusory percept created by presenting mismatching auditory (vocal sounds) and visual information (speaker’s lip movements). Specifically, we manipulated whether the speaker (a) moved his eyelids up/down (i.e., open/closed his eyes) prior to speaking or did not show any eye motion, and (b) spoke with open or closed eyes. When the speaker’s eyes moved (i.e., opened or closed) before an utterance, and when the speaker spoke with closed eyes, the McGurk illusion was weakened (i.e., addressees reported significantly fewer illusory percepts). In line with previous research, this suggests that motion (opening or closing), as well as the closed state of the speaker’s eyes, captured addressees’ attention, thereby reducing the influence of the speaker’s lip movements on the addressees’ audiovisual integration process. Our findings reaffirm the power of speaker gaze to guide attention, showing that its dynamics can modulate low-level processes such as the integration of multisensory speech signals.


2021 ◽  
Author(s):  
Kennis Ma ◽  
jan schnupp

The “unity assumption hypothesis” contends that higher-level factors, such as a perceiver’s belief and prior experience, modulate multisensory integration. The McGurk illusion exemplifies such integration. When a visual velar /ga/ is dubbed with an auditory bilabial /ba/, listeners unify the discrepant signals with knowledge that open lips cannot produce /ba/ and a fusion percept /da/ is perceived. Previous research claimed to have falsified this theory by demonstrating the McGurk effect occurs even when a face is dubbed with a gender incongruent voice. But perhaps stronger evidence than just an apparent incongruence between unfamiliar faces and voices is needed to prevent perceptual unity. Here we investigated whether the McGurk illusion with gender incongruent stimuli can be disrupted by priming with appropriate pairing of face and voice. In an online experiment, 89 participants aged 18-62, were randomly allocated to experience experimental trials containing either a male or female face with incongruent gender voice. The number of times participants experienced a McGurk illusion was measured before and after a training block which familiarized them with the true pairings of face and voice. After training and priming, the susceptibility to the McGurk effects decreased significantly on average. The findings support the notion that unity assumptions modulate intersensory bias, and confirm and extend previous studies using gender incongruous McGurk stimuli.


2021 ◽  
Vol 149 (4) ◽  
pp. A32-A32
Author(s):  
Kristin J. Van Engen
Keyword(s):  

2021 ◽  
Vol 15 ◽  
Author(s):  
Mariel G. Gonzales ◽  
Kristina C. Backer ◽  
Brenna Mandujano ◽  
Antoine J. Shahin

The McGurk illusion occurs when listeners hear an illusory percept (i.e., “da”), resulting from mismatched pairings of audiovisual (AV) speech stimuli (i.e., auditory/ba/paired with visual/ga/). Hearing a third percept—distinct from both the auditory and visual input—has been used as evidence of AV fusion. We examined whether the McGurk illusion is instead driven by visual dominance, whereby the third percept, e.g., “da,” represents a default percept for visemes with an ambiguous place of articulation (POA), like/ga/. Participants watched videos of a talker uttering various consonant vowels (CVs) with (AV) and without (V-only) audios of/ba/. Individuals transcribed the CV they saw (V-only) or heard (AV). In the V-only condition, individuals predominantly saw “da”/“ta” when viewing CVs with indiscernible POAs. Likewise, in the AV condition, upon perceiving an illusion, they predominantly heard “da”/“ta” for CVs with indiscernible POAs. The illusion was stronger in individuals who exhibited weak/ba/auditory encoding (examined using a control auditory-only task). In Experiment2, we attempted to replicate these findings using stimuli recorded from a different talker. The V-only results were not replicated, but again individuals predominately heard “da”/“ta”/“tha” as an illusory percept for various AV combinations, and the illusion was stronger in individuals who exhibited weak/ba/auditory encoding. These results demonstrate that when visual CVs with indiscernible POAs are paired with a weakly encoded auditory/ba/, listeners default to hearing “da”/“ta”/“tha”—thus, tempering the AV fusion account, and favoring a default mechanism triggered when both AV stimuli are ambiguous.


PLoS ONE ◽  
2021 ◽  
Vol 16 (2) ◽  
pp. e0246986
Author(s):  
Alma Lindborg ◽  
Tobias S. Andersen

Speech is perceived with both the ears and the eyes. Adding congruent visual speech improves the perception of a faint auditory speech stimulus, whereas adding incongruent visual speech can alter the perception of the utterance. The latter phenomenon is the case of the McGurk illusion, where an auditory stimulus such as e.g. “ba” dubbed onto a visual stimulus such as “ga” produces the illusion of hearing “da”. Bayesian models of multisensory perception suggest that both the enhancement and the illusion case can be described as a two-step process of binding (informed by prior knowledge) and fusion (informed by the information reliability of each sensory cue). However, there is to date no study which has accounted for how they each contribute to audiovisual speech perception. In this study, we expose subjects to both congruent and incongruent audiovisual speech, manipulating the binding and the fusion stages simultaneously. This is done by varying both temporal offset (binding) and auditory and visual signal-to-noise ratio (fusion). We fit two Bayesian models to the behavioural data and show that they can both account for the enhancement effect in congruent audiovisual speech, as well as the McGurk illusion. This modelling approach allows us to disentangle the effects of binding and fusion on behavioural responses. Moreover, we find that these models have greater predictive power than a forced fusion model. This study provides a systematic and quantitative approach to measuring audiovisual integration in the perception of the McGurk illusion as well as congruent audiovisual speech, which we hope will inform future work on audiovisual speech perception.


Cortex ◽  
2020 ◽  
Vol 129 ◽  
pp. 266-280
Author(s):  
Stephanie Rosemann ◽  
Dakota Smith ◽  
Marie Dewenter ◽  
Christiane M. Thiel

2018 ◽  
Vol 120 (6) ◽  
pp. 2988-3000 ◽  
Author(s):  
Noelle T. Abbott ◽  
Antoine J. Shahin

In spoken language, audiovisual (AV) perception occurs when the visual modality influences encoding of acoustic features (e.g., phonetic representations) at the auditory cortex. We examined how visual speech (mouth movements) transforms phonetic representations, indexed by changes to the N1 auditory evoked potential (AEP). EEG was acquired while human subjects watched and listened to videos of a speaker uttering consonant vowel (CV) syllables, /ba/ and /wa/, presented in auditory-only or AV congruent or incongruent contexts or in a context in which the consonants were replaced by white noise (noise replaced). Subjects reported whether they heard “ba” or “wa.” We hypothesized that the auditory N1 amplitude during illusory perception (caused by incongruent AV input, as in the McGurk illusion, or white noise-replaced consonants in CV utterances) should shift to reflect the auditory N1 characteristics of the phonemes conveyed visually (by mouth movements) as opposed to acoustically. Indeed, the N1 AEP became larger and occurred earlier when listeners experienced illusory “ba” (video /ba/, audio /wa/, heard as “ba”) and vice versa when they experienced illusory “wa” (video /wa/, audio /ba/, heard as “wa”), mirroring the N1 AEP characteristics for /ba/ and /wa/ observed in natural acoustic situations (e.g., auditory-only setting). This visually mediated N1 behavior was also observed for noise-replaced CVs. Taken together, the findings suggest that information relayed by the visual modality modifies phonetic representations at the auditory cortex and that similar neural mechanisms support the McGurk illusion and visually mediated phonemic restoration. NEW & NOTEWORTHY Using a variant of the McGurk illusion experimental design (using the syllables /ba/ and /wa/), we demonstrate that lipreading influences phonetic encoding at the auditory cortex. We show that the N1 auditory evoked potential morphology shifts to resemble the N1 morphology of the syllable conveyed visually. We also show similar N1 shifts when the consonants are replaced by white noise, suggesting that the McGurk illusion and the visually mediated phonemic restoration rely on common mechanisms.


Sign in / Sign up

Export Citation Format

Share Document