Left and Right Hemifield Advantages of Fusions and Combinations in Audiovisual Speech Perception

1995 ◽  
Vol 48 (2) ◽  
pp. 320-333 ◽  
Author(s):  
Eugen Diesch

If a place-of-articulation contrast is created between the auditory and the visual component syllables of videotaped speech, frequently the syllable that listeners report they have heard differs phonetically from the auditory component. These “McGurk effects”, as they have come to be called, show that speech perception may involve some kind of intermodal process. There are two classes of these phenomena: fusions and combinations. Perception of the syllable /da/ when auditory /ba/ and visual /ga/ are presented provides a clear example of the former, and perception of the string /bga/ after presentation of auditory /ga/ and visual /ba/ an unambiguous instance of the latter. Besides perceptual fusions and combinations, hearing visually presented component syllables also shows an influence of vision on audition. It is argued that these “visual” responses arise from basically the same underlying processes that yield fusions and combinations, respectively. In the present study, the visual component of audiovisually incongruous CV-syllables was presented in the left and the right visual hemifield, respectively. Audiovisual fusion responses showed a left hemifield advantage, and audiovisual combination responses a right hemifield advantage. This finding suggests that the process of audiovisual integration differs between audiovisual fusions and combinations and, furthermore, that the two cerebral hemispheres contribute differentially to the two classes of response.

2012 ◽  
Vol 40 (3) ◽  
pp. 687-700 ◽  
Author(s):  
FERRAN PONS ◽  
LLORENÇ ANDREU ◽  
MONICA SANZ-TORRENT ◽  
LUCÍA BUIL-LEGAZ ◽  
DAVID J. LEWKOWICZ

ABSTRACTSpeech perception involves the integration of auditory and visual articulatory information, and thus requires the perception of temporal synchrony between this information. There is evidence that children with specific language impairment (SLI) have difficulty with auditory speech perception but it is not known if this is also true for the integration of auditory and visual speech. Twenty Spanish-speaking children with SLI, twenty typically developing age-matched Spanish-speaking children, and twenty Spanish-speaking children matched for MLU-w participated in an eye-tracking study to investigate the perception of audiovisual speech synchrony. Results revealed that children with typical language development perceived an audiovisual asynchrony of 666 ms regardless of whether the auditory or visual speech attribute led the other one. Children with SLI only detected the 666 ms asynchrony when the auditory component followed the visual component. None of the groups perceived an audiovisual asynchrony of 366 ms. These results suggest that the difficulty of speech processing by children with SLI would also involve difficulties in integrating auditory and visual aspects of speech perception.


2011 ◽  
Vol 24 (1) ◽  
pp. 67-90 ◽  
Author(s):  
Riikka Möttönen ◽  
Kaisa Tiippana ◽  
Mikko Sams ◽  
Hanna Puharinen

AbstractAudiovisual speech perception has been considered to operate independent of sound location, since the McGurk effect (altered auditory speech perception caused by conflicting visual speech) has been shown to be unaffected by whether speech sounds are presented in the same or different location as a talking face. Here we show that sound location effects arise with manipulation of spatial attention. Sounds were presented from loudspeakers in five locations: the centre (location of the talking face) and 45°/90° to the left/right. Auditory spatial attention was focused on a location by presenting the majority (90%) of sounds from this location. In Experiment 1, the majority of sounds emanated from the centre, and the McGurk effect was enhanced there. In Experiment 2, the major location was 90° to the left, causing the McGurk effect to be stronger on the left and centre than on the right. Under control conditions, when sounds were presented with equal probability from all locations, the McGurk effect tended to be stronger for sounds emanating from the centre, but this tendency was not reliable. Additionally, reaction times were the shortest for a congruent audiovisual stimulus, and this was the case independent of location. Our main finding is that sound location can modulate audiovisual speech perception, and that spatial attention plays a role in this modulation.


2019 ◽  
Author(s):  
Kristin J. Van Engen ◽  
Avanti Dey ◽  
Mitchell Sommers ◽  
Jonathan E. Peelle

Although listeners use both auditory and visual cues during speech perception, the cognitive and neural bases for their integration remain a matter of debate. One common approach to measuring multisensory integration is to use McGurk tasks, in which discrepant auditory and visual cues produce auditory percepts that differ from those based solely on unimodal input. Not all listeners show the same degree of susceptibility to the McGurk illusion, and these individual differences in susceptibility are frequently used as a measure of audiovisual integration ability. However, despite their popularity, we argue that McGurk tasks are ill-suited for studying the kind of multisensory speech perception that occurs in real life: McGurk stimuli are often based on isolated syllables (which are rare in conversations) and necessarily rely on audiovisual incongruence that does not occur naturally. Furthermore, recent data show that susceptibility on McGurk tasks does not correlate with performance during natural audiovisual speech perception. Although the McGurk effect is a fascinating illusion, truly understanding the combined use of auditory and visual information during speech perception requires tasks that more closely resemble everyday communication.


2011 ◽  
Vol 26 (S2) ◽  
pp. 1512-1512
Author(s):  
G.R. Szycik ◽  
Z. Ye ◽  
B. Mohammadi ◽  
W. Dillo ◽  
B.T. te Wildt ◽  
...  

IntroductionNatural speech perception relies on both, auditory and visual information. Both sensory channels provide redundant and complementary information, such that speech perception is enhanced in healthy subjects, when both information channels are present.ObjectivesPatients with schizophrenia have been reported to have problems regarding this audiovisual integration process, but little is known about which neural processes are altered.AimsIn this study we investigated functional connectivity of Broca’s area in patients with schizophrenia.MethodsFunctional magnetic resonance imaging (fMRI) was performed in 15 schizophrenia patients and 15 healthy controls to study functional connectivity of Broca’s area during perception of videos of bisyllabic German nouns, in which audio and video either matched (congruent condition) or die not match (incongruent; e.g. video = hotel, audio = island).ResultsThere were differences in connectivity between experimental groups and between conditions. Broca’s area of the patient group showed connections to more brain areas than the control group. This difference was more prominent in the incongruent condition, for which only one connection between Broca's area and the supplementary motor area was found in control participants, whereas patients showed connections to 8 widely distributed brain areas.ConclusionsThe findings imply that audiovisual integration problems in schizophrenia result from maladaptive connectivity of Broca's area in particular when confronted with incongruent stimuli and are discussed in light of recent audio visual speech models.


PLoS ONE ◽  
2021 ◽  
Vol 16 (2) ◽  
pp. e0246986
Author(s):  
Alma Lindborg ◽  
Tobias S. Andersen

Speech is perceived with both the ears and the eyes. Adding congruent visual speech improves the perception of a faint auditory speech stimulus, whereas adding incongruent visual speech can alter the perception of the utterance. The latter phenomenon is the case of the McGurk illusion, where an auditory stimulus such as e.g. “ba” dubbed onto a visual stimulus such as “ga” produces the illusion of hearing “da”. Bayesian models of multisensory perception suggest that both the enhancement and the illusion case can be described as a two-step process of binding (informed by prior knowledge) and fusion (informed by the information reliability of each sensory cue). However, there is to date no study which has accounted for how they each contribute to audiovisual speech perception. In this study, we expose subjects to both congruent and incongruent audiovisual speech, manipulating the binding and the fusion stages simultaneously. This is done by varying both temporal offset (binding) and auditory and visual signal-to-noise ratio (fusion). We fit two Bayesian models to the behavioural data and show that they can both account for the enhancement effect in congruent audiovisual speech, as well as the McGurk illusion. This modelling approach allows us to disentangle the effects of binding and fusion on behavioural responses. Moreover, we find that these models have greater predictive power than a forced fusion model. This study provides a systematic and quantitative approach to measuring audiovisual integration in the perception of the McGurk illusion as well as congruent audiovisual speech, which we hope will inform future work on audiovisual speech perception.


2020 ◽  
Vol 63 (7) ◽  
pp. 2245-2254 ◽  
Author(s):  
Jianrong Wang ◽  
Yumeng Zhu ◽  
Yu Chen ◽  
Abdilbar Mamat ◽  
Mei Yu ◽  
...  

Purpose The primary purpose of this study was to explore the audiovisual speech perception strategies.80.23.47 adopted by normal-hearing and deaf people in processing familiar and unfamiliar languages. Our primary hypothesis was that they would adopt different perception strategies due to different sensory experiences at an early age, limitations of the physical device, and the developmental gap of language, and others. Method Thirty normal-hearing adults and 33 prelingually deaf adults participated in the study. They were asked to perform judgment and listening tasks while watching videos of a Uygur–Mandarin bilingual speaker in a familiar language (Standard Chinese) or an unfamiliar language (Modern Uygur) while their eye movements were recorded by eye-tracking technology. Results Task had a slight influence on the distribution of selective attention, whereas subject and language had significant influences. To be specific, the normal-hearing and the d10eaf participants mainly gazed at the speaker's eyes and mouth, respectively, in the experiment; moreover, while the normal-hearing participants had to stare longer at the speaker's mouth when they confronted with the unfamiliar language Modern Uygur, the deaf participant did not change their attention allocation pattern when perceiving the two languages. Conclusions Normal-hearing and deaf adults adopt different audiovisual speech perception strategies: Normal-hearing adults mainly look at the eyes, and deaf adults mainly look at the mouth. Additionally, language and task can also modulate the speech perception strategy.


Author(s):  
Gregor Volberg

Previous studies often revealed a right-hemisphere specialization for processing the global level of compound visual stimuli. Here we explore whether a similar specialization exists for the detection of intersected contours defined by a chain of local elements. Subjects were presented with arrays of randomly oriented Gabor patches that could contain a global path of collinearly arranged elements in the left or in the right visual hemifield. As expected, the detection accuracy was higher for contours presented to the left visual field/right hemisphere. This difference was absent in two control conditions where the smoothness of the contour was decreased. The results demonstrate that the contour detection, often considered to be driven by lateral coactivation in primary visual cortex, relies on higher-level visual representations that differ between the hemispheres. Furthermore, because contour and non-contour stimuli had the same spatial frequency spectra, the results challenge the view that the right-hemisphere advantage in global processing depends on a specialization for processing low spatial frequencies.


2012 ◽  
Author(s):  
Joseph D. W. Stephens ◽  
Julian L. Scrivens ◽  
Amy A. Overman

2020 ◽  
Author(s):  
Elmira Zaynagutdinova ◽  
Karina Karenina ◽  
Andrey Giljov

Abstract Behavioural lateralization, which reflects the functional specializations of the two brain hemispheres, is assumed to play an important role in cooperative intraspecific interactions. However, there are few studies focused on the lateralization in cooperative behaviours of individuals, especially in a natural setting. In the present study, we investigated lateralized spatial interactions between the partners in life-long monogamous pairs. The male-female pairs of two geese species (barnacle, Branta leucopsis, and white-fronted, Anser albifrons geese), were observed during different stages of the annual cycle in a variety of conditions. In geese flocks, we recorded which visual hemifield (left/right) the following partner used to monitor the leading partner relevant to the type of behaviour and the disturbance factors. In a significant majority of pairs, the following bird viewed the leading partner with the left eye during routine behaviours such as resting and feeding in undisturbed conditions. This behavioural lateralization, implicating the right hemisphere processing, was consistent across the different aggregation sites and years of the study. In contrast, no significant bias was found in a variety of geese behaviours associated with enhanced disturbance (when alert on water, flying or fleeing away when disturbed, feeding during the hunting period, in urban area feeding and during moulting). We hypothesize that the increased demands for right hemisphere processing to deal with stressful and emergency situations may interfere with the manifestation of lateralization in social interactions.


Sign in / Sign up

Export Citation Format

Share Document