audiovisual speech perception
Recently Published Documents


TOTAL DOCUMENTS

157
(FIVE YEARS 36)

H-INDEX

26
(FIVE YEARS 2)

2021 ◽  
pp. JN-RM-0114-21
Author(s):  
Jonathan E. Peelle ◽  
Brent Spehar ◽  
Michael S. Jones ◽  
Sarah McConkey ◽  
Joel Myerson ◽  
...  

2021 ◽  
Vol 150 (4) ◽  
pp. A275-A275
Author(s):  
Laura Koenig ◽  
Melissa Randazzo ◽  
Paul J. Smith ◽  
Ryan Priefer

2021 ◽  
pp. 174702182110444
Author(s):  
Yuta Ujiie ◽  
Kohske Takahashi

The other-race effect indicates a perceptual advantage when processing own-race faces. This effect has been demonstrated in individuals’ recognition of facial identity and emotional expressions. However, it remains unclear whether the other-race effect also exists in multisensory domains. We conducted two experiments to provide evidence for the other-race effect in facial speech recognition, using the McGurk effect. Experiment 1 tested this issue among East Asian adults, examining the magnitude of the McGurk effect during stimuli using speakers from two different races (own-race vs. other-race). We found that own-race faces induced a stronger McGurk effect than other-race faces. Experiment 2 indicated that the other-race effect was not simply due to different levels of attention being paid to the mouths of own- and other-race speakers. Our findings demonstrated that own-race faces enhance the weight of visual input during audiovisual speech perception, and they provide evidence of the own-race effect in the audiovisual interaction for speech perception in adults.


Author(s):  
Yi Yuan ◽  
Kelli Meyers ◽  
Kayla Borges ◽  
Yasneli Lleo ◽  
Katarina A. Fiorentino ◽  
...  

Purpose This study investigated the effects of visually presented speech envelope information with various modulation rates and depths on audiovisual speech perception in noise. Method Forty adults (21.25 ± 1.45 years) participated in audiovisual sentence recognition measurements in noise. Target speech sentences were auditorily presented in multitalker babble noises at a −3 dB SNR. Acoustic amplitude envelopes of target signals were extracted through low-pass filters with different cutoff frequencies (4, 10, and 30 Hz) and a fixed modulation depth at 100% (Experiment 1) or extracted with various modulation depths (0%, 25%, 50%, 75%, and 100%) and a fixed 10-Hz modulation rate (Experiment 2). The extracted target envelopes were synchronized with the amplitude of a spherical-shaped ball and presented as visual stimuli. Subjects were instructed to attend to both auditory and visual stimuli of the target sentences and type down their answers. The sentence recognition accuracy was compared between audio-only and audiovisual conditions. Results In Experiment 1, a significant improvement in speech intelligibility was observed when the visual analog (a sphere) synced with the acoustic amplitude envelope modulated at a 10-Hz modulation rate compared to the audio-only condition. In Experiment 2, the visual analog with 75% modulation depth resulted in better audiovisual speech perception in noise compared to the other modulation depth conditions. Conclusion An abstract visual analog of acoustic amplitude envelopes can be efficiently delivered by the visual system and integrated online with auditory signals to enhance speech perception in noise, independent of particular articulation movements.


2021 ◽  
pp. 1-17
Author(s):  
Yuta Ujiie ◽  
Kohske Takahashi

Abstract While visual information from facial speech modulates auditory speech perception, it is less influential on audiovisual speech perception among autistic individuals than among typically developed individuals. In this study, we investigated the relationship between autistic traits (Autism-Spectrum Quotient; AQ) and the influence of visual speech on the recognition of Rubin’s vase-type speech stimuli with degraded facial speech information. Participants were 31 university students (13 males and 18 females; mean age: 19.2, SD: 1.13 years) who reported normal (or corrected-to-normal) hearing and vision. All participants completed three speech recognition tasks (visual, auditory, and audiovisual stimuli) and the AQ–Japanese version. The results showed that accuracies of speech recognition for visual (i.e., lip-reading) and auditory stimuli were not significantly related to participants’ AQ. In contrast, audiovisual speech perception was less susceptible to facial speech perception among individuals with high rather than low autistic traits. The weaker influence of visual information on audiovisual speech perception in autism spectrum disorder (ASD) was robust regardless of the clarity of the visual information, suggesting a difficulty in the process of audiovisual integration rather than in the visual processing of facial speech.


PLoS ONE ◽  
2021 ◽  
Vol 16 (2) ◽  
pp. e0246986
Author(s):  
Alma Lindborg ◽  
Tobias S. Andersen

Speech is perceived with both the ears and the eyes. Adding congruent visual speech improves the perception of a faint auditory speech stimulus, whereas adding incongruent visual speech can alter the perception of the utterance. The latter phenomenon is the case of the McGurk illusion, where an auditory stimulus such as e.g. “ba” dubbed onto a visual stimulus such as “ga” produces the illusion of hearing “da”. Bayesian models of multisensory perception suggest that both the enhancement and the illusion case can be described as a two-step process of binding (informed by prior knowledge) and fusion (informed by the information reliability of each sensory cue). However, there is to date no study which has accounted for how they each contribute to audiovisual speech perception. In this study, we expose subjects to both congruent and incongruent audiovisual speech, manipulating the binding and the fusion stages simultaneously. This is done by varying both temporal offset (binding) and auditory and visual signal-to-noise ratio (fusion). We fit two Bayesian models to the behavioural data and show that they can both account for the enhancement effect in congruent audiovisual speech, as well as the McGurk illusion. This modelling approach allows us to disentangle the effects of binding and fusion on behavioural responses. Moreover, we find that these models have greater predictive power than a forced fusion model. This study provides a systematic and quantitative approach to measuring audiovisual integration in the perception of the McGurk illusion as well as congruent audiovisual speech, which we hope will inform future work on audiovisual speech perception.


Sign in / Sign up

Export Citation Format

Share Document