Somatosensory contribution to audio-visual speech processing

Abstract Natural speech is processed in the brain as a mixture of auditory and visual features. An example of the importance of visual speech is the McGurk effect and related perceptual illusions that result from mismatching auditory and visual syllables. Although the McGurk effect has widely been applied to the exploration of audio-visual speech processing, it relies on isolated syllables, which severely limits the conclusions that can be drawn from the paradigm. In addition, the extreme variability and the quality of the stimuli usually employed prevents comparability across studies. To overcome these limitations, we present an innovative methodology using 3D virtual characters with realistic lip movements synchronized on computer-synthesized speech. We used commercially accessible and affordable tools to facilitate reproducibility and comparability, and the set-up was validated on 24 participants performing a perception task. Within complete and meaningful French sentences, we paired a labiodental fricative viseme (i.e. /v/) with a bilabial occlusive phoneme (i.e. /b/). This audiovisual mismatch is known to induce the illusion of hearing /v/ in a proportion of trials. We tested the rate of the illusion while varying the magnitude of background noise and audiovisual lag. Overall, the effect was observed in 40% of trials. The proportion rose to about 50% with added background noise and up to 66% when controlling for phonetic features. Our results conclusively demonstrate that computer-generated speech stimuli are judicious, and that they can supplement natural speech with higher control over stimulus timing and content.

Download Full-text

Correlations between auditory and visual speech processing ability: Evidence for a modality‐independent source of variance

The Journal of the Acoustical Society of America ◽

10.1121/1.404789 ◽

1992 ◽

Vol 92 (4) ◽

pp. 2385-2385 ◽

Cited By ~ 1

Author(s):

Charles S. Watson ◽

William W. Qiu ◽

Mary Chamberlain

Keyword(s):

Speech Processing ◽

Visual Speech ◽

Independent Source ◽

Processing Ability

Download Full-text

Audio-visual speech processing in age-related hearing loss: Stronger integration and increased frontal lobe recruitment

NeuroImage ◽

10.1016/j.neuroimage.2018.04.023 ◽

2018 ◽

Vol 175 ◽

pp. 425-437 ◽

Cited By ~ 21

Author(s):

Stephanie Rosemann ◽

Christiane M. Thiel

Keyword(s):

Hearing Loss ◽

Frontal Lobe ◽

Speech Processing ◽

Visual Speech ◽

Age Related ◽

Age Related Hearing Loss

Download Full-text

Facial speech gestures: the relation between visual speech processing, phonological awareness, and developmental dyslexia in 10-year-olds

Developmental Science ◽

10.1111/desc.12346 ◽

2015 ◽

Vol 19 (6) ◽

pp. 1020-1034 ◽

Cited By ~ 7

Author(s):

Gesa Schaadt ◽

Claudia Männel ◽

Elke van der Meer ◽

Ann Pannekamp ◽

Angela D. Friederici

Keyword(s):

Phonological Awareness ◽

Developmental Dyslexia ◽

Speech Processing ◽

Visual Speech

Download Full-text

Plasticity in bilateral superior temporal cortex: Effects of deafness and cochlear implantation on auditory and visual speech processing

Hearing Research ◽

10.1016/j.heares.2016.07.013 ◽

2017 ◽

Vol 343 ◽

pp. 138-149 ◽

Cited By ~ 18

Author(s):

Carly A. Anderson ◽

Diane S. Lazard ◽

Douglas E.H. Hartley

Keyword(s):

Speech Processing ◽

Cochlear Implantation ◽

Temporal Cortex ◽

Visual Speech ◽

Superior Temporal Cortex

Download Full-text

Approaches to visual speech processing based on the MPEG-4 Face Animation standard

2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532) ◽

10.1109/icme.2000.869667 ◽

2002 ◽

Cited By ~ 2

Author(s):

E. Petajan

Keyword(s):

Speech Processing ◽

Visual Speech ◽

Face Animation

Download Full-text

Lip movements entrain the observers’ low-frequency brain oscillations to facilitate speech intelligibility

eLife ◽

10.7554/elife.14521 ◽

2016 ◽

Vol 5 ◽

Cited By ~ 65

Author(s):

Hyojin Park ◽

Christoph Kayser ◽

Gregor Thut ◽

Joachim Gross

Keyword(s):

Visual Cortex ◽

Speech Processing ◽

Speech Intelligibility ◽

Brain Activity ◽

Low Frequency ◽

Visual Speech ◽

Visual Signals ◽

Partial Coherence ◽

Auditory Speech ◽

Oscillatory Brain Activity

During continuous speech, lip movements provide visual temporal signals that facilitate speech processing. Here, using MEG we directly investigated how these visual signals interact with rhythmic brain activity in participants listening to and seeing the speaker. First, we investigated coherence between oscillatory brain activity and speaker’s lip movements and demonstrated significant entrainment in visual cortex. We then used partial coherence to remove contributions of the coherent auditory speech signal from the lip-brain coherence. Comparing this synchronization between different attention conditions revealed that attending visual speech enhances the coherence between activity in visual cortex and the speaker’s lips. Further, we identified a significant partial coherence between left motor cortex and lip movements and this partial coherence directly predicted comprehension accuracy. Our results emphasize the importance of visually entrained and attention-modulated rhythmic brain activity for the enhancement of audiovisual speech processing.

Download Full-text