Talking Heads and Speech Recognisers That Can See: The Computer Processing of Visual Speech Signals

Author(s):  
N. Michael Brooke
2018 ◽  
Vol 52 ◽  
pp. 165-190 ◽  
Author(s):  
Helen L. Bear ◽  
Richard Harvey
Keyword(s):  

2021 ◽  
Author(s):  
Mate Aller ◽  
Heidi Solberg Okland ◽  
Lucy J MacGregor ◽  
Helen Blank ◽  
Matthew H. Davis

Speech perception in noisy environments is enhanced by seeing facial movements of communication partners. However, the neural mechanisms by which audio and visual speech are combined are not fully understood. We explore MEG phase locking to auditory and visual signals in MEG recordings from 14 human participants (6 female) that reported words from single spoken sentences. We manipulated the acoustic clarity and visual speech signals such that critical speech information is present in auditory, visual or both modalities. MEG coherence analysis revealed that both auditory and visual speech envelopes (auditory amplitude modulations and lip aperture changes) were phase-locked to 2-6Hz brain responses in auditory and visual cortex, consistent with entrainment to syllable-rate components. Partial coherence analysis was used to separate neural responses to correlated audio-visual signals and showed non-zero phase locking to auditory envelope in occipital cortex during audio-visual (AV) speech. Furthermore, phase-locking to auditory signals in visual cortex was enhanced for AV speech compared to audio-only (AO) speech that was matched for intelligibility. Conversely, auditory regions of the superior temporal gyrus (STG) did not show above-chance partial coherence with visual speech signals during AV conditions, but did show partial coherence in VO conditions. Hence, visual speech enabled stronger phase locking to auditory signals in visual areas, whereas phase-locking of visual speech in auditory regions only occurred during silent lip-reading. Differences in these cross-modal interactions between auditory and visual speech signals are interpreted in line with cross-modal predictive mechanisms during speech perception.


2009 ◽  
pp. 439-461
Author(s):  
Lynne E. Bernstein ◽  
Jintao Jiang

The information in optical speech signals is phonetically impoverished compared to the information in acoustic speech signals that are presented under good listening conditions. But high lipreading scores among prelingually deaf adults inform us that optical speech signals are in fact rich in phonetic information. Hearing lipreaders are not as accurate as deaf lipreaders, but they too demonstrate perception of detailed optical phonetic information. This chapter briefly sketches the historical context of and impediments to knowledge about optical phonetics and visual speech perception (lipreading). The authors review findings on deaf and hearing lipreaders. Then we review recent results on relationships between optical speech signals and visual speech perception. We extend the discussion of these relationships to the development of visual speech synthesis. We advocate for a close relationship between visual speech perception research and development of synthetic visible speech.


1989 ◽  
Vol 27 (11) ◽  
pp. 65-71 ◽  
Author(s):  
B.P. Yuhas ◽  
M.H. Goldstein ◽  
T.J. Sejnowski

SAGE Open ◽  
2015 ◽  
Vol 5 (4) ◽  
pp. 215824401561193
Author(s):  
Mohammad Hossein Sadaghiani ◽  
Niusha Shafiabady ◽  
Dino Isa

Sign in / Sign up

Export Citation Format

Share Document