Auditory-visual speech recognition by hearing-impaired subjects: Consonant recognition, sentence recognition, and auditory-visual integration

1998 ◽  
Vol 103 (5) ◽  
pp. 2677-2690 ◽  
Author(s):  
Ken W. Grant ◽  
Brian E. Walden ◽  
Philip F. Seitz
1981 ◽  
Vol 24 (2) ◽  
pp. 207-216 ◽  
Author(s):  
Brian E. Walden ◽  
Sue A. Erdman ◽  
Allen A. Montgomery ◽  
Daniel M. Schwartz ◽  
Robert A. Prosek

The purpose of this research was to determine some of the effects of consonant recognition training on the speech recognition performance of hearing-impaired adults. Two groups of ten subjects each received seven hours of either auditory or visual consonant recognition training, in addition to a standard two-week, group-oriented, inpatient aural rehabilitation program. A third group of fifteen subjects received the standard two-week program, but no supplementary individual consonant recognition training. An audiovisual sentence recognition test, as well as tests of auditory and visual consonant recognition, were administered both before and ibltowing training. Subjects in all three groups significantly increased in their audiovisual sentence recognition performance, but subjects receiving the individual consonant recognition training improved significantly more than subjects receiving only the standard two-week program. A significant increase in consonant recognition performance was observed in the two groups receiving the auditory or visual consonant recognition training. The data are discussed from varying statistical and clinical perspectives.


2017 ◽  
Vol 29 (1) ◽  
pp. 105-113 ◽  
Author(s):  
Kazuhiro Nakadai ◽  
◽  
Tomoaki Koiwa ◽  

[abstFig src='/00290001/10.jpg' width='300' text='System architecture of AVSR based on missing feature theory and P-V grouping' ] Audio-visual speech recognition (AVSR) is a promising approach to improving the noise robustness of speech recognition in the real world. For AVSR, the auditory and visual units are the phoneme and viseme, respectively. However, these are often misclassified in the real world because of noisy input. To solve this problem, we propose two psychologically-inspired approaches. One is audio-visual integration based on missing feature theory (MFT) to cope with missing or unreliable audio and visual features for recognition. The other is phoneme and viseme grouping based on coarse-to-fine recognition. Preliminary experiments show that these two approaches are effective for audio-visual speech recognition. Integration based on MFT with an appropriate weight improves the recognition performance by −5 dB. This is the case even in a noisy environment, in which most speech recognition systems do not work properly. Phoneme and viseme grouping further improved the AVSR performance, particularly at a low signal-to-noise ratio.**This work is an extension of our publication “Tomoaki Koiwa et al.: Coarse speech recognition by audio-visual integration based on missing feature theory, IROS 2007, pp.1751-1756, 2007.”


Author(s):  
Guillaume Gravier ◽  
Gerasimos Potamianos ◽  
Chalapathy Neti

2007 ◽  
Vol 1 (1) ◽  
pp. 7-20 ◽  
Author(s):  
Alin G. Chiţu ◽  
Leon J. M. Rothkrantz ◽  
Pascal Wiggers ◽  
Jacek C. Wojdel

Author(s):  
Adriano de Andrade Bresolin ◽  
Diamantino Rui da Silva da Silva Freitas ◽  
Adriao Duarte Doria Neto ◽  
Pablo Javier Alsina

Sign in / Sign up

Export Citation Format

Share Document