Application of the Articulation Index and the Speech Transmission Index to the Recognition of Speech by Normal-Hearing and Hearing-Impaired Listeners

1986 ◽  
Vol 29 (4) ◽  
pp. 447-462 ◽  
Author(s):  
Larry E. Humes ◽  
Donald D. Dirks ◽  
Theodore S. Bell ◽  
Christopher Ahlstbom ◽  
Gail E. Kincaid

The present article is divided into four major sections dealing with the application of acoustical indices to the prediction of speech recognition performance. In the first section, two acoustical indices, the Articulation Index (AI) and the Speech Transmission Index (STI), are described. In the next section, the effectiveness of the AI and the STI in describing the performance of normal-hearing and hearing-impaired subjects listening to spectrally distorted (filtered) and temporally distorted (reverberant) speech is examined retrospectively. In the third section, the results of a prospective investigation that examined the recognition of nonsense syllables under conditions of babble competition, filtering and reverberation are described. Finally, in the fourth section, the ability of the acoustical indices to describe the performance of 10 hearing-impaired listeners, 5 listening in quiet and 5 in babble, is examined. It is concluded that both the AI and the STI have significant shortcomings. A hybrid index, designated mSTI, which takes the best features from each procedure, is described and demonstrated to be the best alternative presently available.

2021 ◽  
Vol 69 (2) ◽  
pp. 173-179
Author(s):  
Nilolina Samardzic ◽  
Brian C.J. Moore

Traditional methods for predicting the intelligibility of speech in the presence of noise inside a vehicle, such as the Articulation Index (AI), the Speech Intelligibility Index (SII), and the Speech Transmission Index (STI), are not accurate, probably because they do not take binaural listening into account; the signals reaching the two ears can differ markedly depending on the positions of the talker and listener. We propose a new method for predicting the intelligibility of speech in a vehicle, based on the ratio of the binaural loudness of the speech to the binaural loudness of the noise, each calculated using the method specified in ISO 532-2 (2017). The method was found to give accurate predictions of the speech reception threshold (SRT) measured under a variety of conditions and for different positions of the talker and listener in a car. The typical error in the predicted SRT was 1.3 dB, which is markedly smaller than estimated using the SII and STI (2.0 dB and 2.1 dB, respectively).


1992 ◽  
Vol 35 (4) ◽  
pp. 942-949 ◽  
Author(s):  
Christopher W. Turner ◽  
David A. Fabry ◽  
Stephanie Barrett ◽  
Amy R. Horwitz

This study examined the possibility that hearing-impaired listeners, in addition to displaying poorer-than-normal recognition of speech presented in background noise, require a larger signal-to-noise ratio for the detection of the speech sounds. Psychometric functions for the detection and recognition of stop consonants were obtained from both normal-hearing and hearing-impaired listeners. Expressing the speech levels in terms of their short-term spectra, the detection of consonants for both subject groups occurred at the same signal-to-noise ratio. In contrast, the hearing-impaired listeners displayed poorer recognition performance than the normal-hearing listeners. These results imply that the higher signal-to-noise ratios required for a given level of recognition by some subjects with hearing loss are not due in part to a deficit in detection of the signals in the masking noise, but rather are due exclusively to a deficit in recognition.


1987 ◽  
Vol 30 (3) ◽  
pp. 403-410 ◽  
Author(s):  
Larry E. Humes ◽  
Stephen Boney ◽  
Faith Loven

The present article further evaluates the accuracy of speech-recognition predictions made according to two forms of the Speech Transmission Index (STI) for normal-hearing listeners. The first portion of this article describes the application of the modified Speech Transmission Index (mSTI) to an extensive set of speech-recognition data. Performance of normal-hearing listeners on a nonsense-syllable recognition task in 216 conditions involving different speech levels, background noise levels, reverberation times and filter passbands was found to be monotonically related to the mSTI. The second portion of this article describes a retrospective and prospective analysis of an extended sound-field version of the STI, referred to here as STI x . This extended STI considers many of the variables relevant to sound-field speech recognition, some of which are not incorporated in the mSTI. These variables include: (a) reverberation time; (b) speech level; (e) noise level; (d) talker-to-listener distance; (e) directivity of the speech source; and (f) directivity of the listener (eg., monaural vs. binaural listening). For both the retrospective and prospective analyses, speech-recognition was found to vary monotonically with STI x .


1990 ◽  
Vol 33 (4) ◽  
pp. 726-735 ◽  
Author(s):  
Larry E. Humes ◽  
Lisa Roberts

The role that sensorineural hearing loss plays in the speech-recognition difficulties of the hearing-impaired elderly is examined. One approach to this issue was to make between-group comparisons of performance for three groups of subjects: (a) young normal-hearing adults; (b) elderly hearing-impaired adults; and (c) young normal-hearing adults with simulated sensorineural hearing loss equivalent to that of the elderly subjects produced by a spectrally shaped masking noise. Another approach to this issue employed correlational analyses to examine the relation between audibility and speech recognition within the group of elderly hearing-impaired subjects. An additional approach was pursued in which an acoustical index incorporating adjustments for threshold elevation was used to examine the role audibility played in the speech-recognition performance of the hearing-impaired elderly. A wide range of listening conditions was sampled in this experiment. The conclusion was that the primary determiner of speech-recognition performance in the elderly hearing-impaired subjects was their threshold elevation.


1990 ◽  
Vol 33 (3) ◽  
pp. 440-449 ◽  
Author(s):  
Fan-Gang Zeng ◽  
Christopher W. Turner

The purpose of this study was to investigate the sufficient perceptual cues used in the recognition of four voiceless fricative consonants [s, f, θ, ∫] followed by the same vowel [i:] in normal-hearing and hearing-impaired adult listeners. Subjects identified the four CV speech tokens in a closed-set response task across a range of presentation levels. Fricative syllables were either produced by a human speaker in the natural stimulus set, or generated by a computer program in the synthetic stimulus set. By comparing conditions in which the subjects were presented with equivalent degrees of audibility for individual fricatives, it was possible to isolate the factor of lack of audibility from that of loss of suprathreshold discriminability. Results indicate that (a) the frication burst portion may serve as a sufficient cue for correct recognition of voiceless fricatives by normal-hearing subjects, whereas the more intense CV transition portion, though it may not be necessary, can also assist these subjects to distinguish place information, particularly at low presentation levels; (b) hearing-impaired subjects achieved close-to-normal recognition performance when given equivalent degrees of audibility of the frication cue, but they obtained poorer-than-normal performance if only given equivalent degrees of audibility of the transition cue; (c) the difficulty that hearing-impaired subjects have in perceiving fricatives under normal circumstances may be due to two factors: the lack of audibility of the frication cue and the loss of discriminability of the transition cue.


2006 ◽  
Vol 27 (3) ◽  
pp. 263-278 ◽  
Author(s):  
Matthew H. Burk ◽  
Larry E. Humes ◽  
Nathan E. Amos ◽  
Lauren E. Strauser

Author(s):  
Amin Ebrahimi ◽  
Mohammad Ebrahim Mahdavi ◽  
Hamid Jalilvand

Background and Aim: Digits are suitable speech materials for evaluating recognition of speech-in-noise in clients with the wide range of language abilities. Farsi Auditory Recognition of Digit-in-Noise (FARDIN) test has been deve­loped and validated in learning-disabled child­ren showing dichotic listening deficit. This stu­dy was conducted for further validation of FARDIN and to survey the effects of noise type on the recognition performance in individuals with sensory-neural hearing impairment. Methods: Persian monosyllabic digits 1−10 were extracted from the audio file of FARDIN test. Ten lists were compiled using a random order of the triplets. The first five lists were mixed with multi-talker babble noise (MTBN) and the second five lists with speech-spectrum noise (SSN). Signal- to- noise ratio (SNR) var­ied from +5 to −15 in 5 dB steps. 20 normal hearing and 19 hearing-impaired individuals participated in the current study. Results: Both types of noise could differentiate the hearing loss from normal hearing. Hearing-impaired group showed weaker performance for digit recognition in MTBN and SSN and needed 4−5.6 dB higher SNR (50%), compared to the normal hearing group. MTBN was more challenging for normal hearing than SSN. Conclusion: Farsi Auditory Recognition of Digit-in-Noise is a validated test for estimating SNR (50%) in clients with hearing loss. It seems SSN is more appropriate for using as a back­ground noise for testing the performance of aud­itory recognition of digit-in-noise.   Keywords: Auditory recognition; hearing loss; speech perception in noise; digit recognition in noise


Sign in / Sign up

Export Citation Format

Share Document