Extended speech intelligibility index for the prediction of the speech reception threshold in fluctuating noise

Traditional methods for predicting the intelligibility of speech in the presence of noise inside a vehicle, such as the Articulation Index (AI), the Speech Intelligibility Index (SII), and the Speech Transmission Index (STI), are not accurate, probably because they do not take binaural listening into account; the signals reaching the two ears can differ markedly depending on the positions of the talker and listener. We propose a new method for predicting the intelligibility of speech in a vehicle, based on the ratio of the binaural loudness of the speech to the binaural loudness of the noise, each calculated using the method specified in ISO 532-2 (2017). The method was found to give accurate predictions of the speech reception threshold (SRT) measured under a variety of conditions and for different positions of the talker and listener in a car. The typical error in the predicted SRT was 1.3 dB, which is markedly smaller than estimated using the SII and STI (2.0 dB and 2.1 dB, respectively).

Download Full-text

The Band Importance Function in the Evaluation of the Speech Intelligibility Index at the Speech Reception Threshold within a Simulated Driving Environment

10.4271/2013-01-1953 ◽

2013 ◽

Cited By ~ 1

Author(s):

Nikolina Samardzic ◽

Colin Novak

Keyword(s):

Speech Intelligibility ◽

Simulated Driving ◽

Speech Reception ◽

Importance Function ◽

Speech Intelligibility Index

Download Full-text

A Speech Intelligibility Index-based approach to predict the speech reception threshold for sentences in fluctuating noise for normal-hearing listeners

The Journal of the Acoustical Society of America ◽

10.1121/1.1861713 ◽

2005 ◽

Vol 117 (4) ◽

pp. 2181-2192 ◽

Cited By ~ 169

Author(s):

Koenraad S. Rhebergen ◽

Niek J. Versfeld

Keyword(s):

Speech Intelligibility ◽

Normal Hearing ◽

Speech Reception ◽

Speech Intelligibility Index

Download Full-text

Prediction of the speech intelligibility index behind a single screen in an open-plan office

Applied Acoustics ◽

10.1016/s0003-682x(02)00003-8 ◽

2002 ◽

Vol 63 (8) ◽

pp. 867-883 ◽

Cited By ~ 9

Author(s):

C. Wang ◽

J.S. Bradley

Keyword(s):

Speech Intelligibility ◽

Open Plan ◽

Speech Intelligibility Index

Download Full-text

Quantification of classroom design over Speech Intelligibility Index and Reverberation Time through Deep Learning

10.26678/abcm.cobem2021.cob2021-1082 ◽

2021 ◽

Author(s):

Eriberto Oliveira do Nascimento ◽

PAULO HENRIQUE TROMBETTA ZANNIN

Keyword(s):

Deep Learning ◽

Speech Intelligibility ◽

Classroom Design ◽

Reverberation Time ◽

Speech Intelligibility Index

Download Full-text

Predicting Speech Recognition Using the Speech Intelligibility Index and Other Variables for Cochlear Implant Users

Journal of Speech Language and Hearing Research ◽

10.1044/2018_jslhr-h-18-0303 ◽

2019 ◽

Vol 62 (5) ◽

pp. 1517-1531 ◽

Cited By ~ 1

Author(s):

Sungmin Lee ◽

Lisa Lucks Mendel ◽

Gavin M. Bidelman

Keyword(s):

Speech Recognition ◽

Speech Perception ◽

Predictive Factors ◽

Cognitive Skills ◽

Speech Intelligibility ◽

Digit Span ◽

Individual Variability ◽

Large Individual ◽

Speech Intelligibility Index ◽

Speech Performance

Purpose Although the speech intelligibility index (SII) has been widely applied in the field of audiology and other related areas, application of this metric to cochlear implants (CIs) has yet to be investigated. In this study, SIIs for CI users were calculated to investigate whether the SII could be an effective tool for predicting speech perception performance in a population with CI. Method Fifteen pre- and postlingually deafened adults with CI participated. Speech recognition scores were measured using the AzBio sentence lists. CI users also completed questionnaires and performed psychoacoustic (spectral and temporal resolution) and cognitive function (digit span) tests. Obtained SIIs were compared with predicted SIIs using a transfer function curve. Correlation and regression analyses were conducted on perceptual and demographic predictor variables to investigate the association between these factors and speech perception performance. Result Because of the considerably poor hearing and large individual variability in performance, the SII did not predict speech performance for this CI group using the traditional calculation. However, new SII models were developed incorporating predictive factors, which improved the accuracy of SII predictions in listeners with CI. Conclusion Conventional SII models are not appropriate for predicting speech perception scores for CI users. Demographic variables (aided audibility and duration of deafness) and perceptual–cognitive skills (gap detection and auditory digit span outcomes) are needed to improve the use of the SII for listeners with CI. Future studies are needed to improve our CI-corrected SII model by considering additional predictive factors. Supplemental Material https://doi.org/10.23641/asha.8057003

Download Full-text

Modelling the human-machine gap in speech reception: microscopic speech intelligibility prediction for normal-hearing subjects with an auditory model

10.21437/interspeech.2007-200 ◽

2007 ◽

Author(s):

Tim Jürgens ◽

Thomas Brand ◽

Birger Kollmeier

Keyword(s):

Speech Intelligibility ◽

Normal Hearing ◽

Auditory Model ◽

Speech Reception

Download Full-text

Modeling Binaural Unmasking of Speech Using a Blind Binaural Processing Stage

Trends in Hearing ◽

10.1177/2331216520975630 ◽

2020 ◽

Vol 24 ◽

pp. 233121652097563

Author(s):

Christopher F. Hauth ◽

Simon C. Berning ◽

Birger Kollmeier ◽

Thomas Brand

Keyword(s):

Speech Recognition ◽

Speech Intelligibility ◽

Single Channel ◽

Signal To Noise Ratio ◽

Binaural Processing ◽

Speech In Noise ◽

Masking Level Difference ◽

Low Pass ◽

Speech Intelligibility Index ◽

Filtered Speech

The equalization cancellation model is often used to predict the binaural masking level difference. Previously its application to speech in noise has required separate knowledge about the speech and noise signals to maximize the signal-to-noise ratio (SNR). Here, a novel, blind equalization cancellation model is introduced that can use the mixed signals. This approach does not require any assumptions about particular sound source directions. It uses different strategies for positive and negative SNRs, with the switching between the two steered by a blind decision stage utilizing modulation cues. The output of the model is a single-channel signal with enhanced SNR, which we analyzed using the speech intelligibility index to compare speech intelligibility predictions. In a first experiment, the model was tested on experimental data obtained in a scenario with spatially separated target and masker signals. Predicted speech recognition thresholds were in good agreement with measured speech recognition thresholds with a root mean square error less than 1 dB. A second experiment investigated signals at positive SNRs, which was achieved using time compressed and low-pass filtered speech. The results demonstrated that binaural unmasking of speech occurs at positive SNRs and that the modulation-based switching strategy can predict the experimental results.

Download Full-text

Evaluation of a Wide Range of AmplitudeFrequency Responses for the Hearing Impaired

Journal of Speech Language and Hearing Research ◽

10.1044/jshr.3801.211 ◽

1995 ◽

Vol 38 (1) ◽

pp. 211-221 ◽

Cited By ~ 36

Author(s):

Ronald A. van Buuren ◽

Joost M. Festen ◽

Reinier Plomp

Keyword(s):

Frequency Spectrum ◽

Speech Intelligibility ◽

Dynamic Range ◽

Finite Impulse Response ◽

Sound Quality ◽

Average Frequency ◽

Frequency Spectra ◽

Low Frequencies ◽

Speech Reception ◽

Wide Range

The long-term average frequency spectrum of speech was modified to 25 target frequency spectra in order to determine the effect of each of these spectra on speech intelligibility in noise and on sound quality. Speech intelligibility was evaluated using the test as developed by Plomp and Mimpen (1979), whereas sound quality was examined through judgments of loudness, sharpness, clearness, and pleasantness of speech fragments. Subjects had different degrees of sensorineural hearing loss and sloping audiograms, but not all of them were hearing aid users. The 25 frequency spectra were defined such that the entire dynamic range of each listener, from dB above threshold to 5 dB below UCL, was covered. Frequency shaping of the speech was carried out on-line by means of Finite Impulse Response (FIR) filters. The tests on speech reception in noise indicated that the Speech-Reception Thresholds (SRTs) did not differ significantly for the majority of spectra. Spectra with high levels, especially at low frequencies (probably causing significant upward spread of masking), and also those with steep negative slopes resulted in significantly higher SRTs. Sound quality judgments led to conclusions virtually identical to those from the SRT data: frequency spectra with an unacceptably low sound quality were in most of the cases significantly worse on the SRT test as well. Because the SRT did not vary significantly among the majority of frequency spectra, it was concluded that a wide range of spectra between the threshold and UCL levels of listeners with hearing losses is suitable for the presentation of speech energy. This is very useful in everyday listening, where the frequency spectrum of speech may vary considerably.

Download Full-text

Speech intelligibility index predictions for young and old listeners in automobile noise: Can the index be improved by incorporating factors other than absolute threshold?

The Journal of the Acoustical Society of America ◽

10.1121/1.4781148 ◽

2003 ◽

Vol 114 (4) ◽

pp. 2351-2351

Author(s):

Meghan Saweikis ◽

Aimée M. Surprenant ◽

Patricia Davies ◽

Don Gallant

Keyword(s):

Speech Intelligibility ◽

Absolute Threshold ◽

Speech Intelligibility Index

Download Full-text