Using the short-time speech transmission index to predict speech reception thresholds in fluctuating noise

2014 ◽  
Vol 135 (4) ◽  
pp. 2224-2225 ◽  
Author(s):  
Matthew Ferreira ◽  
Karen Payton
2021 ◽  
Vol 69 (2) ◽  
pp. 173-179
Author(s):  
Nilolina Samardzic ◽  
Brian C.J. Moore

Traditional methods for predicting the intelligibility of speech in the presence of noise inside a vehicle, such as the Articulation Index (AI), the Speech Intelligibility Index (SII), and the Speech Transmission Index (STI), are not accurate, probably because they do not take binaural listening into account; the signals reaching the two ears can differ markedly depending on the positions of the talker and listener. We propose a new method for predicting the intelligibility of speech in a vehicle, based on the ratio of the binaural loudness of the speech to the binaural loudness of the noise, each calculated using the method specified in ISO 532-2 (2017). The method was found to give accurate predictions of the speech reception threshold (SRT) measured under a variety of conditions and for different positions of the talker and listener in a car. The typical error in the predicted SRT was 1.3 dB, which is markedly smaller than estimated using the SII and STI (2.0 dB and 2.1 dB, respectively).


2017 ◽  
Vol 42 (3) ◽  
pp. 385-394
Author(s):  
Jedrzej Kocinski ◽  
Edward Ozimek

AbstractThe paper deals with relationship between speech recognition and objective parameters of enclosures. Six enclosures were chosen: a church, an assembly hall of a music school, two courtrooms of different volumes, a typical auditorium and a university concert hall. Dirac 4.1 software was used to record impulse responses (IRs) in the chosen measurement points of each enclosure. On this base, the following objective parameters of the enclosure were determined: Reverberation Time (RT), Early Decay Time (EDT), Weighted Clarity (C50) and Speech Transmission Index (STI). A convolution of the IRs with logatome tests and the Polish Sentence Test (PST) was made. Logatome recognition and speech reception threshold (SRT - i.e., SNR yielding 50% speech recognition) were evaluated and their dependence on the objective parameters were determined. Generally a linear relationship between logatome or SRT and RT or EDT was found. However, speech recognition was nonlinearly related (according to psychometric function) to STI values. The most sensitive range of the logatome and sentence recognition relative to STI changes corresponded to the middle range of STI values. Below and above this range, logatome and sentence recognition were much less dependent of STI changes.


2016 ◽  
Vol 41 (2) ◽  
pp. 255-264
Author(s):  
Jędrzej Kociński ◽  
Edward Ozimek

Abstract The aim of this work was to measure subjective speech intelligibility in an enclosure with a long reverberation time and comparison of these results with objective parameters. Impulse Responses (IRs) were first determined with a dummy head in different measurement points of the enclosure. The following objective parameters were calculated with Dirac 4.1 software: Reverberation Time (RT), Early Decay Time (EDT), weighted Clarity (C50) and Speech Transmission Index (STI). For the chosen measurement points, a convolution of the IRs with the Polish Sentence Test (PST) and logatome tests was made. PST was presented at a background of a babble noise and speech reception threshold - SRT (i.e. SNR yielding 50% speech intelligibility) for those points were evaluated. A relationship of the sentence and logatome recognition vs. STI was determined. It was found that the final SRT data are well correlated with speech transmission index (STI), and can be expressed by a psychometric function. The difference between SRT determined in condition without reverberation and in reverberation conditions appeared to be a good measure of the effect of reverberation on speech intelligibility in a room. In addition, speech intelligibility, with and without use of the sound amplification system installed in the enclosure, was compared.


2020 ◽  
Vol 10 (15) ◽  
pp. 5257
Author(s):  
Nathan Berwick ◽  
Hyunkook Lee

This study examined whether the spatial unmasking effect operates on speech reception thresholds (SRTs) in the median plane. SRTs were measured using an adaptive staircase procedure, with target speech sentences and speech-shaped noise maskers presented via loudspeakers at −30°, 0°, 30°, 60° and 90°. Results indicated a significant median plane spatial unmasking effect, with the largest SRT gain obtained for the −30° elevation of the masker. Head-related transfer function analysis suggests that the result is associated with the energy weighting of the ear-input signal of the masker at upper-mid frequencies relative to the maskee.


Sign in / Sign up

Export Citation Format

Share Document