Speech Recognition in an Enclosure with a Long Reverberation Time

2016 ◽  
Vol 41 (2) ◽  
pp. 255-264
Author(s):  
Jędrzej Kociński ◽  
Edward Ozimek

Abstract The aim of this work was to measure subjective speech intelligibility in an enclosure with a long reverberation time and comparison of these results with objective parameters. Impulse Responses (IRs) were first determined with a dummy head in different measurement points of the enclosure. The following objective parameters were calculated with Dirac 4.1 software: Reverberation Time (RT), Early Decay Time (EDT), weighted Clarity (C50) and Speech Transmission Index (STI). For the chosen measurement points, a convolution of the IRs with the Polish Sentence Test (PST) and logatome tests was made. PST was presented at a background of a babble noise and speech reception threshold - SRT (i.e. SNR yielding 50% speech intelligibility) for those points were evaluated. A relationship of the sentence and logatome recognition vs. STI was determined. It was found that the final SRT data are well correlated with speech transmission index (STI), and can be expressed by a psychometric function. The difference between SRT determined in condition without reverberation and in reverberation conditions appeared to be a good measure of the effect of reverberation on speech intelligibility in a room. In addition, speech intelligibility, with and without use of the sound amplification system installed in the enclosure, was compared.

2017 ◽  
Vol 42 (3) ◽  
pp. 385-394
Author(s):  
Jedrzej Kocinski ◽  
Edward Ozimek

AbstractThe paper deals with relationship between speech recognition and objective parameters of enclosures. Six enclosures were chosen: a church, an assembly hall of a music school, two courtrooms of different volumes, a typical auditorium and a university concert hall. Dirac 4.1 software was used to record impulse responses (IRs) in the chosen measurement points of each enclosure. On this base, the following objective parameters of the enclosure were determined: Reverberation Time (RT), Early Decay Time (EDT), Weighted Clarity (C50) and Speech Transmission Index (STI). A convolution of the IRs with logatome tests and the Polish Sentence Test (PST) was made. Logatome recognition and speech reception threshold (SRT - i.e., SNR yielding 50% speech recognition) were evaluated and their dependence on the objective parameters were determined. Generally a linear relationship between logatome or SRT and RT or EDT was found. However, speech recognition was nonlinearly related (according to psychometric function) to STI values. The most sensitive range of the logatome and sentence recognition relative to STI changes corresponded to the middle range of STI values. Below and above this range, logatome and sentence recognition were much less dependent of STI changes.


2021 ◽  
Vol 69 (2) ◽  
pp. 173-179
Author(s):  
Nilolina Samardzic ◽  
Brian C.J. Moore

Traditional methods for predicting the intelligibility of speech in the presence of noise inside a vehicle, such as the Articulation Index (AI), the Speech Intelligibility Index (SII), and the Speech Transmission Index (STI), are not accurate, probably because they do not take binaural listening into account; the signals reaching the two ears can differ markedly depending on the positions of the talker and listener. We propose a new method for predicting the intelligibility of speech in a vehicle, based on the ratio of the binaural loudness of the speech to the binaural loudness of the noise, each calculated using the method specified in ISO 532-2 (2017). The method was found to give accurate predictions of the speech reception threshold (SRT) measured under a variety of conditions and for different positions of the talker and listener in a car. The typical error in the predicted SRT was 1.3 dB, which is markedly smaller than estimated using the SII and STI (2.0 dB and 2.1 dB, respectively).


2019 ◽  
Vol XXII (2) ◽  
pp. 268-275
Author(s):  
Pazara T.

In a lecture hall it is vital to assure proper teaching conditions meaning that the information from the speaker/teacher must be received correctly by the listeners. Speech intelligibility is the main objective when a lecture hall is evaluated. In this paper, the authors discuss the importance of the acoustics of a lecture hall and the influence of various parameters over speech transmission from the speaker – the professors to the listeners – the students. The number of acoustical parameters is very large, but Speech Transmission Index (STI) and Reverberation Time (RT) are commonly used to evaluate the acoustics of a teaching room. Other parameters like room geometry and seat placement have great influence in speech intelligibility. As an example, a lecture hall of 120 seats from Naval Academy „Mircea cel Batran“ is investigated using virtual simulations with ODEON software. The results of the simulations are discussed and some remarks are made regarding the current condition of the lecture hall.


Author(s):  
Eriberto Oliveira do Nascimento ◽  
Paulo Henrique Trombetta Zannin

The acoustic quality in a classroom directly impacts the educational relationship between the student and the teacher, reducing speech intelligibility. In addition, inadequate acoustic comfort burdens the vocal health of teachers. This study evaluated a classroom at the Federal University of Paraná, Campus Centro Politécnico, to verify its acoustic quality. The measurements of the acoustics descriptors: Reverberation Time (RT), Definition (D50), Central Time (Ts), Early Decay Time (EDT) were performed according to the ISO 3382-2 standard, concerning Noise Curves (NC) and Background Noise (BGN) these were evaluated by the  NBR 10152 and S12.2 standards. The Speech Transmission Index (STI) was measured according to IEC 60268-16 and evaluated according to ISO 9921. The useful-detrimental ratio (U50) and the other descriptors were simulated in the ODEON software version 11. Thus, the results showed that the evaluated room did not meet the minimum requirements in terms of acoustic quality, for the descriptors RT, STI, Ts, D50, RF, and NC. Simultaneously, the RT and STI were also outside the limits established by the German and Finnish standards. Therefore, it is concluded that the evaluated classroom did not reach the minimum acoustic quality requirements.


2021 ◽  
Vol 3 (5 (111)) ◽  
pp. 47-56
Author(s):  
Arkadiy Prodeus ◽  
Maryna Didkovska

The scores of speech intelligibility, obtained using objective and subjective methods for three university lecture rooms of the small, medium, and large sizes with different degrees of filling, were presented. The problem of achieving high speech intelligibility is relevant for both students and university administration, and for architects designing or reconstructing lecture rooms. Speech intelligibility was assessed using binaural room impulse responses which applied an artificial head and non-professional quality audio equipment for measuring. The Speech Transmission Index was an objective measure of speech intelligibility, while the subjective evaluation of speech intelligibility was carried out using the articulation method. Comparative analysis of the effectiveness of parameters of impulse response as a measure of speech intelligibility showed that Early Decay Time exceeded the score of the T30 reverberation time but was ineffective in a small lecture room. The C50 clarity index for all the considered lecture rooms was the most informative. Several patterns determined by the influence of early sound reflections on speech intelligibility were detected. Specifically, it was shown that an increase in the ratio of the energy of early reflections to the energy of direct sound leads to a decrease in speech intelligibility. The exceptions are small, up to 30‒40 cm, distances from the back wall of the room, where speech intelligibility is usually slightly higher than in the middle of the room. At a distance of 0.7–1.7 m from the side walls of the room, speech intelligibility is usually worse for the ear, which is closer to the wall. The usefulness of the obtained results lies in refining the quantitative characteristics of the influence of early reflections of sound on speech intelligibility at different points of lecture rooms.


2017 ◽  
Vol 13 (7) ◽  
pp. 69
Author(s):  
Pasit Leeniva ◽  
Prapatpong Upala

The objectives of this research are to evaluate acoustic environments and to forecast STI values from spatial component variables in the large classrooms of the Thai public university that were specially controlled the same room finishing materials including the floor, walls, and ceiling. Whereas the five spatial component factors included (1) Room Volume (RV), (2) Ceiling Height (CH), (3) the Ratio of Depth to Width (Rdw), (4) Total Room Surface (TS), and (5) Percentage of Absorbing Surface areas (PAS). The research tools were the smartphones that used the applications for acoustical evaluation and speech intelligibility analysis. The Speech Transmission Index (STI), Reverberation Time (RT), and Background Noise Level (BNL) were collected by the calibrated microphone in the nine points distributed across the entire room. And also, the sounds for testing were simulated such as balloon burst, and STIPA signal via a sound generator. The Thailand Speech Intelligibility (T-SI) model was developed by the multiple regression analysis with a statistical at a confidence level of 95%.The results showed that this T-SI model depended on the strongly positive relationship of PAS and the slightly positive relationship of CH, TS while the RV, Rdw were slightly the negative relationship and which predicted STI values. Moreover, the highest affecting variable of T-SI model was CH and the lowest was PAS. However, this research implies that the improving room acoustic quality would be adjusting the sound absorbing surface areas i.e., increase the cloth curtain or appropriate methods.


Sign in / Sign up

Export Citation Format

Share Document