Listening tests in room acoustics: Comparison of overall difference protocols regarding operational power

2021 ◽  
Vol 182 ◽  
pp. 108186
Author(s):  
Daniel de la Prida ◽  
Antonio Pedrero ◽  
Luis Antonio Azpicueta-Ruiz ◽  
María Ángeles Navacerrada
2011 ◽  
Vol 131 (4) ◽  
pp. 800-807 ◽  
Author(s):  
Ayuko Shigeta ◽  
Takeshi Koike ◽  
Kazuhiko Hamamoto ◽  
Kiyoshi Nosu
Keyword(s):  

Author(s):  
Yuri Khokhlov ◽  
Alexander Zatvornitskiy ◽  
Ivan Medennikov ◽  
Ivan Sorokin ◽  
Tatiana Prisyach ◽  
...  
Keyword(s):  

2020 ◽  
Author(s):  
Lieber Po-Hung Li ◽  
Ji-Yan Han ◽  
Wei-Zhong Zheng ◽  
Ren-Jie Huang ◽  
Ying-Hui Lai

BACKGROUND The cochlear implant technology is a well-known approach to help deaf patients hear speech again. It can improve speech intelligibility in quiet conditions; however, it still has room for improvement in noisy conditions. More recently, it has been proven that deep learning–based noise reduction (NR), such as noise classification and deep denoising autoencoder (NC+DDAE), can benefit the intelligibility performance of patients with cochlear implants compared to classical noise reduction algorithms. OBJECTIVE Following the successful implementation of the NC+DDAE model in our previous study, this study aimed to (1) propose an advanced noise reduction system using knowledge transfer technology, called NC+DDAE_T, (2) examine the proposed NC+DDAE_T noise reduction system using objective evaluations and subjective listening tests, and (3) investigate which layer substitution of the knowledge transfer technology in the NC+DDAE_T noise reduction system provides the best outcome. METHODS The knowledge transfer technology was adopted to reduce the number of parameters of the NC+DDAE_T compared with the NC+DDAE. We investigated which layer should be substituted using short-time objective intelligibility (STOI) and perceptual evaluation of speech quality (PESQ) scores, as well as t-distributed stochastic neighbor embedding to visualize the features in each model layer. Moreover, we enrolled ten cochlear implant users for listening tests to evaluate the benefits of the newly developed NC+DDAE_T. RESULTS The experimental results showed that substituting the middle layer (ie, the second layer in this study) of the noise-independent DDAE (NI-DDAE) model achieved the best performance gain regarding STOI and PESQ scores. Therefore, the parameters of layer three in the NI-DDAE were chosen to be replaced, thereby establishing the NC+DDAE_T. Both objective and listening test results showed that the proposed NC+DDAE_T noise reduction system achieved similar performances compared with the previous NC+DDAE in several noisy test conditions. However, the proposed NC+DDAE_T only needs a quarter of the number of parameters compared to the NC+DDAE. CONCLUSIONS This study demonstrated that knowledge transfer technology can help to reduce the number of parameters in an NC+DDAE while keeping similar performance rates. This suggests that the proposed NC+DDAE_T model may reduce the implementation costs of this noise reduction system and provide more benefits for cochlear implant users.


2021 ◽  
Vol 11 (3) ◽  
pp. 1150
Author(s):  
Stephan Werner ◽  
Florian Klein ◽  
Annika Neidhardt ◽  
Ulrike Sloma ◽  
Christian Schneiderwind ◽  
...  

For a spatial audio reproduction in the context of augmented reality, a position-dynamic binaural synthesis system can be used to synthesize the ear signals for a moving listener. The goal is the fusion of the auditory perception of the virtual audio objects with the real listening environment. Such a system has several components, each of which help to enable a plausible auditory simulation. For each possible position of the listener in the room, a set of binaural room impulse responses (BRIRs) congruent with the expected auditory environment is required to avoid room divergence effects. Adequate and efficient approaches are methods to synthesize new BRIRs using very few measurements of the listening room. The required spatial resolution of the BRIR positions can be estimated by spatial auditory perception thresholds. Retrieving and processing the tracking data of the listener’s head-pose and position as well as convolving BRIRs with an audio signal needs to be done in real-time. This contribution presents work done by the authors including several technical components of such a system in detail. It shows how the single components are affected by psychoacoustics. Furthermore, the paper also discusses the perceptive effect by means of listening tests demonstrating the appropriateness of the approaches.


Electronics ◽  
2021 ◽  
Vol 10 (11) ◽  
pp. 1349
Author(s):  
Stefan Lattner ◽  
Javier Nistal

Lossy audio codecs compress (and decompress) digital audio streams by removing information that tends to be inaudible in human perception. Under high compression rates, such codecs may introduce a variety of impairments in the audio signal. Many works have tackled the problem of audio enhancement and compression artifact removal using deep-learning techniques. However, only a few works tackle the restoration of heavily compressed audio signals in the musical domain. In such a scenario, there is no unique solution for the restoration of the original signal. Therefore, in this study, we test a stochastic generator of a Generative Adversarial Network (GAN) architecture for this task. Such a stochastic generator, conditioned on highly compressed musical audio signals, could one day generate outputs indistinguishable from high-quality releases. Therefore, the present study may yield insights into more efficient musical data storage and transmission. We train stochastic and deterministic generators on MP3-compressed audio signals with 16, 32, and 64 kbit/s. We perform an extensive evaluation of the different experiments utilizing objective metrics and listening tests. We find that the models can improve the quality of the audio signals over the MP3 versions for 16 and 32 kbit/s and that the stochastic generators are capable of generating outputs that are closer to the original signals than those of the deterministic generators.


2021 ◽  
Vol 13 (13) ◽  
pp. 7320
Author(s):  
Tobias Pietrzyk ◽  
Markus Georgi ◽  
Sabine Schlittmeier ◽  
Katharina Schmitz

In this study, sound measurements of an axial piston pump and an internal gear pump were performed and subjective pleasantness judgements were collected in listening tests (to analyze the subjective pleasantness), which could be seen as the inverse of the subjective annoyance of hydraulic drives. Pumps are the dominant sound source in hydraulic systems. The noise generation of displacement machines is subject of current research. However, in this research only the sound pressure level (SPL) was considered. Psychoacoustic metrics give new possibilities to analyze the sound of hydraulic drive technology and to improve the sound quality. For this purpose, instrumental measurements of the acoustic and psychoacoustic parameters are evaluated for both pump types. The recorded sounds are played back to the participants in listening tests. Participants evaluate them regarding the subjective pleasantness by means of paired comparison, which is an indirect scaling method. The dependence of the subjective pleasantness on speed and pressure was analyzed for both pump types. Different regression analyses were carried out to predict the subjectively perceived pleasantness or annoyance of the pumps. Results show that a lower speed is the decisive operating parameter for reducing both the SPL and the annoyance of a hydraulic pump.


Sign in / Sign up

Export Citation Format

Share Document