Speech-in-noise enhancement using amplification and dynamic range compression controlled by the speech intelligibility index

2015 ◽  
Vol 138 (5) ◽  
pp. 2692-2706 ◽  
Author(s):  
Henning Schepker ◽  
Jan Rennies ◽  
Simon Doclo
2020 ◽  
Vol 24 ◽  
pp. 233121652097563
Author(s):  
Christopher F. Hauth ◽  
Simon C. Berning ◽  
Birger Kollmeier ◽  
Thomas Brand

The equalization cancellation model is often used to predict the binaural masking level difference. Previously its application to speech in noise has required separate knowledge about the speech and noise signals to maximize the signal-to-noise ratio (SNR). Here, a novel, blind equalization cancellation model is introduced that can use the mixed signals. This approach does not require any assumptions about particular sound source directions. It uses different strategies for positive and negative SNRs, with the switching between the two steered by a blind decision stage utilizing modulation cues. The output of the model is a single-channel signal with enhanced SNR, which we analyzed using the speech intelligibility index to compare speech intelligibility predictions. In a first experiment, the model was tested on experimental data obtained in a scenario with spatially separated target and masker signals. Predicted speech recognition thresholds were in good agreement with measured speech recognition thresholds with a root mean square error less than 1 dB. A second experiment investigated signals at positive SNRs, which was achieved using time compressed and low-pass filtered speech. The results demonstrated that binaural unmasking of speech occurs at positive SNRs and that the modulation-based switching strategy can predict the experimental results.


2005 ◽  
Vol 48 (3) ◽  
pp. 702-714 ◽  
Author(s):  
Peninah S. Rosengard ◽  
Karen L. Payton ◽  
Louis D. Braida

The purpose of this study was twofold: (a) to determine the extent to which 4-channel, slow-acting wide dynamic range amplitude compression (WDRC) can counteract the perceptual effects of reduced auditory dynamic range and (b) to examine the relation between objective measures of speech intelligibility and categorical ratings of speech quality for sentences processed with slow-acting WDRC. Multiband expansion was used to simulate the effects of elevated thresholds and loudness recruitment in normal hearing listeners. While some previous studies have shown that WDRC can improve both speech intelligibility and quality, others have found no benefit. The current experiment shows that moderate amounts of compression can provide a small but significant improvement in speech intelligibility, relative to linear amplification, for simulated-loss listeners with small dynamic ranges (i.e., flat, moderate hearing loss). This benefit was found for speech at conversational levels, both in quiet and in a background of babble. Simulated-loss listeners with large dynamic ranges (i.e., sloping, mild-to-moderate hearing loss) did not show any improvement. Comparison of speech intelligibility scores and subjective ratings of intelligibility showed that listeners with simulated hearing loss could accurately judge the overall intelligibility of speech. However, in all listeners, ratings of pleasantness decreased as the compression ratio increased. These findings suggest that subjective measures of speech quality should be used in conjunction with either objective or subjective measures of speech intelligibility to ensure that participant-selected hearing aid parameters optimize both comfort and intelligibility.


2016 ◽  
Vol 59 (6) ◽  
pp. 1543-1554 ◽  
Author(s):  
Paul N. Reinhart ◽  
Pamela E. Souza

Purpose The purpose of this study was to examine the effects of varying wide dynamic range compression (WDRC) release time on intelligibility and clarity of reverberant speech. The study also considered the role of individual working memory. Method Thirty older listeners with mild to moderately-severe sloping sensorineural hearing loss participated. Individuals were divided into high and low working memory groups on the basis of the results of a reading span test. Participants listened binaurally to sentence stimuli simulated at a range of reverberation conditions and WDRC release times using a high compression ratio. Outcome measures included objective intelligibility and subjective clarity ratings. Results Speech intelligibility and clarity ratings both decreased as a function of reverberation. The low working memory group demonstrated a greater decrease in intelligibility with increasing amounts of reverberation than the high working memory group. Both groups, regardless of working memory, had higher speech intelligibility and clarity ratings with longer WDRC release times. WDRC release time had a larger effect on speech intelligibility under more reverberant conditions. Conclusions Reverberation significantly affects speech intelligibility, particularly for individuals with lower working memory. In addition, longer release times in hearing aids may improve listener speech intelligibility and clarity in reverberant environments.


2017 ◽  
Vol 60 (6) ◽  
pp. 1674-1680 ◽  
Author(s):  
In-Ki Jin ◽  
James M. Kates ◽  
Kathryn H. Arehart

Purpose This study aims to evaluate the sensitivity of the speech intelligibility index (SII) to the assumed speech dynamic range (DR) in different languages and with different types of stimuli. Method Intelligibility prediction uses the absolute transfer function (ATF) to map the SII value to the predicted intelligibility for a given stimuli. To evaluate the sensitivity of the predicted intelligibility to the assumed DR, ATF-transformed SII scores for English (words), Korean (sentences), and Mandarin (sentences) were derived for DRs ranging from 10 dB to 60 dB. Results Increasing the assumed DR caused steeper ATFs for all languages. However, high correlation coefficients between predicted and measured intelligibility scores were observed for DRs from 20 dB to 60 dB for ATFs in English, Korean, and Mandarin. Conclusions Results of the present study indicate that the intelligibility computed from the SII is not sensitive to the assumed DR. The 30-dB DR commonly used in computing the SII is thus a reasonable assumption that produces accurate predictions for different languages and different types of stimuli.


2020 ◽  
Vol 24 ◽  
pp. 233121652093053
Author(s):  
Borys Kowalewski ◽  
Torsten Dau ◽  
Tobias May

Dynamic range compression is a compensation strategy commonly used in modern hearing aids. Fast-acting systems respond relatively quickly to the fluctuations in the input level. This allows for more effective compression of the dynamic range of speech and hence enhanced the audibility of its low-intensity components. However, such processing also amplifies the background noise, distorts the modulation spectra of both the speech and the background, and can reduce the output signal-to-noise ratio (SNR). Recently, May et al. proposed a novel SNR-aware compression strategy, in which the compression speed is adapted depending on whether speech is present or absent. Fast-acting compression is applied to speech-dominated time–frequency (T-F) units, while noise-dominated T-F units are processed using slow-acting compression. It has been shown that this strategy provides a similar effective compression of the speech dynamic range as conventional fast-acting compression, while introducing fewer distortions of the modulation spectrum of the background and providing an improved output SNR. In this study, this SNR-aware compression strategy was compared with conventional fast- and slow-acting compression in terms of speech intelligibility and subjective preference in a group of 17 hearing-impaired listeners with varying degree of hearing loss. The results show a speech intelligibility benefit of the SNR-aware compression strategy over the conventional slow-acting system. Furthermore, the SNR-aware approach demonstrates an increased subjective preference compared with both conventional fast- and slow-acting systems.


Sign in / Sign up

Export Citation Format

Share Document