scholarly journals Modeling Binaural Unmasking of Speech Using a Blind Binaural Processing Stage

2020 ◽  
Vol 24 ◽  
pp. 233121652097563
Author(s):  
Christopher F. Hauth ◽  
Simon C. Berning ◽  
Birger Kollmeier ◽  
Thomas Brand

The equalization cancellation model is often used to predict the binaural masking level difference. Previously its application to speech in noise has required separate knowledge about the speech and noise signals to maximize the signal-to-noise ratio (SNR). Here, a novel, blind equalization cancellation model is introduced that can use the mixed signals. This approach does not require any assumptions about particular sound source directions. It uses different strategies for positive and negative SNRs, with the switching between the two steered by a blind decision stage utilizing modulation cues. The output of the model is a single-channel signal with enhanced SNR, which we analyzed using the speech intelligibility index to compare speech intelligibility predictions. In a first experiment, the model was tested on experimental data obtained in a scenario with spatially separated target and masker signals. Predicted speech recognition thresholds were in good agreement with measured speech recognition thresholds with a root mean square error less than 1 dB. A second experiment investigated signals at positive SNRs, which was achieved using time compressed and low-pass filtered speech. The results demonstrated that binaural unmasking of speech occurs at positive SNRs and that the modulation-based switching strategy can predict the experimental results.

2019 ◽  
Vol 62 (5) ◽  
pp. 1517-1531 ◽  
Author(s):  
Sungmin Lee ◽  
Lisa Lucks Mendel ◽  
Gavin M. Bidelman

Purpose Although the speech intelligibility index (SII) has been widely applied in the field of audiology and other related areas, application of this metric to cochlear implants (CIs) has yet to be investigated. In this study, SIIs for CI users were calculated to investigate whether the SII could be an effective tool for predicting speech perception performance in a population with CI. Method Fifteen pre- and postlingually deafened adults with CI participated. Speech recognition scores were measured using the AzBio sentence lists. CI users also completed questionnaires and performed psychoacoustic (spectral and temporal resolution) and cognitive function (digit span) tests. Obtained SIIs were compared with predicted SIIs using a transfer function curve. Correlation and regression analyses were conducted on perceptual and demographic predictor variables to investigate the association between these factors and speech perception performance. Result Because of the considerably poor hearing and large individual variability in performance, the SII did not predict speech performance for this CI group using the traditional calculation. However, new SII models were developed incorporating predictive factors, which improved the accuracy of SII predictions in listeners with CI. Conclusion Conventional SII models are not appropriate for predicting speech perception scores for CI users. Demographic variables (aided audibility and duration of deafness) and perceptual–cognitive skills (gap detection and auditory digit span outcomes) are needed to improve the use of the SII for listeners with CI. Future studies are needed to improve our CI-corrected SII model by considering additional predictive factors. Supplemental Material https://doi.org/10.23641/asha.8057003


2020 ◽  
Vol 16 (3) ◽  
pp. 217-225
Author(s):  
Hongyeop Oh ◽  
Soon-Je Choi ◽  
In-Ki Jin

Purpose: This study aimed to derive band-importance functions (BIFs) and transfer functions (TFs) according to contextual predictability clues to determine the influence of contextual predictability clues in Korean speech material on the speech intelligibility index (SII). Methods: This study was conducted on 156 native speakers of Korean who had normal hearing. Korean speech perception in noise test material, which was composed of 120 high-predictability and 120 low-predictability sentences, was used for stimuli. To obtain intelligibility data, participants were tested for intelligibility in various frequency ranges and signal-to-noise ratio conditions. In order to derive the BIF and the TF, a nonlinear optimization procedure using MATLAB (MathWorks, Inc.) was used. Results: The BIF derived from the high-predictability sentences showed a peak in areas of 700 Hz (7.0%), 1,850 Hz (8.5%), and 4,800 Hz (7.6%). The crossover frequency for the high-predictability sentences was around 1,370 Hz. The BIF derived from the low-predictability sentences showed a peak in areas of 570 Hz (7.5%), 1,850 Hz (9.3%), and 4,000 Hz (8.0%). The crossover frequency for the low-predictability sentences was around 1,600 Hz. In the case of the TF, the TF curves derived from high-predictability sentences were steeper than those derived from low-predictability sentences.Conclusion: In the SII model, speech intelligibility differs according to contextual predictability clues. Especially, the more contextual predictability clues at the identical audibility, the higher the intelligibility predicted by the SII. Therefore, accurate speech intelligibility prediction requires the use of SII considering the contextual predictability clues that are characteristic of the stimulus.


2018 ◽  
Vol 27 (4) ◽  
pp. 581-593 ◽  
Author(s):  
Lisa Brody ◽  
Yu-Hsiang Wu ◽  
Elizabeth Stangl

Purpose The aim of this study was to compare the benefit of self-adjusted personal sound amplification products (PSAPs) to audiologist-fitted hearing aids based on speech recognition, listening effort, and sound quality in ecologically relevant test conditions to estimate real-world effectiveness. Method Twenty-five older adults with bilateral mild-to-moderate hearing loss completed the single-blinded, crossover study. Participants underwent aided testing using 3 PSAPs and a traditional hearing aid, as well as unaided testing. PSAPs were adjusted based on participant preference, whereas the hearing aid was configured using best-practice verification protocols. Audibility provided by the devices was quantified using the Speech Intelligibility Index (American National Standards Institute, 2012). Outcome measures assessing speech recognition, listening effort, and sound quality were administered in ecologically relevant laboratory conditions designed to represent real-world speech listening situations. Results All devices significantly improved Speech Intelligibility Index compared to unaided listening, with the hearing aid providing more audibility than all PSAPs. Results further revealed that, in general, the hearing aid improved speech recognition performance and reduced listening effort significantly more than all PSAPs. Few differences in sound quality were observed between devices. All PSAPs improved speech recognition and listening effort compared to unaided testing. Conclusions Hearing aids fitted using best-practice verification protocols were capable of providing more aided audibility, better speech recognition performance, and lower listening effort compared to the PSAPs tested in the current study. Differences in sound quality between the devices were minimal. However, because all PSAPs tested in the study significantly improved participants' speech recognition performance and reduced listening effort compared to unaided listening, PSAPs could serve as a budget-friendly option for those who cannot afford traditional amplification.


2021 ◽  
Vol 15 ◽  
Author(s):  
Jing Chen ◽  
Zhe Wang ◽  
Ruijuan Dong ◽  
Xinxing Fu ◽  
Yuan Wang ◽  
...  

Objective: This study was aimed at evaluating improvements in speech-in-noise recognition ability as measured by signal-to-noise ratio (SNR) with the use of wireless remote microphone technology. These microphones transmit digital signals via radio frequency directly to hearing aids and may be a valuable assistive listening device for the hearing-impaired population of Mandarin speakers in China.Methods: Twenty-three adults (aged 19–80 years old) and fourteen children (aged 8–17 years old) with bilateral sensorineural hearing loss were recruited. The Mandarin Hearing in Noise Test was used to test speech recognition ability in adult subjects, and the Mandarin Hearing in Noise Test for Children was used for children. The subjects’ perceived SNR was measured using sentence recognition ability at three different listening distances of 1.5, 3, and 6 m. At each distance, SNR was obtained under three device settings: hearing aid microphone alone, wireless remote microphone alone, and hearing aid microphone and wireless remote microphone simultaneously.Results: At each test distance, for both adult and pediatric groups, speech-in-noise recognition thresholds were significantly lower with the use of the wireless remote microphone in comparison with the hearing aid microphones alone (P < 0.05), indicating better SNR performance with the wireless remote microphone. Moreover, when the wireless remote microphone was used, test distance had no effect on speech-in-noise recognition for either adults or children.Conclusion: Wireless remote microphone technology can significantly improve speech recognition performance in challenging listening environments for Mandarin speaking hearing aid users in China.


Author(s):  
Julie Beadle ◽  
Jeesun Kim ◽  
Chris Davis

Purpose: Listeners understand significantly more speech in noise when the talker's face can be seen (visual speech) in comparison to an auditory-only baseline (a visual speech benefit). This study investigated whether the visual speech benefit is reduced when the correspondence between auditory and visual speech is uncertain and whether any reduction is affected by listener age (older vs. younger) and how severe the auditory signal is masked. Method: Older and younger adults completed a speech recognition in noise task that included an auditory-only condition and four auditory–visual (AV) conditions in which one, two, four, or six silent talking face videos were presented. One face always matched the auditory signal; the other face(s) did not. Auditory speech was presented in noise at −6 and −1 dB signal-to-noise ratio (SNR). Results: When the SNR was −6 dB, for both age groups, the standard-sized visual speech benefit reduced as more talking faces were presented. When the SNR was −1 dB, younger adults received the standard-sized visual speech benefit even when two talking faces were presented, whereas older adults did not. Conclusions: The size of the visual speech benefit obtained by older adults was always smaller when AV correspondence was uncertain; this was not the case for younger adults. Difficulty establishing AV correspondence may be a factor that limits older adults' speech recognition in noisy AV environments. Supplemental Material https://doi.org/10.23641/asha.16879549


1995 ◽  
Vol 38 (1) ◽  
pp. 234-243 ◽  
Author(s):  
Sarah E. Hargus ◽  
Sandra Gordon-Salant

This study examined whether the accuracy of Speech Intelligibility Index (SII) predictions is affected by subject age when between-groups auditory sensitivity differences are controlled. SII predictive accuracy was assessed for elderly listeners with hearing impairment (EHI) and for young noise-masked listeners with normal hearing (NMN). SII predictive accuracy was poorer for the EHI subjects than for the NMN subjects across a range of test conditions and stimuli. Speech test redundancy, speech presentation level, signal-to-babble ratio, and babble level also affected SII predictive accuracy. The results suggest that the speech recognition difficulties experienced in noise by elderly listeners do not result solely from reduced auditory sensitivity.


2019 ◽  
Vol 23 ◽  
pp. 233121652091919
Author(s):  
Gertjan Dingemanse ◽  
André Goedegebure

This study examines whether speech-in-noise tests that use adaptive procedures to assess a speech reception threshold in noise ( SRT50n) can be optimized using stochastic approximation (SA) methods, especially in cochlear-implant (CI) users. A simulation model was developed that simulates intelligibility scores for words from sentences in noise for both CI users and normal-hearing (NH) listeners. The model was used in Monte Carlo simulations. Four different SA algorithms were optimized for use in both groups and compared with clinically used adaptive procedures. The simulation model proved to be valid, as its results agreed very well with existing experimental data. The four optimized SA algorithms all provided an efficient estimation of the SRT50n. They were equally accurate and produced smaller standard deviations (SDs) than the clinical procedures. In CI users, SRT50n estimates had a small bias and larger SDs than in NH listeners. At least 20 sentences per condition and an initial signal-to-noise ratio below the real SRT50n were required to ensure sufficient reliability. In CI users, bias and SD became unacceptably large for a maximum speech intelligibility score in quiet below 70%. In conclusion, SA algorithms with word scoring in adaptive speech-in-noise tests are applicable to various listeners, from CI users to NH listeners. In CI users, they lead to efficient estimation of the SRT50n as long as speech intelligibility in quiet is greater than 70%. SA procedures can be considered as a valid, more efficient, and alternative to clinical adaptive procedures currently used in CI users.


1995 ◽  
Vol 38 (3) ◽  
pp. 706-713 ◽  
Author(s):  
Sandra Gordon-Salant ◽  
Peter J. Fitzgibbons

An index of equivalent performance in noise was developed to compare recognition in different forms of speech distortion. Speech-recognition performance of young and elderly listeners with and without hearing loss was evaluated for undistorted speech presented in quiet and noise, and for speech distorted by four time-compression ratios and by four reverberation times. The data obtained in noise on young subjects with normal hearing served to generate a normalized regression equation, which was used to convert percent-correct performance in different distortion conditions to equivalent performance for undistorted speech at a particular S/N ratio. Comparisons of the equivalent S/N ratios obtained in the various conditions allowed rank-ordering of speech recognition performance in different types of degradation. The data also show that age and hearing loss affect recognition of speech degraded by reverberation or time compression. However, age effects are evident primarily in the most severe distortion conditions. Recognition of undistorted speech in noise was affected by hearing loss but not by age. These findings support a hypothesis that stipulates that increased age produces a reduction in the functional S/N ratio.


Sign in / Sign up

Export Citation Format

Share Document