scholarly journals Effects of Contextual Predictability Clues in Speech Materials on the Korean Speech Intelligibility Index

2020 ◽  
Vol 16 (3) ◽  
pp. 217-225
Author(s):  
Hongyeop Oh ◽  
Soon-Je Choi ◽  
In-Ki Jin

Purpose: This study aimed to derive band-importance functions (BIFs) and transfer functions (TFs) according to contextual predictability clues to determine the influence of contextual predictability clues in Korean speech material on the speech intelligibility index (SII). Methods: This study was conducted on 156 native speakers of Korean who had normal hearing. Korean speech perception in noise test material, which was composed of 120 high-predictability and 120 low-predictability sentences, was used for stimuli. To obtain intelligibility data, participants were tested for intelligibility in various frequency ranges and signal-to-noise ratio conditions. In order to derive the BIF and the TF, a nonlinear optimization procedure using MATLAB (MathWorks, Inc.) was used. Results: The BIF derived from the high-predictability sentences showed a peak in areas of 700 Hz (7.0%), 1,850 Hz (8.5%), and 4,800 Hz (7.6%). The crossover frequency for the high-predictability sentences was around 1,370 Hz. The BIF derived from the low-predictability sentences showed a peak in areas of 570 Hz (7.5%), 1,850 Hz (9.3%), and 4,000 Hz (8.0%). The crossover frequency for the low-predictability sentences was around 1,600 Hz. In the case of the TF, the TF curves derived from high-predictability sentences were steeper than those derived from low-predictability sentences.Conclusion: In the SII model, speech intelligibility differs according to contextual predictability clues. Especially, the more contextual predictability clues at the identical audibility, the higher the intelligibility predicted by the SII. Therefore, accurate speech intelligibility prediction requires the use of SII considering the contextual predictability clues that are characteristic of the stimulus.

2020 ◽  
Vol 24 ◽  
pp. 233121652097563
Author(s):  
Christopher F. Hauth ◽  
Simon C. Berning ◽  
Birger Kollmeier ◽  
Thomas Brand

The equalization cancellation model is often used to predict the binaural masking level difference. Previously its application to speech in noise has required separate knowledge about the speech and noise signals to maximize the signal-to-noise ratio (SNR). Here, a novel, blind equalization cancellation model is introduced that can use the mixed signals. This approach does not require any assumptions about particular sound source directions. It uses different strategies for positive and negative SNRs, with the switching between the two steered by a blind decision stage utilizing modulation cues. The output of the model is a single-channel signal with enhanced SNR, which we analyzed using the speech intelligibility index to compare speech intelligibility predictions. In a first experiment, the model was tested on experimental data obtained in a scenario with spatially separated target and masker signals. Predicted speech recognition thresholds were in good agreement with measured speech recognition thresholds with a root mean square error less than 1 dB. A second experiment investigated signals at positive SNRs, which was achieved using time compressed and low-pass filtered speech. The results demonstrated that binaural unmasking of speech occurs at positive SNRs and that the modulation-based switching strategy can predict the experimental results.


2019 ◽  
Vol 62 (9) ◽  
pp. 3290-3301
Author(s):  
Jingjing Guan ◽  
Chang Liu

Purpose Degraded speech intelligibility in background noise is a common complaint of listeners with hearing loss. The purpose of the current study is to explore whether 2nd formant (F2) enhancement improves speech perception in noise for older listeners with hearing impairment (HI) and normal hearing (NH). Method Target words (e.g., color and digit) were selected and presented based on the paradigm of the coordinate response measure corpus. Speech recognition thresholds with original and F2-enhanced speech in 2- and 6-talker babble were examined for older listeners with NH and HI. Results The thresholds for both the NH and HI groups improved for enhanced speech signals primarily in 2-talker babble, but not in 6-talker babble. The F2 enhancement benefits did not correlate significantly with listeners' age and their average hearing thresholds in most listening conditions. However, speech intelligibility index values increased significantly with F2 enhancement in babble for listeners with HI, but not for NH listeners. Conclusions Speech sounds with F2 enhancement may improve listeners' speech perception in 2-talker babble, possibly due to a greater amount of speech information available in temporally modulated noise or a better capacity to separate speech signals from background babble.


2020 ◽  
Vol 24 ◽  
pp. 233121652097034
Author(s):  
Florian Langner ◽  
Andreas Büchner ◽  
Waldo Nogueira

Cochlear implant (CI) sound processing typically uses a front-end automatic gain control (AGC), reducing the acoustic dynamic range (DR) to control the output level and protect the signal processing against large amplitude changes. It can also introduce distortions into the signal and does not allow a direct mapping between acoustic input and electric output. For speech in noise, a reduction in DR can result in lower speech intelligibility due to compressed modulations of speech. This study proposes to implement a CI signal processing scheme consisting of a full acoustic DR with adaptive properties to improve the signal-to-noise ratio and overall speech intelligibility. Measurements based on the Short-Time Objective Intelligibility measure and an electrodogram analysis, as well as behavioral tests in up to 10 CI users, were used to compare performance with a single-channel, dual-loop, front-end AGC and with an adaptive back-end multiband dynamic compensation system (Voice Guard [VG]). Speech intelligibility in quiet and at a +10 dB signal-to-noise ratio was assessed with the Hochmair–Schulz–Moser sentence test. A logatome discrimination task with different consonants was performed in quiet. Speech intelligibility was significantly higher in quiet for VG than for AGC, but intelligibility was similar in noise. Participants obtained significantly better scores with VG than AGC in the logatome discrimination task. The objective measurements predicted significantly better performance estimates for VG. Overall, a dynamic compensation system can outperform a single-stage compression (AGC + linear compression) for speech perception in quiet.


2021 ◽  
Vol 69 (2) ◽  
pp. 173-179
Author(s):  
Nilolina Samardzic ◽  
Brian C.J. Moore

Traditional methods for predicting the intelligibility of speech in the presence of noise inside a vehicle, such as the Articulation Index (AI), the Speech Intelligibility Index (SII), and the Speech Transmission Index (STI), are not accurate, probably because they do not take binaural listening into account; the signals reaching the two ears can differ markedly depending on the positions of the talker and listener. We propose a new method for predicting the intelligibility of speech in a vehicle, based on the ratio of the binaural loudness of the speech to the binaural loudness of the noise, each calculated using the method specified in ISO 532-2 (2017). The method was found to give accurate predictions of the speech reception threshold (SRT) measured under a variety of conditions and for different positions of the talker and listener in a car. The typical error in the predicted SRT was 1.3 dB, which is markedly smaller than estimated using the SII and STI (2.0 dB and 2.1 dB, respectively).


Author(s):  
Serafeim Moustakidis ◽  
Athanasios Anagnostis ◽  
Apostolos Chondronasios ◽  
Patrik Karlsson ◽  
Kostas Hrissagis

There is a large number of industries that make extensive use of composite materials in their respective sectors. This rise in composites’ use has necessitated the development of new non-destructive inspection techniques that focus on manufacturing quality assurance, as well as in-service damage testing. Active infrared thermography is now a popular nondestructive testing method for detecting defects in composite structures. Non-uniform emissivity, uneven heating of the test surface, and variation in thermal properties of the test material are some of the crucial factors in experimental thermography. These unwanted thermal effects are typically coped with the application of a number of well-established thermographic techniques including pulse phase thermography and thermographic signal reconstruction. This article addresses this problem of the induced uneven heating at the pre-processing phase prior to the application of the thermographic processing techniques. To accomplish this, a number of excitation invariant pre-processing techniques were developed and tested in this article addressing the unwanted effect of non-uniform excitation in the collected thermographic data. Various fitting approaches were validated in light of modeling the non-uniform heating effect, and new normalization approaches were proposed following a time-dependent framework. The proposed pre-processing techniques were validated on a testing composite sample with pre-determined defects. The results demonstrated the effectiveness of the proposed processing algorithms in terms of removing the unwanted heat distribution effect along with the signal-to-noise ratio of the produced infrared images.


2020 ◽  
Vol 14 (3) ◽  
pp. 329-332
Author(s):  
Simone dos Santos Barreto ◽  
Karin Zazo Ortiz

ABSTRACT. Foreign accent syndrome (FAS) is an extremely rare disorder, with 112 cases described until 2019. We compare two cases of the foreign accent syndrome in native speakers of Brazilian Portuguese in its classic form (FAS) and psychiatric variant (FALS). Two cases were analyzed: (1) a right-handed, 69-year-old man, with a prior history of stroke, and (2) a right-handed, 43-year-old woman, diagnosed with schizophrenia. They were evaluate for language and speech, including the speech intelligibility. Both patients had speech impairments complaints, similar to a new accent, without previous exposure to a foreign language. However, the onset of the speech disorder was sudden in case 1 and insidious and with transient events in case 2, with speech intelligibility scores of 95.5 and 55.3% respectively. Besides neurologic impairment, the clinical presentation of FALS was extremely severe and differed to that expected in FAS cases, in which speech intelligibility is preserved.


2019 ◽  
Vol 62 (5) ◽  
pp. 1517-1531 ◽  
Author(s):  
Sungmin Lee ◽  
Lisa Lucks Mendel ◽  
Gavin M. Bidelman

Purpose Although the speech intelligibility index (SII) has been widely applied in the field of audiology and other related areas, application of this metric to cochlear implants (CIs) has yet to be investigated. In this study, SIIs for CI users were calculated to investigate whether the SII could be an effective tool for predicting speech perception performance in a population with CI. Method Fifteen pre- and postlingually deafened adults with CI participated. Speech recognition scores were measured using the AzBio sentence lists. CI users also completed questionnaires and performed psychoacoustic (spectral and temporal resolution) and cognitive function (digit span) tests. Obtained SIIs were compared with predicted SIIs using a transfer function curve. Correlation and regression analyses were conducted on perceptual and demographic predictor variables to investigate the association between these factors and speech perception performance. Result Because of the considerably poor hearing and large individual variability in performance, the SII did not predict speech performance for this CI group using the traditional calculation. However, new SII models were developed incorporating predictive factors, which improved the accuracy of SII predictions in listeners with CI. Conclusion Conventional SII models are not appropriate for predicting speech perception scores for CI users. Demographic variables (aided audibility and duration of deafness) and perceptual–cognitive skills (gap detection and auditory digit span outcomes) are needed to improve the use of the SII for listeners with CI. Future studies are needed to improve our CI-corrected SII model by considering additional predictive factors. Supplemental Material https://doi.org/10.23641/asha.8057003


Sign in / Sign up

Export Citation Format

Share Document