scholarly journals Predicting individual speech intelligibility from the cortical tracking of acoustic- and phonetic-level speech representations

2018 ◽  
Author(s):  
D Lesenfants ◽  
J Vanthornhout ◽  
E Verschueren ◽  
L Decruy ◽  
T Francart

ABSTRACTObjectiveTo objectively measure speech intelligibility of individual subjects from the EEG, based on cortical tracking of different representations of speech: low-level acoustical, higher-level discrete, or a combination. To compare each model’s prediction of the speech reception threshold (SRT) for each individual with the behaviorally measured SRT.MethodsNineteen participants listened to Flemish Matrix sentences presented at different signal-to-noise ratios (SNRs), corresponding to different levels of speech understanding. For different EEG frequency bands (delta, theta, alpha, beta or low-gamma), a model was built to predict the EEG signal from various speech representations: envelope, spectrogram, phonemes, phonetic features or a combination of phonetic Features and Spectrogram (FS). The same model was used for all subjects. The model predictions were then compared to the actual EEG of each subject for the different SNRs, and the prediction accuracy in function of SNR was used to predict the SRT.ResultsThe model based on the FS speech representation and the theta EEG band yielded the best SRT predictions, with a difference between the behavioral and objective SRT below 1 decibel for 53% and below 2 decibels for 89% of the subjects.ConclusionA model including low- and higher-level speech features allows to predict the speech reception threshold from the EEG of people listening to natural speech. It has potential applications in diagnostics of the auditory system.Search Termscortical speech tracking, objective measure, speech intelligibility, auditory processing, speech representations.HighlightsObjective EEG-based measure of speech intelligibilityImproved prediction of speech intelligibility by combining speech representationsCortical tracking of speech in the delta EEG band monotonically increased with SNRsCortical responses in the theta EEG band best predicted the speech reception thresholdDisclosureThe authors report no disclosures relevant to the manuscript.

2013 ◽  
Vol 24 (04) ◽  
pp. 307-328 ◽  
Author(s):  
Joshua G.W. Bernstein ◽  
Van Summers ◽  
Elena Grassi ◽  
Ken W. Grant

Background: Hearing-impaired (HI) individuals with similar ages and audiograms often demonstrate substantial differences in speech-reception performance in noise. Traditional models of speech intelligibility focus primarily on average performance for a given audiogram, failing to account for differences between listeners with similar audiograms. Improved prediction accuracy might be achieved by simulating differences in the distortion that speech may undergo when processed through an impaired ear. Although some attempts to model particular suprathreshold distortions can explain general speech-reception deficits not accounted for by audibility limitations, little has been done to model suprathreshold distortion and predict speech-reception performance for individual HI listeners. Auditory-processing models incorporating individualized measures of auditory distortion, along with audiometric thresholds, could provide a more complete understanding of speech-reception deficits by HI individuals. A computational model capable of predicting individual differences in speech-recognition performance would be a valuable tool in the development and evaluation of hearing-aid signal-processing algorithms for enhancing speech intelligibility. Purpose: This study investigated whether biologically inspired models simulating peripheral auditory processing for individual HI listeners produce more accurate predictions of speech-recognition performance than audiogram-based models. Research Design: Psychophysical data on spectral and temporal acuity were incorporated into individualized auditory-processing models consisting of three stages: a peripheral stage, customized to reflect individual audiograms and spectral and temporal acuity; a cortical stage, which extracts spectral and temporal modulations relevant to speech; and an evaluation stage, which predicts speech-recognition performance by comparing the modulation content of clean and noisy speech. To investigate the impact of different aspects of peripheral processing on speech predictions, individualized details (absolute thresholds, frequency selectivity, spectrotemporal modulation [STM] sensitivity, compression) were incorporated progressively, culminating in a model simulating level-dependent spectral resolution and dynamic-range compression. Study Sample: Psychophysical and speech-reception data from 11 HI and six normal-hearing listeners were used to develop the models. Data Collection and Analysis: Eleven individualized HI models were constructed and validated against psychophysical measures of threshold, frequency resolution, compression, and STM sensitivity. Speech-intelligibility predictions were compared with measured performance in stationary speech-shaped noise at signal-to-noise ratios (SNRs) of −6, −3, 0, and 3 dB. Prediction accuracy for the individualized HI models was compared to the traditional audibility-based Speech Intelligibility Index (SII). Results: Models incorporating individualized measures of STM sensitivity yielded significantly more accurate within-SNR predictions than the SII. Additional individualized characteristics (frequency selectivity, compression) improved the predictions only marginally. A nonlinear model including individualized level-dependent cochlear-filter bandwidths, dynamic-range compression, and STM sensitivity predicted performance more accurately than the SII but was no more accurate than a simpler linear model. Predictions of speech-recognition performance simultaneously across SNRs and individuals were also significantly better for some of the auditory-processing models than for the SII. Conclusions: A computational model simulating individualized suprathreshold auditory-processing abilities produced more accurate speech-intelligibility predictions than the audibility-based SII. Most of this advantage was realized by a linear model incorporating audiometric and STM-sensitivity information. Although more consistent with known physiological aspects of auditory processing, modeling level-dependent changes in frequency selectivity and gain did not result in more accurate predictions of speech-reception performance.


2021 ◽  
Vol 15 ◽  
Author(s):  
Florian Worschech ◽  
Damien Marie ◽  
Kristin Jünemann ◽  
Christopher Sinke ◽  
Tillmann H. C. Krüger ◽  
...  

Understanding speech in background noise poses a challenge in daily communication, which is a particular problem among the elderly. Although musical expertise has often been suggested to be a contributor to speech intelligibility, the associations are mostly correlative. In the present multisite study conducted in Germany and Switzerland, 156 healthy, normal-hearing elderly were randomly assigned to either piano playing or music listening/musical culture groups. The speech reception threshold was assessed using the International Matrix Test before and after a 6 month intervention. Bayesian multilevel modeling revealed an improvement of both groups over time under binaural conditions. Additionally, the speech reception threshold of the piano group decreased during stimuli presentation to the left ear. A right ear improvement only occurred in the German piano group. Furthermore, improvements were predominantly found in women. These findings are discussed in the light of current neuroscientific theories on hemispheric lateralization and biological sex differences. The study indicates a positive transfer from musical training to speech processing, probably supported by the enhancement of auditory processing and improvement of general cognitive functions.


2021 ◽  
Vol 69 (2) ◽  
pp. 173-179
Author(s):  
Nilolina Samardzic ◽  
Brian C.J. Moore

Traditional methods for predicting the intelligibility of speech in the presence of noise inside a vehicle, such as the Articulation Index (AI), the Speech Intelligibility Index (SII), and the Speech Transmission Index (STI), are not accurate, probably because they do not take binaural listening into account; the signals reaching the two ears can differ markedly depending on the positions of the talker and listener. We propose a new method for predicting the intelligibility of speech in a vehicle, based on the ratio of the binaural loudness of the speech to the binaural loudness of the noise, each calculated using the method specified in ISO 532-2 (2017). The method was found to give accurate predictions of the speech reception threshold (SRT) measured under a variety of conditions and for different positions of the talker and listener in a car. The typical error in the predicted SRT was 1.3 dB, which is markedly smaller than estimated using the SII and STI (2.0 dB and 2.1 dB, respectively).


2011 ◽  
Vol 26 (S2) ◽  
pp. 170-170 ◽  
Author(s):  
S.E. Pape ◽  
M.P. Collins

IntroductionResearch shows anxiety clustering within families: a greater proportion of children with anxious parents develop symptoms of anxiety than children with non-anxious parents. Anxious children often describe their parents as over-controlling and intrusive, lacking in affection and warmth, with reports of decreased parental support.Objectives(1)to identify if parenting behaviors differ between anxious and non-anxious parents,(2)to discuss if these differences in behaviors can contribute to transgenerational transmission of anxiety.AimsIdentifying whether behaviour modification could reduce familial transmission rates of anxiety.MethodA search of OvidSP Medline, Google Scholar, and PubMed was performed, covering 1999 to 2010. Search terms used were: parenting, parents, maternal, paternal, or parental; and anxiety, PTSD, OCD, panic disorder, or phobia. 14 Papers were identified.ResultsWhile most papers identified differences in parenting between anxious and control parents, the conclusions were variable. Two observed increased amounts of controlling behaviour, 5 a decrease in sensitivity, 1 witnessed exageration of behaviours, and 5 a decrease in granting of autonomy or increased protectiveness.ConclusionThe most supported differences in anxious parenting are less granting of autonomy, and lower levels of sensitivity. Whilst in isolation they cannot explain how anxiety is transmitted, and appear to be reciprocally related to child anxiety and temperament, they give grounds for further research. In particular this review identifies the need to study the above behavioral components in longitudinal studies, to observe causal effects between parent behavior and child anxiety.


2021 ◽  
Author(s):  
Marlies Gillis ◽  
Jonas Vanthornhout ◽  
Jonathan Z Simon ◽  
Tom Francart ◽  
Christian Brodbeck

When listening to speech, brain responses time-lock to acoustic events in the stimulus. Recent studies have also reported that cortical responses track linguistic representations of speech. However, tracking of these representations is often described without controlling for acoustic properties. Therefore, the response to these linguistic representations might reflect unaccounted acoustic processing rather than language processing. Here we tested several recently proposed linguistic representations, using audiobook speech, while controlling for acoustic and other linguistic representations. Indeed, some of these linguistic representations were not significantly tracked after controlling for acoustic properties. However, phoneme surprisal, cohort entropy, word surprisal and word frequency were significantly tracked over and beyond acoustic properties. Additionally, these linguistic representations are tracked similarly across different stories, spoken by different readers. Together, this suggests that these representations characterize processing of the linguistic content of speech and might allow a behaviour-free evaluation of the speech intelligibility.


2018 ◽  
Vol 22 (04) ◽  
pp. 408-414 ◽  
Author(s):  
Signe Grasel ◽  
Mario Greters ◽  
Maria Goffi-Gomez ◽  
Roseli Bittar ◽  
Raimar Weber ◽  
...  

Introduction The P3 cognitive evoked potential is recorded when a subject correctly identifies, evaluates and processes two different auditory stimuli. Objective to evaluate the latency and amplitude of the P3 evoked potential in 26 cochlear implant users with post-lingual deafness with good or poor speech recognition scores as compared with normal hearing subjects matched for age and educational level. Methods In this prospective cohort study, auditory cortical responses were recorded from 26 post-lingual deaf adult cochlear implant users (19 with good and 7 with poor speech recognition scores) and 26 control subjects. Results There was a significant difference in the P3 latency between cochlear implant users with poor speech recognition scores (G-) and their control group (CG) (p = 0.04), and between G- and cochlear implant users with good speech discrimination (G+) (p = 0.01). We found no significant difference in the P3 latency between the CG and G+. In this study, all G- patients had deafness due to meningitis, which suggests that higher auditory function was impaired too. Conclusion Post-lingual deaf adult cochlear implant users in the G- group had prolonged P3 latencies as compared with the CG and the cochlear implant users in the G+ group. The amplitudes were similar between patients and controls. All G- subjects were deaf due to meningitis. These findings suggest that meningitis may have deleterious effects not only on the peripheral auditory system but on the central auditory processing as well.


Radiocarbon ◽  
2003 ◽  
Vol 45 (2) ◽  
pp. 293-328 ◽  

TIRI was officially launched at the 14th International Radiocarbon Conference in Arizona in 1991. Prior to the conference, 150 laboratories received a letter describing the general intention to organize an intercomparison and over 90 laboratories from around the world responded positively to the invitation to participate. Simply stated, the aims of this intercomparison were: 1.To function as the third arm of the quality assurance (QA) procedure.2.To provide an objective measure of the maintenance and improvement in analytical quality.3.To assist in the development of a “self-help” scheme for participating laboratories.


1995 ◽  
Vol 38 (1) ◽  
pp. 211-221 ◽  
Author(s):  
Ronald A. van Buuren ◽  
Joost M. Festen ◽  
Reinier Plomp

The long-term average frequency spectrum of speech was modified to 25 target frequency spectra in order to determine the effect of each of these spectra on speech intelligibility in noise and on sound quality. Speech intelligibility was evaluated using the test as developed by Plomp and Mimpen (1979), whereas sound quality was examined through judgments of loudness, sharpness, clearness, and pleasantness of speech fragments. Subjects had different degrees of sensorineural hearing loss and sloping audiograms, but not all of them were hearing aid users. The 25 frequency spectra were defined such that the entire dynamic range of each listener, from dB above threshold to 5 dB below UCL, was covered. Frequency shaping of the speech was carried out on-line by means of Finite Impulse Response (FIR) filters. The tests on speech reception in noise indicated that the Speech-Reception Thresholds (SRTs) did not differ significantly for the majority of spectra. Spectra with high levels, especially at low frequencies (probably causing significant upward spread of masking), and also those with steep negative slopes resulted in significantly higher SRTs. Sound quality judgments led to conclusions virtually identical to those from the SRT data: frequency spectra with an unacceptably low sound quality were in most of the cases significantly worse on the SRT test as well. Because the SRT did not vary significantly among the majority of frequency spectra, it was concluded that a wide range of spectra between the threshold and UCL levels of listeners with hearing losses is suitable for the presentation of speech energy. This is very useful in everyday listening, where the frequency spectrum of speech may vary considerably.


2003 ◽  
Vol 129 (3) ◽  
pp. 248-254 ◽  
Author(s):  
Jack J. Wazen ◽  
Jaclyn B. Spitzer ◽  
Soha N. Ghossaini ◽  
José N. Fayad ◽  
John K. Niparko ◽  
...  

OBJECTIVES: The purpose of this study is to evaluate the effectiveness of Bone Anchored Cochlear Stimulator (BAHA) in transcranial routing of signal by implanting the deaf ear. STUDY DESIGN AND SETTINGS: Eighteen patients with unilateral deafness were included in a multisite study. They had a 1-month pre-implantation trial with a contralateral routing of signal (CROS) hearing aid. Their performance with BAHA was compared with the CROS device using speech reception thresholds, speech recognition performance in noise, and the Abbreviated Profile Hearing Benefit and Single Sided Deafness questionnaires. RESULTS: Patients reported a significant improvement in speech intelligibility in noise and greater benefit from BAHA compared with CROS hearing aids. Patients were satisfied with the device and its impact on their quality of life. No major complications were reported. CONCLUSION AND SIGNIFICANCE: BAHA is effective in unilateral deafness. Auditory stimuli from the deaf side can be transmitted to the good ear, avoiding the limitations inherent in CROS amplification.


Sign in / Sign up

Export Citation Format

Share Document