speech tracking
Recently Published Documents


TOTAL DOCUMENTS

47
(FIVE YEARS 26)

H-INDEX

9
(FIVE YEARS 2)

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Sarah Tune ◽  
Mohsen Alavash ◽  
Lorenz Fiedler ◽  
Jonas Obleser

AbstractSuccessful listening crucially depends on intact attentional filters that separate relevant from irrelevant information. Research into their neurobiological implementation has focused on two potential auditory filter strategies: the lateralization of alpha power and selective neural speech tracking. However, the functional interplay of the two neural filter strategies and their potency to index listening success in an ageing population remains unclear. Using electroencephalography and a dual-talker task in a representative sample of listeners (N = 155; age=39–80 years), we here demonstrate an often-missed link from single-trial behavioural outcomes back to trial-by-trial changes in neural attentional filtering. First, we observe preserved attentional–cue-driven modulation of both neural filters across chronological age and hearing levels. Second, neural filter states vary independently of one another, demonstrating complementary neurobiological solutions of spatial selective attention. Stronger neural speech tracking but not alpha lateralization boosts trial-to-trial behavioural performance. Our results highlight the translational potential of neural speech tracking as an individualized neural marker of adaptive listening behaviour.


2021 ◽  
Author(s):  
Fabian Schmidt ◽  
Ya-Ping Chen ◽  
Anne Keitel ◽  
Sebastian Rösch ◽  
Ronny Hannemann ◽  
...  

ABSTRACTThe most prominent acoustic features in speech are intensity modulations, represented by the amplitude envelope of speech. Synchronization of neural activity with these modulations is vital for speech comprehension. As the acoustic modulation of speech is related to the production of syllables, investigations of neural speech tracking rarely distinguish between lower-level acoustic (envelope modulation) and higher-level linguistic (syllable rate) information. Here we manipulated speech intelligibility using noise-vocoded speech and investigated the spectral dynamics of neural speech processing, across two studies at cortical and subcortical levels of the auditory hierarchy, using magnetoencephalography. Overall, cortical regions mostly track the syllable rate, whereas subcortical regions track the acoustic envelope. Furthermore, with less intelligible speech, tracking of the modulation rate becomes more dominant. Our study highlights the importance of distinguishing between envelope modulation and syllable rate and provides novel possibilities to better understand differences between auditory processing and speech/language processing disorders.Abstract Figure


10.2196/20405 ◽  
2021 ◽  
Vol 8 (1) ◽  
pp. e20405
Author(s):  
Christiane Völter ◽  
Carolin Stöckmann ◽  
Christiane Schirmer ◽  
Stefan Dazert

Background Technologies allowing home-based rehabilitation may be a key means of saving financial resources while also facilitating people’s access to treatment. After cochlear implantation, auditory training is necessary for the brain to adapt to new auditory signals transmitted by the cochlear implant (CI). To date, auditory training is conducted in a face-to-face setting at a specialized center. However, because of the COVID-19 pandemic’s impact on health care, the need for new therapeutic settings has intensified. Objective The aims of this study are to assess the feasibility of a novel teletherapeutic auditory rehabilitation platform in adult CI recipients and compare the clinical outcomes and economic benefits of this platform with those derived from conventional face-to-face rehabilitation settings in a clinic. Methods In total, 20 experienced adult CI users with a mean age of 59.4 (SD 16.3) years participated in the study. They completed 3 weeks of standard (face-to-face) therapy, followed by 3 weeks of computer-based auditory training (CBAT) at home. Participants were assessed at three intervals: before face-to-face therapy, after face-to-face therapy, and after CBAT. The primary outcomes were speech understanding in quiet and noisy conditions. The secondary outcomes were the usability of the CBAT system, the participants’ subjective rating of their own listening abilities, and the time required for completing face-to-face and CBAT sessions for CI users and therapists. Results Greater benefits were observed after CBAT than after standard therapy in nearly all speech outcome measures. Significant improvements were found in sentence comprehension in noise (P=.004), speech tracking (P=.004) and phoneme differentiation (vowels: P=.001; consonants: P=.02) after CBAT. Only speech tracking improved significantly after conventional therapy (P=.007). The program’s usability was judged to be high: only 2 of 20 participants could not imagine using the program without support. The different features of the training platform were rated as high. Cost analysis showed a cost difference in favor of CBAT: therapists spent 120 minutes per week face-to-face and 30 minutes per week on computer-based sessions. For CI users, attending standard therapy required an average of approximately 78 (SD 58.6) minutes of travel time per appointment. Conclusions The proposed teletherapeutic approach for hearing rehabilitation enables good clinical outcomes while saving time for CI users and clinicians. The promising speech understanding results might be due to the high satisfaction of users with the CBAT program. Teletherapy might offer a cost-effective solution to address the lack of human resources in health care as well as the global challenge of current or future pandemics.


2021 ◽  
pp. 1-62
Author(s):  
Orsolya B Kolozsvári ◽  
Weiyong Xu ◽  
Georgia Gerike ◽  
Tiina Parviainen ◽  
Lea Nieminen ◽  
...  

Speech perception is dynamic and shows changes across development. In parallel, functional differences in brain development over time have been well documented and these differences may interact with changes in speech perception during infancy and childhood. Further, there is evidence that the two hemispheres contribute unequally to speech segmentation at the sentence and phonemic levels. To disentangle those contributions, we studied the cortical tracking of various sized units of speech that are crucial for spoken language processing in children (4.7-9.3 year-olds, N=34) and adults (N=19). We measured participants’ magnetoencephalogram (MEG) responses to syllables, words and sentences, calculated the coherence between the speech signal and MEG responses at the level of words and sentences, and further examined auditory evoked responses to syllables. Age-related differences were found for coherence values at the delta and theta frequency bands. Both frequency bands showed an effect of stimulus type, although this was attributed to the length of the stimulus and not linguistic unit size. There was no difference between hemispheres at the source level either in coherence values for word or sentence processing or in evoked response to syllables. Results highlight the importance of the lower frequencies for speech tracking in the brain across different lexical units. Further, stimulus length affects the speech-brain associations suggesting methodological approaches should be selected carefully when studying speech envelope processing at the neural level. Speech tracking in the brain seems decoupled from more general maturation of the auditory cortex.


2021 ◽  
Author(s):  
Maansi Desai ◽  
Jade Holder ◽  
Cassandra Villarreal ◽  
Nat Clark ◽  
Liberty S. Hamilton

AbstractIn natural conversations, listeners must attend to what others are saying while ignoring extraneous background sounds. Recent studies have used encoding models to predict electroencephalography (EEG) responses to speech in noise-free listening situations, sometimes referred to as “speech tracking” in EEG. Researchers have analyzed how speech tracking changes with different types of background noise. It is unclear, however, whether neural responses from noisy and naturalistic environments can be generalized to more controlled stimuli. If encoding models for noisy, naturalistic stimuli are generalizable to other tasks, this could aid in data collection from populations who may not tolerate listening to more controlled, less-engaging stimuli for long periods of time. We recorded non-invasive scalp EEG while participants listened to speech without noise and audiovisual speech stimuli containing overlapping speakers and background sounds. We fit multivariate temporal receptive field (mTRF) encoding models to predict EEG responses to pitch, the acoustic envelope, phonological features, and visual cues in both noise-free and noisy stimulus conditions. Our results suggested that neural responses to naturalistic stimuli were generalizable to more controlled data sets. EEG responses to speech in isolation were predicted accurately using phonological features alone, while responses to noisy speech were more accurate when including both phonological and acoustic features. These findings may inform basic science research on speech-in-noise processing. Ultimately, they may also provide insight into auditory processing in people who are hard of hearing, who use a combination of audio and visual cues to understand speech in the presence of noise.Significance StatementUnderstanding spoken language in natural environments requires listeners to parse acoustic and linguistic information in the presence of other distracting stimuli. However, most studies of auditory processing rely on highly controlled stimuli with no background noise, or with background noise inserted at specific times. Here, we compare models where EEG data are predicted based on a combination of acoustic, phonetic, and visual features in highly disparate stimuli – sentences from a speech corpus, and speech embedded within movie trailers. We show that modeling neural responses to highly noisy, audiovisual movies can uncover tuning for acoustic and phonetic information that generalizes to simpler stimuli typically used in sensory neuroscience experiments.


Author(s):  
Kevin D. Prinsloo ◽  
Edmund C. Lalor

AbstractIn recent years research on natural speech processing has benefited from recognizing that low frequency cortical activity tracks the amplitude envelope of natural speech. However, it remains unclear to what extent this tracking reflects speech-specific processing beyond the analysis of the stimulus acoustics. In the present study, we aimed to disentangle contributions to cortical envelope tracking that reflect general acoustic processing from those that are functionally related to processing speech. To do so, we recorded EEG from subjects as they listened to “auditory chimeras” – stimuli comprised of the temporal fine structure (TFS) of one speech stimulus modulated by the amplitude envelope (ENV) of another speech stimulus. By varying the number of frequency bands used in making the chimeras, we obtained some control over which speech stimulus was recognized by the listener. No matter which stimulus was recognized, envelope tracking was always strongest for the ENV stimulus, indicating a dominant contribution from acoustic processing. However, there was also a positive relationship between intelligibility and the tracking of the perceived speech, indicating a contribution from speech specific processing. These findings were supported by a follow-up analysis that assessed envelope tracking as a function of the (estimated) output of the cochlea rather than the original stimuli used in creating the chimeras. Finally, we sought to isolate the speech-specific contribution to envelope tracking using forward encoding models and found that indices of phonetic feature processing tracked reliably with intelligibility. Together these results show that cortical speech tracking is dominated by acoustic processing, but also reflects speech-specific processing.This work was supported by a Career Development Award from Science Foundation Ireland (CDA/15/3316) and a grant from the National Institute on Deafness and Other Communication Disorders (DC016297). The authors thank Dr. Aaron Nidiffer, Dr. Aisling O’Sullivan, Thomas Stoll and Lauren Szymula for assistance with data collection, and Dr. Nathaniel Zuk, Dr. Aaron Nidiffer, Dr. Aisling O’Sullivan for helpful comments on this manuscript.Significance StatementActivity in auditory cortex is known to dynamically track the energy fluctuations, or amplitude envelope, of speech. Measures of this tracking are now widely used in research on hearing and language and have had a substantial influence on theories of how auditory cortex parses and processes speech. But, how much of this speech tracking is actually driven by speech-specific processing rather than general acoustic processing is unclear, limiting its interpretability and its usefulness. Here, by merging two speech stimuli together to form so-called auditory chimeras, we show that EEG tracking of the speech envelope is dominated by acoustic processing, but also reflects linguistic analysis. This has important implications for theories of cortical speech tracking and for using measures of that tracking in applied research.


2020 ◽  
Author(s):  
Eline Verschueren ◽  
Jonas Vanthornhout ◽  
Tom Francart

ABSTRACTObjectivesThe last years there has been significant interest in attempting to recover the temporal envelope of a speech signal from the neural response to investigate neural speech processing. The research focus is now broadening from neural speech processing in normal-hearing listeners towards hearing-impaired listeners. When testing hearing-impaired listeners speech has to be amplified to resemble the effect of a hearing aid and compensate peripheral hearing loss. Until today, it is not known with certainty how or if neural speech tracking is influenced by sound amplification. As these higher intensities could influence the outcome, we investigated the influence of stimulus intensity on neural speech tracking.DesignWe recorded the electroencephalogram (EEG) of 20 normal-hearing participants while they listened to a narrated story. The story was presented at intensities from 10 to 80 dB A. To investigate the brain responses, we analyzed neural tracking of the speech envelope by reconstructing the envelope from EEG using a linear decoder and by correlating the reconstructed with the actual envelope. We investigated the delta (0.5-4 Hz) and the theta (4-8 Hz) band for each intensity. We also investigated the latencies and amplitudes of the responses in more detail using temporal response functions which are the estimated linear response functions between the stimulus envelope and the EEG.ResultsNeural envelope tracking is dependent on stimulus intensity in both the TRF and envelope reconstruction analysis. However, provided that the decoder is applied on data of the same stimulus intensity as it was trained on, envelope reconstruction is robust to stimulus intensity. In addition, neural envelope tracking in the delta (but not theta) band seems to relate to speech intelligibility. Similar to the linear decoder analysis, TRF amplitudes and latencies are dependent on stimulus intensity: The amplitude of peak 1 (30-50 ms) increases and the latency of peak 2 (140-160 ms) decreases with increasing stimulus intensity.ConclusionAlthough brain responses are influenced by stimulus intensity, neural envelope tracking is robust to stimulus intensity when using the same intensity to test and train the decoder. Therefore we can assume that intensity is not a confound when testing hearing-impaired participants with amplified speech using the linear decoder approach. In addition, neural envelope tracking in the delta band appears to be correlated with speech intelligibility, showing the potential of neural envelope tracking as an objective measure of speech intelligibility.


Author(s):  
Anne Hauswald ◽  
Anne Keitel ◽  
Ya‐Ping Chen ◽  
Sebastian Rösch ◽  
Nathan Weisz

2020 ◽  
Author(s):  
Felix Bröhl ◽  
Christoph Kayser

AbstractThe representation of speech in the brain is often examined by measuring the alignment of rhythmic brain activity to the speech envelope. To conveniently quantify this alignment (termed ‘speech tracking’) many studies consider the overall speech envelope, which combines acoustic fluctuations across the spectral range. Using EEG recordings, we show that using this overall envelope can provide a distorted picture on speech encoding. We systematically investigated the encoding of spectrally-limited speech-derived envelopes presented by individual and multiple noise carriers in the human brain. Tracking in the 1 to 6 Hz EEG bands differentially reflected low (0.2 – 0.83 kHz) and high (2.66 – 8 kHz) frequency speech-derived envelopes. This was independent of the specific carrier frequency but sensitive to attentional manipulations, and reflects the context-dependent emphasis of information from distinct spectral ranges of the speech envelope in low frequency brain activity. As low and high frequency speech envelopes relate to distinct phonemic features, our results suggest that functionally distinct processes contribute to speech tracking in the same EEG bands, and are easily confounded when considering the overall speech envelope.HighlightsDelta/theta band EEG tracks band-limited speech-derived envelopes similar to real speechLow and high frequency speech-derived envelopes are represented differentiallyHigh-frequency derived envelopes are more susceptible to attentional and contextual manipulationsDelta band tracking shifts towards low frequency derived envelopes with more acoustic detail


Sign in / Sign up

Export Citation Format

Share Document