scholarly journals Neural speech tracking shifts from the syllabic to the modulation rate of speech as intelligibility decreases

2021 ◽  
Author(s):  
Fabian Schmidt ◽  
Ya-Ping Chen ◽  
Anne Keitel ◽  
Sebastian Rösch ◽  
Ronny Hannemann ◽  
...  

ABSTRACTThe most prominent acoustic features in speech are intensity modulations, represented by the amplitude envelope of speech. Synchronization of neural activity with these modulations is vital for speech comprehension. As the acoustic modulation of speech is related to the production of syllables, investigations of neural speech tracking rarely distinguish between lower-level acoustic (envelope modulation) and higher-level linguistic (syllable rate) information. Here we manipulated speech intelligibility using noise-vocoded speech and investigated the spectral dynamics of neural speech processing, across two studies at cortical and subcortical levels of the auditory hierarchy, using magnetoencephalography. Overall, cortical regions mostly track the syllable rate, whereas subcortical regions track the acoustic envelope. Furthermore, with less intelligible speech, tracking of the modulation rate becomes more dominant. Our study highlights the importance of distinguishing between envelope modulation and syllable rate and provides novel possibilities to better understand differences between auditory processing and speech/language processing disorders.Abstract Figure

Author(s):  
Mattson Ogg ◽  
L. Robert Slevc

Music and language are uniquely human forms of communication. What neural structures facilitate these abilities? This chapter conducts a review of music and language processing that follows these acoustic signals as they ascend the auditory pathway from the brainstem to auditory cortex and on to more specialized cortical regions. Acoustic, neural, and cognitive mechanisms are identified where processing demands from both domains might overlap, with an eye to examples of experience-dependent cortical plasticity, which are taken as strong evidence for common neural substrates. Following an introduction describing how understanding musical processing informs linguistic or auditory processing more generally, findings regarding the major components (and parallels) of music and language research are reviewed: pitch perception, syntax and harmonic structural processing, semantics, timbre and speaker identification, attending in auditory scenes, and rhythm. Overall, the strongest evidence that currently exists for neural overlap (and cross-domain, experience-dependent plasticity) is in the brainstem, followed by auditory cortex, with evidence and the potential for overlap becoming less apparent as the mechanisms involved in music and speech perception become more specialized and distinct at higher levels of processing.


2021 ◽  
Author(s):  
Marlies Gillis ◽  
Jonas Vanthornhout ◽  
Jonathan Z Simon ◽  
Tom Francart ◽  
Christian Brodbeck

When listening to speech, brain responses time-lock to acoustic events in the stimulus. Recent studies have also reported that cortical responses track linguistic representations of speech. However, tracking of these representations is often described without controlling for acoustic properties. Therefore, the response to these linguistic representations might reflect unaccounted acoustic processing rather than language processing. Here we tested several recently proposed linguistic representations, using audiobook speech, while controlling for acoustic and other linguistic representations. Indeed, some of these linguistic representations were not significantly tracked after controlling for acoustic properties. However, phoneme surprisal, cohort entropy, word surprisal and word frequency were significantly tracked over and beyond acoustic properties. Additionally, these linguistic representations are tracked similarly across different stories, spoken by different readers. Together, this suggests that these representations characterize processing of the linguistic content of speech and might allow a behaviour-free evaluation of the speech intelligibility.


2019 ◽  
Vol 30 (3) ◽  
pp. 942-951 ◽  
Author(s):  
Lanfang Liu ◽  
Yuxuan Zhang ◽  
Qi Zhou ◽  
Douglas D Garrett ◽  
Chunming Lu ◽  
...  

Abstract Whether auditory processing of speech relies on reference to the articulatory motor information of speaker remains elusive. Here, we addressed this issue under a two-brain framework. Functional magnetic resonance imaging was applied to record the brain activities of speakers when telling real-life stories and later of listeners when listening to the audio recordings of these stories. Based on between-brain seed-to-voxel correlation analyses, we revealed that neural dynamics in listeners’ auditory temporal cortex are temporally coupled with the dynamics in the speaker’s larynx/phonation area. Moreover, the coupling response in listener’s left auditory temporal cortex follows the hierarchical organization for speech processing, with response lags in A1+, STG/STS, and MTG increasing linearly. Further, listeners showing greater coupling responses understand the speech better. When comprehension fails, such interbrain auditory-articulation coupling vanishes substantially. These findings suggest that a listener’s auditory system and a speaker’s articulatory system are inherently aligned during naturalistic verbal interaction, and such alignment is associated with high-level information transfer from the speaker to the listener. Our study provides reliable evidence supporting that references to the articulatory motor information of speaker facilitate speech comprehension under a naturalistic scene.


2021 ◽  
Author(s):  
Galit Agmon ◽  
Paz Har-Shai Yahav ◽  
Michal Ben-Shachar ◽  
Elana Zion Golumbic

AbstractDaily life is full of situations where many people converse at the same time. Under these noisy circumstances, individuals can employ different listening strategies to deal with the abundance of sounds around them. In this fMRI study we investigated how applying two different listening strategies – Selective vs. Distributed attention – affects the pattern of neural activity. Specifically, in a simulated ‘cocktail party’ paradigm, we compared brain activation patterns when listeners attend selectively to only one speaker and ignore all others, versus when they distribute their attention and attempt to follow two or four speakers at the same time. Results indicate that the two attention types activate a highly overlapping, bilateral fronto-temporal-parietal network of functionally connected regions. This network includes auditory association cortex (bilateral STG/STS) and higher-level regions related to speech processing and attention (bilateral IFG/insula, right MFG, left IPS). Within this network, responses in specific areas were modulated by the type of attention required. Specifically, auditory and speech-processing regions exhibited higher activity during Distributed attention, whereas fronto-parietal regions were activated more strongly during Selective attention. This pattern suggests that a common perceptual-attentional network is engaged when dealing with competing speech-inputs, regardless of the specific task at hand. At the same time, local activity within nodes of this network varies when implementing different listening strategies, reflecting the different cognitive demands they impose. These results nicely demonstrate the system’s flexibility to adapt its internal computations to accommodate different task requirements and listener goals.Significance StatementHearing many people talk simultaneously poses substantial challenges for the human perceptual and cognitive systems. We compared neural activity when listeners applied two different listening strategy to deal with these competing inputs: attending selectively to one speaker vs. distributing attention among all speakers. A network of functionally connected brain regions, involved in auditory processing, language processing and attentional control was activated when applying both attention types. However, activity within this network was modulated by the type of attention required and the number of competing speakers. These results suggest a common ‘attention to speech’ network, providing the computational infrastructure to deal effectively with multi-speaker input, but with sufficient flexibility to implement different prioritization strategies and to adapt to different listener goals.


2013 ◽  
Vol 25 (12) ◽  
pp. 2179-2188 ◽  
Author(s):  
Katya Krieger-Redwood ◽  
M. Gareth Gaskell ◽  
Shane Lindsay ◽  
Elizabeth Jefferies

Several accounts of speech perception propose that the areas involved in producing language are also involved in perceiving it. In line with this view, neuroimaging studies show activation of premotor cortex (PMC) during phoneme judgment tasks; however, there is debate about whether speech perception necessarily involves motor processes, across all task contexts, or whether the contribution of PMC is restricted to tasks requiring explicit phoneme awareness. Some aspects of speech processing, such as mapping sounds onto meaning, may proceed without the involvement of motor speech areas if PMC specifically contributes to the manipulation and categorical perception of phonemes. We applied TMS to three sites—PMC, posterior superior temporal gyrus, and occipital pole—and for the first time within the TMS literature, directly contrasted two speech perception tasks that required explicit phoneme decisions and mapping of speech sounds onto semantic categories, respectively. TMS to PMC disrupted explicit phonological judgments but not access to meaning for the same speech stimuli. TMS to two further sites confirmed that this pattern was site specific and did not reflect a generic difference in the susceptibility of our experimental tasks to TMS: stimulation of pSTG, a site involved in auditory processing, disrupted performance in both language tasks, whereas stimulation of occipital pole had no effect on performance in either task. These findings demonstrate that, although PMC is important for explicit phonological judgments, crucially, PMC is not necessary for mapping speech onto meanings.


2020 ◽  
Author(s):  
Eline Verschueren ◽  
Jonas Vanthornhout ◽  
Tom Francart

ABSTRACTObjectivesThe last years there has been significant interest in attempting to recover the temporal envelope of a speech signal from the neural response to investigate neural speech processing. The research focus is now broadening from neural speech processing in normal-hearing listeners towards hearing-impaired listeners. When testing hearing-impaired listeners speech has to be amplified to resemble the effect of a hearing aid and compensate peripheral hearing loss. Until today, it is not known with certainty how or if neural speech tracking is influenced by sound amplification. As these higher intensities could influence the outcome, we investigated the influence of stimulus intensity on neural speech tracking.DesignWe recorded the electroencephalogram (EEG) of 20 normal-hearing participants while they listened to a narrated story. The story was presented at intensities from 10 to 80 dB A. To investigate the brain responses, we analyzed neural tracking of the speech envelope by reconstructing the envelope from EEG using a linear decoder and by correlating the reconstructed with the actual envelope. We investigated the delta (0.5-4 Hz) and the theta (4-8 Hz) band for each intensity. We also investigated the latencies and amplitudes of the responses in more detail using temporal response functions which are the estimated linear response functions between the stimulus envelope and the EEG.ResultsNeural envelope tracking is dependent on stimulus intensity in both the TRF and envelope reconstruction analysis. However, provided that the decoder is applied on data of the same stimulus intensity as it was trained on, envelope reconstruction is robust to stimulus intensity. In addition, neural envelope tracking in the delta (but not theta) band seems to relate to speech intelligibility. Similar to the linear decoder analysis, TRF amplitudes and latencies are dependent on stimulus intensity: The amplitude of peak 1 (30-50 ms) increases and the latency of peak 2 (140-160 ms) decreases with increasing stimulus intensity.ConclusionAlthough brain responses are influenced by stimulus intensity, neural envelope tracking is robust to stimulus intensity when using the same intensity to test and train the decoder. Therefore we can assume that intensity is not a confound when testing hearing-impaired participants with amplified speech using the linear decoder approach. In addition, neural envelope tracking in the delta band appears to be correlated with speech intelligibility, showing the potential of neural envelope tracking as an objective measure of speech intelligibility.


2021 ◽  
Vol 15 ◽  
Author(s):  
Florian Worschech ◽  
Damien Marie ◽  
Kristin Jünemann ◽  
Christopher Sinke ◽  
Tillmann H. C. Krüger ◽  
...  

Understanding speech in background noise poses a challenge in daily communication, which is a particular problem among the elderly. Although musical expertise has often been suggested to be a contributor to speech intelligibility, the associations are mostly correlative. In the present multisite study conducted in Germany and Switzerland, 156 healthy, normal-hearing elderly were randomly assigned to either piano playing or music listening/musical culture groups. The speech reception threshold was assessed using the International Matrix Test before and after a 6 month intervention. Bayesian multilevel modeling revealed an improvement of both groups over time under binaural conditions. Additionally, the speech reception threshold of the piano group decreased during stimuli presentation to the left ear. A right ear improvement only occurred in the German piano group. Furthermore, improvements were predominantly found in women. These findings are discussed in the light of current neuroscientific theories on hemispheric lateralization and biological sex differences. The study indicates a positive transfer from musical training to speech processing, probably supported by the enhancement of auditory processing and improvement of general cognitive functions.


2012 ◽  
Vol 2012 ◽  
pp. 1-7 ◽  
Author(s):  
Joseph P. Pillion

Deficits in central auditory processing may occur in a variety of clinical conditions including traumatic brain injury, neurodegenerative disease, auditory neuropathy/dyssynchrony syndrome, neurological disorders associated with aging, and aphasia. Deficits in central auditory processing of a more subtle nature have also been studied extensively in neurodevelopmental disorders in children with learning disabilities, ADD, and developmental language disorders. Illustrative cases are reviewed demonstrating the use of an audiological test battery in patients with auditory neuropathy/dyssynchrony syndrome, bilateral lesions to the inferior colliculi, and bilateral lesions to the temporal lobes. Electrophysiological tests of auditory function were utilized to define the locus of dysfunction at neural levels ranging from the auditory nerve, midbrain, and cortical levels.


2019 ◽  
Author(s):  
Jérémy Giroud ◽  
Agnès Trébuchon ◽  
Daniele Schön ◽  
Patrick Marquis ◽  
Catherine Liegeois-Chauvel ◽  
...  

AbstractSpeech perception is mediated by both left and right auditory cortices, but with differential sensitivity to specific acoustic information contained in the speech signal. A detailed description of this functional asymmetry is missing, and the underlying models are widely debated. We analyzed cortical responses from 96 epilepsy patients with electrode implantation in left or right primary, secondary, and/or association auditory cortex. We presented short acoustic transients to reveal the stereotyped spectro-spatial oscillatory response profile of the auditory cortical hierarchy. We show remarkably similar bimodal spectral response profiles in left and right primary and secondary regions, with preferred processing modes in the theta (∼4-8 Hz) and low gamma (∼25-50 Hz) ranges. These results highlight that the human auditory system employs a two-timescale processing mode. Beyond these first cortical levels of auditory processing, a hemispheric asymmetry emerged, with delta and beta band (∼3/15 Hz) responsivity prevailing in the right hemisphere and theta and gamma band (∼6/40 Hz) activity in the left. These intracranial data provide a more fine-grained and nuanced characterization of cortical auditory processing in the two hemispheres, shedding light on the neural dynamics that potentially shape auditory and speech processing at different levels of the cortical hierarchy.Author summarySpeech processing is now known to be distributed across the two hemispheres, but the origin and function of lateralization continues to be vigorously debated. The asymmetric sampling in time (AST) hypothesis predicts that (1) the auditory system employs a two-timescales processing mode, (2) present in both hemispheres but with a different ratio of fast and slow timescales, (3) that emerges outside of primary cortical regions. Capitalizing on intracranial data from 96 epileptic patients we sensitively validated each of these predictions and provide a precise estimate of the processing timescales. In particular, we reveal that asymmetric sampling in associative areas is subtended by distinct two-timescales processing modes. Overall, our results shed light on the neurofunctional architecture of cortical auditory processing.


Sign in / Sign up

Export Citation Format

Share Document