Cortical Activity of Children with Dyslexia During Natural Speech Processing: Evidence of Auditory Processing Deficiency

ABSTRACTSpeech is an ecologically essential signal whose processing begins in the subcortical nuclei of the auditory brainstem, but there are few experimental options for studying these early responses under natural conditions. While encoding of continuous natural speech has been successfully probed in the cortex with neurophysiological tools such as electro- and magnetoencephalography, the rapidity of subcortical response components combined with unfavorable signal to noise ratios has prevented application of those methods to the brainstem. Instead, experiments have used thousands of repetitions of simple stimuli such as clicks, tonebursts, or brief spoken syllables, with deviations from those paradigms leading to ambiguity in the neural origins of measured responses. In this study we developed and tested a new way to measure the auditory brainstem response to ongoing, naturally uttered speech. We found a high degree of morphological similarity between the speech-evoked auditory brainstem responses (ABR) and the standard click-evoked ABR, notably a preserved wave V, the most prominent voltage peak in the standard click-evoked ABR. Because this method yields distinct peaks at latencies too short to originate from the cortex, the responses measured can be unambiguously determined to be subcortical in origin. The use of naturally uttered speech to evoke the ABR allows the design of engaging behavioral tasks, facilitating new investigations of the effects of cognitive processes like language processing and attention on brainstem processing.SIGNIFICANCE STATEMENTSpeech processing is usually studied in the cortex, but it starts in the auditory brainstem. However, a paradigm for studying brainstem processing of continuous natural speech in human listeners has been elusive due to practical limitations. Here we adapt methods that have been employed for studying cortical activity to the auditory brainstem. We measure the response to continuous natural speech and show that it is highly similar to the click-evoked response. The method also allows simultaneous investigation of cortical activity with no added recording time. This discovery paves the way for studies of speech processing in the human brainstem, including its interactions with higher order cognitive processes originating in the cortex.

Download Full-text

Speech Processing Disorder in Neural Hearing Loss

Case Reports in Medicine ◽

10.1155/2012/206716 ◽

2012 ◽

Vol 2012 ◽

pp. 1-7 ◽

Cited By ~ 2

Author(s):

Joseph P. Pillion

Keyword(s):

Speech Processing ◽

Auditory Processing ◽

Neurological Disorders ◽

Auditory Neuropathy ◽

Central Auditory Processing ◽

Temporal Lobes ◽

Electrophysiological Tests ◽

Bilateral Lesions ◽

Clinical Conditions ◽

Inferior Colliculi

Deficits in central auditory processing may occur in a variety of clinical conditions including traumatic brain injury, neurodegenerative disease, auditory neuropathy/dyssynchrony syndrome, neurological disorders associated with aging, and aphasia. Deficits in central auditory processing of a more subtle nature have also been studied extensively in neurodevelopmental disorders in children with learning disabilities, ADD, and developmental language disorders. Illustrative cases are reviewed demonstrating the use of an audiological test battery in patients with auditory neuropathy/dyssynchrony syndrome, bilateral lesions to the inferior colliculi, and bilateral lesions to the temporal lobes. Electrophysiological tests of auditory function were utilized to define the locus of dysfunction at neural levels ranging from the auditory nerve, midbrain, and cortical levels.

Download Full-text

Asymmetric sampling in human auditory cortex reveals spectral processing hierarchy

10.1101/581520 ◽

2019 ◽

Author(s):

Jérémy Giroud ◽

Agnès Trébuchon ◽

Daniele Schön ◽

Patrick Marquis ◽

Catherine Liegeois-Chauvel ◽

...

Keyword(s):

Auditory Cortex ◽

Auditory System ◽

Speech Processing ◽

Auditory Processing ◽

Differential Sensitivity ◽

Processing Mode ◽

Cortical Hierarchy ◽

Processing Modes ◽

Left And Right ◽

Cortical Auditory Processing

AbstractSpeech perception is mediated by both left and right auditory cortices, but with differential sensitivity to specific acoustic information contained in the speech signal. A detailed description of this functional asymmetry is missing, and the underlying models are widely debated. We analyzed cortical responses from 96 epilepsy patients with electrode implantation in left or right primary, secondary, and/or association auditory cortex. We presented short acoustic transients to reveal the stereotyped spectro-spatial oscillatory response profile of the auditory cortical hierarchy. We show remarkably similar bimodal spectral response profiles in left and right primary and secondary regions, with preferred processing modes in the theta (∼4-8 Hz) and low gamma (∼25-50 Hz) ranges. These results highlight that the human auditory system employs a two-timescale processing mode. Beyond these first cortical levels of auditory processing, a hemispheric asymmetry emerged, with delta and beta band (∼3/15 Hz) responsivity prevailing in the right hemisphere and theta and gamma band (∼6/40 Hz) activity in the left. These intracranial data provide a more fine-grained and nuanced characterization of cortical auditory processing in the two hemispheres, shedding light on the neural dynamics that potentially shape auditory and speech processing at different levels of the cortical hierarchy.Author summarySpeech processing is now known to be distributed across the two hemispheres, but the origin and function of lateralization continues to be vigorously debated. The asymmetric sampling in time (AST) hypothesis predicts that (1) the auditory system employs a two-timescales processing mode, (2) present in both hemispheres but with a different ratio of fast and slow timescales, (3) that emerges outside of primary cortical regions. Capitalizing on intracranial data from 96 epileptic patients we sensitively validated each of these predictions and provide a precise estimate of the processing timescales. In particular, we reveal that asymmetric sampling in associative areas is subtended by distinct two-timescales processing modes. Overall, our results shed light on the neurofunctional architecture of cortical auditory processing.

Download Full-text

Including measures of high gamma power can improve the decoding of natural speech from EEG

10.1101/785881 ◽

2019 ◽

Cited By ~ 1

Author(s):

Shyanthony R. Synigal ◽

Emily S. Teoh ◽

Edmund C. Lalor

Keyword(s):

Brain Imaging ◽

Neural Activity ◽

Speech Processing ◽

Signal To Noise Ratio ◽

Low Frequency ◽

Natural Speech ◽

Low Frequencies ◽

High Gamma ◽

Gamma Power ◽

Speech Tracking

ABSTRACTThe human auditory system is adept at extracting information from speech in both single-speaker and multi-speaker situations. This involves neural processing at the rapid temporal scales seen in natural speech. Non-invasive brain imaging (electro-/magnetoencephalography [EEG/MEG]) signatures of such processing have shown that the phase of neural activity below 16 Hz tracks the dynamics of speech, whereas invasive brain imaging (electrocorticography [ECoG]) has shown that such rapid processing is even more strongly reflected in the power of neural activity at high frequencies (around 70-150 Hz; known as high gamma). The aim of this study was to determine if high gamma power in scalp recorded EEG carries useful stimulus-related information, despite its reputation for having a poor signal to noise ratio. Furthermore, we aimed to assess whether any such information might be complementary to that reflected in well-established low frequency EEG indices of speech processing. We used linear regression to investigate speech envelope and attention decoding in EEG at low frequencies, in high gamma power, and in both signals combined. While low frequency speech tracking was evident for almost all subjects as expected, high gamma power also showed robust speech tracking in a minority of subjects. This same pattern was true for attention decoding using a separate group of subjects who undertook a cocktail party attention experiment. For the subjects who showed speech tracking in high gamma power, the spatiotemporal characteristics of that high gamma tracking differed from that of low-frequency EEG. Furthermore, combining the two neural measures led to improved measures of speech tracking for several subjects. Overall, this indicates that high gamma power EEG can carry useful information regarding speech processing and attentional selection in some subjects and combining it with low frequency EEG can improve the mapping between natural speech and the resulting neural responses.

Download Full-text

Animated virtual characters to explore audio-visual speech in controlled and naturalistic environments

Scientific Reports ◽

10.1038/s41598-020-72375-y ◽

2020 ◽

Vol 10 (1) ◽

Cited By ~ 1

Author(s):

Raphaël Thézé ◽

Mehdi Ali Gadiri ◽

Louis Albert ◽

Antoine Provost ◽

Anne-Lise Giraud ◽

...

Keyword(s):

Speech Processing ◽

Background Noise ◽

Mcgurk Effect ◽

Visual Speech ◽

Natural Speech ◽

Virtual Characters ◽

Speech Stimuli ◽

Stimulus Timing ◽

Phonetic Features ◽

Set Up

Abstract Natural speech is processed in the brain as a mixture of auditory and visual features. An example of the importance of visual speech is the McGurk effect and related perceptual illusions that result from mismatching auditory and visual syllables. Although the McGurk effect has widely been applied to the exploration of audio-visual speech processing, it relies on isolated syllables, which severely limits the conclusions that can be drawn from the paradigm. In addition, the extreme variability and the quality of the stimuli usually employed prevents comparability across studies. To overcome these limitations, we present an innovative methodology using 3D virtual characters with realistic lip movements synchronized on computer-synthesized speech. We used commercially accessible and affordable tools to facilitate reproducibility and comparability, and the set-up was validated on 24 participants performing a perception task. Within complete and meaningful French sentences, we paired a labiodental fricative viseme (i.e. /v/) with a bilabial occlusive phoneme (i.e. /b/). This audiovisual mismatch is known to induce the illusion of hearing /v/ in a proportion of trials. We tested the rate of the illusion while varying the magnitude of background noise and audiovisual lag. Overall, the effect was observed in 40% of trials. The proportion rose to about 50% with added background noise and up to 66% when controlling for phonetic features. Our results conclusively demonstrate that computer-generated speech stimuli are judicious, and that they can supplement natural speech with higher control over stimulus timing and content.

Download Full-text

Auditory–Articulatory Neural Alignment between Listener and Speaker during Verbal Communication

Cerebral Cortex ◽

10.1093/cercor/bhz138 ◽

2019 ◽

Vol 30 (3) ◽

pp. 942-951 ◽

Cited By ~ 5

Author(s):

Lanfang Liu ◽

Yuxuan Zhang ◽

Qi Zhou ◽

Douglas D Garrett ◽

Chunming Lu ◽

...

Keyword(s):

Speech Processing ◽

Auditory Processing ◽

Information Transfer ◽

Temporal Cortex ◽

Real Life ◽

Hierarchical Organization ◽

Speech Comprehension ◽

Level Information ◽

Motor Information ◽

High Level

Abstract Whether auditory processing of speech relies on reference to the articulatory motor information of speaker remains elusive. Here, we addressed this issue under a two-brain framework. Functional magnetic resonance imaging was applied to record the brain activities of speakers when telling real-life stories and later of listeners when listening to the audio recordings of these stories. Based on between-brain seed-to-voxel correlation analyses, we revealed that neural dynamics in listeners’ auditory temporal cortex are temporally coupled with the dynamics in the speaker’s larynx/phonation area. Moreover, the coupling response in listener’s left auditory temporal cortex follows the hierarchical organization for speech processing, with response lags in A1+, STG/STS, and MTG increasing linearly. Further, listeners showing greater coupling responses understand the speech better. When comprehension fails, such interbrain auditory-articulation coupling vanishes substantially. These findings suggest that a listener’s auditory system and a speaker’s articulatory system are inherently aligned during naturalistic verbal interaction, and such alignment is associated with high-level information transfer from the speaker to the listener. Our study provides reliable evidence supporting that references to the articulatory motor information of speaker facilitate speech comprehension under a naturalistic scene.

Download Full-text

Stuttering and Natural Speech Processing of Semantic and Syntactic Constraints on Verbs

Journal of Speech Language and Hearing Research ◽

10.1044/1092-4388(2008/07-0164) ◽

2008 ◽

Vol 51 (5) ◽

pp. 1058-1071 ◽

Cited By ~ 22

Author(s):

Christine Weber-Fox ◽

Amanda Hampton

Keyword(s):

Speech Processing ◽

Natural Speech ◽

Syntactic Constraints

Download Full-text

Attention to speech: Mapping distributed and selective attention systems

10.1101/2021.02.13.431098 ◽

2021 ◽

Author(s):

Galit Agmon ◽

Paz Har-Shai Yahav ◽

Michal Ben-Shachar ◽

Elana Zion Golumbic

Keyword(s):

Selective Attention ◽

Language Processing ◽

Neural Activity ◽

Speech Processing ◽

Auditory Processing ◽

Brain Regions ◽

Association Cortex ◽

Distributed Attention ◽

Listening Strategies ◽

Auditory Association Cortex

AbstractDaily life is full of situations where many people converse at the same time. Under these noisy circumstances, individuals can employ different listening strategies to deal with the abundance of sounds around them. In this fMRI study we investigated how applying two different listening strategies – Selective vs. Distributed attention – affects the pattern of neural activity. Specifically, in a simulated ‘cocktail party’ paradigm, we compared brain activation patterns when listeners attend selectively to only one speaker and ignore all others, versus when they distribute their attention and attempt to follow two or four speakers at the same time. Results indicate that the two attention types activate a highly overlapping, bilateral fronto-temporal-parietal network of functionally connected regions. This network includes auditory association cortex (bilateral STG/STS) and higher-level regions related to speech processing and attention (bilateral IFG/insula, right MFG, left IPS). Within this network, responses in specific areas were modulated by the type of attention required. Specifically, auditory and speech-processing regions exhibited higher activity during Distributed attention, whereas fronto-parietal regions were activated more strongly during Selective attention. This pattern suggests that a common perceptual-attentional network is engaged when dealing with competing speech-inputs, regardless of the specific task at hand. At the same time, local activity within nodes of this network varies when implementing different listening strategies, reflecting the different cognitive demands they impose. These results nicely demonstrate the system’s flexibility to adapt its internal computations to accommodate different task requirements and listener goals.Significance StatementHearing many people talk simultaneously poses substantial challenges for the human perceptual and cognitive systems. We compared neural activity when listeners applied two different listening strategy to deal with these competing inputs: attending selectively to one speaker vs. distributing attention among all speakers. A network of functionally connected brain regions, involved in auditory processing, language processing and attentional control was activated when applying both attention types. However, activity within this network was modulated by the type of attention required and the number of competing speakers. These results suggest a common ‘attention to speech’ network, providing the computational infrastructure to deal effectively with multi-speaker input, but with sufficient flexibility to implement different prioritization strategies and to adapt to different listener goals.

Download Full-text

Relations between hemispheric asymmetries of grey matter and auditory processing of spoken syllables in 281 healthy adults

10.1101/2020.08.12.247841 ◽

2020 ◽

Author(s):

Tulio Guadalupe ◽

Xiang-Zhen Kong ◽

Sophie E. A. Akkermans ◽

Simon E. Fisher ◽

Clyde Francks

Keyword(s):

Auditory Cortex ◽

Speech Processing ◽

Auditory Processing ◽

Dichotic Listening ◽

Grey Matter ◽

Magnetic Resonance Images ◽

Hemispheric Dominance ◽

Subcortical Structures ◽

Hemispheric Asymmetries ◽

Ear Advantage

AbstractMost people have a right-ear advantage for the perception of spoken syllables, consistent with left hemisphere dominance for speech processing. However, there is considerable variation, with some people showing left-ear advantage. The extent to which this variation is reflected in brain structure remains unclear. We tested for relations between hemispheric asymmetries of auditory processing and of grey matter in 281 adults, using dichotic listening and voxel-based morphometry. This was the largest study of this issue to date. Per-voxel asymmetry indexes were derived for each participant following registration of brain magnetic resonance images to a template that was symmetrized. The asymmetry index derived from dichotic listening was related to grey matter asymmetry in clusters of voxels corresponding to the amygdala and cerebellum lobule VI. There was also a smaller, non-significant cluster in the posterior superior temporal gyrus, a region of auditory cortex. These findings contribute to the mapping of asymmetrical structure-function links in the human brain, and suggest that subcortical structures should be investigated in relation to hemispheric dominance for speech processing, in addition to auditory cortex.

Download Full-text

Developmental Shifts in Detection and Attention for Auditory, Visual, and Audiovisual Speech

Journal of Speech Language and Hearing Research ◽

10.1044/2018_jslhr-h-17-0343 ◽

2018 ◽

Vol 61 (12) ◽

pp. 3095-3112 ◽

Cited By ~ 2

Author(s):

Susan Jerger ◽

Markus F. Damian ◽

Cassandra Karl ◽

Hervé Abdi

Keyword(s):

Speech Processing ◽

Auditory Processing ◽

Response Times ◽

Age Groups ◽

Focused Attention ◽

Optimal Detection ◽

Typically Developing ◽

Speech Detection ◽

Multisensory Cues ◽

Attentional Lapses

Purpose Successful speech processing depends on our ability to detect and integrate multisensory cues, yet there is minimal research on multisensory speech detection and integration by children. To address this need, we studied the development of speech detection for auditory (A), visual (V), and audiovisual (AV) input. Method Participants were 115 typically developing children clustered into age groups between 4 and 14 years. Speech detection (quantified by response times [RTs]) was determined for 1 stimulus, /buh/, presented in A, V, and AV modes (articulating vs. static facial conditions). Performance was analyzed not only in terms of traditional mean RTs but also in terms of the faster versus slower RTs (defined by the 1st vs. 3rd quartiles of RT distributions). These time regions were conceptualized respectively as reflecting optimal detection with efficient focused attention versus less optimal detection with inefficient focused attention due to attentional lapses. Results Mean RTs indicated better detection (a) of multisensory AV speech than A speech only in 4- to 5-year-olds and (b) of A and AV inputs than V input in all age groups. The faster RTs revealed that AV input did not improve detection in any group. The slower RTs indicated that (a) the processing of silent V input was significantly faster for the articulating than static face and (b) AV speech or facial input significantly minimized attentional lapses in all groups except 6- to 7-year-olds (a peaked U-shaped curve). Apparently, the AV benefit observed for mean performance in 4- to 5-year-olds arose from effects of attention. Conclusions The faster RTs indicated that AV input did not enhance detection in any group, but the slower RTs indicated that AV speech and dynamic V speech (mouthing) significantly minimized attentional lapses and thus did influence performance. Overall, A and AV inputs were detected consistently faster than V input; this result endorsed stimulus-bound auditory processing by these children.

Download Full-text