Stuttering and Natural Speech Processing of Semantic and Syntactic Constraints on Verbs

ABSTRACTThe human auditory system is adept at extracting information from speech in both single-speaker and multi-speaker situations. This involves neural processing at the rapid temporal scales seen in natural speech. Non-invasive brain imaging (electro-/magnetoencephalography [EEG/MEG]) signatures of such processing have shown that the phase of neural activity below 16 Hz tracks the dynamics of speech, whereas invasive brain imaging (electrocorticography [ECoG]) has shown that such rapid processing is even more strongly reflected in the power of neural activity at high frequencies (around 70-150 Hz; known as high gamma). The aim of this study was to determine if high gamma power in scalp recorded EEG carries useful stimulus-related information, despite its reputation for having a poor signal to noise ratio. Furthermore, we aimed to assess whether any such information might be complementary to that reflected in well-established low frequency EEG indices of speech processing. We used linear regression to investigate speech envelope and attention decoding in EEG at low frequencies, in high gamma power, and in both signals combined. While low frequency speech tracking was evident for almost all subjects as expected, high gamma power also showed robust speech tracking in a minority of subjects. This same pattern was true for attention decoding using a separate group of subjects who undertook a cocktail party attention experiment. For the subjects who showed speech tracking in high gamma power, the spatiotemporal characteristics of that high gamma tracking differed from that of low-frequency EEG. Furthermore, combining the two neural measures led to improved measures of speech tracking for several subjects. Overall, this indicates that high gamma power EEG can carry useful information regarding speech processing and attentional selection in some subjects and combining it with low frequency EEG can improve the mapping between natural speech and the resulting neural responses.

Download Full-text

Animated virtual characters to explore audio-visual speech in controlled and naturalistic environments

Scientific Reports ◽

10.1038/s41598-020-72375-y ◽

2020 ◽

Vol 10 (1) ◽

Cited By ~ 1

Author(s):

Raphaël Thézé ◽

Mehdi Ali Gadiri ◽

Louis Albert ◽

Antoine Provost ◽

Anne-Lise Giraud ◽

...

Keyword(s):

Speech Processing ◽

Background Noise ◽

Mcgurk Effect ◽

Visual Speech ◽

Natural Speech ◽

Virtual Characters ◽

Speech Stimuli ◽

Stimulus Timing ◽

Phonetic Features ◽

Set Up

Abstract Natural speech is processed in the brain as a mixture of auditory and visual features. An example of the importance of visual speech is the McGurk effect and related perceptual illusions that result from mismatching auditory and visual syllables. Although the McGurk effect has widely been applied to the exploration of audio-visual speech processing, it relies on isolated syllables, which severely limits the conclusions that can be drawn from the paradigm. In addition, the extreme variability and the quality of the stimuli usually employed prevents comparability across studies. To overcome these limitations, we present an innovative methodology using 3D virtual characters with realistic lip movements synchronized on computer-synthesized speech. We used commercially accessible and affordable tools to facilitate reproducibility and comparability, and the set-up was validated on 24 participants performing a perception task. Within complete and meaningful French sentences, we paired a labiodental fricative viseme (i.e. /v/) with a bilabial occlusive phoneme (i.e. /b/). This audiovisual mismatch is known to induce the illusion of hearing /v/ in a proportion of trials. We tested the rate of the illusion while varying the magnitude of background noise and audiovisual lag. Overall, the effect was observed in 40% of trials. The proportion rose to about 50% with added background noise and up to 66% when controlling for phonetic features. Our results conclusively demonstrate that computer-generated speech stimuli are judicious, and that they can supplement natural speech with higher control over stimulus timing and content.

Download Full-text

Individual Differences in the Use of Acoustic-Phonetic Versus Lexical Cues for Speech Perception

Frontiers in Communication ◽

10.3389/fcomm.2021.691225 ◽

2021 ◽

Vol 6 ◽

Author(s):

Nikole Giovannone ◽

Rachel M. Theodore

Keyword(s):

Speech Perception ◽

Speech Processing ◽

Receptive Language ◽

Language Ability ◽

Natural Speech ◽

Stimulus Input ◽

Lexical Information ◽

Speech Input ◽

High Conflict ◽

Lexical Cues

Previous research suggests that individuals with weaker receptive language show increased reliance on lexical information for speech perception relative to individuals with stronger receptive language, which may reflect a difference in how acoustic-phonetic and lexical cues are weighted for speech processing. Here we examined whether this relationship is the consequence of conflict between acoustic-phonetic and lexical cues in speech input, which has been found to mediate lexical reliance in sentential contexts. Two groups of participants completed standardized measures of language ability and a phonetic identification task to assess lexical recruitment (i.e., a Ganong task). In the high conflict group, the stimulus input distribution removed natural correlations between acoustic-phonetic and lexical cues, thus placing the two cues in high competition with each other; in the low conflict group, these correlations were present and thus competition was reduced as in natural speech. The results showed that 1) the Ganong effect was larger in the low compared to the high conflict condition in single-word contexts, suggesting that cue conflict dynamically influences online speech perception, 2) the Ganong effect was larger for those with weaker compared to stronger receptive language, and 3) the relationship between the Ganong effect and receptive language was not mediated by the degree to which acoustic-phonetic and lexical cues conflicted in the input. These results suggest that listeners with weaker language ability down-weight acoustic-phonetic cues and rely more heavily on lexical knowledge, even when stimulus input distributions reflect characteristics of natural speech input.

Download Full-text

Cortical Activity of Children with Dyslexia During Natural Speech Processing: Evidence of Auditory Processing Deficiency

Journal of Basic and Clinical Physiology and Pharmacology ◽

10.1515/jbcpp.2005.16.2-3.157 ◽

2005 ◽

Vol 16 (2-3) ◽

Cited By ~ 3

Author(s):

H. Putter-Katz ◽

L. Kishon-Rabin ◽

E. Sachartov ◽

E.L. Shabtai ◽

M. Sadeh ◽

...

Keyword(s):

Speech Processing ◽

Auditory Processing ◽

Cortical Activity ◽

Natural Speech

Download Full-text

Listeners modulate temporally selective attention during natural speech processing

Biological Psychology ◽

10.1016/j.biopsycho.2008.01.015 ◽

2009 ◽

Vol 80 (1) ◽

pp. 23-34 ◽

Cited By ~ 63

Author(s):

Lori B. Astheimer ◽

Lisa D. Sanders

Keyword(s):

Selective Attention ◽

Speech Processing ◽

Natural Speech

Download Full-text

Brain potentials indicate immediate use of prosodic cues in natural speech processing

Nature Neuroscience ◽

10.1038/5757 ◽

1999 ◽

Vol 2 (2) ◽

pp. 191-196 ◽

Cited By ~ 267

Author(s):

Karsten Steinhauer ◽

Kai Alter ◽

Angela D. Friederici

Keyword(s):

Speech Processing ◽

Natural Speech ◽

Brain Potentials ◽

Prosodic Cues

Download Full-text

General auditory and speech-specific contributions to cortical envelope tracking revealed using auditory chimeras

10.1101/2020.10.21.348557 ◽

2020 ◽

Cited By ~ 1

Author(s):

Kevin D. Prinsloo ◽

Edmund C. Lalor

Keyword(s):

Auditory Cortex ◽

Speech Processing ◽

Low Frequency ◽

Communication Disorders ◽

Speech Stimulus ◽

Natural Speech ◽

Amplitude Envelope ◽

Envelope Tracking ◽

Acoustic Processing ◽

Speech Tracking

AbstractIn recent years research on natural speech processing has benefited from recognizing that low frequency cortical activity tracks the amplitude envelope of natural speech. However, it remains unclear to what extent this tracking reflects speech-specific processing beyond the analysis of the stimulus acoustics. In the present study, we aimed to disentangle contributions to cortical envelope tracking that reflect general acoustic processing from those that are functionally related to processing speech. To do so, we recorded EEG from subjects as they listened to “auditory chimeras” – stimuli comprised of the temporal fine structure (TFS) of one speech stimulus modulated by the amplitude envelope (ENV) of another speech stimulus. By varying the number of frequency bands used in making the chimeras, we obtained some control over which speech stimulus was recognized by the listener. No matter which stimulus was recognized, envelope tracking was always strongest for the ENV stimulus, indicating a dominant contribution from acoustic processing. However, there was also a positive relationship between intelligibility and the tracking of the perceived speech, indicating a contribution from speech specific processing. These findings were supported by a follow-up analysis that assessed envelope tracking as a function of the (estimated) output of the cochlea rather than the original stimuli used in creating the chimeras. Finally, we sought to isolate the speech-specific contribution to envelope tracking using forward encoding models and found that indices of phonetic feature processing tracked reliably with intelligibility. Together these results show that cortical speech tracking is dominated by acoustic processing, but also reflects speech-specific processing.This work was supported by a Career Development Award from Science Foundation Ireland (CDA/15/3316) and a grant from the National Institute on Deafness and Other Communication Disorders (DC016297). The authors thank Dr. Aaron Nidiffer, Dr. Aisling O’Sullivan, Thomas Stoll and Lauren Szymula for assistance with data collection, and Dr. Nathaniel Zuk, Dr. Aaron Nidiffer, Dr. Aisling O’Sullivan for helpful comments on this manuscript.Significance StatementActivity in auditory cortex is known to dynamically track the energy fluctuations, or amplitude envelope, of speech. Measures of this tracking are now widely used in research on hearing and language and have had a substantial influence on theories of how auditory cortex parses and processes speech. But, how much of this speech tracking is actually driven by speech-specific processing rather than general acoustic processing is unclear, limiting its interpretability and its usefulness. Here, by merging two speech stimuli together to form so-called auditory chimeras, we show that EEG tracking of the speech envelope is dominated by acoustic processing, but also reflects linguistic analysis. This has important implications for theories of cortical speech tracking and for using measures of that tracking in applied research.

Download Full-text

The effect of stimulus choice on an EEG-based objective measure of speech intelligibility

10.1101/421727 ◽

2018 ◽

Cited By ~ 1

Author(s):

Eline Verschueren ◽

Jonas Vanthornhout ◽

Tom Francart

Keyword(s):

Speech Processing ◽

Speech Intelligibility ◽

Objective Measure ◽

Natural Speech ◽

Temporal Characteristics ◽

Brain Responses ◽

Stimulus Choice ◽

The Matrix ◽

Speech Envelope ◽

The Brain

ABSTRACTObjectivesRecently an objective measure of speech intelligibility, based on brain responses derived from the electroencephalogram (EEG), has been developed using isolated Matrix sentences as a stimulus. We investigated whether this objective measure of speech intelligibility can also be used with natural speech as a stimulus, as this would be beneficial for clinical applications.DesignWe recorded the EEG in 19 normal-hearing participants while they listened to two types of stimuli: Matrix sentences and a natural story. Each stimulus was presented at different levels of speech intelligibility by adding speech weighted noise. Speech intelligibility was assessed in two ways for both stimuli: (1) behaviorally and (2) objectively by reconstructing the speech envelope from the EEG using a linear decoder and correlating it with the acoustic envelope. We also calculated temporal response functions (TRFs) to investigate the temporal characteristics of the brain responses in the EEG channels covering different brain areas.ResultsFor both stimulus types the correlation between the speech envelope and the reconstructed envelope increased with increasing speech intelligibility. In addition, correlations were higher for the natural story than for the Matrix sentences. Similar to the linear decoder analysis, TRF amplitudes increased with increasing speech intelligibility for both stimuli. Remarkable is that although speech intelligibility remained unchanged in the no noise and +2.5 dB SNR condition, neural speech processing was affected by the addition of this small amount of noise: TRF amplitudes across the entire scalp decreased between 0 to 150 ms, while amplitudes between 150 to 200 ms increased in the presence of noise. TRF latency changes in function of speech intelligibility appeared to be stimulus specific: The latency of the prominent negative peak in the early responses (50-300 ms) increased with increasing speech intelligibility for the Matrix sentences, but remained unchanged for the natural story.ConclusionsThese results show (1) the feasibility of natural speech as a stimulus for the objective measure of speech intelligibility, (2) that neural tracking of speech is enhanced using a natural story compared to Matrix sentences and (3) that noise and the stimulus type can change the temporal characteristics of the brain responses. These results might reflect the integration of incoming acoustic features and top-down information, suggesting that the choice of the stimulus has to be considered based on the intended purpose of the measurement.

Download Full-text

Atypical MEG inter-subject correlation during listening to continuous natural speech in dyslexia

10.1101/677674 ◽

2019 ◽

Author(s):

A. Thiede ◽

E. Glerean ◽

T. Kujala ◽

L. Parkkonen

Keyword(s):

Working Memory ◽

Speech Processing ◽

Phonological Processing ◽

Real Life ◽

Mantel Test ◽

Natural Speech ◽

List Type ◽

Frequency Bands ◽

Temporal Sampling ◽

Technical Reading

AbstractListening to speech elicits brain activity time-locked to the speech sounds. This so-called neural entrainment to speech was found to be atypical in dyslexia, a reading impairment associated with neural speech processing deficits. We hypothesized that the brain responses of dyslexic vs. normal readers to real-life speech would be different, and thus the strength of inter-subject correlation (ISC) would differ from that of typical readers and be reflected in reading-related measures.We recorded magnetoencephalograms (MEG) of 23 dyslexic and 21 typically-reading adults during listening to ∼10 min of natural Finnish speech consisting of excerpts from radio news, a podcast, a self-recorded audiobook chapter and small talk. The amplitude envelopes of band-pass-filtered MEG source signals were correlated between subjects in a cortically-constrained source space in six frequency bands. The resulting ISCs of dyslexic and typical readers were compared with a permutation-based t-test. Neuropsychological measures of phonological processing, technical reading, and working memory were correlated with the ISCs utilizing the Mantel test.During listening to speech, ISCs were reduced in dyslexic compared to typical readers in delta (0.5–4 Hz), alpha (8–12 Hz), low gamma (25–45 Hz) and high gamma (55–90 Hz) frequency bands. In the beta (12–25 Hz) band, dyslexics had mainly enhanced ISC to speech compared to controls. Furthermore, we found that ISCs across both groups were associated with phonological processing, technical reading, and working memory.The atypical ISC to natural speech in dyslexics supports the temporal sampling deficit theory of dyslexia. It also suggests over-synchronization to phoneme-rate information in speech, which could indicate more effort-demanding sampling of phonemes from speech in dyslexia. These irregularities in parsing speech are likely some of the complex neural factors contributing to dyslexia. The associations between neural coupling and reading-related skills further support this notion.Research HighlightsMEG inter-subject correlation (ISC) of dyslexics was atypical while listening to speech.Depending on the frequency band, dyslexics had stronger or weaker ISC than controls.Reading-related measures correlated with the strength of ISC.

Download Full-text

Auditory brainstem responses to continuous natural speech in human listeners

10.1101/192070 ◽

2017 ◽

Author(s):

Ross K Maddox ◽

Adrian KC Lee

Keyword(s):

Language Processing ◽

Cognitive Processes ◽

Speech Processing ◽

Cortical Activity ◽

Auditory Brainstem ◽

Natural Speech ◽

Auditory Brainstem Responses ◽

Recording Time ◽

Brainstem Response ◽

High Degree

ABSTRACTSpeech is an ecologically essential signal whose processing begins in the subcortical nuclei of the auditory brainstem, but there are few experimental options for studying these early responses under natural conditions. While encoding of continuous natural speech has been successfully probed in the cortex with neurophysiological tools such as electro- and magnetoencephalography, the rapidity of subcortical response components combined with unfavorable signal to noise ratios has prevented application of those methods to the brainstem. Instead, experiments have used thousands of repetitions of simple stimuli such as clicks, tonebursts, or brief spoken syllables, with deviations from those paradigms leading to ambiguity in the neural origins of measured responses. In this study we developed and tested a new way to measure the auditory brainstem response to ongoing, naturally uttered speech. We found a high degree of morphological similarity between the speech-evoked auditory brainstem responses (ABR) and the standard click-evoked ABR, notably a preserved wave V, the most prominent voltage peak in the standard click-evoked ABR. Because this method yields distinct peaks at latencies too short to originate from the cortex, the responses measured can be unambiguously determined to be subcortical in origin. The use of naturally uttered speech to evoke the ABR allows the design of engaging behavioral tasks, facilitating new investigations of the effects of cognitive processes like language processing and attention on brainstem processing.SIGNIFICANCE STATEMENTSpeech processing is usually studied in the cortex, but it starts in the auditory brainstem. However, a paradigm for studying brainstem processing of continuous natural speech in human listeners has been elusive due to practical limitations. Here we adapt methods that have been employed for studying cortical activity to the auditory brainstem. We measure the response to continuous natural speech and show that it is highly similar to the click-evoked response. The method also allows simultaneous investigation of cortical activity with no added recording time. This discovery paves the way for studies of speech processing in the human brainstem, including its interactions with higher order cognitive processes originating in the cortex.

Download Full-text