scholarly journals Transcranial alternating current stimulation with speech envelopes modulates speech comprehension

2017 ◽  
Author(s):  
Anna Wilsch ◽  
Toralf Neuling ◽  
Jonas Obleser ◽  
Christoph S. Herrmann

AbstractCortical entrainment of the auditory cortex to the broadband temporal envelope of a speech signal is crucial for speech comprehension. Entrainment results in phases of high and low neural excitability, which structure and decode the incoming speech signal. Entrainment to speech is strongest in the theta frequency range (4–8 Hz), the average frequency of the speech envelope. If a speech signal is degraded, entrainment to the speech envelope is weaker and speech intelligibility declines. Besides perceptually evoked cortical entrainment, transcranial alternating current stimulation (tACS) entrains neural oscillations by applying an electric signal to the brain. Accordingly, tACS-induced entrainment in auditory cortex has been shown to improve auditory perception. The aim of the current study was to modulate speech intelligibility externally by means of tACS such that the electric current corresponds to the envelope of the presented speech stream (i.e., envelope-tACS). Participants performed the Oldenburg sentence test with sentences presented in noise in combination with envelope-tACS. Critically, tACS was induced at time lags of 0 to 250 ms in 50-ms steps relative to sentence onset (auditory stimuli were simultaneous to or preceded tACS). We performed single-subject sinusoidal, linear, and quadratic fits to the sentence comprehension performance across the time lags. We could show that the sinusoidal fit described the modulation of sentence comprehension best. Importantly, the average frequency of the sinusoidal fit was 5.12 Hz, corresponding to the peaks of the amplitude spectrum of the stimulated envelopes. This finding was supported by a significant 5-Hz peak in the average power spectrum of individual performance time series. Altogether, envelope tACS modulates intelligibility of speech in noise, presumably by enhancing and disrupting (time lag with in-or out-of-phase stimulation, respectively) cortical entrainment to the speech envelope in auditory cortex.

2020 ◽  
Vol 15 ◽  
pp. 263310552093662
Author(s):  
Jules Erkens ◽  
Michael Schulte ◽  
Matthias Vormann ◽  
Christoph S Herrmann

In recent years, several studies have reported beneficial effects of transcranial alternating current stimulation (tACS) in experiments regarding sound and speech perception. A new development in this field is envelope-tACS: The goal of this method is to improve cortical entrainment to the speech signal by stimulating with a waveform based on the speech envelope. One challenge of this stimulation method is timing; the electrical stimulation needs to be phase-aligned with the naturally occurring cortical entrainment to the auditory stimuli. Due to individual differences in anatomy and processing speed, the optimal time-lag between presentation of sound and applying envelope-tACS varies between participants. To better investigate the effects of envelope-tACS, we performed a speech comprehension task with a larger amount of time-lags than previous experiments, as well as an equal amount of sham conditions. No significant difference between optimal stimulation time-lag condition and best sham condition was found. Further investigation of the data revealed a significant difference between the positive and negative half-cycles of the stimulation conditions but not for sham. However, we also found a significant learning effect over the course of the experiment which was of comparable size to the effects of envelope-tACS found in previous auditory tACS studies. In this article, we discuss possible explanations for why our findings did not match up with those of previous studies and the issues that come with researching and developing envelope-tACS.


2019 ◽  
Author(s):  
Sankar Mukherjee ◽  
Alice Tomassini ◽  
Leonardo Badino ◽  
Aldo Pastore ◽  
Luciano Fadiga ◽  
...  

AbstractCortical entrainment to the (quasi-) rhythmic components of speech seems to play an important role in speech comprehension. It has been suggested that neural entrainment may reflect top-down temporal predictions of sensory signals. Key properties of a predictive model are its anticipatory nature and its ability to reconstruct missing information. Here we put both these two properties to experimental test. We acoustically presented sentences and measured cortical entrainment to both acoustic speech envelope and lips kinematics acquired from the speaker but not visible to the participants. We then analyzed speech-brain and lips-brain coherence at multiple negative and positive lags. Besides the well-known cortical entrainment to the acoustic speech envelope, we found significant entrainment in the delta range to the (latent) lips kinematics. Most interestingly, the two entrainment phenomena were temporally dissociated. While entrainment to the acoustic speech peaked around +0.3 s lag (i.e., when EEG followed speech by 0.3 s), entrainment to the lips was significantly anticipated and peaked around 0-0.1 s lag (i.e., when EEG was virtually synchronous to the putative lips movement). Our results demonstrate that neural entrainment during speech listening involves the anticipatory reconstruction of missing information related to lips movement production, indicating its fundamentally predictive nature and thus supporting analysis by synthesis models.


2018 ◽  
Author(s):  
Jonas Vanthornhout ◽  
Lien Decruy ◽  
Jan Wouters ◽  
Jonathan Z. Simon ◽  
Tom Francart

AbstractSpeech intelligibility is currently measured by scoring how well a person can identify a speech signal. The results of such behavioral measures reflect neural processing of the speech signal, but are also influenced by language processing, motivation and memory. Very often electrophysiological measures of hearing give insight in the neural processing of sound. However, in most methods non-speech stimuli are used, making it hard to relate the results to behavioral measures of speech intelligibility. The use of natural running speech as a stimulus in electrophysiological measures of hearing is a paradigm shift which allows to bridge the gap between behavioral and electrophysiological measures. Here, by decoding the speech envelope from the electroencephalogram, and correlating it with the stimulus envelope, we demonstrate an electrophysiological measure of neural processing of running speech. We show that behaviorally measured speech intelligibility is strongly correlated with our electrophysiological measure. Our results pave the way towards an objective and automatic way of assessing neural processing of speech presented through auditory prostheses, reducing confounds such as attention and cognitive capabilities. We anticipate that our electrophysiological measure will allow better differential diagnosis of the auditory system, and will allow the development of closed-loop auditory prostheses that automatically adapt to individual users.


2021 ◽  
Author(s):  
Na Xu ◽  
Baotian Zhao ◽  
Lu Luo ◽  
Kai Zhang ◽  
Xiaoqiu Shao ◽  
...  

The envelope is essential for speech perception. Recent studies have shown that cortical activity can track the acoustic envelope. However, whether the tracking strength reflects the extent of speech intelligibility processing remains controversial. Here, using stereo-electroencephalogram (sEEG) technology, we directly recorded the activity in human auditory cortex while subjects listened to either natural or noise-vocoded speech. These two stimuli have approximately identical envelopes, but the noise-vocoded speech does not have speech intelligibility. We found two stages of envelope tracking in auditory cortex: an early high-γ (60-140 Hz) power stage (delay ≈ 49 ms) that preferred the noise-vocoded speech, and a late θ (4-8 Hz) phase stage (delay ≈ 178 ms) that preferred the natural speech. Furthermore, the decoding performance of high-γ power was better in primary auditory cortex than in non-primary auditory cortex, consistent with its short tracking delay. We also found distinct lateralization effects: high-γ power envelope tracking dominated left auditory cortex, while θ phase showed better decoding performance in right auditory cortex. In sum, we suggested a functional dissociation between high-γ power and θ phase: the former reflects fast and automatic processing of brief acoustic features, while the latter correlates to slow build-up processing facilitated by speech intelligibility.


2021 ◽  
Author(s):  
Mahmoud Keshavarzi ◽  
Enrico Varano ◽  
Tobias Reichenbach

AbstractUnderstanding speech in background noise is a difficult task. The tracking of speech rhythms such as the rate of syllables and words by cortical activity has emerged as a key neural mechanism for speech-in-noise comprehension. In particular, recent investigations have used transcranial alternating current stimulation (tACS) with the envelope of a speech signal to influence the cortical speech tracking, demonstrating that this type of stimulation modulates comprehension and therefore evidencing a functional role of the cortical tracking in speech processing. Cortical activity has been found to track the rhythms of a background speaker as well, but the functional significance of this neural response remains unclear. Here we employ a speech-comprehension task with a target speaker in the presence of a distractor voice to show that tACS with the speech envelope of the target voice as well as tACS with the envelope of the distractor speaker both modulate the comprehension of the target speech.Because the envelope of the distractor speech does not carry information about the target speech stream, the modulation of speech comprehension through tACS with this envelope evidences that the cortical tracking of the background speaker affects the comprehension of the foreground speech signal. The phase dependency of the resulting modulation of speech comprehension is, however, opposite to that obtained from tACS with the envelope of the target speech signal. This suggests that the cortical tracking of the ignored speech stream and that of the attended speech stream may compete for neural resources.Significance StatementLoud environments such as busy pubs or restaurants can make conversation difficult. However, they also allow us to eavesdrop into other conversations that occur in the background. In particular, we often notice when somebody else mentions our name, even if we have not been listening to that person. However, the neural mechanisms by which background speech is processed remain poorly understood. Here we employ transcranial alternating current stimulation, a technique through which neural activity in the cerebral cortex can be influenced, to show that cortical responses to rhythms in the distractor speech modulate the comprehension of the target speaker. Our results evidence that the cortical tracking of background speech rhythms plays a functional role in speech processing.


2019 ◽  
Author(s):  
Peng Zan ◽  
Alessandro Presacco ◽  
Samira Anderson ◽  
Jonathan Z. Simon

AbstractAging is associated with an exaggerated representation of the speech envelope in auditory cortex. The relationship between this age-related exaggerated response and a listener’s ability to understand speech in noise remains an open question. Here, information-theory-based analysis methods are applied to magnetoencephalography (MEG) recordings of human listeners, investigating their cortical responses to continuous speech, using the novel non-linear measure of phase-locked mutual information between the speech stimuli and cortical responses. The cortex of older listeners shows an exaggerated level of mutual information, compared to younger listeners, for both attended and unattended speakers. The mutual information peaks for several distinct latencies: early (∼50 ms), middle (∼100 ms) and late (∼200 ms). For the late component, the neural enhancement of attended over unattended speech is affected by stimulus SNR, but the direction of this dependency is reversed by aging. Critically, in older listeners and for the same late component, greater cortical exaggeration is correlated with decreased behavioral inhibitory control. This negative correlation also carries over to speech intelligibility in noise, where greater cortical exaggeration in older listeners is correlated with worse speech intelligibility scores. Finally, an age-related lateralization difference is also seen for the ∼100 ms latency peaks, where older listeners show a bilateral response compared to younger listeners’ right-lateralization. Thus, this information-theory-based analysis provides new, and less coarse-grained, results regarding age-related change in auditory cortical speech processing, and its correlation with cognitive measures, compared to related linear measures.New & NoteworthyCortical representations of natural speech are investigated using a novel non-linear approach based on mutual information. Cortical responses, phase-locked to the speech envelope, show an exaggerated level of mutual information associated with aging, appearing at several distinct latencies (∼50, ∼100 and ∼200 ms). Critically, for older listeners only, the ∼200 ms latency response components are correlated with specific behavioral measures, including behavioral inhibition and speech comprehension.


2017 ◽  
Vol 114 (24) ◽  
pp. 6352-6357 ◽  
Author(s):  
Geoffrey Brookshire ◽  
Jenny Lu ◽  
Howard C. Nusbaum ◽  
Susan Goldin-Meadow ◽  
Daniel Casasanto

Despite immense variability across languages, people can learn to understand any human language, spoken or signed. What neural mechanisms allow people to comprehend language across sensory modalities? When people listen to speech, electrophysiological oscillations in auditory cortex entrain to slow (<8 Hz) fluctuations in the acoustic envelope. Entrainment to the speech envelope may reflect mechanisms specialized for auditory perception. Alternatively, flexible entrainment may be a general-purpose cortical mechanism that optimizes sensitivity to rhythmic information regardless of modality. Here, we test these proposals by examining cortical coherence to visual information in sign language. First, we develop a metric to quantify visual change over time. We find quasiperiodic fluctuations in sign language, characterized by lower frequencies than fluctuations in speech. Next, we test for entrainment of neural oscillations to visual change in sign language, using electroencephalography (EEG) in fluent speakers of American Sign Language (ASL) as they watch videos in ASL. We find significant cortical entrainment to visual oscillations in sign language <5 Hz, peaking at ∼1 Hz. Coherence to sign is strongest over occipital and parietal cortex, in contrast to speech, where coherence is strongest over the auditory cortex. Nonsigners also show coherence to sign language, but entrainment at frontal sites is reduced relative to fluent signers. These results demonstrate that flexible cortical entrainment to language does not depend on neural processes that are specific to auditory speech perception. Low-frequency oscillatory entrainment may reflect a general cortical mechanism that maximizes sensitivity to informational peaks in time-varying signals.


2021 ◽  
Author(s):  
Marlies Gillis ◽  
Jonas Vanthornhout ◽  
Jonathan Z Simon ◽  
Tom Francart ◽  
Christian Brodbeck

When listening to speech, brain responses time-lock to acoustic events in the stimulus. Recent studies have also reported that cortical responses track linguistic representations of speech. However, tracking of these representations is often described without controlling for acoustic properties. Therefore, the response to these linguistic representations might reflect unaccounted acoustic processing rather than language processing. Here we tested several recently proposed linguistic representations, using audiobook speech, while controlling for acoustic and other linguistic representations. Indeed, some of these linguistic representations were not significantly tracked after controlling for acoustic properties. However, phoneme surprisal, cohort entropy, word surprisal and word frequency were significantly tracked over and beyond acoustic properties. Additionally, these linguistic representations are tracked similarly across different stories, spoken by different readers. Together, this suggests that these representations characterize processing of the linguistic content of speech and might allow a behaviour-free evaluation of the speech intelligibility.


2013 ◽  
Vol 278-280 ◽  
pp. 1124-1128
Author(s):  
Yi Long You ◽  
Fei Zhang ◽  
Bu Lei Zuo ◽  
Feng Xiang You

Although traditional algorithms can led to suppressed voice in the noise, but the distortion of the voice is inevitable. An introduction is made as to the speech signal enhancement with an improved threshold method. Compared MATLAB experimental simulation on simulated platform with traditional enhanced algorithm, this paper aims to verify this method can effectively remove the noise in the signal, enhanced voice quality, improve speech intelligibility, and achieve the effect of the enhanced speech signal.


Sign in / Sign up

Export Citation Format

Share Document