scholarly journals Delta-band Cortical Tracking of Acoustic and Linguistic Features in Natural Spoken Narratives

2020 ◽  
Author(s):  
Cheng Luo ◽  
Nai Ding

AbstractSpeech contains rich acoustic and linguistic information. During speech comprehension, cortical activity tracks the acoustic envelope of speech. Recent studies also observe cortical tracking of higher-level linguistic units, such as words and phrases, using synthesized speech deprived of delta-band acoustic envelope. It remains unclear, however, how cortical activity jointly encodes the acoustic and linguistic information in natural speech. Here, we investigate the neural encoding of words and demonstrate that delta-band cortical activity tracks the rhythm of multi-syllabic words when naturally listening to narratives. Furthermore, by dissociating the word rhythm from acoustic envelope, we find cortical activity primarily tracks the word rhythm during speech comprehension. When listeners’ attention is diverted, however, neural tracking of words diminishes, and delta-band activity becomes phase locked to the acoustic envelope. These results suggest that large-scale cortical dynamics in the delta band are primarily coupled to the rhythm of linguistic units during natural speech comprehension.

2021 ◽  
Author(s):  
Seung-Goo Kim ◽  
Federico De Martino ◽  
Tobias Overath

Speech comprehension entails the neural mapping of the acoustic speech signal onto learned linguistic units. This acousto-linguistic transformation is bi-directional, whereby higher-level linguistic processes (e.g., semantics) modulate the acoustic analysis of individual linguistic units. Here, we investigated the cortical topography and linguistic modulation of the most fundamental linguistic unit, the phoneme. We presented natural speech and 'phoneme quilts' (pseudo-randomly shuffled phonemes) in either a familiar (English) or unfamiliar (Korean) language to native English speakers while recording fMRI. This design dissociates the contribution of acoustic and linguistic processes towards phoneme analysis. We show that (1) the four main phoneme classes (vowels, nasals, plosives, fricatives) are differentially and topographically encoded in human auditory cortex, and that (2) their acoustic analysis is modulated by linguistic analysis. These results suggest that the linguistic modulation of cortical sensitivity to phoneme classes minimizes prediction error during natural speech perception, thereby aiding speech comprehension in challenging listening situations.


eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
Cheng Luo ◽  
Nai Ding

Speech contains rich acoustic and linguistic information. Using highly controlled speech materials, previous studies have demonstrated that cortical activity is synchronous to the rhythms of perceived linguistic units, for example, words and phrases, on top of basic acoustic features, for example, the speech envelope. When listening to natural speech, it remains unclear, however, how cortical activity jointly encodes acoustic and linguistic information. Here we investigate the neural encoding of words using electroencephalography and observe neural activity synchronous to multi-syllabic words when participants naturally listen to narratives. An amplitude modulation (AM) cue for word rhythm enhances the word-level response, but the effect is only observed during passive listening. Furthermore, words and the AM cue are encoded by spatially separable neural responses that are differentially modulated by attention. These results suggest that bottom-up acoustic cues and top-down linguistic knowledge separately contribute to cortical encoding of linguistic units in spoken narratives.


2020 ◽  
Vol 32 (1) ◽  
pp. 155-166 ◽  
Author(s):  
Hugo Weissbart ◽  
Katerina D. Kandylaki ◽  
Tobias Reichenbach

Speech comprehension requires rapid online processing of a continuous acoustic signal to extract structure and meaning. Previous studies on sentence comprehension have found neural correlates of the predictability of a word given its context, as well as of the precision of such a prediction. However, they have focused on single sentences and on particular words in those sentences. Moreover, they compared neural responses to words with low and high predictability, as well as with low and high precision. However, in speech comprehension, a listener hears many successive words whose predictability and precision vary over a large range. Here, we show that cortical activity in different frequency bands tracks word surprisal in continuous natural speech and that this tracking is modulated by precision. We obtain these results through quantifying surprisal and precision from naturalistic speech using a deep neural network and through relating these speech features to EEG responses of human volunteers acquired during auditory story comprehension. We find significant cortical tracking of surprisal at low frequencies, including the delta band as well as in the higher frequency beta and gamma bands, and observe that the tracking is modulated by the precision. Our results pave the way to further investigate the neurobiology of natural speech comprehension.


2021 ◽  
Author(s):  
Jeremy Giroud ◽  
Jacques Pesnot Lerousseau ◽  
Francois Pellegrino ◽  
Benjamin Morillon

Humans are expert at processing speech but how this feat is accomplished remains a major question in cognitive neuroscience. Capitalizing on the concept of channel capacity, we developed a unified measurement framework to investigate the respective influence of seven acoustic and linguistic features on speech comprehension, encompassing acoustic, sub-lexical, lexical and supra-lexical levels of description. We show that comprehension is independently impacted by all these features, but at varying degrees and with a clear dominance of the syllabic rate. Comparing comprehension of French words and sentences further reveals that when supra-lexical contextual information is present, the impact of all other features is dramatically reduced. Finally, we estimated the channel capacity associated with each linguistic feature and compared them with their generic distribution in natural speech. Our data point towards supra-lexical contextual information as the feature limiting the flow of natural speech. Overall, this study reveals how multilevel linguistic features constrain speech comprehension.


2019 ◽  
Vol 122 (6) ◽  
pp. 2206-2219 ◽  
Author(s):  
A. Alishbayli ◽  
J. G. Tichelaar ◽  
U. Gorska ◽  
M. X. Cohen ◽  
B. Englitz

Understanding the relation between large-scale potentials (M/EEG) and their underlying neural activity can improve the precision of research and clinical diagnosis. Recent insights into cortical dynamics highlighted a state of strongly reduced spike count correlations, termed the asynchronous state (AS). The AS has received considerable attention from experimenters and theorists alike, regarding its implications for cortical dynamics and coding of information. However, how reconcilable are these vanishing correlations in the AS with large-scale potentials such as M/EEG observed in most experiments? Typically the latter are assumed to be based on underlying correlations in activity, in particular between subthreshold potentials. We survey the occurrence of the AS across brain states, regions, and layers and argue for a reconciliation of this seeming disparity: large-scale potentials are either observed, first, at transitions between cortical activity states, which entail transient changes in population firing rate, as well as during the AS, and, second, on the basis of sufficiently large, asynchronous populations that only need to exhibit weak correlations in activity. Cells with no or little spiking activity can contribute to large-scale potentials via their subthreshold currents, while they do not contribute to the estimation of spiking correlations, defining the AS. Furthermore, third, the AS occurs only within particular cortical regions and layers associated with the currently selected modality, allowing for correlations at other times and between other areas and layers.


2020 ◽  
Author(s):  
Michael P. Broderick ◽  
Edmund C. Lalor

AbstractPrior knowledge facilitates perception and allows us to interpret our sensory environment. However, the neural mechanisms underlying this process remain unclear. Theories of predictive coding propose that feedback connections between cortical levels carry predictions about upcoming sensory events whereas feedforward connections carry the error between the prediction and the sensory input. Although predictive coding has gained much ground as a viable mechanism for perception, in the context spoken language comprehension it lacks empirical support using more naturalistic stimuli. In this study, we investigated theories of predictive coding using continuous, everyday speech. EEG recordings from human participants listening to an audiobook were analysed using a 2-stage regression framework. This tested the effect of top-down linguistic information, estimated using computational language models, on the bottom-up encoding of acoustic and phonetic speech features. Our results show enhanced encoding of both semantic predictions and surprising words, based on preceding context. This suggests that signals pertaining to prediction and error units can be observed in the same electrophysiological responses to natural speech. In addition, temporal analysis of these signals reveals support for theories of predictive coding that propose that perception is first biased towards what is expected followed by what is informative.Significance StatementOver the past two decades, predictive coding has grown in popularity as an explanatory mechanism for perception. However, there has been lack of empirical support for this theory in research studying natural speech comprehension. We address this issue by developing an analysis framework that tests the effects of top-down linguistic information on the auditory encoding of continuous speech. Our results provide evidence for the co-existence of prediction and error signals and support theories of predictive coding using more naturalistic stimuli.


2020 ◽  
Vol 6 (3) ◽  
pp. 158-164
Author(s):  
Navruza Yakhyayeva ◽  

The quality and content of information in the article media text is based on scientific classification of linguistic features. The study of functional styles of speech, the identification of their linguistic signs, the discovery of the functional properties of linguistic units and their separation on the basis of linguistic facts is one of thetasks that modern linguistics is waiting for a solution. Text Linguistics, which deals with the creation, modeling of its structure and the study of the process of such activity, is of interest to journalists today as a science.


Author(s):  
Xiang Zhang ◽  
Erjing Lin ◽  
Yulian Lv

In this article, the authors propose a novel search model: Multi-Target Search (MT search in brief). MT search is a keyword-based search model on Semantic Associations in Linked Data. Each search contains multiple sub-queries, in which each sub-query represents a certain user need for a certain object in a group relationship. They first formularize the problem of association search, and then introduce their approach to discover Semantic Associations in large-scale Linked Data. Next, they elaborate their novel search model, the notion of Virtual Document they use to extract linguistic features, and the details of search process. The authors then discuss the way search results are organized and summarized. Quantitative experiments are conducted on DBpedia to validate the effectiveness and efficiency of their approach.


PLoS ONE ◽  
2012 ◽  
Vol 7 (2) ◽  
pp. e30757 ◽  
Author(s):  
Vicente Botella-Soler ◽  
Mario Valderrama ◽  
Benoît Crépon ◽  
Vincent Navarro ◽  
Michel Le Van Quyen

Sign in / Sign up

Export Citation Format

Share Document