scholarly journals Cortical encoding of acoustic and linguistic rhythms in spoken narratives

eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
Cheng Luo ◽  
Nai Ding

Speech contains rich acoustic and linguistic information. Using highly controlled speech materials, previous studies have demonstrated that cortical activity is synchronous to the rhythms of perceived linguistic units, for example, words and phrases, on top of basic acoustic features, for example, the speech envelope. When listening to natural speech, it remains unclear, however, how cortical activity jointly encodes acoustic and linguistic information. Here we investigate the neural encoding of words using electroencephalography and observe neural activity synchronous to multi-syllabic words when participants naturally listen to narratives. An amplitude modulation (AM) cue for word rhythm enhances the word-level response, but the effect is only observed during passive listening. Furthermore, words and the AM cue are encoded by spatially separable neural responses that are differentially modulated by attention. These results suggest that bottom-up acoustic cues and top-down linguistic knowledge separately contribute to cortical encoding of linguistic units in spoken narratives.

2020 ◽  
Author(s):  
Cheng Luo ◽  
Nai Ding

AbstractSpeech contains rich acoustic and linguistic information. During speech comprehension, cortical activity tracks the acoustic envelope of speech. Recent studies also observe cortical tracking of higher-level linguistic units, such as words and phrases, using synthesized speech deprived of delta-band acoustic envelope. It remains unclear, however, how cortical activity jointly encodes the acoustic and linguistic information in natural speech. Here, we investigate the neural encoding of words and demonstrate that delta-band cortical activity tracks the rhythm of multi-syllabic words when naturally listening to narratives. Furthermore, by dissociating the word rhythm from acoustic envelope, we find cortical activity primarily tracks the word rhythm during speech comprehension. When listeners’ attention is diverted, however, neural tracking of words diminishes, and delta-band activity becomes phase locked to the acoustic envelope. These results suggest that large-scale cortical dynamics in the delta band are primarily coupled to the rhythm of linguistic units during natural speech comprehension.


Author(s):  
Alvin Cheng-Hsien Chen

AbstractIn this study, we aim to demonstrate the effectiveness of network science in exploring the emergence of constructional semantics from the connectedness and relationships between linguistic units. With Mandarin locative constructions (MLCs) as a case study, we extracted constructional tokens from a representative corpus, including their respective space particles (SPs) and the head nouns of the landmarks (LMs), which constitute the nodes of the network. We computed edges based on the lexical similarities of word embeddings learned from large text corpora and the SP-LM contingency from collostructional analysis. We address three issues: (1) For each LM, how prototypical is it of the meaning of the SP? (2) For each SP, how semantically cohesive are its LM exemplars? (3) What are the emerging semantic fields from the constructional network of MLCs? We address these questions by examining the quantitative properties of the network at three levels: microscopic (i.e., node centrality and local clustering coefficient), mesoscopic (i.e., community) and macroscopic properties (i.e., small-worldness and scale-free). Our network analyses bring to the foreground the importance of repeated language experiences in the shaping and entrenchment of linguistic knowledge.


2009 ◽  
Vol 102 (3) ◽  
pp. 1606-1622 ◽  
Author(s):  
Paweł Kuśmierek ◽  
Josef P. Rauschecker

Responses of neural units in two areas of the medial auditory belt (middle medial area [MM] and rostral medial area [RM]) were tested with tones, noise bursts, monkey calls (MC), and environmental sounds (ES) in microelectrode recordings from two alert rhesus monkeys. For comparison, recordings were also performed from two core areas (primary auditory area [A1] and rostral area [R]) of the auditory cortex. All four fields showed cochleotopic organization, with best (center) frequency [BF(c)] gradients running in opposite directions in A1 and MM than in R and RM. The medial belt was characterized by a stronger preference for band-pass noise than for pure tones found medially to the core areas. Response latencies were shorter for the two more posterior (middle) areas MM and A1 than for the two rostral areas R and RM, reaching values as low as 6 ms for high BF(c) in MM and A1, and strongly depended on BF(c). The medial belt areas exhibited a higher selectivity to all stimuli, in particular to noise bursts, than the core areas. An increased selectivity to tones and noise bursts was also found in the anterior fields; the opposite was true for highly temporally modulated ES. Analysis of the structure of neural responses revealed that neurons were driven by low-level acoustic features in all fields. Thus medial belt areas RM and MM have to be considered early stages of auditory cortical processing. The anteroposterior difference in temporal processing indices suggests that R and RM may belong to a different hierarchical level or a different computational network than A1 and MM.


2020 ◽  
Vol 32 (12) ◽  
pp. 2260-2271
Author(s):  
Cécile J. Bouvet ◽  
Benoît G. Bardy ◽  
Peter E. Keller ◽  
Simone Dalla Bella ◽  
Sylvie Nozaradan ◽  
...  

Human rhythmic movements spontaneously synchronize with auditory rhythms at various frequency ratios. The emergence of more complex relationships—for instance, frequency ratios of 1:2 and 1:3—is enhanced by adding a congruent accentuation pattern (binary for 1:2 and ternary for 1:3), resulting in a 1:1 movement–accentuation relationship. However, this benefit of accentuation on movement synchronization appears to be stronger for the ternary pattern than for the binary pattern. Here, we investigated whether this difference in accent-induced movement synchronization may be related to a difference in the neural tracking of these accentuation profiles. Accented and control unaccented auditory sequences were presented to participants who concurrently produced finger taps at their preferred frequency, and spontaneous movement synchronization was measured. EEG was recorded during passive listening to each auditory sequence. The results revealed that enhanced movement synchronization with ternary accentuation was accompanied by enhanced neural tracking of this pattern. Larger EEG responses at the accentuation frequency were found for the ternary pattern compared with the binary pattern. Moreover, the amplitude of accent-induced EEG responses was positively correlated with the magnitude of accent-induced movement synchronization across participants. Altogether, these findings show that the dynamics of spontaneous auditory–motor synchronization is strongly driven by the multi-time-scale sensory processing of auditory rhythms, highlighting the importance of considering neural responses to rhythmic sequences for understanding and enhancing synchronization performance.


eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
Peiqing Jin ◽  
Yuhan Lu ◽  
Nai Ding

Chunking is a key mechanism for sequence processing. Studies on speech sequences have suggested low-frequency cortical activity tracks spoken phrases, that is, chunks of words defined by tacit linguistic knowledge. Here, we investigate whether low-frequency cortical activity reflects a general mechanism for sequence chunking and can track chunks defined by temporarily learned artificial rules. The experiment records magnetoencephalographic (MEG) responses to a sequence of spoken words. To dissociate word properties from the chunk structures, two tasks separately require listeners to group pairs of semantically similar or semantically dissimilar words into chunks. In the MEG spectrum, a clear response is observed at the chunk rate. More importantly, the chunk-rate response is task-dependent. It is phase locked to chunk boundaries, instead of the semantic relatedness between words. The results strongly suggest that cortical activity can track chunks constructed based on task-related rules and potentially reflects a general mechanism for chunk-level representations.


2021 ◽  
Vol 72 ◽  
pp. 1343-1384
Author(s):  
Vassilina Nikoulina ◽  
Maxat Tezekbayev ◽  
Nuradil Kozhakhmet ◽  
Madina Babazhanova ◽  
Matthias Gallé ◽  
...  

There is an ongoing debate in the NLP community whether modern language models contain linguistic knowledge, recovered through so-called probes. In this paper, we study whether linguistic knowledge is a necessary condition for the good performance of modern language models, which we call the rediscovery hypothesis. In the first place, we show that language models that are significantly compressed but perform well on their pretraining objectives retain good scores when probed for linguistic structures. This result supports the rediscovery hypothesis and leads to the second contribution of our paper: an information-theoretic framework that relates language modeling objectives with linguistic information. This framework also provides a metric to measure the impact of linguistic information on the word prediction task. We reinforce our analytical results with various experiments, both on synthetic and on real NLP tasks in English.


2021 ◽  
Author(s):  
Nicholas J Audette ◽  
WenXi Zhou ◽  
David M Schneider

Many of the sensations experienced by an organism are caused by their own actions, and accurately anticipating both the sensory features and timing of self-generated stimuli is crucial to a variety of behaviors. In the auditory cortex, neural responses to self-generated sounds exhibit frequency-specific suppression, suggesting that movement-based predictions may be implemented early in sensory processing. Yet it remains unknown whether this modulation results from a behaviorally specific and temporally precise prediction, nor is it known whether corresponding expectation signals are present locally in the auditory cortex. To address these questions, we trained mice to expect the precisely timed acoustic outcome of a forelimb movement using a closed-loop sound-generating lever. Dense neuronal recordings in the auditory cortex revealed suppression of responses to self-generated sounds that was specific to the expected acoustic features, specific to a precise time within the movement, and specific to the movement that was coupled to sound during training. Predictive suppression was concentrated in L2/3 and L5, where deviations from expectation also recruited a population of prediction-error neurons that was otherwise unresponsive. Recording in the absence of sound revealed abundant movement signals in deep layers that were biased toward neurons tuned to the expected sound, as well as temporal expectation signals that were present throughout the cortex and peaked at the time of expected auditory feedback. Together, these findings reveal that predictive processing in the mouse auditory cortex is consistent with a learned internal model linking a specific action to its temporally precise acoustic outcome, while identifying distinct populations of neurons that anticipate expected stimuli and differentially process expected versus unexpected outcomes.


2021 ◽  
Author(s):  
Octave Etard ◽  
Rémy Ben Messaoud ◽  
Gabriel Gaugain ◽  
Tobias Reichenbach

AbstractSpeech and music are spectro-temporally complex acoustic signals that a highly relevant for humans. Both contain a temporal fine structure that is encoded in the neural responses of subcortical and cortical processing centres. The subcortical response to the temporal fine structure of speech has recently been shown to be modulated by selective attention to one of two competing voices. Music similarly often consists of several simultaneous melodic lines, and a listener can selectively attend to a particular one at a time. However, the neural mechanisms that enable such selective attention remain largely enigmatic, not least since most investigations to date have focussed on short and simplified musical stimuli. Here we study the neural encoding of classical musical pieces in human volunteers, using scalp electroencephalography (EEG) recordings. We presented volunteers with continuous musical pieces composed of one or two instruments. In the latter case, the participants were asked to selectively attend to one of the two competing instruments and to perform a vibrato identification task. We used linear encoding and decoding models to relate the recorded EEG activity to the stimulus waveform. We show that we can measure neural responses to the temporal fine structure of melodic lines played by one single instrument, at the population level as well as for most individual subjects. The neural response peaks at a latency of 7.6 ms and is not measurable past 15 ms. When analysing the neural responses elicited by competing instruments, we find no evidence of attentional modulation. Our results show that, much like speech, the temporal fine structure of music is tracked by neural activity. In contrast to speech, however, this response appears unaffected by selective attention in the context of our experiment.


2013 ◽  
Vol 416-417 ◽  
pp. 1552-1557
Author(s):  
Xiao Xu Hu

Hypothesis combination is a main method to improve the performance of machine translation (MT) system. The state-of-the-arts strategies include sentence-level and word-level methods, which has its own advantages and disadvantages. And, the current strategies mainly depends on the statistical method with little guidance from the rich linguistic knowledge. This paper propose hybrid framework to combine the ability of the sentence-level and word-level methods. In word-level stage, the method select the well translated words according to its part-of-speech and translation ability of this part-of-speech of the MT system which generate this word. The experimental results with different MT systems proves the effectiveness of this approach.


Sign in / Sign up

Export Citation Format

Share Document