scholarly journals Neural tracking as an objective measure of auditory perception and speech intelligibility

2021 ◽  
Author(s):  
Jana Van Canneyt ◽  
Marlies Gillis ◽  
Jonas Vanthornhout ◽  
Tom Francart

The neural tracking framework enables the analysis of neural responses (EEG) to continuous natural speech, e.g., a story or a podcast. This allows for objective investigation of a range of auditory and linguistic processes in the brain during natural speech perception. This approach is more ecologically valid than traditional auditory evoked responses and has great potential for both research and clinical applications. In this article, we review the neural tracking framework and highlight three prominent examples of neural tracking analyses. This includes the neural tracking of the fundamental frequency of the voice (f0), the speech envelope and linguistic features. Each of these analyses provides a unique point of view into the hierarchical stages of speech processing in the human brain. f0-tracking assesses the encoding of fine temporal information in the early stages of the auditory pathway, i.e. from the auditory periphery up to early processing in the primary auditory cortex. This fundamental processing in (mostly) subcortical stages forms the foundation of speech perception in the cortex. Envelope tracking reflects bottom-up and top-down speech-related processes in the auditory cortex, and is likely necessary but not sufficient for speech intelligibility. To study neural processes more directly related to speech intelligibility, neural tracking of linguistic features can be used. This analysis focuses on the encoding of linguistic features (e.g. word or phoneme surprisal) in the brain. Together these analyses form a multi-faceted and time-effective objective assessment of the auditory and linguistic processing of an individual.

2018 ◽  
Author(s):  
Eline Verschueren ◽  
Jonas Vanthornhout ◽  
Tom Francart

ABSTRACTObjectivesRecently an objective measure of speech intelligibility, based on brain responses derived from the electroencephalogram (EEG), has been developed using isolated Matrix sentences as a stimulus. We investigated whether this objective measure of speech intelligibility can also be used with natural speech as a stimulus, as this would be beneficial for clinical applications.DesignWe recorded the EEG in 19 normal-hearing participants while they listened to two types of stimuli: Matrix sentences and a natural story. Each stimulus was presented at different levels of speech intelligibility by adding speech weighted noise. Speech intelligibility was assessed in two ways for both stimuli: (1) behaviorally and (2) objectively by reconstructing the speech envelope from the EEG using a linear decoder and correlating it with the acoustic envelope. We also calculated temporal response functions (TRFs) to investigate the temporal characteristics of the brain responses in the EEG channels covering different brain areas.ResultsFor both stimulus types the correlation between the speech envelope and the reconstructed envelope increased with increasing speech intelligibility. In addition, correlations were higher for the natural story than for the Matrix sentences. Similar to the linear decoder analysis, TRF amplitudes increased with increasing speech intelligibility for both stimuli. Remarkable is that although speech intelligibility remained unchanged in the no noise and +2.5 dB SNR condition, neural speech processing was affected by the addition of this small amount of noise: TRF amplitudes across the entire scalp decreased between 0 to 150 ms, while amplitudes between 150 to 200 ms increased in the presence of noise. TRF latency changes in function of speech intelligibility appeared to be stimulus specific: The latency of the prominent negative peak in the early responses (50-300 ms) increased with increasing speech intelligibility for the Matrix sentences, but remained unchanged for the natural story.ConclusionsThese results show (1) the feasibility of natural speech as a stimulus for the objective measure of speech intelligibility, (2) that neural tracking of speech is enhanced using a natural story compared to Matrix sentences and (3) that noise and the stimulus type can change the temporal characteristics of the brain responses. These results might reflect the integration of incoming acoustic features and top-down information, suggesting that the choice of the stimulus has to be considered based on the intended purpose of the measurement.


2011 ◽  
Vol 105 (2) ◽  
pp. 582-600 ◽  
Author(s):  
Pingbo Yin ◽  
Jeffrey S. Johnson ◽  
Kevin N. O'Connor ◽  
Mitchell L. Sutter

Conflicting results have led to different views about how temporal modulation is encoded in primary auditory cortex (A1). Some studies find a substantial population of neurons that change firing rate without synchronizing to temporal modulation, whereas other studies fail to see these nonsynchronized neurons. As a result, the role and scope of synchronized temporal and nonsynchronized rate codes in AM processing in A1 remains unresolved. We recorded A1 neurons' responses in awake macaques to sinusoidal AM noise. We find most (37–78%) neurons synchronize to at least one modulation frequency (MF) without exhibiting nonsynchronized responses. However, we find both exclusively nonsynchronized neurons (7–29%) and “mixed-mode” neurons (13–40%) that synchronize to at least one MF and fire nonsynchronously to at least one other. We introduce new measures for modulation encoding and temporal synchrony that can improve the analysis of how neurons encode temporal modulation. These include comparing AM responses to the responses to unmodulated sounds, and a vector strength measure that is suitable for single-trial analysis. Our data support a transformation from a temporally based population code of AM to a rate-based code as information ascends the auditory pathway. The number of mixed-mode neurons found in A1 indicates this transformation is not yet complete, and A1 neurons may carry multiplexed temporal and rate codes.


2009 ◽  
Vol 102 (3) ◽  
pp. 1483-1490 ◽  
Author(s):  
Francois D. Szymanski ◽  
Jose A. Garcia-Lazaro ◽  
Jan W. H. Schnupp

Neurons in primary auditory cortex (A1) are known to exhibit a phenomenon known as stimulus-specific adaptation (SSA), which means that, when tested with pure tones, they will respond more strongly to a particular frequency if it is presented as a rare, unexpected “oddball” stimulus than when the same stimulus forms part of a series of common, “standard” stimuli. Although SSA has occasionally been observed in midbrain neurons that form part of the paraleminscal auditory pathway, it is thought to be weak, rare, or nonexistent among neurons of the leminscal pathway that provide the main afferent input to A1, so that SSA seen in A1 is likely generated within A1 by local mechanisms. To study the contributions that neural processing within the different cytoarchitectonic layers of A1 may make to SSA, we recorded local field potentials in A1 of the rat in response to standard and oddball tones and subjected these to current source density analysis. Although our results show that SSA can be observed throughout all layers of A1, right from the earliest part of the response, there are nevertheless significant differences between layers, with SSA becoming significantly stronger as stimulus-related activity passes from the main thalamorecipient layers III and IV to layer V.


2014 ◽  
Vol 369 (1651) ◽  
pp. 20130297 ◽  
Author(s):  
Jeremy I. Skipper

What do we hear when someone speaks and what does auditory cortex (AC) do with that sound? Given how meaningful speech is, it might be hypothesized that AC is most active when other people talk so that their productions get decoded. Here, neuroimaging meta-analyses show the opposite: AC is least active and sometimes deactivated when participants listened to meaningful speech compared to less meaningful sounds. Results are explained by an active hypothesis-and-test mechanism where speech production (SP) regions are neurally re-used to predict auditory objects associated with available context. By this model, more AC activity for less meaningful sounds occurs because predictions are less successful from context, requiring further hypotheses be tested. This also explains the large overlap of AC co-activity for less meaningful sounds with meta-analyses of SP. An experiment showed a similar pattern of results for non-verbal context. Specifically, words produced less activity in AC and SP regions when preceded by co-speech gestures that visually described those words compared to those words without gestures. Results collectively suggest that what we ‘hear’ during real-world speech perception may come more from the brain than our ears and that the function of AC is to confirm or deny internal predictions about the identity of sounds.


2021 ◽  
Author(s):  
Swapna Agarwalla ◽  
Sharba Bandyopadhyay

Syllable sequences in male mouse ultrasonic-vocalizations (USVs), songs, contain structure - quantified through predictability, like birdsong and aspects of speech. Apparent USV innateness and lack of learnability, discount mouse USVs for modelling speech-like social communication and its deficits. Informative contextual natural sequences (SN) were theoretically extracted and they were preferred by female mice. Primary auditory cortex (A1) supragranular neurons show differential selectivity to the same syllables in SN and random sequences (SR). Excitatory neurons (EXNs) in females showed increases in selectivity to whole SNs over SRs based on extent of social exposure with male, but syllable selectivity remained unchanged. Thus mouse A1 single neurons adaptively represent entire order of acoustic units without altering selectivity of individual units, fundamental to speech perception. Additionally, observed plasticity was replicated with silencing of somatostatin positive neurons, which had plastic effects opposite to EXNs, thus pointing out possible pathways involved in perception of sound sequences.


2020 ◽  
Author(s):  
Gavin M. Bidelman ◽  
Sara Momtaz

ABSTRACTScalp-recorded frequency-following responses (FFRs) reflect a mixture of phase-locked activity across the auditory pathway. FFRs have been widely used as a neural barometer of complex listening skills, especially speech-in noise (SIN) perception. Applying individually optimized source reconstruction to speech-FFRs recorded via EEG (FFREEG), we assessed the relative contributions of subcortical [auditory nerve (AN), brainstem/midbrain (BS)] and cortical [bilateral primary auditory cortex, PAC] source generators with the aim of identifying which source(s) drive the brain-behavior relation between FFRs and SIN listening skills. We found FFR strength declined precipitously from AN to PAC, consistent with diminishing phase-locking along the ascending auditory neuroaxis. FFRs to the speech fundamental (F0) were robust to noise across sources, but were largest in subcortical sources (BS > AN > PAC). PAC FFRs were only weakly observed above the noise floor and only at the low pitch of speech (F0≈100 Hz). Brain-behavior regressions revealed (i) AN and BS FFRs were sufficient to describe listeners’ QuickSIN scores and (ii) contrary to neuromagnetic (MEG) FFRs, neither left nor right PAC FFREEG predicted SIN performance. Our preliminary findings suggest subcortical sources not only dominate the electrical FFR but also the link between speech-FFRs and SIN processing as observed in previous EEG studies.


2020 ◽  
Author(s):  
Emmanuel Biau ◽  
Danying Wang ◽  
Hyojin Park ◽  
Ole Jensen ◽  
Simon Hanslmayr

ABSTRACTAudiovisual speech perception relies, among other things, on our expertise to map a speaker’s lip movements with speech sounds. This multimodal matching is facilitated by salient syllable features that align lip movements and acoustic envelope signals in the 4 - 8 Hz theta band. Although non-exclusive, the predominance of theta rhythms in speech processing has been firmly established by studies showing that neural oscillations track the acoustic envelope in the primary auditory cortex. Equivalently, theta oscillations in the visual cortex entrain to lip movements, and the auditory cortex is recruited during silent speech perception. These findings suggest that neuronal theta oscillations may play a functional role in organising information flow across visual and auditory sensory areas. We presented silent speech movies while participants performed a pure tone detection task to test whether entrainment to lip movements directs the auditory system and drives behavioural outcomes. We showed that auditory detection varied depending on the ongoing theta phase conveyed by lip movements in the movies. In a complementary experiment presenting the same movies while recording participants’ electro-encephalogram (EEG), we found that silent lip movements entrained neural oscillations in the visual and auditory cortices with the visual phase leading the auditory phase. These results support the idea that the visual cortex entrained by lip movements filtered the sensitivity of the auditory cortex via theta phase synchronisation.


2011 ◽  
Vol 106 (2) ◽  
pp. 849-859 ◽  
Author(s):  
Edward L. Bartlett ◽  
Srivatsun Sadagopan ◽  
Xiaoqin Wang

The frequency resolution of neurons throughout the ascending auditory pathway is important for understanding how sounds are processed. In many animal studies, the frequency tuning widths are about 1/5th octave wide in auditory nerve fibers and much wider in auditory cortex neurons. Psychophysical studies show that humans are capable of discriminating far finer frequency differences. A recent study suggested that this is perhaps attributable to fine frequency tuning of neurons in human auditory cortex (Bitterman Y, Mukamel R, Malach R, Fried I, Nelken I. Nature 451: 197–201, 2008). We investigated whether such fine frequency tuning was restricted to human auditory cortex by examining the frequency tuning width in the awake common marmoset monkey. We show that 27% of neurons in the primary auditory cortex exhibit frequency tuning that is finer than the typical frequency tuning of the auditory nerve and substantially finer than previously reported cortical data obtained from anesthetized animals. Fine frequency tuning is also present in 76% of neurons of the auditory thalamus in awake marmosets. Frequency tuning was narrower during the sustained response compared to the onset response in auditory cortex neurons but not in thalamic neurons, suggesting that thalamocortical or intracortical dynamics shape time-dependent frequency tuning in cortex. These findings challenge the notion that the fine frequency tuning of auditory cortex is unique to human auditory cortex and that it is a de novo cortical property, suggesting that the broader tuning observed in previous animal studies may arise from the use of anesthesia during physiological recordings or from species differences.


1973 ◽  
Vol 38 (3) ◽  
pp. 320-325 ◽  
Author(s):  
Ronald R. Tasker ◽  
L. W. Organ

✓ Auditory hallucinations were produced by electrical stimulation of the human upper brain stem during stereotaxic operations. The responses were confined to stimulation of the inferior colliculus, brachium of the inferior colliculus, medial geniculate body, and auditory radiations. Anatomical confirmation of an auditory site was obtained in one patient. The hallucination produced was a low-pitched nonspecific auditory “paresthesia” independent of the structure stimulated, the conditions of stimulation, or sonotopic factors. The effect was identical to that reported from stimulating the primary auditory cortex, and virtually all responses were contralateral. These observations have led to the following generalizations concerning electrical stimulation of the somesthetic, auditory, vestibular, and visual pathways within the human brain stem: the hallucination induced in each is the response to comparable conditions of stimulation, is nonspecific, independent of stimulation site, confined to the primary pathway concerned, chiefly contralateral, and identical to that induced by stimulating the corresponding primary auditory cortex. No sensory responses are found in the brain stem corresponding to those from the sensory association cortex.


Sign in / Sign up

Export Citation Format

Share Document