Multiscale integration organizes hierarchical computation in human auditory cortex

AbstractTo derive meaning from sound, the brain must integrate information across tens (e.g. phonemes) to hundreds (e.g. words) of milliseconds, but the neural computations that enable multiscale integration remain unclear. Prior evidence suggests that human auditory cortex analyzes sound using both generic acoustic features (e.g. spectrotemporal modulation) and category-specific computations, but how these putatively distinct computations integrate temporal information is unknown. To answer this question, we developed a novel method to estimate neural integration periods and applied the method to intracranial recordings from human epilepsy patients. We show that integration periods increase three-fold as one ascends the auditory cortical hierarchy. Moreover, we find that electrodes with short integration periods (~50-150 ms) respond selectively to spectrotemporal modulations, while electrodes with long integration periods (~200-300 ms) show prominent selectivity for sound categories such as speech and music. These findings reveal how multiscale temporal analysis organizes hierarchical computation in human auditory cortex.

Download Full-text

Parallel streams define the temporal dynamics of speech processing across human auditory cortex

10.1101/097485 ◽

2016 ◽

Cited By ~ 3

Author(s):

Liberty S. Hamilton ◽

Erik Edwards ◽

Edward F. Chang

Keyword(s):

Auditory Cortex ◽

Speech Processing ◽

Large Scale ◽

Temporal Dynamics ◽

Prosodic Cues ◽

Multiple Dimensions ◽

Parallel Streams ◽

Human Auditory Cortex ◽

Using Data ◽

Intracranial Recordings

AbstractTo derive meaning from speech, we must extract multiple dimensions of concurrent information from incoming speech signals, including phonetic and prosodic cues. Equally important is the detection of acoustic cues that give structure and context to the information we hear, such as sentence boundaries. How the brain organizes this information processing is unknown. Here, using data-driven computational methods on an extensive set of high-density intracranial recordings, we reveal a large-scale partitioning of the entire human speech cortex into two spatially distinct regions that detect important cues for parsing natural speech. These caudal (Zone 1) and rostral (Zone 2) regions work in parallel to detect onsets and prosodic information, respectively, within naturally spoken sentences. In contrast, local processing within each region supports phonetic feature encoding. These findings demonstrate a fundamental organizational property of the human auditory cortex that has been previously unrecognized.

Download Full-text

Phonemic representation in human auditory cortex examined via intracranial recordings

International Journal of Psychophysiology ◽

10.1016/j.ijpsycho.2012.06.137 ◽

2012 ◽

Vol 85 (3) ◽

pp. 340-341

Author(s):

M. Steinschneider

Keyword(s):

Auditory Cortex ◽

Human Auditory Cortex ◽

Intracranial Recordings

Download Full-text

Functional Maps of Human Auditory Cortex: Effects of Acoustic Features and Attention

PLoS ONE ◽

10.1371/journal.pone.0005183 ◽

2009 ◽

Vol 4 (4) ◽

pp. e5183 ◽

Cited By ~ 82

Author(s):

David L. Woods ◽

G. Christopher Stecker ◽

Teemu Rinne ◽

Timothy J. Herron ◽

Anthony D. Cate ◽

...

Keyword(s):

Auditory Cortex ◽

Acoustic Features ◽

Human Auditory Cortex

Download Full-text

A speech envelope landmark for syllable encoding in human superior temporal gyrus

Science Advances ◽

10.1126/sciadv.aay6279 ◽

2019 ◽

Vol 5 (11) ◽

pp. eaay6279 ◽

Cited By ~ 8

Author(s):

Yulia Oganian ◽

Edward F. Chang

Keyword(s):

Temporal Structure ◽

Superior Temporal Gyrus ◽

Rate Of Change ◽

Speech Comprehension ◽

Acoustic Features ◽

Local Maxima ◽

Neural Computations ◽

Absolute Amplitude ◽

Speech Envelope ◽

Intracranial Recordings

The most salient acoustic features in speech are the modulations in its intensity, captured by the amplitude envelope. Perceptually, the envelope is necessary for speech comprehension. Yet, the neural computations that represent the envelope and their linguistic implications are heavily debated. We used high-density intracranial recordings, while participants listened to speech, to determine how the envelope is represented in human speech cortical areas on the superior temporal gyrus (STG). We found that a well-defined zone in middle STG detects acoustic onset edges (local maxima in the envelope rate of change). Acoustic analyses demonstrated that timing of acoustic onset edges cues syllabic nucleus onsets, while their slope cues syllabic stress. Synthesized amplitude-modulated tone stimuli showed that steeper slopes elicited greater responses, confirming cortical encoding of amplitude change, not absolute amplitude. Overall, STG encoding of the timing and magnitude of acoustic onset edges underlies the perception of speech temporal structure.

Download Full-text

Two Stages of Speech Envelope Tracking in Human Auditory Cortex Modulated by Speech Intelligibility

10.1101/2021.12.11.472249 ◽

2021 ◽

Author(s):

Na Xu ◽

Baotian Zhao ◽

Lu Luo ◽

Kai Zhang ◽

Xiaoqiu Shao ◽

...

Keyword(s):

Auditory Cortex ◽

Speech Intelligibility ◽

Primary Auditory Cortex ◽

Acoustic Features ◽

Envelope Tracking ◽

Power Stage ◽

Human Auditory Cortex ◽

Speech Envelope ◽

Two Stages ◽

Vocoded Speech

The envelope is essential for speech perception. Recent studies have shown that cortical activity can track the acoustic envelope. However, whether the tracking strength reflects the extent of speech intelligibility processing remains controversial. Here, using stereo-electroencephalogram (sEEG) technology, we directly recorded the activity in human auditory cortex while subjects listened to either natural or noise-vocoded speech. These two stimuli have approximately identical envelopes, but the noise-vocoded speech does not have speech intelligibility. We found two stages of envelope tracking in auditory cortex: an early high-γ (60-140 Hz) power stage (delay ≈ 49 ms) that preferred the noise-vocoded speech, and a late θ (4-8 Hz) phase stage (delay ≈ 178 ms) that preferred the natural speech. Furthermore, the decoding performance of high-γ power was better in primary auditory cortex than in non-primary auditory cortex, consistent with its short tracking delay. We also found distinct lateralization effects: high-γ power envelope tracking dominated left auditory cortex, while θ phase showed better decoding performance in right auditory cortex. In sum, we suggested a functional dissociation between high-γ power and θ phase: the former reflects fast and automatic processing of brief acoustic features, while the latter correlates to slow build-up processing facilitated by speech intelligibility.

Download Full-text