scholarly journals Modulating Cortical Instrument Representations During Auditory Stream Segregation and Integration With Polyphonic Music

2021 ◽  
Vol 15 ◽  
Author(s):  
Lars Hausfeld ◽  
Niels R. Disbergen ◽  
Giancarlo Valente ◽  
Robert J. Zatorre ◽  
Elia Formisano

Numerous neuroimaging studies demonstrated that the auditory cortex tracks ongoing speech and that, in multi-speaker environments, tracking of the attended speaker is enhanced compared to the other irrelevant speakers. In contrast to speech, multi-instrument music can be appreciated by attending not only on its individual entities (i.e., segregation) but also on multiple instruments simultaneously (i.e., integration). We investigated the neural correlates of these two modes of music listening using electroencephalography (EEG) and sound envelope tracking. To this end, we presented uniquely composed music pieces played by two instruments, a bassoon and a cello, in combination with a previously validated music auditory scene analysis behavioral paradigm (Disbergen et al., 2018). Similar to results obtained through selective listening tasks for speech, relevant instruments could be reconstructed better than irrelevant ones during the segregation task. A delay-specific analysis showed higher reconstruction for the relevant instrument during a middle-latency window for both the bassoon and cello and during a late window for the bassoon. During the integration task, we did not observe significant attentional modulation when reconstructing the overall music envelope. Subsequent analyses indicated that this null result might be due to the heterogeneous strategies listeners employ during the integration task. Overall, our results suggest that subsequent to a common processing stage, top-down modulations consistently enhance the relevant instrument’s representation during an instrument segregation task, whereas such an enhancement is not observed during an instrument integration task. These findings extend previous results from speech tracking to the tracking of multi-instrument music and, furthermore, inform current theories on polyphonic music perception.

Author(s):  
Neha Banerjee ◽  
Prashanth Prabhu

Background and Aim: The central auditory nervous system has the ability to perceptually group similar sounds and segregates different sounds called auditory stream segregation or auditory streaming or auditory scene analysis. Identification of a change in spectral profile when the amplitude of a component of complex tone is changed is referred to as Spectral profile analysis. It serves as an important cue in auditory stream segregation as the spectra of the sound source vary. The aim of the study was to assess auditory stream segregation in individuals with cochlear pathology (CP) and auditory neuropathy spectrum disorder. Methods: In the present study, three groups of participants were included. Experimental groups included 21 ears in each group with cochlear hearing loss or auditory neuropathy spectrum disorders (ANSD) and control group with 21 ears with normal hearing. Profile analysis was asse­ssed using "mlp" toolbox, which implements a maximum likelihood procedure in MATLAB. It was assessed at four frequencies (250 Hz, 500 Hz, 750 Hz, and 1000 Hz) for all three groups. Results: The results of the study indicate that the profile analysis threshold (at all four frequ­encies) was significantly poorer for individuals with CP or ANSD compared to the control group. Although, cochlear pathology group performed better than ANSD group. Conclusion: This could be because of poor spec­tral and temporal processing due to loss of outer hair cells at the level of the basilar membrane in cochlear pathology patients and due to the demyelination of auditory neurons in individuals with ANSD. Keywords: Auditory stream segregation; auditory scene analysis; spectral profiling; spectral profile analysis; cochlear pathology; auditory neuropathy spectrum disorders


2006 ◽  
Vol 18 (1) ◽  
pp. 1-13 ◽  
Author(s):  
Joel S. Snyder ◽  
Claude Alain ◽  
Terence W. Picton

A general assumption underlying auditory scene analysis is that the initial grouping of acoustic elements is independent of attention. The effects of attention on auditory stream segregation were investigated by recording event-related potentials (ERPs) while participants either attended to sound stimuli and indicated whether they heard one or two streams or watched a muted movie. The stimuli were pure-tone ABA-patterns that repeated for 10.8 sec with a stimulus onset asynchrony between A and B tones of 100 msec in which the A tone was fixed at 500 Hz, the B tone could be 500, 625, 750, or 1000 Hz, and was a silence. In both listening conditions, an enhancement of the auditory-evoked response (P1-N1-P2 and N1c) to the B tone varied with f and correlated with perception of streaming. The ERP from 150 to 250 msec after the beginning of the repeating ABA-patterns became more positive during the course of the trial and was diminished when participants ignored the tones, consistent with behavioral studies indicating that streaming takes several seconds to build up. The N1c enhancement and the buildup over time were larger at right than left temporal electrodes, suggesting a right-hemisphere dominance for stream segregation. Sources in Heschl's gyrus accounted for the ERP modulations related to f-based segregation and buildup. These findings provide evidence for two cortical mechanisms of streaming: automatic segregation of sounds and attention-dependent buildup process that integrates successive tones within streams over several seconds.


2012 ◽  
Vol 107 (9) ◽  
pp. 2366-2382 ◽  
Author(s):  
Yonatan I. Fishman ◽  
Christophe Micheyl ◽  
Mitchell Steinschneider

The ability to detect and track relevant acoustic signals embedded in a background of other sounds is crucial for hearing in complex acoustic environments. This ability is exemplified by a perceptual phenomenon known as “rhythmic masking release” (RMR). To demonstrate RMR, a sequence of tones forming a target rhythm is intermingled with physically identical “Distracter” sounds that perceptually mask the rhythm. The rhythm can be “released from masking” by adding “Flanker” tones in adjacent frequency channels that are synchronous with the Distracters. RMR represents a special case of auditory stream segregation, whereby the target rhythm is perceptually segregated from the background of Distracters when they are accompanied by the synchronous Flankers. The neural basis of RMR is unknown. Previous studies suggest the involvement of primary auditory cortex (A1) in the perceptual organization of sound patterns. Here, we recorded neural responses to RMR sequences in A1 of awake monkeys in order to identify neural correlates and potential mechanisms of RMR. We also tested whether two current models of stream segregation, when applied to these responses, could account for the perceptual organization of RMR sequences. Results suggest a key role for suppression of Distracter-evoked responses by the simultaneous Flankers in the perceptual restoration of the target rhythm in RMR. Furthermore, predictions of stream segregation models paralleled the psychoacoustics of RMR in humans. These findings reinforce the view that preattentive or “primitive” aspects of auditory scene analysis may be explained by relatively basic neural mechanisms at the cortical level.


2015 ◽  
Vol 33 (1) ◽  
pp. 70-82 ◽  
Author(s):  
Claude Alain ◽  
Lori J. Bernstein

Albert Bregman’s (1990) book Auditory Scene Analysis: The Perceptual Organization of Sound has had a tremendous impact on research in auditory neuroscience. Here, we outline some of the accomplishments. This review is not meant to be exhaustive, but rather aims to highlight milestones in the brief history of auditory neuroscience. The steady increase in neuroscience research following the book’s pivotal publication has advanced knowledge about how the brain forms representations of auditory objects. This research has far-reaching societal implications on health and quality of life. For instance, it helped us understand why some people experience difficulties understanding speech in noise, which in turn has led to development of therapeutic interventions. Importantly, the book acts as a catalyst, providing scientists with a common conceptual framework for research in such diverse fields as speech perception, music perception, neurophysiology and computational neuroscience. This interdisciplinary approach to research in audition is one of this book’s legacies.


2017 ◽  
Vol 60 (10) ◽  
pp. 2989-3000 ◽  
Author(s):  
Elyse S. Sussman

Purpose This review article provides a new perspective on the role of attention in auditory scene analysis. Method A framework for understanding how attention interacts with stimulus-driven processes to facilitate task goals is presented. Previously reported data obtained through behavioral and electrophysiological measures in adults with normal hearing are summarized to demonstrate attention effects on auditory perception—from passive processes that organize unattended input to attention effects that act at different levels of the system. Data will show that attention can sharpen stream organization toward behavioral goals, identify auditory events obscured by noise, and limit passive processing capacity. Conclusions A model of attention is provided that illustrates how the auditory system performs multilevel analyses that involve interactions between stimulus-driven input and top-down processes. Overall, these studies show that (a) stream segregation occurs automatically and sets the basis for auditory event formation; (b) attention interacts with automatic processing to facilitate task goals; and (c) information about unattended sounds is not lost when selecting one organization over another. Our results support a neural model that allows multiple sound organizations to be held in memory and accessed simultaneously through a balance of automatic and task-specific processes, allowing flexibility for navigating noisy environments with competing sound sources. Presentation Video http://cred.pubs.asha.org/article.aspx?articleid=2601618


2009 ◽  
Vol 101 (6) ◽  
pp. 3212-3225 ◽  
Author(s):  
Naoya Itatani ◽  
Georg M. Klump

Streaming in auditory scene analysis refers to the perceptual grouping of multiple interleaved sounds having similar characteristics while sounds with different characteristics are segregated. In human perception, auditory streaming occurs on the basis of temporal features of sounds such as the rate of amplitude modulation. We present results from multiunit recordings in the auditory forebrain of awake European starlings ( Sturnus vulgaris) on the representation of sinusoidally amplitude modulated (SAM) tones to investigate the effect of temporal envelope structure on neural stream segregation. Different types of rate modulation transfer functions in response to SAM tones were observed. The strongest responses were found for modulation frequencies (fmod) <160 Hz. The streaming stimulus consisted of sequences of alternating SAM tones with the same carrier frequency but differing in fmod (ABA-ABA-ABA-…). A signals had a modulation frequency evoking a large excitation, whereas the fmod of B signals was ≤4 octaves higher. Synchrony of B signal responses to the modulation decreased as fmod increased. Spike rate in response to B signals dropped as fmod increased. Faster signal repetition resulted in fewer spikes, suggesting the contribution of forward suppression to the response that may be due to both signals having similar spectral energy and that is not related to the temporal pattern of modulation. These two effects are additive and may provide the basis for a more separated representation of A and B signals by two populations of neurons that can be viewed as a neuronal correlate of segregated streams.


2021 ◽  
Vol 12 ◽  
Author(s):  
Kai Siedenburg ◽  
Kirsten Goldmann ◽  
Steven van de Par

Auditory scene analysis is an elementary aspect of music perception, yet only little research has scrutinized auditory scene analysis under realistic musical conditions with diverse samples of listeners. This study probed the ability of younger normal-hearing listeners and older hearing-aid users in tracking individual musical voices or lines in JS Bach's The Art of the Fugue. Five-second excerpts with homogeneous or heterogenous instrumentation of 2–4 musical voices were presented from spatially separated loudspeakers and preceded by a short cue for signaling the target voice. Listeners tracked the cued voice and detected whether an amplitude modulation was imposed on the cued voice or a distractor voice. Results indicated superior performance of young normal-hearing listeners compared to older hearing-aid users. Performance was generally better in conditions with fewer voices. For young normal-hearing listeners, there was interaction between the number of voices and the instrumentation: performance degraded less drastically with an increase in the number of voices for timbrally heterogeneous mixtures compared to homogeneous mixtures. Older hearing-aid users generally showed smaller effects of the number of voices and instrumentation, but no interaction between the two factors. Moreover, tracking performance of older hearing aid users did not differ when these participants did or did not wear hearing aids. These results shed light on the role of timbral differentiation in musical scene analysis and suggest reduced musical scene analysis abilities of older hearing-impaired listeners in a realistic musical scenario.


Sign in / Sign up

Export Citation Format

Share Document