auditory grouping
Recently Published Documents


TOTAL DOCUMENTS

65
(FIVE YEARS 11)

H-INDEX

20
(FIVE YEARS 2)

2022 ◽  
Vol 15 ◽  
Author(s):  
Yonghee Oh ◽  
Jillian C. Zuwala ◽  
Caitlin M. Salvagno ◽  
Grace A. Tilbrook

In multi-talker listening environments, the culmination of different voice streams may lead to the distortion of each source’s individual message, causing deficits in comprehension. Voice characteristics, such as pitch and timbre, are major dimensions of auditory perception and play a vital role in grouping and segregating incoming sounds based on their acoustic properties. The current study investigated how pitch and timbre cues (determined by fundamental frequency, notated as F0, and spectral slope, respectively) can affect perceptual integration and segregation of complex-tone sequences within an auditory streaming paradigm. Twenty normal-hearing listeners participated in a traditional auditory streaming experiment using two alternating sequences of harmonic tone complexes A and B with manipulating F0 and spectral slope. Grouping ranges, the F0/spectral slope ranges over which auditory grouping occurs, were measured with various F0/spectral slope differences between tones A and B. Results demonstrated that the grouping ranges were maximized in the absence of the F0/spectral slope differences between tones A and B and decreased by 2 times as their differences increased to ±1-semitone F0 and ±1-dB/octave spectral slope. In other words, increased differences in either F0 or spectral slope allowed listeners to more easily distinguish between harmonic stimuli, and thus group them together less. These findings suggest that pitch/timbre difference cues play an important role in how we perceive harmonic sounds in an auditory stream, representing our ability to group or segregate human voices in a multi-talker listening environment.


2021 ◽  
Vol 150 (4) ◽  
pp. A153-A153
Author(s):  
Destinee Halverson ◽  
Kaylah Lalonde

2021 ◽  
Author(s):  
Joseph Alexander Sollini ◽  
Katarina C Poole ◽  
Jennifer Kim Bizley

To form complex representations of sounds, i.e. auditory objects, the auditory system needs to make decisions about which information is part of one object and which is part of another. These decisions are usually taken over a short period of time at the beginning of the sound, known as build-up, by using the available grouping cues. Here we investigate the use of temporal coherence and temporal stability to influence subsequent grouping. We show that these two grouping cues behave independently from one another and except when put into conflict. In these situations, the cues available during the build-up period determine subsequent perception.


2021 ◽  
Vol 38 (5) ◽  
pp. 473-498
Author(s):  
Manda Fischer ◽  
Kit Soden ◽  
Etienne Thoret ◽  
Marcel Montrey ◽  
Stephen McAdams

Timbre perception and auditory grouping principles can provide a theoretical basis for aspects of orchestration. In Experiment 1, 36 excerpts contained two streams and 12 contained one stream as determined by music analysts. Streams—the perceptual connecting of successive events—comprised either single instruments or blended combinations of instruments from the same or different families. Musicians and nonmusicians rated the degree of segregation perceived in the excerpts. Heterogeneous instrument combinations between streams yielded greater segregation than did homogeneous ones. Experiment 2 presented the individual streams from each two-stream excerpt. Blend ratings on isolated individual streams from the two-stream excerpts did not predict global segregation between streams. In Experiment 3, Experiment 1 excerpts were reorchestrated with only string instruments to determine the relative contribution of timbre to segregation beyond other musical cues. Decreasing timbral differences reduced segregation ratings. Acoustic and score-based descriptors were extracted from the recordings and scores, respectively, to statistically quantify the factors involved in these effects. Instrument family, part crossing, consonance, spectral factors related to timbre, and onset synchrony all played a role, providing evidence of how timbral differences enhance segregation in orchestral music.


2020 ◽  
Author(s):  
Emma Holmes ◽  
Peter Zeidman ◽  
Karl J Friston ◽  
Timothy D Griffiths

Abstract In our everyday lives, we are often required to follow a conversation when background noise is present (“speech-in-noise” [SPIN] perception). SPIN perception varies widely—and people who are worse at SPIN perception are also worse at fundamental auditory grouping, as assessed by figure-ground tasks. Here, we examined the cortical processes that link difficulties with SPIN perception to difficulties with figure-ground perception using functional magnetic resonance imaging. We found strong evidence that the earliest stages of the auditory cortical hierarchy (left core and belt areas) are similarly disinhibited when SPIN and figure-ground tasks are more difficult (i.e., at target-to-masker ratios corresponding to 60% rather than 90% performance)—consistent with increased cortical gain at lower levels of the auditory hierarchy. Overall, our results reveal a common neural substrate for these basic (figure-ground) and naturally relevant (SPIN) tasks—which provides a common computational basis for the link between SPIN perception and fundamental auditory grouping.


2020 ◽  
Vol 63 (7) ◽  
pp. 2141-2161
Author(s):  
Jonathan H. Venezia ◽  
Marjorie R. Leek ◽  
Michael P. Lindeman

Purpose Age-related declines in auditory temporal processing and cognition make older listeners vulnerable to interference from competing speech. This vulnerability may be increased in older listeners with sensorineural hearing loss due to additional effects of spectral distortion and accelerated cognitive decline. The goal of this study was to uncover differences between older hearing-impaired (OHI) listeners and older normal-hearing (ONH) listeners in the perceptual encoding of competing speech signals. Method Age-matched groups of 10 OHI and 10 ONH listeners performed the coordinate response measure task with a synthetic female target talker and a male competing talker at a target-to-masker ratio of +3 dB. Individualized gain was provided to OHI listeners. Each listener completed 50 baseline and 800 “bubbles” trials in which randomly selected segments of the speech modulation power spectrum (MPS) were retained on each trial while the remainder was filtered out. Average performance was fixed at 50% correct by adapting the number of segments retained. Multinomial regression was used to estimate weights showing the regions of the MPS associated with performance (a “classification image” or CImg). Results The CImg weights were significantly different between the groups in two MPS regions: a region encoding the shared phonetic content of the two talkers and a region encoding the competing (male) talker's voice. The OHI listeners demonstrated poorer encoding of the phonetic content and increased vulnerability to interference from the competing talker. Individual differences in CImg weights explained over 75% of the variance in baseline performance in the OHI listeners, whereas differences in high-frequency pure-tone thresholds explained only 10%. Conclusion Suprathreshold deficits in the encoding of low- to mid-frequency (~5–10 Hz) temporal modulations—which may reflect poorer “dip listening”—and auditory grouping at a perceptual and/or cognitive level are responsible for the relatively poor performance of OHI versus ONH listeners on a different-gender competing speech task. Supplemental Material https://doi.org/10.23641/asha.12568472


2019 ◽  
Vol 116 (50) ◽  
pp. 25355-25364 ◽  
Author(s):  
Wiktor Młynarski ◽  
Josh H. McDermott

Events and objects in the world must be inferred from sensory signals to support behavior. Because sensory measurements are temporally and spatially local, the estimation of an object or event can be viewed as the grouping of these measurements into representations of their common causes. Perceptual grouping is believed to reflect internalized regularities of the natural environment, yet grouping cues have traditionally been identified using informal observation and investigated using artificial stimuli. The relationship of grouping to natural signal statistics has thus remained unclear, and additional or alternative cues remain possible. Here, we develop a general methodology for relating grouping to natural sensory signals and apply it to derive auditory grouping cues from natural sounds. We first learned local spectrotemporal features from natural sounds and measured their co-occurrence statistics. We then learned a small set of stimulus properties that could predict the measured feature co-occurrences. The resulting cues included established grouping cues, such as harmonic frequency relationships and temporal coincidence, but also revealed previously unappreciated grouping principles. Human perceptual grouping was predicted by natural feature co-occurrence, with humans relying on the derived grouping cues in proportion to their informativity about co-occurrence in natural sounds. The results suggest that auditory grouping is adapted to natural stimulus statistics, show how these statistics can reveal previously unappreciated grouping phenomena, and provide a framework for studying grouping in natural signals.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Emma Holmes ◽  
Timothy D. Griffiths

AbstractUnderstanding speech when background noise is present is a critical everyday task that varies widely among people. A key challenge is to understand why some people struggle with speech-in-noise perception, despite having clinically normal hearing. Here, we developed new figure-ground tests that require participants to extract a coherent tone pattern from a stochastic background of tones. These tests dissociated variability in speech-in-noise perception related to mechanisms for detecting static (same-frequency) patterns and those for tracking patterns that change frequency over time. In addition, elevated hearing thresholds that are widely considered to be ‘normal’ explained significant variance in speech-in-noise perception, independent of figure-ground perception. Overall, our results demonstrate that successful speech-in-noise perception is related to audiometric thresholds, fundamental grouping of static acoustic patterns, and tracking of acoustic sources that change in frequency. Crucially, speech-in-noise deficits are better assessed by measuring central (grouping) processes alongside audiometric thresholds.


2019 ◽  
Author(s):  
Emma Holmes ◽  
Peter Zeidman ◽  
Karl J. Friston ◽  
Timothy D. Griffiths

AbstractIn our everyday lives, we are often required to follow a conversation when background noise is present (“speech-in-noise” perception). Speech-in-noise perception varies widely—and people who are worse at speech-in-noise perception are also worse at fundamental auditory grouping, as assessed by figure-ground tasks. Here, we examined the cortical processes that link difficulties with speech-in-noise perception to difficulties with figure-ground perception using functional magnetic resonance imaging (fMRI). We found strong evidence that the earliest stages of the auditory cortical hierarchy (left core and belt areas) are similarly disinhibited when speech-in-noise and figure-ground tasks are more difficult (i.e., at target-to-masker ratios corresponding to 60% than 90% thresholds)—consistent with increased cortical gain at lower levels of the auditory hierarchy. Overall, our results reveal a common neural substrate for these basic (figure-ground) and naturally relevant (speech-in-noise) tasks—which provides a common computational basis for the link between speech-in-noise perception and fundamental auditory grouping.


2019 ◽  
Author(s):  
Emma Holmes ◽  
Timothy D. Griffiths

AbstractUnderstanding speech when background noise is present is a critical everyday task that varies widely among people. A key challenge is to understand why some people struggle with speech-in-noise perception, despite having clinically normal hearing. Here, we developed new figure-ground tests that require participants to extract a coherent tone pattern from a stochastic background of tones. These tests dissociated variability in speech-in-noise perception related to mechanisms for detecting static (same-frequency) patterns and those for tracking patterns that change frequency over time. In addition, elevated hearing thresholds that are widely considered to be ‘normal’ explained significant variance in speech-in-noise perception, independent of figure-ground perception. Overall, our results demonstrate that successful speech-in-noise perception is related to audiometric thresholds, fundamental grouping of static acoustic patterns, and tracking of acoustic sources that change in frequency. Crucially, measuring both peripheral (audiometric thresholds) and central (grouping) processes is required to adequately assess speech-in-noise deficits.


Sign in / Sign up

Export Citation Format

Share Document