Effects of target level on release from masking by voice-gender difference and spatial separation between talkers

2021 ◽  
Vol 150 (4) ◽  
pp. A304-A304
Author(s):  
Yonghee Oh ◽  
Hannah Schoenfeld ◽  
Allison O. Layne ◽  
Sarah E. Bridges
2021 ◽  
Vol 1 (8) ◽  
pp. 084404
Author(s):  
Yonghee Oh ◽  
Sarah E. Bridges ◽  
Hannah Schoenfeld ◽  
Allison O. Layne ◽  
David Eddins

2020 ◽  
Vol 31 (04) ◽  
pp. 271-276
Author(s):  
Grant King ◽  
Nicole E. Corbin ◽  
Lori J. Leibold ◽  
Emily Buss

Abstract Background Speech recognition in complex multisource environments is challenging, particularly for listeners with hearing loss. One source of difficulty is the reduced ability of listeners with hearing loss to benefit from spatial separation of the target and masker, an effect called spatial release from masking (SRM). Despite the prevalence of complex multisource environments in everyday life, SRM is not routinely evaluated in the audiology clinic. Purpose The purpose of this study was to demonstrate the feasibility of assessing SRM in adults using widely available tests of speech-in-speech recognition that can be conducted using standard clinical equipment. Research Design Participants were 22 young adults with normal hearing. The task was masked sentence recognition, using each of five clinically available corpora with speech maskers. The target always sounded like it originated from directly in front of the listener, and the masker either sounded like it originated from the front (colocated with the target) or from the side (separated from the target). In the real spatial manipulation conditions, source location was manipulated by routing the target and masker to either a single speaker or to two speakers: one directly in front of the participant, and one mounted in an adjacent corner, 90° to the right. In the perceived spatial separation conditions, the target and masker were presented from both speakers with delays that made them sound as if they were either colocated or separated. Results With real spatial manipulations, the mean SRM ranged from 7.1 to 11.4 dB, depending on the speech corpus. With perceived spatial manipulations, the mean SRM ranged from 1.8 to 3.1 dB. Whereas real separation improves the signal-to-noise ratio in the ear contralateral to the masker, SRM in the perceived spatial separation conditions is based solely on interaural timing cues. Conclusions The finding of robust SRM with widely available speech corpora supports the feasibility of measuring this important aspect of hearing in the audiology clinic. The finding of a small but significant SRM in the perceived spatial separation conditions suggests that modified materials could be used to evaluate the use of interaural timing cues specifically.


1998 ◽  
Vol 104 (1) ◽  
pp. 422-431 ◽  
Author(s):  
Gerald Kidd ◽  
Christine R. Mason ◽  
Tanya L. Rohtla ◽  
Phalguni S. Deliwala

eLife ◽  
2017 ◽  
Vol 6 ◽  
Author(s):  
Norman Lee ◽  
Andrew C Mason

Spatial release from masking (SRM) occurs when spatial separation between a signal and masker decreases masked thresholds. The mechanically-coupled ears of Ormia ochracea are specialized for hyperacute directional hearing, but the possible role of SRM, or whether such specializations exhibit limitations for sound source segregation, is unknown. We recorded phonotaxis to a cricket song masked by band-limited noise. With a masker, response thresholds increased and localization was diverted away from the signal and masker. Increased separation from 6° to 90° did not decrease response thresholds or improve localization accuracy, thus SRM does not operate in this range of spatial separations. Tympanal vibrations and auditory nerve responses reveal that localization errors were consistent with changes in peripheral coding of signal location and flies localized towards the ear with better signal detection. Our results demonstrate that, in a mechanically coupled auditory system, specialization for directional hearing does not contribute to source segregation.


2018 ◽  
Vol 61 (2) ◽  
pp. 428-435 ◽  
Author(s):  
Navin Viswanathan ◽  
Kostas Kokkinakis ◽  
Brittany T. Williams

Purpose The purpose of this study was to evaluate whether listeners with normal hearing perceiving noise-vocoded speech-in-speech demonstrate better intelligibility of target speech when the background speech was mismatched in language (linguistic release from masking [LRM]) and/or location (spatial release from masking [SRM]) relative to the target. We also assessed whether the spectral resolution of the noise-vocoded stimuli affected the presence of LRM and SRM under these conditions. Method In Experiment 1, a mixed factorial design was used to simultaneously manipulate the masker language (within-subject, English vs. Dutch), the simulated masker location (within-subject, right, center, left), and the spectral resolution (between-subjects, 6 vs. 12 channels) of noise-vocoded target–masker combinations presented at +25 dB signal-to-noise ratio (SNR). In Experiment 2, the study was repeated using a spectral resolution of 12 channels at +15 dB SNR. Results In both experiments, listeners' intelligibility of noise-vocoded targets was better when the background masker was Dutch, demonstrating reliable LRM in all conditions. The pattern of results in Experiment 1 was not reliably different across the 6- and 12-channel noise-vocoded speech. Finally, a reliable spatial benefit (SRM) was detected only in the more challenging SNR condition (Experiment 2). Conclusion The current study is the first to report a clear LRM benefit in noise-vocoded speech-in-speech recognition. Our results indicate that this benefit is available even under spectrally degraded conditions and that it may augment the benefit due to spatial separation of target speech and competing backgrounds.


2019 ◽  
Author(s):  
Ysabel Domingo ◽  
Emma Holmes ◽  
Ewan Macpherson ◽  
Ingrid Johnsrude

The ability to segregate simultaneous speech streams is crucial for successful communication. Recent studies have demonstrated that participants can report 10–20% more words spoken by naturally familiar (e.g., friends or spouses) than unfamiliar talkers in two-voice mixtures. This benefit is commensurate with one of the largest benefits to speech intelligibility currently known—that gained by spatially separating two talkers. However, because of differences in the methods of these previous studies, the relative benefits of spatial separation and voice familiarity are unclear. Here, we directly compared the familiar-voice benefit and spatial release from masking, and examined if and how these two cues interact with one another. We recorded talkers speaking sentences from a published closed-set “matrix” task and then presented listeners with three different sentences played simultaneously. Each target sentence was played at 0° azimuth, and two masker sentences were symmetrically separated about the target. On average, participants reported 10–30% more words correctly when the target sentence was spoken in a familiar than unfamiliar voice (collapsed over spatial separation conditions); we found that participants gain a similar benefit from a familiar target as when an unfamiliar voice is separated from two symmetrical maskers by approximately 15° azimuth.


2019 ◽  
Vol 62 (11) ◽  
pp. 4165-4178 ◽  
Author(s):  
Nematollah Rouhbakhsh ◽  
John Mahdi ◽  
Jacob Hwo ◽  
Baran Nobel ◽  
Fati Mousave

Purpose Speech recognition in complex listening environments is enhanced by the extent of spatial separation between the speech source and background competing sources, an effect known as spatial release from masking (SRM). The aim of this study was to investigate whether the phase-locked neural activity in the central auditory pathways, reflected in the frequency following response (FFR), exhibits SRM. Method Eighteen normal-hearing adults (8 men and 10 women, ranging in age from 20 to 42 years) with no known neurological disorders participated in this study. FFRs were recorded from the participants in response to a target vowel /u/ presented with spatially colocated and separated competing talkers at 3 ranges of signal-to-noise ratios (SNRs), with median SNRs of −5.4, 0.5, and 6.8 dB and for different attentional conditions (attention and no attention). Results Amplitude of the FFR at the fundamental frequency was significantly larger in the spatially separated condition as compared to the colocated condition for only the lowest (< −2.4 dB SNR) of the 3 SNR ranges tested. A significant effect of attention was found when subjects were actively focusing on the target stimuli. No significant interaction effects were found between spatial separation and attention. Conclusions The enhanced representation of the target stimulus in the separated condition suggests that the temporal pattern of phase-locked brainstem neural activity generating the FFR may contain information relevant to the binaural processes underlying SRM but only in challenging listening environments. Attention may modulate FFR fundamental frequency amplitude but does not seem to modulate spatial processing at the level of generating the FFR. Supplemental Material https://doi.org/10.23641/asha.9992597


1992 ◽  
Vol 36 (3) ◽  
pp. 253-257
Author(s):  
Michael D. Good ◽  
Robert H. Gilkey

The development of optimal three-dimensional auditory displays requires a more complete understanding of the interactions among spatially separated sounds. Free-field masking was investigated as a function of the spatial separation between signal and masker sounds within the horizontal, frontal, and median planes. The detectability of filtered pulse trains in the presence of noise maskers was measured using a cued, two-alternative, forced-choice, adaptive staircase procedure. Signal and masker combinations in low (below 2.3 kHz), middle (1.0–8.5 kHz), and high (above 3.5 kHz) frequency regions were examined. As the sound sources were separated within the horizontal plane, signal detectability increased dramatically. Similar improvement in detectability was observed within the frontal plane. As suggested by traditional binaural models, interaural time cues and interaural intensity cues are likely to play a major role in mediating masking release in both the horizontal and frontal planes. Because no interaural cues exist for stimuli presented within the median plane, traditional models would not predict a release from masking when the stimuli are separated within this plane. However, with high frequency signals, masking release similar to that observed in the horizontal and frontal planes could be observed in the median plane. The current literature suggests that sound localization in the median plane may depend on direction-specific spectral cues that are introduced by the pinna at high frequencies. The masking release observed here may also depend on these “pinna cues.”


Sign in / Sign up

Export Citation Format

Share Document