scholarly journals Using spatial release from masking to estimate the magnitude of the familiar-voice intelligibility benefit

2019 ◽  
Author(s):  
Ysabel Domingo ◽  
Emma Holmes ◽  
Ewan Macpherson ◽  
Ingrid Johnsrude

The ability to segregate simultaneous speech streams is crucial for successful communication. Recent studies have demonstrated that participants can report 10–20% more words spoken by naturally familiar (e.g., friends or spouses) than unfamiliar talkers in two-voice mixtures. This benefit is commensurate with one of the largest benefits to speech intelligibility currently known—that gained by spatially separating two talkers. However, because of differences in the methods of these previous studies, the relative benefits of spatial separation and voice familiarity are unclear. Here, we directly compared the familiar-voice benefit and spatial release from masking, and examined if and how these two cues interact with one another. We recorded talkers speaking sentences from a published closed-set “matrix” task and then presented listeners with three different sentences played simultaneously. Each target sentence was played at 0° azimuth, and two masker sentences were symmetrically separated about the target. On average, participants reported 10–30% more words correctly when the target sentence was spoken in a familiar than unfamiliar voice (collapsed over spatial separation conditions); we found that participants gain a similar benefit from a familiar target as when an unfamiliar voice is separated from two symmetrical maskers by approximately 15° azimuth.

2020 ◽  
Vol 31 (04) ◽  
pp. 271-276
Author(s):  
Grant King ◽  
Nicole E. Corbin ◽  
Lori J. Leibold ◽  
Emily Buss

Abstract Background Speech recognition in complex multisource environments is challenging, particularly for listeners with hearing loss. One source of difficulty is the reduced ability of listeners with hearing loss to benefit from spatial separation of the target and masker, an effect called spatial release from masking (SRM). Despite the prevalence of complex multisource environments in everyday life, SRM is not routinely evaluated in the audiology clinic. Purpose The purpose of this study was to demonstrate the feasibility of assessing SRM in adults using widely available tests of speech-in-speech recognition that can be conducted using standard clinical equipment. Research Design Participants were 22 young adults with normal hearing. The task was masked sentence recognition, using each of five clinically available corpora with speech maskers. The target always sounded like it originated from directly in front of the listener, and the masker either sounded like it originated from the front (colocated with the target) or from the side (separated from the target). In the real spatial manipulation conditions, source location was manipulated by routing the target and masker to either a single speaker or to two speakers: one directly in front of the participant, and one mounted in an adjacent corner, 90° to the right. In the perceived spatial separation conditions, the target and masker were presented from both speakers with delays that made them sound as if they were either colocated or separated. Results With real spatial manipulations, the mean SRM ranged from 7.1 to 11.4 dB, depending on the speech corpus. With perceived spatial manipulations, the mean SRM ranged from 1.8 to 3.1 dB. Whereas real separation improves the signal-to-noise ratio in the ear contralateral to the masker, SRM in the perceived spatial separation conditions is based solely on interaural timing cues. Conclusions The finding of robust SRM with widely available speech corpora supports the feasibility of measuring this important aspect of hearing in the audiology clinic. The finding of a small but significant SRM in the perceived spatial separation conditions suggests that modified materials could be used to evaluate the use of interaural timing cues specifically.


2017 ◽  
Vol 26 (4) ◽  
pp. 507-518 ◽  
Author(s):  
Kasey M. Jakien ◽  
Sean D. Kampel ◽  
Meghan M. Stansell ◽  
Frederick J. Gallun

Purpose To evaluate the test–retest reliability of a headphone-based spatial release from a masking task with two maskers (referred to here as the SR2) and to describe its relationship to the same test done over loudspeakers in an anechoic chamber (the SR2A). We explore what thresholds tell us about certain populations (such as older individuals or individuals with hearing impairment) and discuss how the SR2 might be useful in the clinic. Method Fifty-four participants completed speech intelligibility tests in which a target phrase and two masking phrases from the Coordinate Response Measure corpus (Bolia, Nelson, Ericson, & Simpson, 2000) were presented either via earphones using a virtual spatial array or via loudspeakers in an anechoic chamber. For the SR2, the target sentence was always at 0° azimuth angle, and the maskers were either colocated at 0° or positioned at ± 45°. For the SR2A, the target was located at 0°, and the maskers were colocated or located at ± 15°, ± 30°, ± 45°, ± 90°, or ± 135°. Spatial release from masking was determined as the difference between thresholds in the colocated condition and each spatially separated condition. All participants completed the SR2 at least twice, and 29 of the individuals who completed the SR2 at least twice also participated in the SR2A. In a second experiment, 40 participants completed the SR2 8 times, and the changes in performance were evaluated as a function of test repetition. Results Mean thresholds were slightly better on the SR2 after the first repetition but were consistent across 8 subsequent testing sessions. Performance was consistent for the SR2A, regardless of the number of times testing was repeated. The SR2, which simulates 45° separations of target and maskers, produced spatially separated thresholds that were similar to thresholds obtained with 30° of separation in the anechoic chamber. Over headphones and in the anechoic chamber, pure-tone average was a strong predictor of spatial release, whereas age only reached significance for colocated conditions. Conclusions The SR2 is a reliable and effective method of testing spatial release from masking, suitable for screening abnormal listening abilities and for tracking rehabilitation over time. Future work should focus on developing and validating rapid, automated testing to identify the ability of listeners to benefit from high-frequency amplification, smaller spatial separations, and larger spectral differences among talkers.


Author(s):  
Grant King ◽  
Nicole E. Corbin ◽  
Lori J. Leibold ◽  
Emily Buss

Background: Speech recognition in complex multisource environments is challenging, particularly forlisteners with hearing loss. One source of difficulty is the reduced ability of listeners with hearing loss tobenefit from spatial separation of the target and masker, an effect called spatial release from masking(SRM). Despite the prevalence of complex multisource environments in everyday life, SRM is not routinelyevaluated in the audiology clinic.<br />Purpose: The purpose of this study was to demonstrate the feasibility of assessing SRM in adults usingwidely available tests of speech-in-speech recognition that can be conducted using standard clinicalequipment.<br />Research Design: Participants were 22 young adults with normal hearing. The task was masked sentencerecognition, using each of five clinically available corpora with speech maskers. The target alwayssounded like it originated from directly in front of the listener, and the masker either sounded like it originatedfrom the front (colocated with the target) or from the side (separated from the target). In the realspatial manipulation conditions, source location was manipulated by routing the target and masker toeither a single speaker or to two speakers: one directly in front of the participant, and one mountedin an adjacent corner, 90° to the right. In the perceived spatial separation conditions, the target andmasker were presented from both speakers with delays that made them sound as if they were eithercolocated or separated.<br />Results: With real spatial manipulations, the mean SRM ranged from 7.1 to 11.4 dB, depending on thespeech corpus. With perceived spatial manipulations, the mean SRM ranged from 1.8 to 3.1 dB. Whereasreal separation improves the signal-to-noise ratio in the ear contralateral to the masker, SRM in the perceivedspatial separation conditions is based solely on interaural timing cues.<br />Conclusions: The finding of robust SRM with widely available speech corpora supports the feasibility ofmeasuring this important aspect of hearing in the audiology clinic. The finding of a small but significantSRM in the perceived spatial separation conditions suggests that modified materials could be used toevaluate the use of interaural timing cues specifically.<br />


2020 ◽  
Vol 29 (4) ◽  
pp. 907-915
Author(s):  
Nirmal Kumar Srinivasan ◽  
Allison Holtz ◽  
Frederick J. Gallun

Purpose The purpose of this study was to compare speech identification abilities of individuals of various ages and hearing abilities using traditional methods and Portable Automated Rapid Testing (PART) iPad app. Method Speech identification data were collected using three techniques: over headphones using a virtual speaker array, using PART iPad app (UCR Brain Game Center, 2018), and using loudspeaker presentation in a sound-attenuated room. For all three techniques, Coordinate Response Measure sentences were used as the stimuli and “Charlie” was used as the call sign. A progressive tracking procedure was used to estimate the speech identification thresholds for listeners with varying hearing thresholds. The target sentence was always presented at 0° azimuth angle, whereas the maskers were colocated (0°) with the target or symmetrically spatially separated by ±15°, ±30°, or ±45°. Results Data analysis revealed similar speech identification thresholds for the iPad and headphone conditions and slightly poorer thresholds for the loudspeaker array condition across participant groups. This was true for all spatial separations between the target and the maskers. Conclusion Strong correlation between the headphone and iPad data presented in this study indicated that the spatial release from masking module in the PART iPad app can be used as a clinical tool to assess spatial processing ability prior to audiologic evaluation in the clinic and can also be used to make recommendations for and to track progress with aural rehabilitation programs over time.


2021 ◽  
Vol 2069 (1) ◽  
pp. 012162
Author(s):  
G E Puglisi ◽  
A Warzybok ◽  
A Astolfi ◽  
B Kollmeier

Abstract Excessive noise and reverberation times degrade listening abilities in everyday life environments. This is particularly true for school settings. Most classrooms in Italy are settled in historical buildings that generate competitive acoustic environments. So far, few studies investigated the effect of real acoustics on speech intelligibility and on the spatial release from masking, focusing more on laboratory conditions. Also, the effect of noise on speech intelligibility was widely investigated considering its energetic rather than its informational content. Therefore, a study involving normal hearing adults was performed presenting listening tests via headphone and considering the competitive real acoustics of two primary-school classrooms with reverberation time of 0.4 s and 3.1 s, respectively. The main objective was the investigation of the effect of reverberation and noise on the spatial release from masking to help the design of learning environments. Binaural room impulse responses were acquired, with noise sources at different azimuths from the listener’s head. The spatial release from masking was significantly affected by noise type and reverberation. Longer reverberation times brought to worst speech intelligibility, with speech recognition thresholds higher by 6 dB on average. Noise with an informational content was detrimental by 7 dB with respect to an energetic noise.


Sign in / Sign up

Export Citation Format

Share Document