Development of a Speech-in-Multitalker-Babble Paradigm to Assess Word-Recognition Performance

2003 ◽  
Vol 14 (09) ◽  
pp. 453-470 ◽  
Author(s):  
Richard H. Wilson

A simple word-recognition task in multitalker babble for clinic use was developed in the course of four experiments involving listeners with normal hearing and listeners with hearing loss. In Experiments 1 and 2, psychometric functions for the individual NU No. 6 words from Lists 2, 3, and 4 were obtained with each word in a unique segment of multitalker babble. The test paradigm that emerged involved ten words at each of seven signal-to-babble ratios (S/B) from 0 to 24 dB. Experiment 3 examined the effect that babble presentation level (70, 80, and 90 dB SPL) had on recognition performance in babble, whereas Experiment 4 studied the effect that monaural and binaural listening had on recognition performance. For listeners with normal hearing, the 90th percentile was 6 dB S/B. In comparison to the listeners with normal hearing, the 50% correct points on the functions for listeners with hearing loss were at 5 to 15 dB higher signal-to-babble ratios.

2005 ◽  
Vol 16 (06) ◽  
pp. 367-382 ◽  
Author(s):  
Richard H. Wilson ◽  
Deborah G. Weakley

The purpose of this study was to determine if performances on a 500 Hz MLD task and a word-recognition task in multitalker babble covaried or varied independently for listeners with normal hearing and for listeners with hearing loss. Young listeners with normal hearing (n = 25) and older listeners (25 per decade from 40–80 years, n = 125) with sensorineural hearing loss were studied. Thresholds at 500 and 1000 Hz were ≤30 dB HL and ≤40 dB HL, respectively, with thresholds above 1000 Hz <100 dB HL. There was no systematic relationship between the 500 Hz MLD and word-recognition performance in multitalker babble. Higher SoNo and SπNo; thresholds were observed for the older listeners, but the MLDs were the same for all groups. Word recognition in babble in terms of signal-to-babble ratio was on average 6.5 (40- to 49-year-old group) to 10.8 dB (80- to 89-year-old group) poorer for the older listeners with hearing loss. Neither pure-tone thresholds nor word-recognition abilities in quiet accurately predicted word-recognition performance in multitalker babble.


2020 ◽  
Vol 31 (07) ◽  
pp. 531-546
Author(s):  
Mitzarie A. Carlo ◽  
Richard H. Wilson ◽  
Albert Villanueva-Reyes

Abstract Background English materials for speech audiometry are well established. In Spanish, speech-recognition materials are not standardized with monosyllables, bisyllables, and trisyllables used in word-recognition protocols. Purpose This study aimed to establish the psychometric characteristics of common Spanish monosyllabic, bisyllabic, and trisyllabic words for potential use in word-recognition procedures. Research Design Prospective descriptive study. Study Sample Eighteen adult Puerto Ricans (M = 25.6 years) with normal hearing [M = 7.8-dB hearing level (HL) pure-tone average] were recruited for two experiments. Data Collection and Analyses A digital recording of 575 Spanish words was created (139 monosyllables, 359 bisyllables, and 77 trisyllables), incorporating materials from a variety of Spanish word-recognition lists. Experiment 1 (n = 6) used 25 randomly selected words from each of the three syllabic categories to estimate the presentation level ranges needed to obtain recognition performances over the 10 to 90% range. In Experiment 2 (n = 12) the 575 words were presented over five 1-hour sessions using presentation levels from 0- to 30-dB HL in 5-dB steps (monosyllables), 0- to 25-dB HL in 5-dB steps (bisyllables), and −3- to 17-dB HL in 4-dB steps (trisyllables). The presentation order of both the words and the presentation levels were randomized for each listener. The functions for each listener and each word were fit with polynomial equations from which the 50% points and slopes at the 50% point were calculated. Results The mean 50% points and slopes at 50% were 8.9-dB HL, 4.0%/dB (monosyllables), 6.9-dB HL, 5.1%/dB (bisyllables), and 1.4-dB HL, 6.3%/dB (trisyllables). The Kruskal–Wallis test with Mann–Whitney U post-hoc analysis indicated that the mean 50% points and slopes at the 50% points of the individual word functions were significantly different among the syllabic categories. Although significant differences were observed among the syllabic categories, substantial overlap was noted in the individual word functions, indicating that the psychometric characteristics of the words were not dictated exclusively by the syllabic number. Influences associated with word difficulty, word familiarity, singular and plural form words, phonetic stress patterns, and gender word patterns also were evaluated. Conclusion The main finding was the direct relation between the number of syllables in a word and word-recognition performance. In general, words with more syllables were more easily recognized; there were, however, exceptions. The current data from young adults with normal hearing established the psychometric characteristics of the 575 Spanish words on which the formulation of word lists for both threshold and suprathreshold measures of word-recognition abilities in quiet and in noise and other word-recognition protocols can be based.


2019 ◽  
Vol 30 (05) ◽  
pp. 370-395 ◽  
Author(s):  
Richard H. Wilson

AbstractThe Auditec of St. Louis and the Department of Veterans Affairs (VA) recorded versions of the Northwestern University Auditory Test No. 6 (NU-6) are in common usage. Data on young adults with normal hearing for pure tones (YNH) demonstrate equal recognition performances on the two versions when the VA version is presented 5 dB higher but similar data on older listeners with sensorineural hearing loss (OHL) are lacking.To compare word-recognition performances on the Auditec and VA versions of NU-6 presented at six presentation levels with YNH and OHL listeners.A quasi-experimental, repeated-measures design was used.Twelve YNH (M = 24.0 years; PTA = 9.9-dB HL) and 36 OHL listeners (M = 71.6 years; PTA = 26.7-dB HL) participated in three, one-hour sessions.Each listener received 100 stimulus words that were randomized by 6 presentation levels for each of two speakers (YNH, −2 to 28-dB SL; OHL, −2 to 38-dB SL). The sessions were limited to 25 practice and 400 experimental words. Digital versions of the 16, 25-word tracks for each session were alternated between speakers.Each of the 48 listeners had higher recognition performances on the Auditec version of NU-6 than on the VA version. The respective overall recognition performances on the Auditec and VA versions were 71.4% and 64.1% (YNH) and 68.7% and 58.2% (OHL). At the highest presentation levels, recognition performances on the two versions differed by only 0.5% (YNH) and 3.3% (OHL). At the 50% correct point, performances on the Auditec version were 3.2 dB (YNH) and 6.1 dB (OHL) better than those on the VA version. The slopes at the 50% points on the mean functions for both speakers were about 4.9%/dB (YNH) and 3.0%/dB (OHL); however, the slopes evaluated from the individual listener data were steeper, 5.2 to 5.3%/dB (YNH) and 3.3 to 3.5%/dB (OHL). When the individual data were transformed from dB SL to dB HL, the differences between the two listener groups were emphasized. The four functions (2 speakers by 2 listener groups) were plotted for each of the 48 participants and each of the 200 words, which revealed the gamut of relations among the datasets. Examination of the data for each speaker across test sessions, in the traditional 50-word lists, and in the typically used 25-word lists of Randomization A revealed no differences of clinical concern. Finally, introspective reports from the listeners revealed that 91.7% and 83.3% of the YNH and OHL listeners, respectively, thought the Auditec speaker was easier to understand than the VA speaker. Recognition performances on each participant and on each word are presented.


2019 ◽  
Vol 62 (4) ◽  
pp. 1051-1067 ◽  
Author(s):  
Jonathan H. Venezia ◽  
Allison-Graham Martin ◽  
Gregory Hickok ◽  
Virginia M. Richards

Purpose Age-related sensorineural hearing loss can dramatically affect speech recognition performance due to reduced audibility and suprathreshold distortion of spectrotemporal information. Normal aging produces changes within the central auditory system that impose further distortions. The goal of this study was to characterize the effects of aging and hearing loss on perceptual representations of speech. Method We asked whether speech intelligibility is supported by different patterns of spectrotemporal modulations (STMs) in older listeners compared to young normal-hearing listeners. We recruited 3 groups of participants: 20 older hearing-impaired (OHI) listeners, 19 age-matched normal-hearing listeners, and 10 young normal-hearing (YNH) listeners. Listeners performed a speech recognition task in which randomly selected regions of the speech STM spectrum were revealed from trial to trial. The overall amount of STM information was varied using an up–down staircase to hold performance at 50% correct. Ordinal regression was used to estimate weights showing which regions of the STM spectrum were associated with good performance (a “classification image” or CImg). Results The results indicated that (a) large-scale CImg patterns did not differ between the 3 groups; (b) weights in a small region of the CImg decreased systematically as hearing loss increased; (c) CImgs were also nonsystematically distorted in OHI listeners, and the magnitude of this distortion predicted speech recognition performance even after accounting for audibility; and (d) YNH listeners performed better overall than the older groups. Conclusion We conclude that OHI/older normal-hearing listeners rely on the same speech STMs as YNH listeners but encode this information less efficiently. Supplemental Material https://doi.org/10.23641/asha.7859981


2015 ◽  
Vol 26 (04) ◽  
pp. 331-345 ◽  
Author(s):  
Richard H. Wilson ◽  
Rachel McArdle

Background: In developing the PB-50 word lists, J. P. Egan suggested five developmental principles, two of which were “equal average difficulty” and an “equal range of difficulty” among the lists (page 963). Egan was satisfied that each of the 20 PB-50 lists had equivalent ranges of recognition performances and that the lists produced the same average performances. This was accomplished in preliminary studies that measured the recognition performance of each word and eliminated words that were always or never correct. In preparing for studies of interrupted words, we needed to know the range of difficulty inherent in the speaker specific NU-6 and Maryland CNC materials we planned to use when those words were not interrupted. There were only a few studies in the literature that touched on the range of difficulty characteristic of the word-recognition materials in common usage. The paucity of this information prompted this investigation whose scope broadened to include the CID W-22, Maryland CNC, NU-6, and PB-50 materials spoken by a variety of speakers. Purpose: The purpose was to evaluate the homogeneity with respect to intelligibility of the words that comprise several of the common word-recognition materials used in audiologic evaluations. Research Design: Both retrospective (10) and prospective (3) studies were involved. Data from six of the retrospective studies were from our labs. The prospective studies involved both listeners with normal hearing for pure tones and listeners with sensorineural hearing loss. Study Sample: The sample sizes for the 13 data sets ranged from 24 to 1,030, with 24 the typical number for listeners with normal hearing. Data Collection and Analysis: The retrospective data were from published studies and archived data from our laboratories. The prospective studies involved presentation of the word-recognition materials to the listeners at a comfortable level. An item analysis was conducted on each data set with descriptive statistics used to characterize the data. Additionally, skewness coefficients were calculated on the distributions of word performances and the interquartile range was used to determine minor and major outliers within each set of 200 words and their component 50-word lists (300 words for the Maryland CNCs). Results: For listeners with normal hearing the majority of performances on the words within a 50-word list were better than the mean performance, which produced negatively skewed distributions with outlier performances in every list. For listeners with sensorineural hearing loss the performances on the words within a 50-word list were evenly distributed above and below the mean performance, which yielded essentially normal distributions with few outliers. There were a few words on which performances were better by the listeners with hearing loss. Conclusions: Every list of word-recognition materials has a few words on which recognition performances are noticeably poorer than performances on the majority of the remaining words. If the intention of an experiment is to evaluate performance at the word level, then identifying these “outliers” becomes a necessity. Although not evaluated in this report, the implications for 25-word lists are they should be based on recognition-performance data and not compiled arbitrarily.


2020 ◽  
Vol 31 (06) ◽  
pp. 412-441 ◽  
Author(s):  
Richard H. Wilson ◽  
Victoria A. Sanchez

Abstract Background In the 1950s, with monitored live voice testing, the vu meter time constant and the short durations and amplitude modulation characteristics of monosyllabic words necessitated the use of the carrier phrase amplitude to monitor (indirectly) the presentation level of the words. This practice continues with recorded materials. To relieve the carrier phrase of this function, first the influence that the carrier phrase has on word recognition performance needs clarification, which is the topic of this study. Purpose Recordings of Northwestern University Auditory Test No. 6 by two female speakers were used to compare word recognition performances with and without the carrier phrases when the carrier phrase and test word were (1) in the same utterance stream with the words excised digitally from the carrier (VA-1 speaker) and (2) independent of one another (VA-2 speaker). The 50-msec segment of the vowel in the target word with the largest root mean square amplitude was used to equate the target word amplitudes. Research Design A quasi-experimental, repeated measures design was used. Study Sample Twenty-four young normal-hearing adults (YNH; M = 23.5 years; pure-tone average [PTA] = 1.3-dB HL) and 48 older hearing loss listeners (OHL; M = 71.4 years; PTA = 21.8-dB HL) participated in two, one-hour sessions. Data Collection and Analyses Each listener had 16 listening conditions (2 speakers × 2 carrier phrase conditions × 4 presentation levels) with 100 randomized words, 50 different words by each speaker. Each word was presented 8 times (2 carrier phrase conditions × 4 presentation levels [YNH, 0- to 24-dB SL; OHL, 6- to 30-dB SL]). The 200 recorded words for each condition were randomized as 8, 25-word tracks. In both test sessions, one practice track was followed by 16 tracks alternated between speakers and randomized by blocks of the four conditions. Central tendency and repeated measures analyses of variance statistics were used. Results With the VA-1 speaker, the overall mean recognition performances were 6.0% (YNH) and 8.3% (OHL) significantly better with the carrier phrase than without the carrier phrase. These differences were in part attributed to the distortion of some words caused by the excision of the words from the carrier phrases. With the VA-2 speaker, recognition performances on the with and without carrier phrase conditions by both listener groups were not significantly different, except for one condition (YNH listeners at 8-dB SL). The slopes of the mean functions were steeper for the YNH listeners (3.9%/dB to 4.8%/dB) than for the OHL listeners (2.4%/dB to 3.4%/dB) and were <1%/dB steeper for the VA-1 speaker than for the VA-2 speaker. Although the mean results were clear, the variability in performance differences between the two carrier phrase conditions for the individual participants and for the individual words was striking and was considered in detail. Conclusion The current data indicate that word recognition performances with and without the carrier phrase (1) were different when the carrier phrase and target word were produced in the same utterance with poorer performances when the target words were excised from their respective carrier phrases (VA-1 speaker), and (2) were the same when the carrier phrase and target word were produced as independent utterances (VA-2 speaker).


2008 ◽  
Vol 19 (06) ◽  
pp. 496-506 ◽  
Author(s):  
Richard H. Wilson ◽  
Rachel McArdle ◽  
Heidi Roberts

Background: So that portions of the classic Miller, Heise, and Lichten (1951) study could be replicated, new recorded versions of the words and digits were made because none of the three common monosyllabic word lists (PAL PB-50, CID W-22, and NU–6) contained the 9 monosyllabic digits (1–10, excluding 7) that were used by Miller et al. It is well established that different psychometric characteristics have been observed for different lists and even for the same materials spoken by different speakers. The decision was made to record four lists of each of the three monosyllabic word sets, the monosyllabic digits not included in the three sets of word lists, and the CID W-1 spondaic words. A professional female speaker with a General American dialect recorded the materials during four recording sessions within a 2-week interval. The recording order of the 582 words was random. Purpose: To determine—on listeners with normal hearing—the psychometric properties of the five speech materials presented in speech-spectrum noise. Research Design: A quasi-experimental, repeated-measures design was used. Study Sample: Twenty-four young adult listeners (M = 23 years) with normal pure-tone thresholds (≤20-dB HL at 250 to 8000 Hz) participated. The participants were university students who were unfamiliar with the test materials. Data Collection and Analysis: The 582 words were presented at four signal-to-noise ratios (SNRs; −7-, −2-, 3-, and 8-dB) in speech-spectrum noise fixed at 72-dB SPL. Although the main metric of interest was the 50% point on the function for each word established with the Spearman-Kärber equation (Finney, 1952), the percentage correct on each word at each SNR was evaluated. The psychometric characteristics of the PB-50, CID W-22, and NU–6 monosyllabic word lists were compared with one another, with the CID W-1 spondaic words, and with the 9 monosyllabic digits. Results: Recognition performance on the four lists within each of the three monosyllabic word materials were equivalent, ±0.4 dB. Likewise, word-recognition performance on the PB-50, W-22, and NU–6 word lists were equivalent, ±0.2 dB. The mean recognition performance at the 50% point with the 36 W-1 spondaic words was ˜6.2 dB lower than the 50% point with the monosyllabic words. Recognition performance on the monosyllabic digits was 1–2 dB better than mean performance on the monosyllabic words. Conclusions: Word-recognition performances on the three sets of materials (PB-50, CID W-22, and NU–6) were equivalent, as were the performances on the four lists that make up each of the three materials. Phonetic/phonemic balance does not appear to be an important consideration in the compilation of word-recognition lists used to evaluate the ability of listeners to understand speech.A companion paper examines the acoustic, phonetic/phonological, and lexical variables that may predict the relative ease or difficulty for which these monosyllable words were recognized in noise (McArdle and Wilson, this issue).


2005 ◽  
Vol 16 (08) ◽  
pp. 622-630 ◽  
Author(s):  
Richard H. Wilson ◽  
Christopher A. Burks ◽  
Deborah G. Weakley

The purpose of this experiment was to determine the relationship between psychometric functions for words presented in multitalker babble using a descending presentation level protocol and a random presentation level protocol. Forty veterans (mean = 63.5 years) with mild-to-moderate sensorineural hearing losses were enrolled. Seventy of the Northwestern University Auditory Test No. 6 words spoken by the VA female speaker were presented at seven signal-to-babble ratios from 24 to 0 dB (10 words/step). Although the random procedure required 69 sec longer to administer than the descending protocol, there was no significant difference between the results obtained with the two psychophysical methods. There was almost no relation between the perceived ability of the listeners to understand speech in background noise and their measured ability to understand speech in multitalker babble. Likewise, there was a tenuous relation between pure-tone thresholds and performance on the words in babble and between recognition performance in quiet and performance on the words in babble.


2005 ◽  
Vol 36 (3) ◽  
pp. 219-229 ◽  
Author(s):  
Peggy Nelson ◽  
Kathryn Kohnert ◽  
Sabina Sabur ◽  
Daniel Shaw

Purpose: Two studies were conducted to investigate the effects of classroom noise on attention and speech perception in native Spanish-speaking second graders learning English as their second language (L2) as compared to English-only-speaking (EO) peers. Method: Study 1 measured children’s on-task behavior during instructional activities with and without soundfield amplification. Study 2 measured the effects of noise (+10 dB signal-to-noise ratio) using an experimental English word recognition task. Results: Findings from Study 1 revealed no significant condition (pre/postamplification) or group differences in observations in on-task performance. Main findings from Study 2 were that word recognition performance declined significantly for both L2 and EO groups in the noise condition; however, the impact was disproportionately greater for the L2 group. Clinical Implications: Children learning in their L2 appear to be at a distinct disadvantage when listening in rooms with typical noise and reverberation. Speech-language pathologists and audiologists should collaborate to inform teachers, help reduce classroom noise, increase signal levels, and improve access to spoken language for L2 learners.


2021 ◽  
pp. 1-10
Author(s):  
Ward R. Drennan

<b><i>Introduction:</i></b> Normal-hearing people often have complaints about the ability to recognize speech in noise. Such disabilities are not typically assessed with conventional audiometry. Suprathreshold temporal deficits might contribute to reduced word recognition in noise as well as reduced temporally based binaural release of masking for speech. Extended high-frequency audibility (&#x3e;8 kHz) has also been shown to contribute to speech perception in noise. The primary aim of this study was to compare conventional audiometric measures with measures that could reveal subclinical deficits. <b><i>Methods:</i></b> Conventional and extended high-frequency audiometry was done with 119 normal-hearing people ranging in age from 18 to 72. The ability to recognize words in noise was evaluated with and without differences in temporally based spatial cues. A low-uncertainty, closed-set word recognition task was used to limit cognitive influences. <b><i>Results:</i></b> In normal-hearing listeners, word recognition in noise ability decreases significantly with increasing pure-tone average (PTA). On average, signal-to-noise ratios worsened by 5.7 and 6.0 dB over the normal range, for the diotic and dichotic conditions, respectively. When controlling for age, a significant relationship remained in the diotic condition. Measurement error was estimated at 1.4 and 1.6 dB for the diotic and dichotic conditions, respectively. Controlling for both PTA and age, EHF-PTAs showed significant partial correlations with SNR50 in both conditions (<i>ρ</i> = 0.30 and 0.23). Temporally based binaural release of masking worsened with age by 1.94 dB from 18 to 72 years old but showed no significant relationship with either PTA. <b><i>Conclusions:</i></b> All three assessments in this study demonstrated hearing problems independently of those observed in conventional audiometry. Considerable degradations in word recognition in noise abilities were observed as PTAs increased within the normal range. The use of an efficient words-in-noise measure might help identify functional hearing problems for individuals that are traditionally normal hearing. Extended audiometry provided additional predictive power for word recognition in noise independent of both the PTA and age. Temporally based binaural release of masking for word recognition decreased with age independent of PTAs within the normal range, indicating multiple mechanisms of age-related decline with potential clinical impact.


Sign in / Sign up

Export Citation Format

Share Document