Word Recognition with Segmented-Alternated CVC Words

1984 ◽  
Vol 27 (3) ◽  
pp. 378-386 ◽  
Author(s):  
Richard H. Wilson ◽  
John T. Arcos ◽  
Howard C. Jones

Consonant-vowel-consonant (CVC) monosyllabic words were segmented at the approximate phoneme boundaries and were presented to subjects with normal hearing in the following sequence: (a) the carrier phrase to both ears, (b) the initial consonant segment to one ear, (c) the vowel segment to the other ear, and (d) the final consonant segment to the ear that received the initial consonant. A computer technique, which is described in detail, was used to develop the test materials. The digital editing did not alter appreciably the spectral or temporal characteristics of the words. A series of four experiments produced a list of 50 words on which 10% correct word recognition was achieved by listeners with normal hearing when the vowel segment or the consonant segments of the words were presented monaurally in isolation. When the speech materials were presented binaurally—that is, the vowel segment in one ear and consonant segments in the other ear—word-recognition performance improved to 90% correct.

2008 ◽  
Vol 19 (06) ◽  
pp. 496-506 ◽  
Author(s):  
Richard H. Wilson ◽  
Rachel McArdle ◽  
Heidi Roberts

Background: So that portions of the classic Miller, Heise, and Lichten (1951) study could be replicated, new recorded versions of the words and digits were made because none of the three common monosyllabic word lists (PAL PB-50, CID W-22, and NU–6) contained the 9 monosyllabic digits (1–10, excluding 7) that were used by Miller et al. It is well established that different psychometric characteristics have been observed for different lists and even for the same materials spoken by different speakers. The decision was made to record four lists of each of the three monosyllabic word sets, the monosyllabic digits not included in the three sets of word lists, and the CID W-1 spondaic words. A professional female speaker with a General American dialect recorded the materials during four recording sessions within a 2-week interval. The recording order of the 582 words was random. Purpose: To determine—on listeners with normal hearing—the psychometric properties of the five speech materials presented in speech-spectrum noise. Research Design: A quasi-experimental, repeated-measures design was used. Study Sample: Twenty-four young adult listeners (M = 23 years) with normal pure-tone thresholds (≤20-dB HL at 250 to 8000 Hz) participated. The participants were university students who were unfamiliar with the test materials. Data Collection and Analysis: The 582 words were presented at four signal-to-noise ratios (SNRs; −7-, −2-, 3-, and 8-dB) in speech-spectrum noise fixed at 72-dB SPL. Although the main metric of interest was the 50% point on the function for each word established with the Spearman-Kärber equation (Finney, 1952), the percentage correct on each word at each SNR was evaluated. The psychometric characteristics of the PB-50, CID W-22, and NU–6 monosyllabic word lists were compared with one another, with the CID W-1 spondaic words, and with the 9 monosyllabic digits. Results: Recognition performance on the four lists within each of the three monosyllabic word materials were equivalent, ±0.4 dB. Likewise, word-recognition performance on the PB-50, W-22, and NU–6 word lists were equivalent, ±0.2 dB. The mean recognition performance at the 50% point with the 36 W-1 spondaic words was ˜6.2 dB lower than the 50% point with the monosyllabic words. Recognition performance on the monosyllabic digits was 1–2 dB better than mean performance on the monosyllabic words. Conclusions: Word-recognition performances on the three sets of materials (PB-50, CID W-22, and NU–6) were equivalent, as were the performances on the four lists that make up each of the three materials. Phonetic/phonemic balance does not appear to be an important consideration in the compilation of word-recognition lists used to evaluate the ability of listeners to understand speech.A companion paper examines the acoustic, phonetic/phonological, and lexical variables that may predict the relative ease or difficulty for which these monosyllable words were recognized in noise (McArdle and Wilson, this issue).


2008 ◽  
Vol 19 (06) ◽  
pp. 507-518 ◽  
Author(s):  
Rachel McArdle ◽  
Richard H. Wilson

Purpose: To analyze the 50% correct recognition data that were from the Wilson et al (this issue) study and that were obtained from 24 listeners with normal hearing; also to examine whether acoustic, phonetic, or lexical variables can predict recognition performance for monosyllabic words presented in speech-spectrum noise. Research Design: The specific variables are as follows: (a) acoustic variables (i.e., effective root-mean-square sound pressure level, duration), (b) phonetic variables (i.e., consonant features such as manner, place, and voicing for initial and final phonemes; vowel phonemes), and (c) lexical variables (i.e., word frequency, word familiarity, neighborhood density, neighborhood frequency). Data Collection and Analysis: The descriptive, correlational study will examine the influence of acoustic, phonetic, and lexical variables on speech recognition in noise performance. Results: Regression analysis demonstrated that 45% of the variance in the 50% point was accounted for by acoustic and phonetic variables whereas only 3% of the variance was accounted for by lexical variables. These findings suggest that monosyllabic word-recognition-in-noise is more dependent on bottom-up processing than on top-down processing. Conclusions: The results suggest that when speech-in-noise testing is used in a pre- and post-hearing-aid-fitting format, the use of monosyllabic words may be sensitive to changes in audibility resulting from amplification.


1963 ◽  
Vol 6 (3) ◽  
pp. 263-269 ◽  
Author(s):  
Richard G. Chappell ◽  
James F. Kavanagh ◽  
Stanley Zerlin

Normal hearing adults demonstrated approximately 20% better intelligibility scores for monosyllabic words presented binaurally (with a background of conversation) than to these words presented monaurally. The test materials were recorded on dual-channel tape through two head-mounted microphones. These microphones were directed toward each of three speakers who in turn produced the monosyllabic words while two simultaneous conversations were carried on by four other participants. Throughout the recording session the experimenters attempted to preserve as naturalistic a situation as possible. The 18 subjects with normal hearing listened through earphones to a single channel of this tape presented monaurally and to both channels delivered binaurally. The difference between the monaural and binaural intelligibility scores is discussed in terms of image-separation in space.


2020 ◽  
Vol 31 (07) ◽  
pp. 531-546
Author(s):  
Mitzarie A. Carlo ◽  
Richard H. Wilson ◽  
Albert Villanueva-Reyes

Abstract Background English materials for speech audiometry are well established. In Spanish, speech-recognition materials are not standardized with monosyllables, bisyllables, and trisyllables used in word-recognition protocols. Purpose This study aimed to establish the psychometric characteristics of common Spanish monosyllabic, bisyllabic, and trisyllabic words for potential use in word-recognition procedures. Research Design Prospective descriptive study. Study Sample Eighteen adult Puerto Ricans (M = 25.6 years) with normal hearing [M = 7.8-dB hearing level (HL) pure-tone average] were recruited for two experiments. Data Collection and Analyses A digital recording of 575 Spanish words was created (139 monosyllables, 359 bisyllables, and 77 trisyllables), incorporating materials from a variety of Spanish word-recognition lists. Experiment 1 (n = 6) used 25 randomly selected words from each of the three syllabic categories to estimate the presentation level ranges needed to obtain recognition performances over the 10 to 90% range. In Experiment 2 (n = 12) the 575 words were presented over five 1-hour sessions using presentation levels from 0- to 30-dB HL in 5-dB steps (monosyllables), 0- to 25-dB HL in 5-dB steps (bisyllables), and −3- to 17-dB HL in 4-dB steps (trisyllables). The presentation order of both the words and the presentation levels were randomized for each listener. The functions for each listener and each word were fit with polynomial equations from which the 50% points and slopes at the 50% point were calculated. Results The mean 50% points and slopes at 50% were 8.9-dB HL, 4.0%/dB (monosyllables), 6.9-dB HL, 5.1%/dB (bisyllables), and 1.4-dB HL, 6.3%/dB (trisyllables). The Kruskal–Wallis test with Mann–Whitney U post-hoc analysis indicated that the mean 50% points and slopes at the 50% points of the individual word functions were significantly different among the syllabic categories. Although significant differences were observed among the syllabic categories, substantial overlap was noted in the individual word functions, indicating that the psychometric characteristics of the words were not dictated exclusively by the syllabic number. Influences associated with word difficulty, word familiarity, singular and plural form words, phonetic stress patterns, and gender word patterns also were evaluated. Conclusion The main finding was the direct relation between the number of syllables in a word and word-recognition performance. In general, words with more syllables were more easily recognized; there were, however, exceptions. The current data from young adults with normal hearing established the psychometric characteristics of the 575 Spanish words on which the formulation of word lists for both threshold and suprathreshold measures of word-recognition abilities in quiet and in noise and other word-recognition protocols can be based.


1980 ◽  
Vol 45 (2) ◽  
pp. 223-238 ◽  
Author(s):  
Richard H. Wilson ◽  
June K. Antablin

The Picture Identification Task was developed to estimate the word-recognition performance of nonverbal adults. Four lists of 50 monosyllabic words each were assembled and recorded. Each test word and three rhyming alternatives were illustrated and photographed in a quadrant arrangement. The task of the patient was to point to the picture representing the recorded word that was presented through the earphone. In the first experiment with young adults, no significant differences were found between the Picture Identification Task and the Northwestern University Auditory Test No. 6 materials in an open-set response paradigm. In the second experiment, the Picture Identification Task with the picture-pointing response was compared with the Northwestern University Auditory Test No. 6 in both an open-set and a closed-set response paradigm. The results from this experiment demonstrated significant differences among the three response tasks. The easiest task was a closed-set response to words, the next was a closed-set response to pictures, and the most difficult task was an open-set response. At high stimulus-presentation levels, however, the three tasks produced similar results. Finally, the clinical use of the Picture Identification Task is described along with preliminary results obtained from 30 patients with various communicative impairments.


1991 ◽  
Vol 34 (6) ◽  
pp. 1436-1438 ◽  
Author(s):  
Richard H. Wilson ◽  
John P. Preece ◽  
Courtney S. Crowther

The NU No. 6 materials spoken by a female speaker were passed through a notch filter centered at 247 Hz with a 34-dB depth The filtering reduced the amplitude range within the spectrum of the materials by 10 dB that was reflected as a 7.5-vu reduction measured on a true vu meter. Thus, the notch filtering in effect changed the level calibration of the materials. Psychometric functions of the NU No. 6 materials filtered and unfiltered in 60-dB SPL broadband noise were obtained from 12 listeners with normal hearing. Although the slopes of the functions for the two conditions were the same, the functions were displaced by an average of 5 8 dB with the function for the filtered materials located at the lower sound-pressure levels.


2017 ◽  
Vol 28 (01) ◽  
pp. 068-079
Author(s):  
Richard H. Wilson ◽  
Kadie C. Sharrett

AbstractTwo previous experiments from our laboratory with 70 interrupted monosyllabic words demonstrated that recognition performance was influenced by the temporal location of the interruption pattern. The interruption pattern (10 interruptions/sec, 50% duty cycle) was always the same and referenced word onset; the only difference between the patterns was the temporal location of the on- and off-segments of the interruption cycle. In the first study, both young and older listeners obtained better recognition performances when the initial on-segment coincided with word onset than when the initial on-segment was delayed by 50 msec. The second experiment with 24 young listeners detailed recognition performance as the interruption pattern was incremented in 10-msec steps through the 0- to 90-msec onset range. Across the onset conditions, 95% of the functions were either flat or U-shaped.To define the effects that interruption pattern locations had on word recognition by older listeners with sensorineural hearing loss as the interruption pattern incremented, re: word onset, from 0 to 90 msec in 10-msec steps.A repeated-measures design with ten interruption patterns (onset conditions) and one uninterruption condition.Twenty-four older males (mean = 69.6 yr) with sensorineural hearing loss participated in two 1-hour sessions. The three-frequency pure-tone average was 24.0 dB HL and word recognition was ≥80% correct.Seventy consonant-vowel nucleus-consonant words formed the corpus of materials with 25 additional words used for practice. For each participant, the 700 interrupted stimuli (70 words by 10 onset conditions), the 70 words uninterrupted, and two practice lists each were randomized and recorded on compact disc in 33 tracks of 25 words each.The data were analyzed at the participant and word levels and compared to the results obtained earlier on 24 young listeners with normal hearing. The mean recognition performance on the 70 words uninterrupted was 91.0% with an overall mean performance on the ten interruption conditions of 63.2% (range: 57.9–69.3%), compared to 80.4% (range: 73.0–87.7%) obtained earlier on the young adults. The best performances were at the extremes of the onset conditions. Standard deviations ranged from 22.1% to 28.1% (24 participants) and from 9.2% to 12.8% (70 words). An arithmetic algorithm categorized the shapes of the psychometric functions across the ten onset conditions. With the older participants in the current study, 40% of the functions were flat, 41.4% were U-shaped, and 18.6% were inverted U-shaped, which compared favorably to the function shapes by the young listeners in the earlier study of 50.0%, 41.4%, and 8.6%, respectively. There were two words on which the older listeners had 40% better performances.Collectively, the data are orderly, but at the individual word or participant level, the data are somewhat volatile, which may reflect auditory processing differences between the participant groups. The diversity of recognition performances by the older listeners on the ten interruption conditions with each of the 70 words supports the notion that the term hearing loss is inclusive of processes well beyond the filtering produced by end-organ sensitivity deficits.


2002 ◽  
Vol 45 (3) ◽  
pp. 585-597 ◽  
Author(s):  
Sumiko Takayanagi ◽  
Donald D. Dirks ◽  
Anahita Moshfegh

Evidence suggests that word recognition depends on numerous talker-, listener-, and stimulus-related characteristics. The current study examined the effects of talker variability and lexical difficulty on spoken-word recognition among four groups of listeners: native listeners with normal hearing or hearing impairment (moderate sensorineural hearing loss) and non-native listeners with normal hearing or hearing impairment. The ability of listeners to accommodate trial-totrial variations in talkers' voice was assessed by comparing recognition scores for a single-talker condition to those obtained in a multiple-talker condition. Lexical difficulty was assessed by comparing word-recognition performance between lexically ‘easy’ and ‘hard’ words as determined by frequency of occurrence in language and the structural characteristics of similarity neighborhoods formalized in the Neighborhood Activation Model. An up-down adaptive procedure was used to determine the sound pressure level for 50% performance. Non-native listeners in both normal-hearing and hearing-impaired groups required greater intensity for equal intelligibility than the native normal-hearing and hearingimpaired listeners. Results, however, showed significant effects of talker variability and lexical difficulty for the four groups. Structural equation modeling demonstrated that an audibility factor accounts for 2–3 times more variance in performance than does a linguistic-familiarity factor. However, the linguistic-familiarity factor is also essential to the model fit. The results demonstrated effects of talker variability and lexical difficulty on word recognition for both native and nonnative listeners with normal or impaired hearing. The results indicate that linguistic and indexical factors should be considered in the development of speech-recognition tests.


2015 ◽  
Vol 24 (1) ◽  
pp. 53-65 ◽  
Author(s):  
Lu-Feng Shi

Purpose Shi (2011, 2013) obtained sensitivity/specificity measures of bilingual listeners' English and relative proficiency ratings as the predictor of English word recognition in quiet. The current study investigated how relative proficiency predicted word recognition in noise. Method Forty-two monolingual and 168 bilingual normal-hearing listeners were included. Bilingual listeners rated their proficiency in listening, speaking, and reading in English and in the other language using an 11-point scale. Listeners were presented with 50 English monosyllabic words in quiet at 45 dB HL and in multitalker babble with a signal-to-noise ratio of +6 and 0 dB. Results Data in quiet confirmed Shi's (2013) finding that relative proficiency with or without dominance predicted well whether bilinguals performed on par with the monolingual norm. Predicting the outcome was difficult for the 2 noise conditions. To identify bilinguals whose performance fell below the normative range, dominance per se or a combination of dominance and average relative proficiency rating yielded the best sensitivity/specificity and summary measures, including Youden's index. Conclusion Bilinguals' word recognition is more difficult to predict in noise than in quiet; however, proficiency and dominance variables can predict reasonably well whether bilinguals may perform at a monolingual normative level.


2003 ◽  
Vol 14 (09) ◽  
pp. 453-470 ◽  
Author(s):  
Richard H. Wilson

A simple word-recognition task in multitalker babble for clinic use was developed in the course of four experiments involving listeners with normal hearing and listeners with hearing loss. In Experiments 1 and 2, psychometric functions for the individual NU No. 6 words from Lists 2, 3, and 4 were obtained with each word in a unique segment of multitalker babble. The test paradigm that emerged involved ten words at each of seven signal-to-babble ratios (S/B) from 0 to 24 dB. Experiment 3 examined the effect that babble presentation level (70, 80, and 90 dB SPL) had on recognition performance in babble, whereas Experiment 4 studied the effect that monaural and binaural listening had on recognition performance. For listeners with normal hearing, the 90th percentile was 6 dB S/B. In comparison to the listeners with normal hearing, the 50% correct points on the functions for listeners with hearing loss were at 5 to 15 dB higher signal-to-babble ratios.


Sign in / Sign up

Export Citation Format

Share Document