scholarly journals Recognition of Synthetic Speech by Hearing-Impaired Elderly Listeners

1991 ◽  
Vol 34 (5) ◽  
pp. 1180-1184 ◽  
Author(s):  
Larry E. Humes ◽  
Kathleen J. Nelson ◽  
David B. Pisoni

The Modified Rhyme Test (MRT), recorded using natural speech and two forms of synthetic speech, DECtalk and Votrax, was used to measure both open-set and closed-set speech-recognition performance. Performance of hearing-impaired elderly listeners was compared to two groups of young normal-hearing adults, one listening in quiet, and the other listening in a background of spectrally shaped noise designed to simulate the peripheral hearing loss of the elderly. Votrax synthetic speech yielded significant decrements in speech recognition compared to either natural or DECtalk synthetic speech for all three subject groups. There were no differences in performance between natural speech and DECtalk speech for the elderly hearing-impaired listeners or the young listeners with simulated hearing loss. The normal-hearing young adults listening in quiet out-performed both of the other groups, but there were no differences in performance between the young listeners with simulated hearing loss and the elderly hearing-impaired listeners. When the closed-set identification of synthetic speech was compared to its open-set recognition, the hearing-impaired elderly gained as much from the reduction in stimulus/response uncertainty as the two younger groups. Finally, among the elderly hearing-impaired listeners, speech-recognition performance was correlated negatively with hearing sensitivity, but scores were correlated positively among the different talker conditions. Those listeners with the greatest hearing loss had the most difficulty understanding speech and those having the most trouble understanding natural speech also had the greatest difficulty with synthetic speech.

1990 ◽  
Vol 33 (4) ◽  
pp. 726-735 ◽  
Author(s):  
Larry E. Humes ◽  
Lisa Roberts

The role that sensorineural hearing loss plays in the speech-recognition difficulties of the hearing-impaired elderly is examined. One approach to this issue was to make between-group comparisons of performance for three groups of subjects: (a) young normal-hearing adults; (b) elderly hearing-impaired adults; and (c) young normal-hearing adults with simulated sensorineural hearing loss equivalent to that of the elderly subjects produced by a spectrally shaped masking noise. Another approach to this issue employed correlational analyses to examine the relation between audibility and speech recognition within the group of elderly hearing-impaired subjects. An additional approach was pursued in which an acoustical index incorporating adjustments for threshold elevation was used to examine the role audibility played in the speech-recognition performance of the hearing-impaired elderly. A wide range of listening conditions was sampled in this experiment. The conclusion was that the primary determiner of speech-recognition performance in the elderly hearing-impaired subjects was their threshold elevation.


1994 ◽  
Vol 37 (2) ◽  
pp. 422-428 ◽  
Author(s):  
John H. Grose ◽  
Elizabeth A. Poth ◽  
Robert W. Peters

This study measured the masking level difference (MLD) for both 500-Hz tone detection and spondee word recognition in two groups of listeners. One group consisted of 9 elderly listeners with normal audiometric sensitivity bilaterally, up to at least 2000 Hz. The other group was a control group of 10 young listeners with normal hearing. The intent was to determine whether the elderly listeners exhibited a reduction in binaural performance that might contribute to the difficulties many such listeners have in understanding speech in noisy situations. By measuring MLDs in elderly listeners in the absence of marked peripheral hearing loss, it was hoped that any observed changes in MLD could be more strongly attributed to central effects. For both tone detection and speech recognition, it was found that the elderly performed more poorly than the young listeners, primarily on the NoSπ condition.


2019 ◽  
Vol 62 (4) ◽  
pp. 1051-1067 ◽  
Author(s):  
Jonathan H. Venezia ◽  
Allison-Graham Martin ◽  
Gregory Hickok ◽  
Virginia M. Richards

Purpose Age-related sensorineural hearing loss can dramatically affect speech recognition performance due to reduced audibility and suprathreshold distortion of spectrotemporal information. Normal aging produces changes within the central auditory system that impose further distortions. The goal of this study was to characterize the effects of aging and hearing loss on perceptual representations of speech. Method We asked whether speech intelligibility is supported by different patterns of spectrotemporal modulations (STMs) in older listeners compared to young normal-hearing listeners. We recruited 3 groups of participants: 20 older hearing-impaired (OHI) listeners, 19 age-matched normal-hearing listeners, and 10 young normal-hearing (YNH) listeners. Listeners performed a speech recognition task in which randomly selected regions of the speech STM spectrum were revealed from trial to trial. The overall amount of STM information was varied using an up–down staircase to hold performance at 50% correct. Ordinal regression was used to estimate weights showing which regions of the STM spectrum were associated with good performance (a “classification image” or CImg). Results The results indicated that (a) large-scale CImg patterns did not differ between the 3 groups; (b) weights in a small region of the CImg decreased systematically as hearing loss increased; (c) CImgs were also nonsystematically distorted in OHI listeners, and the magnitude of this distortion predicted speech recognition performance even after accounting for audibility; and (d) YNH listeners performed better overall than the older groups. Conclusion We conclude that OHI/older normal-hearing listeners rely on the same speech STMs as YNH listeners but encode this information less efficiently. Supplemental Material https://doi.org/10.23641/asha.7859981


2021 ◽  
Vol 2021 (1) ◽  
Author(s):  
Clara Borrelli ◽  
Paolo Bestagini ◽  
Fabio Antonacci ◽  
Augusto Sarti ◽  
Stefano Tubaro

AbstractSeveral methods for synthetic audio speech generation have been developed in the literature through the years. With the great technological advances brought by deep learning, many novel synthetic speech techniques achieving incredible realistic results have been recently proposed. As these methods generate convincing fake human voices, they can be used in a malicious way to negatively impact on today’s society (e.g., people impersonation, fake news spreading, opinion formation). For this reason, the ability of detecting whether a speech recording is synthetic or pristine is becoming an urgent necessity. In this work, we develop a synthetic speech detector. This takes as input an audio recording, extracts a series of hand-crafted features motivated by the speech-processing literature, and classify them in either closed-set or open-set. The proposed detector is validated on a publicly available dataset consisting of 17 synthetic speech generation algorithms ranging from old fashioned vocoders to modern deep learning solutions. Results show that the proposed method outperforms recently proposed detectors in the forensics literature.


2012 ◽  
Vol 23 (08) ◽  
pp. 577-589 ◽  
Author(s):  
Mary Rudner ◽  
Thomas Lunner ◽  
Thomas Behrens ◽  
Elisabet Sundewall Thorén ◽  
Jerker Rönnberg

Background: Recently there has been interest in using subjective ratings as a measure of perceived effort during speech recognition in noise. Perceived effort may be an indicator of cognitive load. Thus, subjective effort ratings during speech recognition in noise may covary both with signal-to-noise ratio (SNR) and individual cognitive capacity. Purpose: The present study investigated the relation between subjective ratings of the effort involved in listening to speech in noise, speech recognition performance, and individual working memory (WM) capacity in hearing impaired hearing aid users. Research Design: In two experiments, participants with hearing loss rated perceived effort during aided speech perception in noise. Noise type and SNR were manipulated in both experiments, and in the second experiment hearing aid compression release settings were also manipulated. Speech recognition performance was measured along with WM capacity. Study Sample: There were 46 participants in all with bilateral mild to moderate sloping hearing loss. In Experiment 1 there were 16 native Danish speakers (eight women and eight men) with a mean age of 63.5 yr (SD = 12.1) and average pure tone (PT) threshold of 47. 6 dB (SD = 9.8). In Experiment 2 there were 30 native Swedish speakers (19 women and 11 men) with a mean age of 70 yr (SD = 7.8) and average PT threshold of 45.8 dB (SD = 6.6). Data Collection and Analysis: A visual analog scale (VAS) was used for effort rating in both experiments. In Experiment 1, effort was rated at individually adapted SNRs while in Experiment 2 it was rated at fixed SNRs. Speech recognition in noise performance was measured using adaptive procedures in both experiments with Dantale II sentences in Experiment 1 and Hagerman sentences in Experiment 2. WM capacity was measured using a letter-monitoring task in Experiment 1 and the reading span task in Experiment 2. Results: In both experiments, there was a strong and significant relation between rated effort and SNR that was independent of individual WM capacity, whereas the relation between rated effort and noise type seemed to be influenced by individual WM capacity. Experiment 2 showed that hearing aid compression setting influenced rated effort. Conclusions: Subjective ratings of the effort involved in speech recognition in noise reflect SNRs, and individual cognitive capacity seems to influence relative rating of noise type.


1979 ◽  
Vol 88 (5) ◽  
pp. 676-683 ◽  
Author(s):  
Robert N. Butler ◽  
Barbara Gastel

Just as the ear trumpet once symbolized the elderly, so do contemporary approaches to hearing loss in the aged reflect many of the major themes in geriatrics and gerontology today. This paper begins by describing the National Institute on Aging (NIA) with particular emphasis on activities relevant to hearing in the elderly. Next, several areas of research interest, including the typology of presbycusis and related conditions, the epidemiology of auditory impairment in old age, the design of testing and research, and the rehabilitation of the hearing-impaired elderly, are addressed. The NIA and the National Institute of Neurological and Communicative Disorders and Stroke (NINCDS) are coordinating their efforts to stimulate investigation of these and related topics.


2003 ◽  
Vol 12 (1) ◽  
pp. 41-51 ◽  
Author(s):  
Paula Henry ◽  
Todd Ricketts

Improving the signal-to-noise ratio (SNR) for individuals with hearing loss who are listening to speech in noise provides an obvious benefit. Although binaural hearing provides the greatest advantage over monaural hearing in noise, some individuals with symmetrical hearing loss choose to wear only one hearing aid. The present study tested the hypothesis that individuals with symmetrical hearing loss fit with one hearing aid would demonstrate improved speech recognition in background noise with increases in head turn. Fourteen individuals were fit monaurally with a Starkey Gemini in-the-ear (ITE) hearing aid with directional and omnidirectional microphone modes. Speech recognition performance in noise was tested using the audiovisual version of the Connected Speech Test (CST v.3). The test was administered in auditory-only conditions as well as with the addition of visual cues for each of three head angles: 0°, 20°, and 40°. Results indicated improvement in speech recognition performance with changes in head angle for the auditory-only presentation mode at the 20° and 40° head angles when compared to 0°. Improvement in speech recognition performance for the auditory + visual mode was noted for the 20° head angle when compared to 0°. Additionally, a decrement in speech recognition performance for the auditory + visual mode was noted for the 40° head angle when compared to 0°. These results support a speech recognition advantage for listeners fit with one ITE hearing aid listening in a close listener-to-speaker distance when they turn their head slightly in order to increase signal intensity.


1982 ◽  
Vol 25 (1) ◽  
pp. 141-148 ◽  
Author(s):  
Judy R. Dubno ◽  
Donald D. Dirks ◽  
Laurn R. Langhofer

Syllable recognition ability and consonant confusion patterns were evaluated for 38 listeners with mild-to-moderate sensorineural hearing loss using the closed-set Nonsense-Syllable Test (NST). Performance for these materials varies as a function of consonant voicing, the position of the consonant in the syllable, and the accompanying vowel. Scores for listeners with steeply sloping audiometric configurations were consistently poorer than those for listeners with gradually sloping or flat audiograms. Consonant confusion analyses revealed place of articulation errors to be the most frequent, regardless of the listener's audiometric configuration. Analysis of consonant confusion patterns indicates the existence of a systematic relationship between consonant confusions and audiometric configuration. The NST findings are discussed in terms of the test's potential use and are compared to the results of existing confusion analyses.


Sign in / Sign up

Export Citation Format

Share Document