The impact of speech recognition on speech synthesis

Author(s):  
M. Ostendorf ◽  
I. Bulyko
2021 ◽  
Vol 32 (08) ◽  
pp. 528-536
Author(s):  
Jessica H. Lewis ◽  
Irina Castellanos ◽  
Aaron C. Moberly

Abstract Background Recent models theorize that neurocognitive resources are deployed differently during speech recognition depending on task demands, such as the severity of degradation of the signal or modality (auditory vs. audiovisual [AV]). This concept is particularly relevant to the adult cochlear implant (CI) population, considering the large amount of variability among CI users in their spectro-temporal processing abilities. However, disentangling the effects of individual differences in spectro-temporal processing and neurocognitive skills on speech recognition in clinical populations of adult CI users is challenging. Thus, this study investigated the relationship between neurocognitive functions and recognition of spectrally degraded speech in a group of young adult normal-hearing (NH) listeners. Purpose The aim of this study was to manipulate the degree of spectral degradation and modality of speech presented to young adult NH listeners to determine whether deployment of neurocognitive skills would be affected. Research Design Correlational study design. Study Sample Twenty-one NH college students. Data Collection and Analysis Participants listened to sentences in three spectral-degradation conditions: no degradation (clear sentences); moderate degradation (8-channel noise-vocoded); and high degradation (4-channel noise-vocoded). Thirty sentences were presented in an auditory-only (A-only) modality and an AV fashion. Visual assessments from The National Institute of Health Toolbox Cognitive Battery were completed to evaluate working memory, inhibition-concentration, cognitive flexibility, and processing speed. Analyses of variance compared speech recognition performance among spectral degradation condition and modality. Bivariate correlation analyses were performed among speech recognition performance and the neurocognitive skills in the various test conditions. Results Main effects on sentence recognition were found for degree of degradation (p = < 0.001) and modality (p = < 0.001). Inhibition-concentration skills moderately correlated (r = 0.45, p = 0.02) with recognition scores for sentences that were moderately degraded in the A-only condition. No correlations were found among neurocognitive scores and AV speech recognition scores. Conclusions Inhibition-concentration skills are deployed differentially during sentence recognition, depending on the level of signal degradation. Additional studies will be required to study these relations in actual clinical populations such as adult CI users.


Author(s):  
Bruna S. Mussoi

Purpose Music training has been proposed as a possible tool for auditory training in older adults, as it may improve both auditory and cognitive skills. However, the evidence to support such benefits is mixed. The goal of this study was to determine the differential effects of lifelong musical training and working memory on speech recognition in noise, in older adults. Method A total of 31 musicians and nonmusicians aged 65–78 years took part in this cross-sectional study. Participants had a normal pure-tone average, with most having high-frequency hearing loss. Working memory (memory capacity) was assessed with the backward Digit Span test, and speech recognition in noise was assessed with three clinical tests (Quick Speech in Noise, Hearing in Noise Test, and Revised Speech Perception in Noise). Results Findings from this sample of older adults indicate that neither music training nor working memory was associated with differences on the speech recognition in noise measures used in this study. Similarly, duration of music training was not associated with speech-in-noise recognition. Conclusions Results from this study do not support the hypothesis that lifelong music training benefits speech recognition in noise. Similarly, an effect of working memory (memory capacity) was not apparent. While these findings may be related to the relatively small sample size, results across previous studies that investigated these effects have also been mixed. Prospective randomized music training studies may be able to better control for variability in outcomes associated with pre-existing and music training factors, as well as to examine the differential impact of music training and working memory for speech-in-noise recognition in older adults.


2018 ◽  
Vol 7 (2.28) ◽  
pp. 234 ◽  
Author(s):  
Karolina Kuligowska ◽  
Paweł Kisielewicz ◽  
Aleksandra Włodarz

The present speech synthesis systems can be successfully used for a wide range of diverse purposes. However, there are serious and important limitations in using various synthesizers. Many of these problems can be identified and resolved. The aim of this paper is to present the current state of development of speech synthesis systems and to examine their drawbacks and limitations. The paper dis-cusses the current classification, construction and functioning of speech synthesis systems, which gives an insight into synthesizers implemented so far. The analysis of disadvantages and limitations of speech synthesis systems focuses on identification of weak points of these systems, namely: the impact of emotions and prosody, spontaneous speech in terms of naturalness and intelligibility, preprocessing and text analysis, problem of ambiguity, natural sounding, adaptation to the situation, variety of systems, sparsely spoken languages, speech synthesis for older people, and some other minor limitations. Solving these problems stimulates further development of speech synthesis domain. 


2014 ◽  
Vol 57 (3) ◽  
pp. 1108-1126 ◽  
Author(s):  
Ruth M. Reeder ◽  
Jill B. Firszt ◽  
Laura K. Holden ◽  
Michael J. Strube

PurposeThe purpose of this study was to examine the rate of progress in the 2nd implanted ear as it relates to the 1st implanted ear and to bilateral performance in adult sequential cochlear implant recipients. In addition, this study aimed to identify factors that contribute to patient outcomes.MethodThe authors performed a prospective longitudinal study in 21 adults who received bilateral sequential cochlear implants. Testing occurred at 6 intervals: prebilateral through 12 months postbilateral implantation. Measures evaluated speech recognition in quiet and noise, localization, and perceived benefit.ResultsSecond ear performance was similar to 1st ear performance by 6 months postbilateral implantation. Bilateral performance was generally superior to either ear alone; however, participants with shorter 2nd ear length of deafness (<20 years) had more rapid early improvement and better overall outcomes than those with longer 2nd ear length of deafness (>30 years). All participants reported bilateral benefit.ConclusionsAdult cochlear implant recipients demonstrated benefit from 2nd ear implantation for speech recognition, localization, and perceived communication function. Because performance outcomes were related to length of deafness, shorter time between surgeries may be warranted to reduce negative length-of-deafness effects. Future study may clarify the impact of other variables, such as preimplant hearing aid use, particularly for individuals with longer periods of deafness.


Sign in / Sign up

Export Citation Format

Share Document