The impact of speech recognition on speech synthesis

Abstract Background Recent models theorize that neurocognitive resources are deployed differently during speech recognition depending on task demands, such as the severity of degradation of the signal or modality (auditory vs. audiovisual [AV]). This concept is particularly relevant to the adult cochlear implant (CI) population, considering the large amount of variability among CI users in their spectro-temporal processing abilities. However, disentangling the effects of individual differences in spectro-temporal processing and neurocognitive skills on speech recognition in clinical populations of adult CI users is challenging. Thus, this study investigated the relationship between neurocognitive functions and recognition of spectrally degraded speech in a group of young adult normal-hearing (NH) listeners. Purpose The aim of this study was to manipulate the degree of spectral degradation and modality of speech presented to young adult NH listeners to determine whether deployment of neurocognitive skills would be affected. Research Design Correlational study design. Study Sample Twenty-one NH college students. Data Collection and Analysis Participants listened to sentences in three spectral-degradation conditions: no degradation (clear sentences); moderate degradation (8-channel noise-vocoded); and high degradation (4-channel noise-vocoded). Thirty sentences were presented in an auditory-only (A-only) modality and an AV fashion. Visual assessments from The National Institute of Health Toolbox Cognitive Battery were completed to evaluate working memory, inhibition-concentration, cognitive flexibility, and processing speed. Analyses of variance compared speech recognition performance among spectral degradation condition and modality. Bivariate correlation analyses were performed among speech recognition performance and the neurocognitive skills in the various test conditions. Results Main effects on sentence recognition were found for degree of degradation (p = < 0.001) and modality (p = < 0.001). Inhibition-concentration skills moderately correlated (r = 0.45, p = 0.02) with recognition scores for sentences that were moderately degraded in the A-only condition. No correlations were found among neurocognitive scores and AV speech recognition scores. Conclusions Inhibition-concentration skills are deployed differentially during sentence recognition, depending on the level of signal degradation. Additional studies will be required to study these relations in actual clinical populations such as adult CI users.

Download Full-text

The Impact of Music Training and Working Memory on Speech Recognition in Older Age

Journal of Speech Language and Hearing Research ◽

10.1044/2021_jslhr-20-00426 ◽

2021 ◽

pp. 1-11

Author(s):

Bruna S. Mussoi

Keyword(s):

Older Adults ◽

Working Memory ◽

Speech Recognition ◽

Digit Span ◽

Memory Capacity ◽

Small Sample ◽

Music Training ◽

Speech In Noise ◽

Speech Recognition In Noise ◽

The Impact

Purpose Music training has been proposed as a possible tool for auditory training in older adults, as it may improve both auditory and cognitive skills. However, the evidence to support such benefits is mixed. The goal of this study was to determine the differential effects of lifelong musical training and working memory on speech recognition in noise, in older adults. Method A total of 31 musicians and nonmusicians aged 65–78 years took part in this cross-sectional study. Participants had a normal pure-tone average, with most having high-frequency hearing loss. Working memory (memory capacity) was assessed with the backward Digit Span test, and speech recognition in noise was assessed with three clinical tests (Quick Speech in Noise, Hearing in Noise Test, and Revised Speech Perception in Noise). Results Findings from this sample of older adults indicate that neither music training nor working memory was associated with differences on the speech recognition in noise measures used in this study. Similarly, duration of music training was not associated with speech-in-noise recognition. Conclusions Results from this study do not support the hypothesis that lifelong music training benefits speech recognition in noise. Similarly, an effect of working memory (memory capacity) was not apparent. While these findings may be related to the relatively small sample size, results across previous studies that investigated these effects have also been mixed. Prospective randomized music training studies may be able to better control for variability in outcomes associated with pre-existing and music training factors, as well as to examine the differential impact of music training and working memory for speech-in-noise recognition in older adults.

Download Full-text

The impact of implementing speech recognition technology on the accuracy and efficiency (time to complete) clinical documentation by nurses: A systematic review

Journal of Clinical Nursing ◽

10.1111/jocn.15261 ◽

2020 ◽

Vol 29 (13-14) ◽

pp. 2125-2137

Author(s):

Joseph Joseph ◽

Zena E. H. Moore ◽

Declan Patton ◽

Tom O'Connor ◽

Linda Elizabeth Nugent

Keyword(s):

Systematic Review ◽

Speech Recognition ◽

Clinical Documentation ◽

Speech Recognition Technology ◽

The Impact

Download Full-text

Mitigating the impact of speech recognition errors on chatbot using sequence-to-sequence model

2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) ◽

10.1109/asru.2017.8268977 ◽

2017 ◽

Cited By ~ 3

Author(s):

Pin-Jung Chen ◽

I-Hung Hsu ◽

Yi-Yao Huang ◽

Hung-Yi Lee

Keyword(s):

Speech Recognition ◽

Recognition Errors ◽

The Impact

Download Full-text

Speech synthesis systems: disadvantages and limitations

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.28.12933 ◽

2018 ◽

Vol 7 (2.28) ◽

pp. 234 ◽

Cited By ~ 2

Author(s):

Karolina Kuligowska ◽

Paweł Kisielewicz ◽

Aleksandra Włodarz

Keyword(s):

Text Analysis ◽

Speech Synthesis ◽

Analysis Problem ◽

Current State ◽

Wide Range ◽

Current Classification ◽

Weak Points ◽

The Impact ◽

Further Development ◽

Insight Into

The present speech synthesis systems can be successfully used for a wide range of diverse purposes. However, there are serious and important limitations in using various synthesizers. Many of these problems can be identified and resolved. The aim of this paper is to present the current state of development of speech synthesis systems and to examine their drawbacks and limitations. The paper dis-cusses the current classification, construction and functioning of speech synthesis systems, which gives an insight into synthesizers implemented so far. The analysis of disadvantages and limitations of speech synthesis systems focuses on identification of weak points of these systems, namely: the impact of emotions and prosody, spontaneous speech in terms of naturalness and intelligibility, preprocessing and text analysis, problem of ambiguity, natural sounding, adaptation to the situation, variety of systems, sparsely spoken languages, speech synthesis for older people, and some other minor limitations. Solving these problems stimulates further development of speech synthesis domain.

Download Full-text

Evaluation of the impact in reducing the number of parameters for continuous speech recognition for Brazilian Portuguese

2012 ISSNIP Biosignals and Biorobotics Conference: Biosignals and Robotics for Better and Safer Living (BRC) ◽

10.1109/brc.2012.6222182 ◽

2012 ◽

Cited By ~ 1

Author(s):

Daniella Dias Cavalcante da Silva ◽

Cesar Rocha Vasconcelos ◽

Benedito Guimaraes Aguiar Neto ◽

Joseana Macedo Fechine

Keyword(s):

Speech Recognition ◽

Brazilian Portuguese ◽

Continuous Speech ◽

Continuous Speech Recognition ◽

The Impact

Download Full-text

A Longitudinal Study in Adults With Sequential Bilateral Cochlear Implants: Time Course for Individual Ear and Bilateral Performance

Journal of Speech Language and Hearing Research ◽

10.1044/2014_jslhr-h-13-0087 ◽

2014 ◽

Vol 57 (3) ◽

pp. 1108-1126 ◽

Cited By ~ 19

Author(s):

Ruth M. Reeder ◽

Jill B. Firszt ◽

Laura K. Holden ◽

Michael J. Strube

Keyword(s):

Longitudinal Study ◽

Speech Recognition ◽

Cochlear Implant ◽

Cochlear Implants ◽

Time Course ◽

Performance Outcomes ◽

Prospective Longitudinal Study ◽

Early Improvement ◽

Prospective Longitudinal ◽

The Impact

PurposeThe purpose of this study was to examine the rate of progress in the 2nd implanted ear as it relates to the 1st implanted ear and to bilateral performance in adult sequential cochlear implant recipients. In addition, this study aimed to identify factors that contribute to patient outcomes.MethodThe authors performed a prospective longitudinal study in 21 adults who received bilateral sequential cochlear implants. Testing occurred at 6 intervals: prebilateral through 12 months postbilateral implantation. Measures evaluated speech recognition in quiet and noise, localization, and perceived benefit.ResultsSecond ear performance was similar to 1st ear performance by 6 months postbilateral implantation. Bilateral performance was generally superior to either ear alone; however, participants with shorter 2nd ear length of deafness (<20 years) had more rapid early improvement and better overall outcomes than those with longer 2nd ear length of deafness (>30 years). All participants reported bilateral benefit.ConclusionsAdult cochlear implant recipients demonstrated benefit from 2nd ear implantation for speech recognition, localization, and perceived communication function. Because performance outcomes were related to length of deafness, shorter time between surgeries may be warranted to reduce negative length-of-deafness effects. Future study may clarify the impact of other variables, such as preimplant hearing aid use, particularly for individuals with longer periods of deafness.

Download Full-text

The impact of vocabulary size and language model order on the polish whispery speech recognition

2017 22nd International Conference on Methods and Models in Automation and Robotics (MMAR) ◽

10.1109/mmar.2017.8046899 ◽

2017 ◽

Author(s):

Piotr Kozierski ◽

Talar Sadalla ◽

Szymon Drgas ◽

Adam Dabrowski ◽

Joanna Zietkiewicz

Keyword(s):

Speech Recognition ◽

Language Model ◽

Vocabulary Size ◽

Model Order ◽

The Impact

Download Full-text