Feasibility of Automating Fidelity Monitoring in a Dementia Care Intervention

Abstract Careful fidelity monitoring is critical to implementing evidence-based interventions in dementia care settings to ensure that the intervention is delivered consistently and as intended. Most approaches to fidelity monitoring rely on human coding of content that has been covered during a session or of stylistic aspects of the intervention, including rapport, empathy, enthusiasm and are unrealistic to implement on a large scale in real world settings. Technological advances in automatic speech recognition and language and speech processing offers potential solutions to overcome these barriers. We compare three commercial automatic speech recognition tools on spoken content drawn from dementia care interactions to determine the accuracy of recognition and the guarantees for privacy offered by each provider. Data were obtained from recorded sessions of the Dementia Behavior Study intervention trial (NCT01892579). We find that despite their impressive performance in general applications, automatic speech recognition systems work less well for older adults and people of color. We outline a plan for automating fidelity in interaction style and content which would be integrated in an online program for training dementia care providers.

Download Full-text

Phoneme-to-Grapheme Conversion Based Large-Scale Pre-Training for End-to-End Automatic Speech Recognition

10.21437/interspeech.2020-1930 ◽

2020 ◽

Author(s):

Ryo Masumura ◽

Naoki Makishima ◽

Mana Ihori ◽

Akihiko Takashima ◽

Tomohiro Tanaka ◽

...

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Large Scale ◽

End To End

Download Full-text

Automatic Speech Recognition Predicts Speech Intelligibility and Comprehension for Listeners With Simulated Age-Related Hearing Loss

Journal of Speech Language and Hearing Research ◽

10.1044/2017_jslhr-s-16-0269 ◽

2017 ◽

Vol 60 (9) ◽

pp. 2394-2405 ◽

Cited By ~ 6

Author(s):

Lionel Fontan ◽

Isabelle Ferrané ◽

Jérôme Farinas ◽

Julien Pinquier ◽

Julien Tardieu ◽

...

Keyword(s):

Hearing Loss ◽

Speech Recognition ◽

Automatic Speech Recognition ◽

Hearing Aids ◽

Speech Processing ◽

Fine Tuning ◽

Language Models ◽

Age Related ◽

Age Related Hearing Loss ◽

Asr System

Purpose The purpose of this article is to assess speech processing for listeners with simulated age-related hearing loss (ARHL) and to investigate whether the observed performance can be replicated using an automatic speech recognition (ASR) system. The long-term goal of this research is to develop a system that will assist audiologists/hearing-aid dispensers in the fine-tuning of hearing aids. Method Sixty young participants with normal hearing listened to speech materials mimicking the perceptual consequences of ARHL at different levels of severity. Two intelligibility tests (repetition of words and sentences) and 1 comprehension test (responding to oral commands by moving virtual objects) were administered. Several language models were developed and used by the ASR system in order to fit human performances. Results Strong significant positive correlations were observed between human and ASR scores, with coefficients up to .99. However, the spectral smearing used to simulate losses in frequency selectivity caused larger declines in ASR performance than in human performance. Conclusion Both intelligibility and comprehension scores for listeners with simulated ARHL are highly correlated with the performances of an ASR-based system. In the future, it needs to be determined if the ASR system is similarly successful in predicting speech processing in noise and by older people with ARHL.

Download Full-text

Large-Scale Mixed-Bandwidth Deep Neural Network Acoustic Modeling for Automatic Speech Recognition

10.21437/interspeech.2019-2641 ◽

2019 ◽

Author(s):

Khoi-Nguyen C. Mac ◽

Xiaodong Cui ◽

Wei Zhang ◽

Michael Picheny

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Automatic Speech Recognition ◽

Large Scale ◽

Deep Neural Network ◽

Acoustic Modeling

Download Full-text

LARGE-SCALE REMOTE ASSESSMENT OF VERBAL COGNITIVE FUNCTION USING AUTOMATIC SPEECH RECOGNITION

Alzheimer s & Dementia ◽

10.1016/j.jalz.2019.06.4334 ◽

2019 ◽

Vol 15 (7) ◽

pp. P162-P163

Author(s):

Francesca K. Cormack ◽

Nick Taptiklis ◽

Jennifer H. Barnett ◽

Merina Su

Keyword(s):

Speech Recognition ◽

Cognitive Function ◽

Automatic Speech Recognition ◽

Large Scale ◽

Remote Assessment

Download Full-text

An Ergonomic Framework for Researching and Designing Speech Recognition Technologies in Health Care with an Emphasis on Safety

Proceedings of the International Symposium on Human Factors and Ergonomics in Health Care ◽

10.1177/2327857919081067 ◽

2019 ◽

Vol 8 (1) ◽

pp. 279-283

Author(s):

Tim Arnold ◽

Helen J. A. Fuller

Keyword(s):

Health Care ◽

Speech Recognition ◽

Human Factors ◽

Automatic Speech Recognition ◽

Speech Processing ◽

Clinical Work ◽

Error Tolerance ◽

Speech Interfaces ◽

Computer Based ◽

Processing Errors

Automatic speech recognition (ASR) systems and speech interfaces are becoming increasingly prevalent. This includes increases in and expansion of use of these technologies for supporting work in health care. Computer-based speech processing has been extensively studied and developed over decades. Speech processing tools have been fine-tuned through the work of Speech and Language Researchers. Researchers have previously and continue to describe speech processing errors in medicine. The discussion provided in this paper proposes an ergonomic framework for speech recognition to expand and further describe this view of speech processing in supporting clinical work. With this end in mind, we hope to build on previous work and emphasize the need for increased human factors involvement in this area while also facilitating the discussion of speech recognition in contexts that have been explored in the human factors domain. Human factors expertise can contribute through proactively describing and designing these critical interconnected socio-technical systems with error-tolerance in mind.

Download Full-text

O3-06-06: LARGE-SCALE REMOTE ASSESSMENT OF VERBAL COGNITIVE FUNCTION USING AUTOMATIC SPEECH RECOGNITION

Alzheimer s & Dementia ◽

10.1016/j.jalz.2019.06.4659 ◽

2019 ◽

Vol 15 ◽

pp. P897-P897

Author(s):

Francesca K. Cormack ◽

Nick Taptiklis ◽

Jennifer H. Barnett ◽

Merina Su

Keyword(s):

Speech Recognition ◽

Cognitive Function ◽

Automatic Speech Recognition ◽

Large Scale ◽

Remote Assessment

Download Full-text

Auditory speech processing for scale-shift covariance and its evaluation in automatic speech recognition

Proceedings of 2010 IEEE International Symposium on Circuits and Systems ◽

10.1109/iscas.2010.5537725 ◽

2010 ◽

Cited By ~ 1

Author(s):

Roy D. Patterson ◽

Thomas C. Walters ◽

Jessica Monaghan ◽

Christian Feldbauer ◽

Toshio Irino

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Speech Processing ◽

Auditory Speech ◽

Scale Shift

Download Full-text

Large Scale Evaluation of Importance Maps in Automatic Speech Recognition

10.21437/interspeech.2020-2883 ◽

2020 ◽

Author(s):

Viet Anh Trinh ◽

Michael I. Mandel

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Large Scale ◽

Scale Evaluation

Download Full-text

Factorial speech processing models for noise-robust automatic speech recognition

2015 23rd Iranian Conference on Electrical Engineering ◽

10.1109/iraniancee.2015.7146292 ◽

2015 ◽

Author(s):

Mahdi Khademian ◽

Mohammad Mehdi Homayounpour

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Speech Processing ◽

Noise Robust

Download Full-text

Is human-human spoken interaction manageable? The emergence of the concept: ‘Conversation Intelligence’

Online Journal of Applied Knowledge Management ◽

10.36965/ojakm.2018.6(1)1-14 ◽

2018 ◽

Vol 6 (1) ◽

pp. 1-14 ◽

Cited By ~ 1

Author(s):

Vered Silber-Varod

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

User Interfaces ◽

Power Relations ◽

Large Scale ◽

Human Communication ◽

Communication Management ◽

Mining Technology ◽

Speech Interaction ◽

Spoken Interaction

Currently, via the mediation of audio mining technology and conversational user interfaces, and after years of constant improvements of Automatic Speech Recognition technology, conversation intelligence is an emerging concept, significant to the understanding of human-human communication in its most natural and primitive channel – our voice. This paper introduces the concept of Conversation Intelligence (CI), which is becoming crucial to the study of humanhuman speech interaction and communication management and is part of the field of speech analytics. CI is demonstrated on two established discourse terms – power relations and convergence. Finally, this paper highlights the importance of visualization for large-scale speech analytics.

Download Full-text