scholarly journals Effect of Speech Recognition on Problem Solving and Recall in Consumer Digital Health Tasks: Controlled Laboratory Experiment (Preprint)

2019 ◽  
Author(s):  
Jessica Chen ◽  
David Lyell ◽  
Liliana Laranjo ◽  
Farah Magrabi

BACKGROUND Recent advances in natural language processing and artificial intelligence have led to widespread adoption of speech recognition technologies. In consumer health applications, speech recognition is usually applied to support interactions with conversational agents for data collection, decision support, and patient monitoring. However, little is known about the use of speech recognition in consumer health applications and few studies have evaluated the efficacy of conversational agents in the hands of consumers. In other consumer-facing tools, cognitive load has been observed to be an important factor affecting the use of speech recognition technologies in tasks involving problem solving and recall. Users find it more difficult to think and speak at the same time when compared to typing, pointing, and clicking. However, the effects of speech recognition on cognitive load when performing health tasks has not yet been explored. OBJECTIVE The aim of this study was to evaluate the use of speech recognition for documentation in consumer digital health tasks involving problem solving and recall. METHODS Fifty university staff and students were recruited to undertake four documentation tasks with a simulated conversational agent in a computer laboratory. The tasks varied in complexity determined by the amount of problem solving and recall required (simple and complex) and the input modality (speech recognition vs keyboard and mouse). Cognitive load, task completion time, error rate, and usability were measured. RESULTS Compared to using a keyboard and mouse, speech recognition significantly increased the cognitive load for complex tasks (<i>Z</i>=–4.08, <i>P</i>&lt;.001) and simple tasks (<i>Z</i>=–2.24, <i>P</i>=.03). Complex tasks took significantly longer to complete (<i>Z</i>=–2.52, <i>P</i>=.01) and speech recognition was found to be overall less usable than a keyboard and mouse (<i>Z</i>=–3.30, <i>P</i>=.001). However, there was no effect on errors. CONCLUSIONS Use of a keyboard and mouse was preferable to speech recognition for complex tasks involving problem solving and recall. Further studies using a broader variety of consumer digital health tasks of varying complexity are needed to investigate the contexts in which use of speech recognition is most appropriate. The effects of cognitive load on task performance and its significance also need to be investigated.

10.2196/14827 ◽  
2020 ◽  
Vol 22 (6) ◽  
pp. e14827
Author(s):  
Jessica Chen ◽  
David Lyell ◽  
Liliana Laranjo ◽  
Farah Magrabi

Background Recent advances in natural language processing and artificial intelligence have led to widespread adoption of speech recognition technologies. In consumer health applications, speech recognition is usually applied to support interactions with conversational agents for data collection, decision support, and patient monitoring. However, little is known about the use of speech recognition in consumer health applications and few studies have evaluated the efficacy of conversational agents in the hands of consumers. In other consumer-facing tools, cognitive load has been observed to be an important factor affecting the use of speech recognition technologies in tasks involving problem solving and recall. Users find it more difficult to think and speak at the same time when compared to typing, pointing, and clicking. However, the effects of speech recognition on cognitive load when performing health tasks has not yet been explored. Objective The aim of this study was to evaluate the use of speech recognition for documentation in consumer digital health tasks involving problem solving and recall. Methods Fifty university staff and students were recruited to undertake four documentation tasks with a simulated conversational agent in a computer laboratory. The tasks varied in complexity determined by the amount of problem solving and recall required (simple and complex) and the input modality (speech recognition vs keyboard and mouse). Cognitive load, task completion time, error rate, and usability were measured. Results Compared to using a keyboard and mouse, speech recognition significantly increased the cognitive load for complex tasks (Z=–4.08, P<.001) and simple tasks (Z=–2.24, P=.03). Complex tasks took significantly longer to complete (Z=–2.52, P=.01) and speech recognition was found to be overall less usable than a keyboard and mouse (Z=–3.30, P=.001). However, there was no effect on errors. Conclusions Use of a keyboard and mouse was preferable to speech recognition for complex tasks involving problem solving and recall. Further studies using a broader variety of consumer digital health tasks of varying complexity are needed to investigate the contexts in which use of speech recognition is most appropriate. The effects of cognitive load on task performance and its significance also need to be investigated.


2021 ◽  
Vol 30 (01) ◽  
pp. 191-199
Author(s):  
Tilman Dingler ◽  
Dominika Kwasnicka ◽  
Jing Wei ◽  
Enying Gong ◽  
Brian Oldenburg

Summary Objectives: To describe the use and promise of conversational agents in digital health—including health promotion andprevention—and how they can be combined with other new technologies to provide healthcare at home. Method: A narrative review of recent advances in technologies underpinning conversational agents and their use and potential for healthcare and improving health outcomes. Results: By responding to written and spoken language, conversational agents present a versatile, natural user interface and have the potential to make their services and applications more widely accessible. Historically, conversational interfaces for health applications have focused mainly on mental health, but with an increase in affordable devices and the modernization of health services, conversational agents are becoming more widely deployed across the health system. We present our work on context-aware voice assistants capable of proactively engaging users and delivering health information and services. The proactive voice agents we deploy, allow us to conduct experience sampling in people's homes and to collect information about the contexts in which users are interacting with them. Conclusion: In this article, we describe the state-of-the-art of these and other enabling technologies for speech and conversation and discuss ongoing research efforts to develop conversational agents that “live” with patients and customize their service offerings around their needs. These agents can function as ‘digital companions’ who will send reminders about medications and appointments, proactively check in to gather self-assessments, and follow up with patients on their treatment plans. Together with an unobtrusive and continuous collection of other health data, conversational agents can provide novel and deeply personalized access to digital health care, and they will continue to become an increasingly important part of the ecosystem for future healthcare delivery.


2018 ◽  
Vol 09 (02) ◽  
pp. 326-335 ◽  
Author(s):  
Tobias Hodgson ◽  
Farah Magrabi ◽  
Enrico Coiera

Objective To conduct a replication study to validate previously identified significant risks and inefficiencies associated with the use of speech recognition (SR) for documentation within an electronic health record (EHR) system. Methods Thirty-five emergency department clinicians undertook randomly allocated clinical documentation tasks using keyboard and mouse (KBM) or SR using a commercial EHR system. The experiment design, setting, and tasks (E2) replicated an earlier study (E1), while technical integration issues that may have led to poorer SR performance were addressed. Results Complex tasks were significantly slower to complete using SR (16.94%) than KBM (KBM: 191.9 s, SR: 224.4 s; p = 0.009; CI, 11.9–48.3), replicating task completion times observed in the earlier experiment. Errors (non-typographical) were significantly higher with SR compared with KBM for both simple (KBM: 3, SR: 84; p < 0.001; CI, 1.5–2.5) and complex tasks (KBM: 23, SR: 53; p = 0.001; CI, 0.5–1.0), again replicating earlier results (E1: 170, E2: 163; p = 0.660; CI, 0.0–0.0). Typographical errors were reduced significantly in the new study (E1: 465, E2: 150; p < 0.001; CI, 2.0–3.0). Discussion The results of this study replicate those reported earlier. The use of SR for clinical documentation within an EHR system appears to be consistently associated with decreased time efficiencies and increased errors. Modifications implemented to optimize SR integration in the EHR seem to have resulted in minor improvements that did not fundamentally change overall results. Conclusion This replication study adds further evidence for the poor performance of SR-assisted clinical documentation within an EHR. Replication studies remain rare in informatics literature, especially where study results are unexpected or have significant implication; such studies are clearly needed to avoid overdependence on the results of a single study.


2009 ◽  
Vol 23 (2) ◽  
pp. 129-138 ◽  
Author(s):  
Florian Schmidt-Weigand ◽  
Martin Hänze ◽  
Rita Wodzinski

How can worked examples be enhanced to promote complex problem solving? N = 92 students of the 8th grade attended in pairs to a physics problem. Problem solving was supported by (a) a worked example given as a whole, (b) a worked example presented incrementally (i.e. only one solution step at a time), or (c) a worked example presented incrementally and accompanied by strategic prompts. In groups (b) and (c) students self-regulated when to attend to the next solution step. In group (c) each solution step was preceded by a prompt that suggested strategic learning behavior (e.g. note taking, sketching, communicating with the learning partner, etc.). Prompts and solution steps were given on separate sheets. The study revealed that incremental presentation lead to a better learning experience (higher feeling of competence, lower cognitive load) compared to a conventional presentation of the worked example. However, only if additional strategic learning behavior was prompted, students remembered the solution more correctly and reproduced more solution steps.


2020 ◽  
Author(s):  
André De Faria Pereira Neto ◽  
Leticia Barbosa ◽  
Rodolfo Paolucci

UNSTRUCTURED Billions of people in the world own a smartphone. It is a low-cost, portable computing device with countless features, among which applications stand out, which are programs or software developed to meet a specific goal. A wide range of applications available ranging from entertainment and personal organization to work and education is available currently. It is a vast and profitable market. Health applications have been a means of intervention for different areas, including chronic diseases, epidemics, and health emergencies. A recently published paper in the journal with the highest impact factor in Digital Health (“Journal of Medical Internet Research”) proposes a classification of health applications. This study performs a critical analysis of this organization and presents other sort criteria. This paper also presents and analyzes the “Meu Info Saúde” (“My Health Info”) app – a pioneering government initiative focused on primary care launched by the Oswaldo Cruz Foundation. The application classification proposal that will be presented builds on the intervention strategies in the health-disease process, namely: “Health Promotion”, “Disease Prevention” and “Care, Treatment and Rehabilitation”, as defined by official documents such as the World Health Organization and the Centers for Disease Control and Prevention. Most applications present in the sample are of private and foreign origin, free to download, but with a display of ads or the sale of products and services. The sampled applications were classified as “Health Promotion”, and some applications have also been categorized as “Disease Prevention” or “Care, Treatment or Rehabilitation” because they have multiple functionalities. The applications identified as “Health Promotion” focused only on individuals’ lifestyle and their increased autonomy and self-care management capacity. From this perspective, the apps analyzed in this paper differ from the “Meu Info-Saúde” application developed at Fiocruz.


2021 ◽  
Vol 11 (1) ◽  
pp. 428
Author(s):  
Donghoon Oh ◽  
Jeong-Sik Park ◽  
Ji-Hwan Kim ◽  
Gil-Jin Jang

Speech recognition consists of converting input sound into a sequence of phonemes, then finding text for the input using language models. Therefore, phoneme classification performance is a critical factor for the successful implementation of a speech recognition system. However, correctly distinguishing phonemes with similar characteristics is still a challenging problem even for state-of-the-art classification methods, and the classification errors are hard to be recovered in the subsequent language processing steps. This paper proposes a hierarchical phoneme clustering method to exploit more suitable recognition models to different phonemes. The phonemes of the TIMIT database are carefully analyzed using a confusion matrix from a baseline speech recognition model. Using automatic phoneme clustering results, a set of phoneme classification models optimized for the generated phoneme groups is constructed and integrated into a hierarchical phoneme classification method. According to the results of a number of phoneme classification experiments, the proposed hierarchical phoneme group models improved performance over the baseline by 3%, 2.1%, 6.0%, and 2.2% for fricative, affricate, stop, and nasal sounds, respectively. The average accuracy was 69.5% and 71.7% for the baseline and proposed hierarchical models, showing a 2.2% overall improvement.


Author(s):  
Ana Guerberof Arenas ◽  
Joss Moorkens ◽  
Sharon O’Brien

AbstractThis paper presents results of the effect of different translation modalities on users when working with the Microsoft Word user interface. An experimental study was set up with 84 Japanese, German, Spanish, and English native speakers working with Microsoft Word in three modalities: the published translated version, a machine translated (MT) version (with unedited MT strings incorporated into the MS Word interface) and the published English version. An eye-tracker measured the cognitive load and usability according to the ISO/TR 16982 guidelines: i.e., effectiveness, efficiency, and satisfaction followed by retrospective think-aloud protocol. The results show that the users’ effectiveness (number of tasks completed) does not significantly differ due to the translation modality. However, their efficiency (time for task completion) and self-reported satisfaction are significantly higher when working with the released product as opposed to the unedited MT version, especially when participants are less experienced. The eye-tracking results show that users experience a higher cognitive load when working with MT and with the human-translated versions as opposed to the English original. The results suggest that language and translation modality play a significant role in the usability of software products whether users complete the given tasks or not and even if they are unaware that MT was used to translate the interface.


2021 ◽  
Vol 7 ◽  
pp. 233372142098568
Author(s):  
Annie T. Chen ◽  
Frances Chu ◽  
Andrew K. Teng ◽  
Soojeong Han ◽  
Shih-Yin Lin ◽  
...  

Background: There is a need for interventions to promote health management of older adults with pre-frailty and frailty. Technology poses promising solutions, but questions exist about effective delivery. Objectives: We present the results of a mixed-methods pilot evaluation of Virtual Online Communities for Older Adults (VOCALE), an 8-week intervention conducted in the northwestern United States, in which participants shared health-related experiences and applied problem solving skills in a Facebook group. Methods: We performed a mixed-methods process evaluation, integrating quantitative and qualitative data, to characterize the intervention and its effects. We focus on four areas: health-related measures (health literacy and self-efficacy), participation, problem solving skills enacted, and subjective feedback. Results: Eight older adults with pre-frailty and frailty (age = 82.7 ± 6.6 years) completed the study. There was an upward trend in health literacy and health self-efficacy post-intervention. Participants posted at least two times per week. Content analysis of 210 posts showed participants were able to apply the problem solving skills taught, and exit interviews showed participants’ increased awareness of the need to manage health, and enjoyment in learning about others. Conclusion: This mixed-methods evaluation provides insight into feasibility and design considerations for online interventions to promote health management among vulnerable older adults.


2014 ◽  
Vol 2 (3-4) ◽  
pp. 257-280 ◽  
Author(s):  
Gwendolyn Kolfschoten ◽  
Simon French ◽  
Frances Brazier

Sign in / Sign up

Export Citation Format

Share Document