Evaluating Web-Based Automatic Transcription for Alzheimer’s Speech Data: Transcript Comparison and Machine Learning Analysis (Preprint)

2021 ◽  
Author(s):  
Thomas Soroski ◽  
Thiago da Cunha Vasco ◽  
Sally Newton-Mason ◽  
Saffrin Granby ◽  
Caitlin Lewis ◽  
...  

BACKGROUND Speech data for medical research can be collected non-invasively and in large volumes. Speech analysis has shown promise in diagnosing neurodegenerative disease. To effectively leverage speech data, transcription is important as there is valuable information contained in lexical content. Manual transcription, while highly accurate, limits potential scalability and cost savings associated with language-based screening. OBJECTIVE To better understand the use of automatic transcription for classification of neurodegenerative disease (Alzheimer’s Disease [AD], mild cognitive impairment [MCI] or subjective memory complaints [SMC] versus healthy controls), we compared automatically generated transcripts against transcripts that went through manual correction. METHODS We recruited individuals from a memory clinic (“patients”) with a diagnosis of mild-moderate AD, (n=44), MCI (n=20), SMC (n=8) and healthy controls living in the community (n=77). Participants were asked to describe a standardized picture, read a paragraph, and recall a pleasant life experience. We compared transcripts generated using Google speech-to-text software to manually-verified transcripts by examining transcription confidence scores, transcription error rates, and machine learning classification accuracy. For the classification tasks, Logistic Regression, Gaussian Naive Bayes, and Random Forests were used. RESULTS The transcription software showed higher confidence scores (P<.001) and lower error rates (P>.05) for speech from healthy controls as compared with patients. Classification models using human-verified transcripts significantly (P<.001) outperformed automatically-generated transcript models for both spontaneous speech tasks. This comparison showed no difference in the reading task. Manually adding pauses to transcripts had no impact on classification performance. Manually correcting both spontaneous speech tasks led to significantly higher performances in the machine learning models. CONCLUSIONS We found that automatically-transcribed speech data could be used to distinguish patients with a diagnosis of AD, MCI or SMC from controls. We recommend a human verification step to improve the performance of automatic transcripts, especially for spontaneous tasks. Moreover, human verification can focus on correcting errors and adding punctuation to transcripts. Manual addition of pauses, however, is not needed, which can simplify the human verification step to more efficiently process large volumes of speech data.

2021 ◽  
Vol 3 ◽  
Author(s):  
Yue Guo ◽  
Changye Li ◽  
Carol Roan ◽  
Serguei Pakhomov ◽  
Trevor Cohen

Large amounts of labeled data are a prerequisite to training accurate and reliable machine learning models. However, in the medical domain in particular, this is also a stumbling block as accurately labeled data are hard to obtain. DementiaBank, a publicly available corpus of spontaneous speech samples from a picture description task widely used to study Alzheimer's disease (AD) patients' language characteristics and for training classification models to distinguish patients with AD from healthy controls, is relatively small—a limitation that is further exacerbated when restricting to the balanced subset used in the Alzheimer's Dementia Recognition through Spontaneous Speech (ADReSS) challenge. We build on previous work showing that the performance of traditional machine learning models on DementiaBank can be improved by the addition of normative data from other sources, evaluating the utility of such extrinsic data to further improve the performance of state-of-the-art deep learning based methods on the ADReSS challenge dementia detection task. To this end, we developed a new corpus of professionally transcribed recordings from the Wisconsin Longitudinal Study (WLS), resulting in 1366 additional Cookie Theft Task transcripts, increasing the available training data by an order of magnitude. Using these data in conjunction with DementiaBank is challenging because the WLS metadata corresponding to these transcripts do not contain dementia diagnoses. However, cognitive status of WLS participants can be inferred from results of several cognitive tests including semantic verbal fluency available in WLS data. In this work, we evaluate the utility of using the WLS ‘controls’ (participants without indications of abnormal cognitive status), and these data in conjunction with inferred ‘cases’ (participants with such indications) for training deep learning models to discriminate between language produced by patients with dementia and healthy controls. We find that incorporating WLS data during training a BERT model on ADReSS data improves its performance on the ADReSS dementia detection task, supporting the hypothesis that incorporating WLS data adds value in this context. We also demonstrate that weighted cost functions and additional prediction targets may be effective ways to address issues arising from class imbalance and confounding effects due to data provenance.


2020 ◽  
Author(s):  
Caroline Wanderley Espinola ◽  
Juliana Carneiro Gomes ◽  
Jessiane Mônica Silva Pereira ◽  
Wellington Pinheiro dos Santos

AbstractPurposeDiagnosis and treatment in psychiatry are still highly dependent on reports from patients and on clinician judgement. This fact makes them prone to memory and subjectivity biases. As for other medical fields, where objective biomarkers are available, there has been an increasing interest in the development of such tools in psychiatry. To this end, vocal acoustic parameters have been recently studied as possible objective biomarkers, instead of otherwise invasive and costly methods. Patients suffering from different mental disorders, such as major depressive disorder (MDD), may present with alterations of speech. These can be described as uninteresting, monotonous and spiritless speech, low voice.MethodsThirty-three individuals (11 males) over 18 years old were selected, 22 of which being previously diagnosed with MDD, and 11 healthy controls. Their speech was recorded in naturalistic settings, during a routine medical evaluation for psychiatric patients, and in different environments for healthy controls. Voices from third parties were removed. The recordings were submitted to to a vocal feature extraction algorithm, and to different machine learning classification techniques.ResultsThe results showed that support vector machines (SVM) models provided the greatest classification performances for different kernels, with PUK kernel providing accuracy of 89.14% for the detection of MDD.ConclusionThe use of machine learning classifiers with vocal acoustics features has shown to be very promising for the detection of major depressive disorder, but further tests with a larger sample will be necessary to validate our findings.


2007 ◽  
Vol 34 (3) ◽  
pp. 445-471 ◽  
Author(s):  
GISELA SZAGUN ◽  
BARBARA STUMPER ◽  
NINA SONDAG ◽  
MELANIE FRANIK

ABSTRACTThe acquisition of noun gender on articles was studied in a sample of 21 young German-speaking children. Longitudinal spontaneous speech data were used. Data analysis is based on 22 two-hourly speech samples per child from 6 children between 1 ; 4 and 3 ; 8 and on 5 two-hourly speech samples per child from 15 children between 1 ; 4 and 2 ; 10. The use of gender marked articles occurred from 1 ; 5. Error frequencies dropped below 10% by 3 ; 0. Definite and indefinite articles were used with similar frequencies and error rates did not differ in the two paradigms. Children's errors were systematic. For monosyllabic nouns and for polysyllabic nouns ending in -el, -en and -er errors were more frequent for nouns which did not conform to the rule that such nouns tend to be masculine. Furthermore, children erred in the direction of the rule overgeneralizing der. Correct gender marking was also associated with adult frequency of noun use. The present data is evidence for the early use of phonological regularities of noun structure in the acquisition of gender marking.


2008 ◽  
Vol 155 ◽  
pp. 23-52
Author(s):  
Elma Nap-Kolhoff ◽  
Peter Broeder

Abstract This study compares pronominal possessive constructions in Dutch first language (L1) acquisition, second language (L2) acquisition by young children, and untutored L2 acquisition by adults. The L2 learners all have Turkish as L1. In longitudinal spontaneous speech data for four L1 learners, seven child L2 learners, and two adult learners, remarkable differences and similarities between the three learner groups were found. In some respects, the child L2 learners develop in a way that is similar to child L1 learners, for instance in the kind of overgeneralisations that they make. However, the child L2 learners also behave like adult L2 learners; i.e., in the pace of the acquisition process, the frequency and persistence of non-target constructions, and the difficulty in acquiring reduced pronouns. The similarities between the child and adult L2 learners are remarkable, because the child L2 learners were only two years old when they started learning Dutch. L2 acquisition before the age of three is often considered to be similar to L1 acquisition. The findings might be attributable to the relatively small amount of Dutch language input the L2 children received.


2021 ◽  
Vol 15 ◽  
Author(s):  
Alhassan Alkuhlani ◽  
Walaa Gad ◽  
Mohamed Roushdy ◽  
Abdel-Badeeh M. Salem

Background: Glycosylation is one of the most common post-translation modifications (PTMs) in organism cells. It plays important roles in several biological processes including cell-cell interaction, protein folding, antigen’s recognition, and immune response. In addition, glycosylation is associated with many human diseases such as cancer, diabetes and coronaviruses. The experimental techniques for identifying glycosylation sites are time-consuming, extensive laboratory work, and expensive. Therefore, computational intelligence techniques are becoming very important for glycosylation site prediction. Objective: This paper is a theoretical discussion of the technical aspects of the biotechnological (e.g., using artificial intelligence and machine learning) to digital bioinformatics research and intelligent biocomputing. The computational intelligent techniques have shown efficient results for predicting N-linked, O-linked and C-linked glycosylation sites. In the last two decades, many studies have been conducted for glycosylation site prediction using these techniques. In this paper, we analyze and compare a wide range of intelligent techniques of these studies from multiple aspects. The current challenges and difficulties facing the software developers and knowledge engineers for predicting glycosylation sites are also included. Method: The comparison between these different studies is introduced including many criteria such as databases, feature extraction and selection, machine learning classification methods, evaluation measures and the performance results. Results and conclusions: Many challenges and problems are presented. Consequently, more efforts are needed to get more accurate prediction models for the three basic types of glycosylation sites.


Author(s):  
Timnit Gebru

This chapter discusses the role of race and gender in artificial intelligence (AI). The rapid permeation of AI into society has not been accompanied by a thorough investigation of the sociopolitical issues that cause certain groups of people to be harmed rather than advantaged by it. For instance, recent studies have shown that commercial automated facial analysis systems have much higher error rates for dark-skinned women, while having minimal errors on light-skinned men. Moreover, a 2016 ProPublica investigation uncovered that machine learning–based tools that assess crime recidivism rates in the United States are biased against African Americans. Other studies show that natural language–processing tools trained on news articles exhibit societal biases. While many technical solutions have been proposed to alleviate bias in machine learning systems, a holistic and multifaceted approach must be taken. This includes standardization bodies determining what types of systems can be used in which scenarios, making sure that automated decision tools are created by people from diverse backgrounds, and understanding the historical and political factors that disadvantage certain groups who are subjected to these tools.


2020 ◽  
Vol 13 (5) ◽  
pp. 508-523 ◽  
Author(s):  
Guan‐Hua Huang ◽  
Chih‐Hsuan Lin ◽  
Yu‐Ren Cai ◽  
Tai‐Been Chen ◽  
Shih‐Yen Hsu ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document