Diagnosing Voice Disorder with Machine Learning

Author(s):  
Minh Pham ◽  
Jing Lin ◽  
Yanjia Zhang
2022 ◽  
Vol 43 (2) ◽  
pp. 103327
Author(s):  
Jonathan Reid ◽  
Preet Parmar ◽  
Tyler Lund ◽  
Daniel K. Aalto ◽  
Caroline C. Jeffery

2017 ◽  
Vol 2017 ◽  
pp. 1-13 ◽  
Author(s):  
Tamer A. Mesallam ◽  
Mohamed Farahat ◽  
Khalid H. Malki ◽  
Mansour Alsulaiman ◽  
Zulfiqar Ali ◽  
...  

A voice disorder database is an essential element in doing research on automatic voice disorder detection and classification. Ethnicity affects the voice characteristics of a person, and so it is necessary to develop a database by collecting the voice samples of the targeted ethnic group. This will enhance the chances of arriving at a global solution for the accurate and reliable diagnosis of voice disorders by understanding the characteristics of a local group. Motivated by such idea, an Arabic voice pathology database (AVPD) is designed and developed in this study by recording three vowels, running speech, and isolated words. For each recorded samples, the perceptual severity is also provided which is a unique aspect of the AVPD. During the development of the AVPD, the shortcomings of different voice disorder databases were identified so that they could be avoided in the AVPD. In addition, the AVPD is evaluated by using six different types of speech features and four types of machine learning algorithms. The results of detection and classification of voice disorders obtained with the sustained vowel and the running speech are also compared with the results of an English-language disorder database, the Massachusetts Eye and Ear Infirmary (MEEI) database.


2018 ◽  
Vol 2018 ◽  
pp. 1-19 ◽  
Author(s):  
Ugo Cesari ◽  
Giuseppe De Pietro ◽  
Elio Marciano ◽  
Ciro Niri ◽  
Giovanna Sannino ◽  
...  

Objectives. The current study presents a clinical evaluation of Vox4Health, an m-health system able to estimate the possible presence of a voice disorder by calculating and analyzing the main acoustic measures required for the acoustic analysis, namely, the Fundamental Frequency, jitter, shimmer, and Harmonic to Noise Ratio. The acoustic analysis is an objective, effective, and noninvasive tool used in clinical practice to perform a quantitative evaluation of voice quality. Materials and Methods. A clinical study was carried out in collaboration with medical staff of the University of Naples Federico II. 208 volunteers were recruited (mean age, 44.2 ± 13.9 years), 58 healthy subjects (mean age, 36.7 ± 13.3 years) and 150 pathological ones (mean age, 47 ± 13.1 years). The evaluation of Vox4Health was made in terms of classification performance, i.e., sensitivity, specificity, and accuracy, by using a rule-based algorithm that considers the most characteristic acoustic parameters to classify if the voice is healthy or pathological. The performance has been compared with that achieved by using Praat, one of the most commonly used tools in clinical practice. Results. Using a rule-based algorithm, the best accuracy in the detection of voice disorders, 72.6%, was obtained by using the jitter or shimmer value. Moreover, the best sensitivity is about 96% and it was always obtained by using jitter. Finally, the best specificity was achieved by using the Fundamental Frequency and it is equal to 56.9%. Additionally, in order to improve the classification accuracy of the next version of the Vox4Health app, an evaluation by using machine learning techniques was conducted. We performed some preliminary tests adopting different machine learning techniques able to classify the voice as healthy or pathological. The best accuracy (77.4%) was obtained by the Logistic Model Tree algorithm, while the best sensitivity (99.3%) was achieved using the Support Vector Machine. Finally, Instance-based Learning performed the best specificity (36.2%). Conclusions. Considering the achieved accuracy, Vox4Health has been considered by the medical experts as a “good screening tool” for the detection of voice disorders in its current version. However, this accuracy is improved when machine learning classifiers are considered rather than the rule-based algorithm.


IEEE Access ◽  
2018 ◽  
Vol 6 ◽  
pp. 16246-16255 ◽  
Author(s):  
Laura Verde ◽  
Giuseppe De Pietro ◽  
Giovanna Sannino

2020 ◽  
Vol 43 ◽  
Author(s):  
Myrthe Faber

Abstract Gilead et al. state that abstraction supports mental travel, and that mental travel critically relies on abstraction. I propose an important addition to this theoretical framework, namely that mental travel might also support abstraction. Specifically, I argue that spontaneous mental travel (mind wandering), much like data augmentation in machine learning, provides variability in mental content and context necessary for abstraction.


2020 ◽  
Vol 63 (1) ◽  
pp. 109-124
Author(s):  
Carly Jo Hosbach-Cannon ◽  
Soren Y. Lowell ◽  
Raymond H. Colton ◽  
Richard T. Kelley ◽  
Xue Bao

Purpose To advance our current knowledge of singer physiology by using ultrasonography in combination with acoustic measures to compare physiological differences between musical theater (MT) and opera (OP) singers under controlled phonation conditions. Primary objectives addressed in this study were (a) to determine if differences in hyolaryngeal and vocal fold contact dynamics occur between two professional voice populations (MT and OP) during singing tasks and (b) to determine if differences occur between MT and OP singers in oral configuration and associated acoustic resonance during singing tasks. Method Twenty-one singers (10 MT and 11 OP) were included. All participants were currently enrolled in a music program. Experimental procedures consisted of sustained phonation on the vowels /i/ and /ɑ/ during both a low-pitch task and a high-pitch task. Measures of hyolaryngeal elevation, tongue height, and tongue advancement were assessed using ultrasonography. Vocal fold contact dynamics were measured using electroglottography. Simultaneous acoustic recordings were obtained during all ultrasonography procedures for analysis of the first two formant frequencies. Results Significant oral configuration differences, reflected by measures of tongue height and tongue advancement, were seen between groups. Measures of acoustic resonance also showed significant differences between groups during specific tasks. Both singer groups significantly raised their hyoid position when singing high-pitched vowels, but hyoid elevation was not statistically different between groups. Likewise, vocal fold contact dynamics did not significantly differentiate the two singer groups. Conclusions These findings suggest that, under controlled phonation conditions, MT singers alter their oral configuration and achieve differing resultant formants as compared with OP singers. Because singers are at a high risk of developing a voice disorder, understanding how these two groups of singers adjust their vocal tract configuration during their specific singing genre may help to identify risky vocal behavior and provide a basis for prevention of voice disorders.


2017 ◽  
Vol 2 (3) ◽  
pp. 49-56
Author(s):  
Jana Childes ◽  
Alissa Acker ◽  
Dana Collins

Pediatric voice disorders are typically a low-incidence population in the average caseload of clinicians working within school and general clinic settings. This occurs despite evidence of a fairly high prevalence of childhood voice disorders and the multiple impacts the voice disorder may have on a child's social development, the perception of the child by others, and the child's academic success. There are multiple barriers that affect the identification of children with abnormal vocal qualities and their access to services. These include: the reliance on school personnel, the ability of parents and caretakers to identify abnormal vocal qualities and signs of misuse, the access to specialized medical services for appropriate diagnosis, and treatment planning and issues related to the Speech-Language Pathologists' perception of their skills and competence regarding voice management for pediatric populations. These barriers and possible solutions to them are discussed with perspectives from the school, clinic and university settings.


2020 ◽  
Author(s):  
Mohammed J. Zaki ◽  
Wagner Meira, Jr
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document