Generating synthetic dysarthric speech to overcome dysarthria acoustic data scarcity

Author(s):  
Andrew Hu ◽  
Dhruv Phadnis ◽  
Seyed Reza Shahamiri
Sensor Review ◽  
2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Dhanalakshmi M. ◽  
Nagarajan T. ◽  
Vijayalakshmi P.

Purpose Dysarthria is a neuromotor speech disorder caused by neuromuscular disturbances that affect one or more articulators resulting in unintelligible speech. Though inter-phoneme articulatory variations are well captured by formant frequency-based acoustic features, these variations are expected to be much higher for dysarthric speakers than normal. These substantial variations can be well captured by placing sensors in appropriate articulatory position. This study focuses to determine a set of articulatory sensors and parameters in order to assess articulatory dysfunctions in dysarthric speech. Design/methodology/approach The current work aims to determine significant sensors and parameters associated using motion path and correlation analyzes on the TORGO database of dysarthric speech. Among eight informative sensor channels and six parameters per channel in positional data, the sensors such as tongue middle, back and tip, lower and upper lips and parameters (y, z, φ) are found to contribute significantly toward capturing the articulatory information. Acoustic and positional data analyzes are performed to validate these identified significant sensors. Furthermore, a convolutional neural network-based classifier is developed for both phone-and word-level classification of dysarthric speech using acoustic and positional data. Findings The average phone error rate is observed to be lower, up to 15.54% for positional data when compared with acoustic-only data. Further, word-level classification using a combination of both acoustic and positional information is performed to study that the positional data acquired using significant sensors will boost the performance of classification even for severe dysarthric speakers. Originality/value The proposed work shows that the significant sensors and parameters can be used to assess dysfunctions in dysarthric speech effectively. The articulatory sensor data helps in better assessment than the acoustic data even for severe dysarthric speakers.


1990 ◽  
Vol 51 (C2) ◽  
pp. C2-939-C2-942 ◽  
Author(s):  
N. DINER ◽  
A. WEILL ◽  
J. Y. COAIL ◽  
J. M. COUDEVILLE

2001 ◽  
Author(s):  
Michael Shinego ◽  
Geoff Edelson ◽  
Francine Menas ◽  
Michael Richman ◽  
Robert Nation

2020 ◽  
Author(s):  
Yuki Takashima ◽  
Ryoichi Takashima ◽  
Tetsuya Takiguchi ◽  
Yasuo Ariki

2020 ◽  
Vol 6 (2) ◽  
pp. 151-183
Author(s):  
Diana B. Archangeli ◽  
Jonathan Yip

AbstractBased on impressionistic and acoustic data, Assamese is described as having a phonological tongue root harmony system, with blocking by certain phonological configurations and over-application in certain morphological contexts. This study explores physical properties of the patterns using ultrasonic imaging to determine whether the impressionistic descriptions match what speakers actually do. Principal components analysis (PCA) determines that most participants produce a contrast in tongue root position in the appropriate contexts, though there is less of an impact on tongue root with greater distance from the triggering vowel. Analysis uses the root mean squared distance (RMSD) calculation to determine whether both blocking and over-application take effect. The blocking results conform to the impressionistic descriptions. With over-application, [e] and [o] are expected; while some speakers clearly produce these vowels, others articulate a vowel that is indeterminant between the expected [e]/[o] and an unexpected [ɛ]/[ɔ]. No speaker consistently showed the expected tongue root position in all contexts, and some speakers appeared to have lost the contrast entirely, yet all are considered to be speakers of the same dialect of Assamese. Whether this (apparent) loss is a consequence of crude research methodologies or accurately reflects what is happening within the language community remains an open question.


Author(s):  
Shansong Liu ◽  
Shoukang Hu ◽  
Xurong Xie ◽  
Mengzhe Geng ◽  
Mingyu Cui ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document