Acoustic-Phonetic Contrasts and Intelligibility in the Dysarthria Associated With Mixed Cerebral Palsy

1992 ◽  
Vol 35 (2) ◽  
pp. 296-308 ◽  
Author(s):  
Beth M. Ansel ◽  
Raymond D. Kent

This study evaluated the relationship between specific acoustic features of speech and perceptual judgments of word intelligibility of adults with cerebral palsy-dysarthria. Use of a contrasting word task allowed for intelligibility analysis and correlated acoustic analysis according to specified spectral and temporal features. Selected phonemic contrasts included syllable-initial voicing; syllable-final voicing; stop-nasal; fricative-affricate; front-back, high-low, and tense-lax vowels. Speech materials included a set of CVC stimulus words. Acoustic data are reported on vowel duration, formant frequency locations, voice onset times, amplitude rise times, and frication durations. Listeners’ perceptual assessment of intelligibility of the 16 dysarthric adults by transcription and rating tasks is also presented. All but one acoustic contrast was successfully made as evidenced by measured acoustic differences between contrast pairs. However, the generally successful acoustic contrasts stood in marked contrast to the poorly rated intelligibility scores and high error percentages that were ascribed to the opposite pair members. A second analysis examined the contribution of these acoustic features towards estimates and prediction of intelligibility deficits in speakers with dysarthria. The scaled intelligibility was predicted by multiple regression analysis with 62.6% accuracy by acoustic measures related to one consonant contrast (fricative-affricate) and three vowel contrasts (front-back, high-low, and tense-lax). Other measured contrasts, such as those related to contrast voicing effects and stop-nasal distinctions, did not seem to contribute in a significant way to variability in the intelligibility estimates. These findings are discussed in relation to specific areas of production deficiency that are consistent across different types of dysarthria with cerebral palsy as the etiology.

2021 ◽  
pp. 216770262110178
Author(s):  
Alex S. Cohen ◽  
Christopher R. Cox ◽  
Tovah Cowan ◽  
Michael D. Masucci ◽  
Thanh P. Le ◽  
...  

Negative schizotypal traits potentially can be digitally phenotyped using objective vocal analysis. Prior attempts have shown mixed success in this regard, potentially because acoustic analysis has relied on small, constrained feature sets. We employed machine learning to (a) optimize and cross-validate predictive models of self-reported negative schizotypy using a large acoustic feature set, (b) evaluate model performance as a function of sex and speaking task, (c) understand potential mechanisms underlying negative schizotypal traits by evaluating the key acoustic features within these models, and (d) examine model performance in its convergence with clinical symptoms and cognitive functioning. Accuracy was good (> 80%) and was improved by considering speaking task and sex. However, the features identified as most predictive of negative schizotypal traits were generally not considered critical to their conceptual definitions. Implications for validating and implementing digital phenotyping to understand and quantify negative schizotypy are discussed.


2001 ◽  
Vol 44 (6) ◽  
pp. 1215-1228 ◽  
Author(s):  
Kate Bunton ◽  
Gary Weismer

This study was designed to explore the relationship between perception of a high-low vowel contrast and its acoustic correlates in tokens produced by persons with motor speech disorders. An intelligibility test designed by Kent, Weismer, Kent, and Rosenbek (1989a) groups target and error words in minimal-pair contrasts. This format allows for construction of phonetic error profiles based on listener responses, thus allowing for a direct comparison of the acoustic characteristics of vowels perceived as the intended target with those heard as something other than the target. The high-low vowel contrast was found to be a consistent error across clinical groups and therefore was selected for acoustic analysis. The contrast was expected to have well-defined acoustic measures or correlates, derived from the literature, that directly relate to a listeners' responses for that token. These measures include the difference between the second and first formant frequency (F2-F1), the difference between F1 and the fundamental frequency (F0), and vowel duration. Results showed that the acoustic characteristics of tongue-height errors were not clearly differentiated from the acoustic characteristics of targets. Rather, the acoustic characteristics of errors often looked like noisy (nonprototypi-cal) versions of the targets. Results are discussed in terms of the test from which the errors were derived and within the framework of speech perception theory.


2014 ◽  
Vol 14 (3) ◽  
pp. 689-714
Author(s):  
Suzanne Franks ◽  
Rommel Barbosa

This article studies the acoustic characteristics of some oral vowels in tonic syllables of Brazilian Portuguese (BP) and which acoustic features are important for classifying native versus non-native speakers of BP. We recorded native and non-native speakers of BP for the purpose of the acoustic analysis of the vowels [a], [i], and [u] in tonic syllables. We analyzed the acoustic parameters of each segment using the Support Vector Machines algorithm to identify to which group, native or non-native, a new speaker belongs. When all of the variables were considered, a precision of 91% was obtained. The two most important acoustic cues to determine if a speaker is native or non-native were the durations of [i] and [u] in a word-final position. These findings can contribute to BP speaker identification as well as to the teaching of the pronunciation of Portuguese as a foreign language.


1995 ◽  
Vol 38 (5) ◽  
pp. 1014-1024 ◽  
Author(s):  
Robert L. Whitehead ◽  
Nicholas Schiavetti ◽  
Brenda H. Whitehead ◽  
Dale Evan Metz

The purpose of this investigation was twofold: (a) to determine if there are changes in specific temporal characteristics of speech that occur during simultaneous communication, and (b) to determine if known temporal rules of spoken English are disrupted during simultaneous communication. Ten speakers uttered sentences consisting of a carrier phrase and experimental CVC words under conditions of: (a) speech, (b) speech combined with signed English, and (c) speech combined with signed English for every word except the CVC word that was fingerspelled. The temporal features investigated included: (a) sentence duration, (b) experimental CVC word duration, (c) vowel duration in experimental CVC words, (d) pause duration before and after experimental CVC words, and (e) consonantal effects on vowel duration. Results indicated that for all durational measures, the speech/sign/fingerspelling condition was longest, followed by the speech/sign condition, with the speech condition being shortest. It was also found that for all three speaking conditions, vowels were longer in duration when preceding voiced consonants than vowels preceding their voiceless cognates, and that a low vowel was longer in duration than a high vowel. These findings indicate that speakers consistently reduced their rate of speech when using simultaneous communication, but did not violate these specific temporal rules of English important for consonant and vowel perception.


Phonetica ◽  
2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Jing Yang

Abstract This study examined the development of vowel categories in young Mandarin -English bilingual children. The participants included 35 children aged between 3 and 4 years old (15 Mandarin-English bilinguals, six English monolinguals, and 14 Mandarin monolinguals). The bilingual children were divided into two groups: one group had a shorter duration (<1 year) of intensive immersion in English (Bi-low group) and one group had a longer duration (>1 year) of intensive immersion in English (Bi-high group). The participants were recorded producing one list of Mandarin words containing the vowels /a, i, u, y, ɤ/ and/or one list of English words containing the vowels /i, ɪ, e, ɛ, æ, u, ʊ, o, ɑ, ʌ/. Formant frequency values were extracted at five equidistant time locations (the 20–35–50–65–80% point) over the course of vowel duration. Cross-language and within-language comparisons were conducted on the midpoint formant values and formant trajectories. The results showed that children in the Bi-low group produced their English vowels into clusters and showed positional deviations from the monolingual targets. However, they maintained the phonetic features of their native vowel sounds well and mainly used an assimilatory process to organize the vowel systems. Children in the Bi-high group separated their English vowels well. They used both assimilatory and dissimilatory processes to construct and refine the two vowel systems. These bilingual children approximated monolingual English children to a better extent than the children in the Bi-low group. However, when compared to the monolingual peers, they demonstrated observable deviations in both L1 and L2.


2017 ◽  
Vol 23 (1) ◽  
pp. 1-20
Author(s):  
Kathy Connaughton ◽  
Irena Yanushevskaya

Objective: This study explores the immediate impact of prolonged voice use by professional sports coaches. Method: Speech samples including sustained phonation of vowel /a/ and a short read passage were collected from two professional sports coaches. The audio recordings were made within an hour before and after a coaching session, over three sessions. Perceptual evaluation of voice quality was done using the GRBAS scale. The speech samples were subsequently analyzed using Praat. The acoustic measures included fundamental frequency (f0), jitter, shimmer, Harmonics-to-Noise ratio and Cepstral Peak Prominence. Main results: The results of perceptual and acoustic analysis suggest a slight shift towards a tenser phonation post-coaching session, which is a likely consequence of laryngeal muscle adaptation to prolonged voice use. This tendency was similar in sustained vowels and connected speech. Conclusion: Acoustic measures used in this study can be useful to capture the voice change post-coaching session. It is desirable, however, that more sophisticated and robust and at the same time intuitive and easy-to-use tools for voice assessment and monitoring be made available to clinicians and professional voice users.


2014 ◽  
Vol 19 (4) ◽  
pp. 387-398 ◽  
Author(s):  
Vanessa Veis Ribeiro ◽  
Carla Aparecida Cielo

Purpose Describe and correlate acoustic and auditory-perceptual vocal measures, vocal complaints and professional characteristics of a group of teachers. Methods Ninety-nine female primary school teachers, aged 20 to 66 years, underwent auditory-perceptual (CAPE-V) and acoustic (Multi-Dimensional Voice Program Advanced) vocal assessments, and answered a questionnaire with questions about personal identification, overall health, occupational activities and vocal complaints. The ANOVA and Pearson’s correlation statistical tests have been applied. Results The teachers worked 6.98 hours a day, on average, and had been working as teachers for 12.91 years, approximately. Most of them reported vocal complaints and were employed in private schools. Auditory perceptual parameters were normal. All measures of jitter, shimmer, voiceless or unvoiced and subharmonic segments were above the normal range, as well as the standard deviation for fundamental frequency and soft phonation index. Perturbation frequency and age, roughness, breathiness and overall degree of voice were positively correlated with age and length of professional practice. There was also a negative correlation between amplitude perturbation and daily use of voice. Conclusion The teachers’ voices were considered as normal by the auditory-perceptual assessment, but noise and instability were detected in the acoustic analysis; there were, particularly, vocal complaints, and alteration of vocal acoustic and auditory-perceptual measures with increasing age and length of professional practice.


2019 ◽  
Vol 62 (1) ◽  
pp. 60-69
Author(s):  
Areen Badwal ◽  
JoHanna Poertner ◽  
Robin A. Samlan ◽  
Julie E. Miller

Purpose The zebra finch is used as a model to study the neural circuitry of auditory-guided human vocal production. The terminology of birdsong production and acoustic analysis, however, differs from human voice production, making it difficult for voice researchers of either species to navigate the literature from the other. The purpose of this research note is to identify common terminology and measures to better compare information across species. Method Terminology used in the birdsong literature will be mapped onto terminology used in the human voice production literature. Measures typically used to quantify the percepts of pitch, loudness, and quality will be described. Measures common to the literature in both species will be made from the songs of 3 middle-age birds using Praat and Song Analysis Pro. Two measures, cepstral peak prominence (CPP) and Wiener entropy (WE), will be compared to determine if they provide similar information. Results Similarities and differences in terminology and acoustic analyses are presented. A core set of measures including frequency, frequency variability within a syllable, intensity, CPP, and WE are proposed for future studies. CPP and WE are related yet provide unique information about the syllable structure. Conclusions Using a core set of measures familiar to both human voice and birdsong researchers, along with both CPP and WE, will allow characterization of similarities and differences among birds. Standard terminology and measures will improve accessibility of the birdsong literature to human voice researchers and vice versa. Supplemental Material https://doi.org/10.23641/asha.7438964


This paper investigates vowel adaptation in English-based loanwords by a group of Saudi Arabic speakers, concentrating exclusively on shared vowels between the two languages. It examines 5 long vowels shared by the two vowel systems in terms of vowel quality and vowel duration in loanword productions by 22 participants and checks them against the properties of the same vowels in native words. To this end, the study performs an acoustic analysis of 660 tokens (loan and native vowel sounds) through Praat to measure the first two formants (F1: vowel height and F2: vowel advancement) of each vowel sound at two temporal points of time (T1: the vowel onset and T2: the peak of the vowel) as well as a durational analysis to examine vowel length. It reports that measurements of the first two formants of vowels in native words appear to be stable during the two temporal points while values of the same vowel sounds occurring in loanwords are fluctuating from T1 to T2 and that durational differences exist between loanword vowels in comparison with vowels of native words in such a way that vowels in native words are longer in duration than the same vowels appearing in loanwords.


2020 ◽  
Vol 63 (4) ◽  
pp. 1002-1017
Author(s):  
Kevin J. Reilly

Purpose This study investigated vowel and sibilant productions in noise to determine whether responses to noise (a) are sensitive to the spectral characteristics of the noise signal and (b) are modulated by the contribution of vowel or sibilant contrasts to word discrimination. Method Vowel and sibilant productions were elicited during serial recall of three-word sequences that were produced in quiet or during exposure to speaker-specific noise signals. These signals either masked a speaker's productions of the sibilants /s/ and /ʃ/ or their productions of the vowels /a/ and /æ/. The contribution of the vowel and sibilant contrasts to word discrimination in a sequence was manipulated by varying the number of times that the target sibilant and vowel pairs occurred in the same word position in each sequence. Results Spectral noise effects were observed for both sibilants and vowels: Responses to noise were larger and/or involved to more acoustic features when the noise signal masked the acoustic characteristics of that phoneme class. Word discrimination effects were limited and consisted of only small increases in vowel duration. Interaction effects between noise and similarity indicated that the phonological similarity of sequences containing both sibilants and/or both vowels influenced articulation in ways not related to speech clarity. Conclusion The findings of this study indicate that sensorimotor control of speech exhibits some sensitivity to noise spectral characteristics. However, productions of sibilants and vowels were not sensitive to their importance in discriminating the words in a sequence. In addition, phonological similarity effects were observed that likely reflected processing demands related to the recall and sequencing of high-similarity words.


Sign in / Sign up

Export Citation Format

Share Document