Stability, reliability, and sensitivity of acoustic measures of vowel space: A comparison of vowel space area, formant centralization ratio, and vowel articulation index

2020 ◽  
Vol 148 (3) ◽  
pp. 1436-1444
Author(s):  
Marja W. J. Caverlé ◽  
Adam P. Vogel
Author(s):  
Christopher Dromey ◽  
Michelle Richins ◽  
Tanner Low

Purpose We examined the effect of bite block insertion (BBI) on lingual movements and formant frequencies in corner vowel and diphthong production in a sentence context. Method Twenty young adults produced the corner vowels (/u/, /ɑ/, /æ/, /i/) and the diphthong /ɑɪ/ in sentence contexts before and after BBI. An electromagnetic articulograph measured the movements of the tongue back, middle, and front. Results There were significant decreases in the acoustic vowel articulation index and vowel space area following BBI. The kinematic vowel articulation index decreased significantly for the back and middle of the tongue but not for the front. There were no significant acoustic changes post-BBI for the diphthong, other than a longer transition duration. Diphthong kinematic changes after BBI included smaller movements for the back and middle of the tongue, but not the front. Conclusions BBI led to a smaller acoustic working space for the corner vowels. The adjustments made by the front of the tongue were sufficient to compensate for the BBI perturbation in the diphthong, resulting in unchanged formant trajectories. The back and middle of the tongue were likely biomechanically restricted in their displacement by the fixation of the jaw, whereas the tongue front showed greater movement flexibility.


2016 ◽  
Vol 59 (4) ◽  
pp. 631-646 ◽  
Author(s):  
Jennifer Lam ◽  
Kris Tjaden

Purpose The authors investigated how different variants of clear speech affect segmental and suprasegmental acoustic measures of speech in speakers with Parkinson's disease and a healthy control group. Method A total of 14 participants with Parkinson's disease and 14 control participants served as speakers. Each speaker produced 18 different sentences selected from the Sentence Intelligibility Test (Yorkston & Beukelman, 1996). All speakers produced stimuli in 4 speaking conditions (habitual, clear, overenunciate, and hearing impaired). Segmental acoustic measures included vowel space area and first moment (M1) coefficient difference measures for consonant pairs. Second formant slope of diphthongs and measures of vowel and fricative durations were also obtained. Suprasegmental measures included fundamental frequency, sound pressure level, and articulation rate. Results For the majority of adjustments, all variants of clear speech instruction differed from the habitual condition. The overenunciate condition elicited the greatest magnitude of change for segmental measures (vowel space area, vowel durations) and the slowest articulation rates. The hearing impaired condition elicited the greatest fricative durations and suprasegmental adjustments (fundamental frequency, sound pressure level). Conclusions Findings have implications for a model of speech production for healthy speakers as well as for speakers with dysarthria. Findings also suggest that particular clear speech instructions may target distinct speech subsystems.


2019 ◽  
Vol 62 (11) ◽  
pp. 4001-4014
Author(s):  
Melanie Weirich ◽  
Adrian Simpson

Purpose The study sets out to investigate inter- and intraspeaker variation in German infant-directed speech (IDS) and considers the potential impact that the factors gender, parental involvement, and speech material (read vs. spontaneous speech) may have. In addition, we analyze data from 3 time points prior to and after the birth of the child to examine potential changes in the features of IDS and, particularly also, of adult-directed speech (ADS). Here, the gender identity of a speaker is considered as an additional factor. Method IDS and ADS data from 34 participants (15 mothers, 19 fathers) is gathered by means of a reading and a picture description task. For IDS, 2 recordings were made when the baby was approximately 6 and 9 months old, respectively. For ADS, an additional recording was made before the baby was born. Phonetic analyses comprise mean fundamental frequency (f0), variation in f0, the 1st 2 formants measured in /i: ɛ a u:/, and the vowel space size. Moreover, social and behavioral data were gathered regarding parental involvement and gender identity. Results German IDS is characterized by an increase in mean f0, a larger variation in f0, vowel- and formant-specific differences, and a larger acoustic vowel space. No effect of gender or parental involvement was found. Also, the phonetic features of IDS were found in both spontaneous and read speech. Regarding ADS, changes in vowel space size in some of the fathers and in mean f0 in mothers were found. Conclusion Phonetic features of German IDS are robust with respect to the factors gender, parental involvement, speech material (read vs. spontaneous speech), and time. Some phonetic features of ADS changed within the child's first year depending on gender and parental involvement/gender identity. Thus, further research on IDS needs to address also potential changes in ADS.


2020 ◽  
Vol 63 (12) ◽  
pp. 3991-3999
Author(s):  
Benjamin van der Woerd ◽  
Min Wu ◽  
Vijay Parsa ◽  
Philip C. Doyle ◽  
Kevin Fung

Objectives This study aimed to evaluate the fidelity and accuracy of a smartphone microphone and recording environment on acoustic measurements of voice. Method A prospective cohort proof-of-concept study. Two sets of prerecorded samples (a) sustained vowels (/a/) and (b) Rainbow Passage sentence were played for recording via the internal iPhone microphone and the Blue Yeti USB microphone in two recording environments: a sound-treated booth and quiet office setting. Recordings were presented using a calibrated mannequin speaker with a fixed signal intensity (69 dBA), at a fixed distance (15 in.). Each set of recordings (iPhone—audio booth, Blue Yeti—audio booth, iPhone—office, and Blue Yeti—office), was time-windowed to ensure the same signal was evaluated for each condition. Acoustic measures of voice including fundamental frequency ( f o ), jitter, shimmer, harmonic-to-noise ratio (HNR), and cepstral peak prominence (CPP), were generated using a widely used analysis program (Praat Version 6.0.50). The data gathered were compared using a repeated measures analysis of variance. Two separate data sets were used. The set of vowel samples included both pathologic ( n = 10) and normal ( n = 10), male ( n = 5) and female ( n = 15) speakers. The set of sentence stimuli ranged in perceived voice quality from normal to severely disordered with an equal number of male ( n = 12) and female ( n = 12) speakers evaluated. Results The vowel analyses indicated that the jitter, shimmer, HNR, and CPP were significantly different based on microphone choice and shimmer, HNR, and CPP were significantly different based on the recording environment. Analysis of sentences revealed a statistically significant impact of recording environment and microphone type on HNR and CPP. While statistically significant, the differences across the experimental conditions for a subset of the acoustic measures (viz., jitter and CPP) have shown differences that fell within their respective normative ranges. Conclusions Both microphone and recording setting resulted in significant differences across several acoustic measurements. However, a subset of the acoustic measures that were statistically significant across the recording conditions showed small overall differences that are unlikely to have clinical significance in interpretation. For these acoustic measures, the present data suggest that, although a sound-treated setting is ideal for voice sample collection, a smartphone microphone can capture acceptable recordings for acoustic signal analysis.


2019 ◽  
Vol 62 (12) ◽  
pp. 4534-4543
Author(s):  
Wei Hu ◽  
Sha Tao ◽  
Mingshuang Li ◽  
Chang Liu

Purpose The purpose of this study was to investigate how the distinctive establishment of 2nd language (L2) vowel categories (e.g., how distinctively an L2 vowel is established from nearby L2 vowels and from the native language counterpart in the 1st formant [F1] × 2nd formant [F2] vowel space) affected L2 vowel perception. Method Identification of 12 natural English monophthongs, and categorization and rating of synthetic English vowels /i/ and /ɪ/ in the F1 × F2 space were measured for Chinese-native (CN) and English-native (EN) listeners. CN listeners were also examined with categorization and rating of Chinese vowels in the F1 × F2 space. Results As expected, EN listeners significantly outperformed CN listeners in English vowel identification. Whereas EN listeners showed distinctive establishment of 2 English vowels, CN listeners had multiple patterns of L2 vowel establishment: both, 1, or neither established. Moreover, CN listeners' English vowel perception was significantly related to the perceptual distance between the English vowel and its Chinese counterpart, and the perceptual distance between the adjacent English vowels. Conclusions L2 vowel perception relied on listeners' capacity to distinctively establish L2 vowel categories that were distant from the nearby L2 vowels.


Author(s):  
Nikitha K. ◽  
Sishir Kalita ◽  
C.M. Vikram ◽  
M. Pushpavathi ◽  
S.R. Mahadeva Prasanna

2017 ◽  
Vol 23 (1) ◽  
pp. 1-20
Author(s):  
Kathy Connaughton ◽  
Irena Yanushevskaya

Objective: This study explores the immediate impact of prolonged voice use by professional sports coaches. Method: Speech samples including sustained phonation of vowel /a/ and a short read passage were collected from two professional sports coaches. The audio recordings were made within an hour before and after a coaching session, over three sessions. Perceptual evaluation of voice quality was done using the GRBAS scale. The speech samples were subsequently analyzed using Praat. The acoustic measures included fundamental frequency (f0), jitter, shimmer, Harmonics-to-Noise ratio and Cepstral Peak Prominence. Main results: The results of perceptual and acoustic analysis suggest a slight shift towards a tenser phonation post-coaching session, which is a likely consequence of laryngeal muscle adaptation to prolonged voice use. This tendency was similar in sustained vowels and connected speech. Conclusion: Acoustic measures used in this study can be useful to capture the voice change post-coaching session. It is desirable, however, that more sophisticated and robust and at the same time intuitive and easy-to-use tools for voice assessment and monitoring be made available to clinicians and professional voice users.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Catherine D. Chong ◽  
Jianwei Zhang ◽  
Jing Li ◽  
Teresa Wu ◽  
Gina Dumkrieger ◽  
...  

Abstract Background/objective Changes in speech can be detected objectively before and during migraine attacks. The goal of this study was to interrogate whether speech changes can be detected in subjects with post-traumatic headache (PTH) attributed to mild traumatic brain injury (mTBI) and whether there are within-subject changes in speech during headaches compared to the headache-free state. Methods Using a series of speech elicitation tasks uploaded via a mobile application, PTH subjects and healthy controls (HC) provided speech samples once every 3 days, over a period of 12 weeks. The following speech parameters were assessed: vowel space area, vowel articulation precision, consonant articulation precision, average pitch, pitch variance, speaking rate and pause rate. Speech samples of subjects with PTH were compared to HC. To assess speech changes associated with PTH, speech samples of subjects during headache were compared to speech samples when subjects were headache-free. All analyses were conducted using a mixed-effect model design. Results Longitudinal speech samples were collected from nineteen subjects with PTH (mean age = 42.5, SD = 13.7) who were an average of 14 days (SD = 32.2) from their mTBI at the time of enrollment and thirty-one HC (mean age = 38.7, SD = 12.5). Regardless of headache presence or absence, PTH subjects had longer pause rates and reductions in vowel and consonant articulation precision relative to HC. On days when speech was collected during a headache, there were longer pause rates, slower sentence speaking rates and less precise consonant articulation compared to the speech production of HC. During headache, PTH subjects had slower speaking rates yet more precise vowel articulation compared to when they were headache-free. Conclusions Compared to HC, subjects with acute PTH demonstrate altered speech as measured by objective features of speech production. For individuals with PTH, speech production may have been more effortful resulting in slower speaking rates and more precise vowel articulation during headache vs. when they were headache-free, suggesting that speech alterations were related to PTH and not solely due to the underlying mTBI.


Languages ◽  
2021 ◽  
Vol 6 (3) ◽  
pp. 114
Author(s):  
Ulrich Reubold ◽  
Sanne Ditewig ◽  
Robert Mayr ◽  
Ineke Mennen

The present study sought to examine the effect of dual language activation on L1 speech in late English–Austrian German sequential bilinguals, and to identify relevant predictor variables. To this end, we compared the English speech patterns of adult migrants to Austria in a code-switched and monolingual condition alongside those of monolingual native speakers in England in a monolingual condition. In the code-switched materials, German words containing target segments known to trigger cross-linguistic interaction in the two languages (i.e., [v–w], [ʃt(ʁ)-st(ɹ)] and [l-ɫ]) were inserted into an English frame; monolingual materials comprised English words with the same segments. To examine whether the position of the German item affects L1 speech, the segments occurred either before the switch (“He wants a Wienerschnitzel”) or after (“I like Würstel with mustard”). Critical acoustic measures of these segments revealed no differences between the groups in the monolingual condition, but significant L2-induced shifts in the bilinguals’ L1 speech production in the code-switched condition for some sounds. These were found to occur both before and after a code-switch, and exhibited a fair amount of individual variation. Only the amount of L2 use was found to be a significant predictor variable for shift size in code-switched compared with monolingual utterances, and only for [w]. These results have important implications for the role of dual activation in the speech of late sequential bilinguals.


Sign in / Sign up

Export Citation Format

Share Document