CREAKY VOICE: A NEW FEMININE VOICE QUALITY FOR YOUNG URBAN-ORIENTED UPWARDLY MOBILE AMERICAN WOMEN?

2010 ◽  
Vol 85 (3) ◽  
pp. 315-337 ◽  
Author(s):  
I. P. Yuasa
2010 ◽  
Vol 127 (3) ◽  
pp. 2023-2023 ◽  
Author(s):  
Hiu‐Wai Lam ◽  
Kristine M. Yu

2018 ◽  
Vol 16 (4) ◽  
pp. 377-406
Author(s):  
Léa Burin

Phonetic convergence is the process by which a speaker adapts his/her speech to sound more similar to his/her interlocutor. While most studies analysing this process have been conducted amongst speakers sharing the same language or variety, this experiment focuses on imitation between non-native and native speakers in a repetition task. The data is a fragment from the ANGLISH corpus designed by Anne Tortel (Tortel, 2008). 40 French speakers (10 male intermediate, 10 male advanced, 10 female intermediate and 10 female advanced learners) were asked to repeat a set of 20 sentences produced by British native speakers. Segmental (vowel quality), suprasegmental (vowel duration) and voice quality were analysed. Level of proficiency, gender and model talker were taken as independent variables. Level appeared not to be a relevant parameter due to a high amount of inter-individual variability amongst groups. Somewhat contradictory results were observed for vowel duration and F1-F2 distance for male learners converged more than female learners. Our hypothesis that low vowels display a higher degree of imitation, and especially within the F1 dimension (Babel, 2012), was partially validated. Convergence in vowel duration in order to sound more native-like was also observed (Zając, 2013). Regarding the analysis of voice quality, and more particularly of creaky voice, observations suggest that some advanced female learners creaked more than the native speakers and more in the reading task, which indicate, both linguistic idiosyncrasy and accommodation towards the native speakers. Low vowels seem also to be more likely to be produced with a creaky voice, especially at the end of prosodic constituents.


Author(s):  
Marc Garellek ◽  
Christina M. Esposito

Hmong languages, particularly White Hmong, are well studied for their complex tone systems that incorporate pitch, phonation, and duration differences. Still, prior work has made use mostly of tones elicited in their citation forms in carrier phrases. In this paper, we provide a detailed description of both the vowel and tone systems of White Hmong from recordings of read speech. We confirm several features of the language, including the presence of nasal vowels (rather than derived nasalized vowels through coarticulation with a coda [ŋ]), the description of certain tone contours, and the systematic presence of breathy and creaky voice on two of the tones. We also find little evidence of additional intonational f0 targets. However, we show that some tones vary greatly by their position in utterance, and propose novel descriptions for several of them. Finally, we show that $\textrm{H}1^{\!*}$ –H2*, a widely used measure of voice quality and phonation in Hmong and across languages, does not adequately distinguish modal from non-modal phonation in this data set, and argue that noise measures like Cepstral Peak Prominence (CPP) are more robust to phonation differences in corpora with more variability.


2005 ◽  
Vol 118 (3) ◽  
pp. 1965-1965
Author(s):  
Avanti S. Shetye ◽  
Carol Y. Espy‐Wilson
Keyword(s):  

2015 ◽  
Vol 1 ◽  
pp. 21-27
Author(s):  
Francesca Shaw ◽  
Victoria Crocker

This study examines the stylistic use of ‘creaky voice’ in a single speaker: the American actress Scarlett Johansson. Recently, there has been a marked increase in both media and academic interest in creaky voice, with work by Yuasa (2010) and Wolk et al. (2011) confirming the prevalence of this feature among young American female speakers. Our study was directly motivated by the work of Barry Pennock-Speck (2005), who took a qualitative approach to analyzing the speech of three American actresses for stylistic modulation of their voice quality. The present study focuses on only one American actress (Johansson), who was chosen as she is an established, successful young American female (at time of research) and therefore was an appropriate subject to represent the social group we are discussing. Our materials included six of Johansson’s films that were developed whilst she was between the ages 18–24. This age range falls in line with previous work on creaky voice (Wolk et al. 2011) who defined their age bracket of study as 18–25 years old. We contrasted American and British character roles and noted the level of creak present through both quantitative and qualitative analysis of six films: three in which she played an American and three in which she took on an English (UK) accent. Acoustic data evaluation involved coding for creak on syllabic nuclei and carrying out a statistical analysis to determine significant influences on the pattern we observed. Our qualitative analysis covers the following variables: character traits and personality, time period in which the film is set, and the age of Johansson’s character. Results showed that there was significantly more creak in Johansson’s speech while she was performing in an American role, in line with the study previously conducted by Pennock-Speck. Our qualitative findings suggest that creak is modulated at an additional level, indexing seductiveness and intimacy with the interlocutor.


2020 ◽  
Vol 63 (4) ◽  
pp. 1071-1082
Author(s):  
Theresa Schölderle ◽  
Elisabet Haas ◽  
Wolfram Ziegler

Purpose The aim of this study was to collect auditory-perceptual data on established symptom categories of dysarthria from typically developing children between 3 and 9 years of age, for the purpose of creating age norms for dysarthria assessment. Method One hundred forty-four typically developing children (3;0–9;11 [years;months], 72 girls and 72 boys) participated. We used a computer-based game specifically designed for this study to elicit sentence repetitions and spontaneous speech samples. Speech recordings were analyzed using the auditory-perceptual criteria of the Bogenhausen Dysarthria Scales, a standardized German assessment tool for dysarthria in adults. The Bogenhausen Dysarthria Scales (scales and features) cover clinically relevant dimensions of speech and allow for an evaluation of well-established symptom categories of dysarthria. Results The typically developing children exhibited a number of speech characteristics overlapping with established symptom categories of dysarthria (e.g., breathy voice, frequent inspirations, reduced articulatory precision, decreased articulation rate). Substantial progress was observed between 3 and 9 years of age, but with different developmental trajectories across different dimensions. In several areas (e.g., respiration, voice quality), 9-year-olds still presented with salient developmental speech characteristics, while in other dimensions (e.g., prosodic modulation), features typically associated with dysarthria occurred only exceptionally, even in the 3-year-olds. Conclusions The acquisition of speech motor functions is a prolonged process not yet completed with 9 years. Various developmental influences (e.g., anatomic–physiological changes) shape children's speech specifically. Our findings are a first step toward establishing auditory-perceptual norms for dysarthria in children of kindergarten and elementary school age. Supplemental Material https://doi.org/10.23641/asha.12133380


2020 ◽  
Vol 63 (12) ◽  
pp. 3991-3999
Author(s):  
Benjamin van der Woerd ◽  
Min Wu ◽  
Vijay Parsa ◽  
Philip C. Doyle ◽  
Kevin Fung

Objectives This study aimed to evaluate the fidelity and accuracy of a smartphone microphone and recording environment on acoustic measurements of voice. Method A prospective cohort proof-of-concept study. Two sets of prerecorded samples (a) sustained vowels (/a/) and (b) Rainbow Passage sentence were played for recording via the internal iPhone microphone and the Blue Yeti USB microphone in two recording environments: a sound-treated booth and quiet office setting. Recordings were presented using a calibrated mannequin speaker with a fixed signal intensity (69 dBA), at a fixed distance (15 in.). Each set of recordings (iPhone—audio booth, Blue Yeti—audio booth, iPhone—office, and Blue Yeti—office), was time-windowed to ensure the same signal was evaluated for each condition. Acoustic measures of voice including fundamental frequency ( f o ), jitter, shimmer, harmonic-to-noise ratio (HNR), and cepstral peak prominence (CPP), were generated using a widely used analysis program (Praat Version 6.0.50). The data gathered were compared using a repeated measures analysis of variance. Two separate data sets were used. The set of vowel samples included both pathologic ( n = 10) and normal ( n = 10), male ( n = 5) and female ( n = 15) speakers. The set of sentence stimuli ranged in perceived voice quality from normal to severely disordered with an equal number of male ( n = 12) and female ( n = 12) speakers evaluated. Results The vowel analyses indicated that the jitter, shimmer, HNR, and CPP were significantly different based on microphone choice and shimmer, HNR, and CPP were significantly different based on the recording environment. Analysis of sentences revealed a statistically significant impact of recording environment and microphone type on HNR and CPP. While statistically significant, the differences across the experimental conditions for a subset of the acoustic measures (viz., jitter and CPP) have shown differences that fell within their respective normative ranges. Conclusions Both microphone and recording setting resulted in significant differences across several acoustic measurements. However, a subset of the acoustic measures that were statistically significant across the recording conditions showed small overall differences that are unlikely to have clinical significance in interpretation. For these acoustic measures, the present data suggest that, although a sound-treated setting is ideal for voice sample collection, a smartphone microphone can capture acceptable recordings for acoustic signal analysis.


2020 ◽  
Vol 63 (12) ◽  
pp. 3974-3981
Author(s):  
Ashwini Joshi ◽  
Isha Baheti ◽  
Vrushali Angadi

Aim The purpose of this study was to develop and assess the reliability of a Hindi version of the Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V). Reliability was assessed by comparing Hindi CAPE-V ratings with English CAPE-V ratings and by the Grade, Roughness, Breathiness, Asthenia and Strain (GRBAS) scale. Method Hindi sentences were created to match the phonemic load of the corresponding English CAPE-V sentences. The Hindi sentences were adapted for linguistic content. The original English and adapted Hindi CAPE-V and GRBAS were completed for 33 bilingual individuals with normal voice quality. Additionally, the Hindi CAPE-V and GRBAS were completed for 13 Hindi speakers with disordered voice quality. The agreement of CAPE-V ratings was assessed between language versions, GRBAS ratings, and two rater pairs (three raters in total). Pearson product–moment correlation was completed for all comparisons. Results A strong correlation ( r > .8, p < .01) was found between the Hindi CAPE-V scores and the English CAPE-V scores for most variables in normal voice participants. A weak correlation was found for the variable of strain ( r < .2, p = .400) in the normative group. A strong correlation ( r > .6, p < .01) was found between the overall severity/grade, roughness, and breathiness scores in the GRBAS scale and the CAPE-V scale in normal and disordered voice samples. Significant interrater reliability ( r > .75) was present in overall severity and breathiness. Conclusions The Hindi version of the CAPE-V demonstrates good interrater reliability and concurrent validity with the English CAPE-V and the GRBAS. The Hindi CAPE-V can be used for the auditory-perceptual voice assessment of Hindi speakers.


Sign in / Sign up

Export Citation Format

Share Document