scholarly journals Perceptual validation of vowel normalization methods for variationist research

2021 ◽  
pp. 1-27
Author(s):  
Santiago Barreda

AbstractThe evaluation of normalization methods sometimes focuses on the maximization of vowel-space similarity. This focus can lead to the adoption of methods that erase legitimate phonetic variation from our data, that is, overnormalization. First, a production corpus is presented that highlights three types of variation in formant patterns: uniform scaling, nonuniform scaling, and centralization. Then the results of two perceptual experiments are presented, both suggesting that listeners tend to ignore variation according to uniform scaling, while associating nonuniform scaling and centralization with phonetic differences. Overall, results suggest that normalization methods that remove variation not according to uniform scaling can remove legitimate phonetic variation from vowel formant data. As a result, although these methods can provide more similar vowel spaces, they do so by erasing phonetic variation from vowel data that may be socially and linguistically meaningful, including a potential male-female difference in the low vowels in our corpus.

2009 ◽  
Vol 21 (3) ◽  
pp. 413-435 ◽  
Author(s):  
Anne H. Fabricius ◽  
Dominic Watt ◽  
Daniel Ezra Johnson

AbstractThis article evaluates a speaker-intrinsic vowel formant frequency normalization algorithm initially proposed in Watt & Fabricius (2002). We compare how well this routine, known as the S-centroid procedure, performs as a sociophonetic research tool in three ways: reducing variance in area ratios of vowel spaces (by attempting to equalize vowel space areas); improving overlap of vowel polygons; and reproducing relative positions of vowel means within the vowel space, compared with formant data in raw Hertz. The study uses existing data sets of vowel formant data from two varieties of English, Received Pronunciation and Aberdeen English (northeast Scotland). We conclude that, for the data examined here, the S-centroid W&F procedure performs at least as well as the two speaker-intrinsic, vowel-extrinsic, formant-intrinsic normalization methods rated as best performing by Adank (2003): Lobanov's (1971) z-score procedure and Nearey's (1978) individual log-mean procedure (CLIHi4 in Adank [2003], CLIHi2 as tested here), and in some test cases better than the latter.


Author(s):  
Yeptain Leung ◽  
Jennifer Oates ◽  
Siew-Pang Chan ◽  
Viktória Papp

Purpose The aim of the study was to examine associations between speaking fundamental frequency ( f os ), vowel formant frequencies ( F ), listener perceptions of speaker gender, and vocal femininity–masculinity. Method An exploratory study was undertaken to examine associations between f os , F 1 – F 3 , listener perceptions of speaker gender (nominal scale), and vocal femininity–masculinity (visual analog scale). For 379 speakers of Australian English aged 18–60 years, f os mode and F 1 – F 3 (12 monophthongs; total of 36 F s) were analyzed on a standard reading passage. Seventeen listeners rated speaker gender and vocal femininity–masculinity on randomized audio recordings of these speakers. Results Model building using principal component analysis suggested the 36 F s could be succinctly reduced to seven principal components (PCs). Generalized structural equation modeling (with the seven PCs of F and f os as predictors) suggested that only F 2 and f os predicted listener perceptions of speaker gender (male, female, unable to decide). However, listener perceptions of vocal femininity–masculinity behaved differently and were predicted by F 1 , F 3 , and the contrast between monophthongs at the extremities of the F 1 acoustic vowel space, in addition to F 2 and f os . Furthermore, listeners' perceptions of speaker gender also influenced ratings of vocal femininity–masculinity substantially. Conclusion Adjusted odds ratios highlighted the substantially larger contribution of F to listener perceptions of speaker gender and vocal femininity–masculinity relative to f os than has previously been reported.


2018 ◽  
Vol 61 (5) ◽  
pp. 1055-1069 ◽  
Author(s):  
Sonia Granlund ◽  
Valerie Hazan ◽  
Merle Mahon

Purpose This study aims to examine the clear speaking strategies used by older children when interacting with a peer with hearing loss, focusing on both acoustic and linguistic adaptations in speech. Method The Grid task, a problem-solving task developed to elicit spontaneous interactive speech, was used to obtain a range of global acoustic and linguistic measures. Eighteen 9- to 14-year-old children with normal hearing (NH) performed the task in pairs, once with a friend with NH and once with a friend with a hearing impairment (HI). Results In HI-directed speech, children increased their fundamental frequency range and midfrequency intensity, decreased the number of words per phrase, and expanded their vowel space area by increasing F1 and F2 range, relative to NH-directed speech. However, participants did not appear to make changes to their articulation rate, the lexical frequency of content words, or lexical diversity when talking to their friend with HI compared with their friend with NH. Conclusions Older children show evidence of listener-oriented adaptations to their speech production; although their speech production systems are still developing, they are able to make speech adaptations to benefit the needs of a peer with HI, even without being given a specific instruction to do so. Supplemental Material https://doi.org/10.23641/asha.6118817


2001 ◽  
Vol 44 (3) ◽  
pp. 552-563 ◽  
Author(s):  
Harlan Lane ◽  
Melanie Matthies ◽  
Joseph Perkell ◽  
Jennell Vick ◽  
Majid Zandipour

In order to examine the role of hearing status in controlling coarticulation, eight English vowels in /bVt/ and /dVt/ syllables, embedded in a carrier phrase, were elicited from 7 postlingually deafened adults and 2 speakers with normal hearing. The deaf adults served in repeated recording sessions both before and up to a year after they received cochlear implants and their speech processors were turned on. Each of the two hearing control speakers served in two recording sessions, separated by about 3 months. Measures were made of second formant frequency at obstruent release and at 25 ms intervals until the final obstruent. An index of coarticulation, based on the ratio of F2 at vowel onset to F2 at midvowel target, was computed. Changes in the amount of coarticulation after the change in hearing status were small and nonsystematic for the /bVt/ syllables; those for the /dVt/ syllables averaged a 3% increase—within the range of reliability measures for the 2 hearing control speakers. Locus equations (F2 at vowel onset vs. F2 at vowel midpoint) and ratios of F2 onsets in point vowels were also calculated. Like the index of coarticulation, these measures tended to confirm that hearing status had little if any effect on coarticulation in the deaf speakers, consistent with the hypothesis that hearing does not play a direct role in regulating anticipatory coarticulation in adulthood. With the restoration of some hearing, 2 implant users significantly increased the average spacing between vowels in the formant plane, whereas the remaining 5 decreased that measure. All speakers but one also reduced vowel duration significantly. Four of the speakers reduced dispersion of vowel formant values around vowel midpoint means, but the other 3 did not show this effect.


2019 ◽  
Vol 62 (5) ◽  
pp. 1278-1295
Author(s):  
Laura L. Koenig ◽  
Susanne Fuchs

Purpose This study evaluated how 1st and 2nd vowel formant frequencies (F1, F2) differ between normal and loud speech in multiple speaking tasks to assess claims that loudness leads to exaggerated vowel articulation. Method Eleven healthy German-speaking women produced normal and loud speech in 3 tasks that varied in the degree of spontaneity: reading sentences that contained isolated /i: a: u:/, responding to questions that included target words with controlled consonantal contexts but varying vowel qualities, and a recipe recall task. Loudness variation was elicited naturalistically by changing interlocutor distance. First and 2nd formant frequencies and average sound pressure level were obtained from the stressed vowels in the target words, and vowel space area was calculated from /i: a: u:/. Results Comparisons across many vowels indicated that high, tense vowels showed limited formant variation as a function of loudness. Analysis of /i: a: u:/ across speech tasks revealed vowel space reduction in the recipe retell task compared to the other 2. Loudness changes for F1 were consistent in direction but variable in extent, with few significant results for high tense vowels. Results for F2 were quite varied and frequently not significant. Speakers differed in how loudness and task affected formant values. Finally, correlations between sound pressure level and F1 were generally positive but varied in magnitude across vowels, with the high tense vowels showing very flat slopes. Discussion These data indicate that naturalistically elicited loud speech in typical speakers does not always lead to changes in vowel formant frequencies and call into question the notion that increasing loudness is necessarily an automatic method of expanding the vowel space. Supplemental Material https://doi.org/10.23641/asha.8061740


2021 ◽  
pp. 002383092110149
Author(s):  
Sky Onosson ◽  
Jesse Stewart

Media Lengua (ML), a mixed language derived from Quichua and Spanish, exhibits a phonological system that largely conforms to that of Quichua acoustically. Yet, it incorporates a large number of vowel sequences from Spanish which do not occur in the Quichua system. This includes the use of mid-vowels, which are phonetically realized in ML as largely overlapping with the high-vowels in acoustic space. We analyze and compare production of vowel sequences by speakers of ML, Quichua, and Spanish through the use of generalized additive mixed models to determine statistically significant differences between vowel formant trajectories. Our results indicate that Spanish-derived ML vowel sequences frequently differ significantly from their Spanish counterparts, largely occupying a more central region of the vowel space and frequently exhibiting markedly reduced trajectories over time. In contrast, we find only one case where an ML vowel sequence differs significantly from its Quichua counterpart—and even in this case the difference from Spanish is substantially greater. Our findings show how the vowel system of ML successfully integrates novel vowel sequence patterns from Spanish into what is essentially Quichua phonology by markedly adapting their production, while still maintaining contrasts which are not expressed in Quichua.


2010 ◽  
Vol 4 (1) ◽  
pp. 203-217
Author(s):  
Lei Jiao

In this paper I discuss the necessity and possibility of the application of normalization methods in phonetic studies, especially in the acoustic analysis of vowels. I also compare and evaluate the vowel normalization methods listed in Adank et al. (2004), with the method of calculating the centrality of vowels after normalization. The results show that the Z-score method and CLIH1 method are the most effective in centralizing the data and eliminating gender differences.


1977 ◽  
Vol 62 (S1) ◽  
pp. S26-S26 ◽  
Author(s):  
Matthew Lennig ◽  
Donald Hindle

2018 ◽  
Vol 41 ◽  
Author(s):  
Duane T. Wegener ◽  
Leandre R. Fabrigar

AbstractReplications can make theoretical contributions, but are unlikely to do so if their findings are open to multiple interpretations (especially violations of psychometric invariance). Thus, just as studies demonstrating novel effects are often expected to empirically evaluate competing explanations, replications should be held to similar standards. Unfortunately, this is rarely done, thereby undermining the value of replication research.


Sign in / Sign up

Export Citation Format

Share Document