Some sociolinguistic evaluations of performances of the California Vowel Shift: a matched-guise study

AbstractIn Raleigh, North Carolina, a Southern U.S. city, five decades of in-migration of technology-sector workers from outside the South has resulted in large-scale contact between the local Southern dialect and non-Southern dialects. This paper investigates the speed and magnitude of the reversal of the Southern Vowel Shift (SVS) with respect to the five front vowels, using Trudgill's (1998) model of dialect contact as a framework. The data consist of conversational interviews with 59 white-collar Raleigh natives representing three generations, the first generation having reached adulthood before large-scale contact. Acoustic analysis shows that all vowels shift away from their Southern variants across apparent time. The leveling of SVS variants begins within the first generation to grow up after large-scale contact began, and contrary to predictions, this generation does not show wide inter- or intraspeaker variability. Previous studies of dialect contact and new dialect formation suggest that leveling of regional dialect features and the establishment of stable linguistic norms occurs more quickly when children have regular contact with one another. Dialect contact in Raleigh has occurred primarily within the middle and upper classes, the members of which are densely connected by virtue of schools and heavy economic segregation in neighborhood residence.

Download Full-text

Intraspeaker Priming across the New Zealand English Short Front Vowel Shift

Language and Speech ◽

10.1177/00238309211053033 ◽

2021 ◽

pp. 002383092110530

Author(s):

Dan Villarreal ◽

Lynn Clark

Keyword(s):

New Zealand ◽

Corpus Linguistics ◽

Repetition Effect ◽

Front Vowel ◽

Repetition Effects ◽

Phonetic Variation ◽

Syntactic Variation ◽

Vowel Shift ◽

And Gender ◽

Linguistic Material

A growing body of research in psycholinguistics, corpus linguistics, and sociolinguistics shows that we have a strong tendency to repeat linguistic material that we have recently produced, seen, or heard. The present paper investigates whether priming effects manifest in continuous phonetic variation the way it has been reported in phonological, morphological, and syntactic variation. We analyzed nearly 60,000 tokens of vowels involved in the New Zealand English short front vowel shift (SFVS), a change in progress in which trap/dress move in the opposite direction to kit, from a topic-controlled corpus of monologues (166 speakers), to test for effects that are characteristic of priming phenomena: repetition, decay, and lexical boost. Our analysis found evidence for all three effects. Tokens that were relatively high and front tended to be followed by tokens that were also high and front; the repetition effect weakened with greater time between the prime and target; and the repetition effect was stronger if the prime and target belonged to (different tokens of) the same word. Contrary to our expectations, however, the cross-vowel effects suggest that the repetition effect responded not to the direction of vowel changes within the SFVS, but rather the peripherality of the tokens. We also found an interaction between priming behavior and gender, with stronger repetition effects among men than women. While these findings both indicate that priming manifests in continuous phonetic variation and provide further evidence that priming is among the factors providing structure to intraspeaker variation, they also challenge unitary accounts of priming phenomena.

Download Full-text

Northern dialect evidence for the chronology of the Great Vowel Shift

Journal of Linguistic Geography ◽

10.1017/jlg.2014.9 ◽

2014 ◽

Vol 2 (2) ◽

pp. 87-102

Author(s):

Hilary Prichard

Keyword(s):

Middle English ◽

Historical Data ◽

Dialect Contact ◽

Chain Analysis ◽

Vowel Shift ◽

New Perspective ◽

Dialect Geography

This paper demonstrates how the tools of dialect geography may fruitfully lend a new perspective to historical data in order to address the lingering questions left by previous analyses. A geographic examination ofSurvey of English Dialectsdata provides evidence in favor of a push-chain analysis of the Great Vowel Shift, in which the Middle English high-mid long vowels raised before the high long vowels were diphthongized. It is also demonstrated that the so-called “irregular” dialect outcomes, which have previously been cited as evidence for a lack of unity of the Great Vowel Shift, are no longer problematic when viewed in the light of a theory of dialect contact, and can in fact refine our understanding of the chronology and geographic extent of the shift itself.

Download Full-text

Chapter 4 The Anti-clockwise Vowel Shift

English After RP ◽

10.1007/978-3-030-04357-5_5 ◽

2019 ◽

pp. 17-21

Author(s):

Geoff Lindsey

Keyword(s):

Vowel Shift

Download Full-text

Advances in Completely Automated Vowel Analysis for Sociophonetics: Using End-to-End Speech Recognition Systems With DARLA

Frontiers in Artificial Intelligence ◽

10.3389/frai.2021.662097 ◽

2021 ◽

Vol 4 ◽

Author(s):

Rolando Coto-Solano ◽

James N. Stanford ◽

Sravana K. Reddy

Keyword(s):

Speech Recognition ◽

Computational Linguistics ◽

North American ◽

Ground Truth ◽

Automated System ◽

The North ◽

Vowel Formants ◽

Southern Vowel Shift ◽

Northern Cities ◽

Vowel Shift

In recent decades, computational approaches to sociophonetic vowel analysis have been steadily increasing, and sociolinguists now frequently use semi-automated systems for phonetic alignment and vowel formant extraction, including FAVE (Forced Alignment and Vowel Extraction, Rosenfelder et al., 2011; Evanini et al., Proceedings of Interspeech, 2009), Penn Aligner (Yuan and Liberman, J. Acoust. Soc. America, 2008, 123, 3878), and DARLA (Dartmouth Linguistic Automation), (Reddy and Stanford, DARLA Dartmouth Linguistic Automation: Online Tools for Linguistic Research, 2015a). Yet these systems still have a major bottleneck: manual transcription. For most modern sociolinguistic vowel alignment and formant extraction, researchers must first create manual transcriptions. This human step is painstaking, time-consuming, and resource intensive. If this manual step could be replaced with completely automated methods, sociolinguists could potentially tap into vast datasets that have previously been unexplored, including legacy recordings that are underutilized due to lack of transcriptions. Moreover, if sociolinguists could quickly and accurately extract phonetic information from the millions of hours of new audio content posted on the Internet every day, a virtual ocean of speech from newly created podcasts, videos, live-streams, and other audio content would now inform research. How close are the current technological tools to achieving such groundbreaking changes for sociolinguistics? Prior work (Reddy et al., Proceedings of the North American Association for Computational Linguistics 2015 Conference, 2015b, 71–75) showed that an HMM-based Automated Speech Recognition system, trained with CMU Sphinx (Lamere et al., 2003), was accurate enough for DARLA to uncover evidence of the US Southern Vowel Shift without any human transcription. Even so, because that automatic speech recognition (ASR) system relied on a small training set, it produced numerous transcription errors. Six years have passed since that study, and since that time numerous end-to-end automatic speech recognition (ASR) algorithms have shown considerable improvement in transcription quality. One example of such a system is the RNN/CTC-based DeepSpeech from Mozilla (Hannun et al., 2014). (RNN stands for recurrent neural networks, the learning mechanism for DeepSpeech. CTC stands for connectionist temporal classification, the mechanism to merge phones into words). The present paper combines DeepSpeech with DARLA to push the technological envelope and determine how well contemporary ASR systems can perform in completely automated vowel analyses with sociolinguistic goals. Specifically, we used these techniques on audio recordings from 352 North American English speakers in the International Dialects of English Archive (IDEA1), extracting 88,500 tokens of vowels in stressed position from spontaneous, free speech passages. With this large dataset we conducted acoustic sociophonetic analyses of the Southern Vowel Shift and the Northern Cities Chain Shift in the North American IDEA speakers. We compared the results using three different sources of transcriptions: 1) IDEA’s manual transcriptions as the baseline “ground truth”, 2) the ASR built on CMU Sphinx used by Reddy et al. (Proceedings of the North American Association for Computational Linguistics 2015 Conference, 2015b, 71–75), and 3) the latest publicly available Mozilla DeepSpeech system. We input these three different transcriptions to DARLA, which automatically aligned and extracted the vowel formants from the 352 IDEA speakers. Our quantitative results show that newer ASR systems like DeepSpeech show considerable promise for sociolinguistic applications like DARLA. We found that DeepSpeech’s automated transcriptions had significantly fewer character error rates than those from the prior Sphinx system (from 46 to 35%). When we performed the sociolinguistic analysis of the extracted vowel formants from DARLA, we found that the automated transcriptions from DeepSpeech matched the results from the ground truth for the Southern Vowel Shift (SVS): five vowels showed a shift in both transcriptions, and two vowels didn’t show a shift in either transcription. The Northern Cities Shift (NCS) was more difficult to detect, but ground truth and DeepSpeech matched for four vowels: One of the vowels showed a clear shift, and three showed no shift in either transcription. Our study therefore shows how technology has made progress toward greater automation in vowel sociophonetics, while also showing what remains to be done. Our statistical modeling provides a quantified view of both the abilities and the limitations of a completely “hands-free” analysis of vowel shifts in a large dataset. Naturally, when comparing a completely automated system against a semi-automated system involving human manual work, there will always be a tradeoff between accuracy on the one hand versus speed and replicability on the other hand [Kendall and Joseph, Towards best practices in sociophonetics (with Marianna DiPaolo), 2014]. The amount of “noise” that can be tolerated for a given study will depend on the particular research goals and researchers’ preferences. Nonetheless, our study shows that, for certain large-scale applications and research goals, a completely automated approach using publicly available ASR can produce meaningful sociolinguistic results across large datasets, and these results can be generated quickly, efficiently, and with full replicability.

Download Full-text