Cross-Linguistic Perceptual Categorization of the Three Corner Vowels: Effects of Listener Language and Talker Age

2020 ◽  
pp. 002383092094324
Author(s):  
Hyunju Chung ◽  
Benjamin Munson ◽  
Jan Edwards

The present study examined the center and size of naïve adult listeners’ vowel perceptual space (VPS) in relation to listener language (LL) and talker age (TA). Adult listeners of three different first languages, American English, Greek, and Korean, categorized and rated the goodness of different vowels produced by 2-year-olds and 5-year-olds and adult speakers of those languages, and speakers of Cantonese and Japanese. The center (i.e., mean first and second formant frequencies (F1 and F2)) and size (i.e., area in the F1/F2 space) of VPSs that were categorized either into /a/, /i/, or /u/ were calculated for each LL and TA group. All center and size calculations were weighted by the goodness rating of each stimulus. The F1 and F2 values of the vowel category (VC) centers differed significantly by LL and TA. These effects were qualitatively different for the three vowel categories: English listeners had different /a/ and /u/ centers than Greek and Korean listeners. The size of VPSs did not differ significantly by LL, but did differ by TA and VCs: Greek and Korean listeners had larger vowel spaces when perceiving vowels produced by 2-year-olds than by 5-year-olds or adults, and English listeners had larger vowel spaces for /a/ than /i/ or /u/. Findings indicate that vowel perceptual categories of listeners varied by the nature of their native vowel system, and were sensitive to TA.

2018 ◽  
Vol 14 (2) ◽  
pp. 61-80
Author(s):  
Anabela Rato

This study reports the results of a perceptual assimilation task (PAT) used to assess the degree of perceived cross-language (dis)similarity between the vowel inventories of European Portuguese (L1) and American English (L2) and, thus, predict difficulty in the perception and production of non-native vowels. Thirty-four native European Portuguese speakers completed a PAT, in which they mapped both L2 English and L1 Portuguese vowels to native vowel categories and rated them for goodness-of-fit to L1 vowels. The results are discussed in terms of theoretical models of cross-language perception and L2 speech learning (SLM, Flege, 1995, & PAM-L2, Best & Tyler, 2007).-----------------------------------------------------------------------------CATEGORIZAÇÃO PERCEPTIVA DE VOGAIS INGLESAS POR FALANTES NATIVOS DE PORTUGUÊS EUROPEUEste estudo reporta os resultados de uma tarefa de assimilação percetiva, usada para avaliar o grau de semelhança inter-linguística entre os inventários vocálicos de português europeu (L1) e de inglês americano (L2), e, assim, prever dificuldades na perceção e produção de sons não nativos. Trinta e quatro falantes nativos de português europeu completaram uma tarefa de assimilação perceptiva, na qual identificaram vogais do inglês (L2) e do português (L1) de acordo com as categorias fonológicas da sua língua nativa, avaliando também a qualidade de representatividade categorial. Os resultados são discutidos partindo de dois modelos de perceção inter-linguística e aprendizagem de fala L2 (SLM, Flege, 1995, & PAM-L2, Best & Tyler, 2007).---Original em inglês.


2017 ◽  
Vol 6 (1) ◽  
pp. 71
Author(s):  
Rudha Widagsa ◽  
Ahmad Agung Yuwono Putro

Indonesian is the most widely spoken language in Indonesia. More than 200 million people speak the language as a first language. However, acoustic study on Indonesian learners of English (ILE) production remains untouched. The purpose of this measurement is to examine the influence of first language (L1) on English vowels production as a second language (L2). Based on perceptual magnet hypothesis (PMH), ILE were predicted to produce close sounds to L1 English where the vowels are similar to Indonesian vowels. Acoustic analysis was conducted to measure the formant frequencies. This study involved five males of Indonesian speakers aged between 20-25 years old. The data of British English native speakers were taken from previous study by Hawkins & Midgley (2005). The result illustrates that the first formant frequencies (F1) which correlates to the vowel hight of Indonesian Learners of English were significantly different from the corresponding frequencies of British English vowels. Surprisingly, the significant differences in second formant (F2) of ILE were only in the production of /ɑ, ɒ, ɔ/ in which /ɑ/=p 0.002, /ɒ/ =p 0,001, /ɔ/ =p 0,03. The vowel space area of ILE was slightly less spacious than the native speakers. This study is expected to shed light in English language teaching particularly as a foreign language.Keywords: VSA, EFL, Indonesian learners, formant frequencies, acoustic


1998 ◽  
Vol 21 (2) ◽  
pp. 275-275 ◽  
Author(s):  
Dominic W. Massaro

Sussman et al. describe an ecological property of the speech signal that is putatively functional in perception. An important issue, however, is whether their putative cue is an emerging feature or whether the second formant (F2) onset and the F2 vowel actually provide independent cues to perceptual categorization. Regardless of the outcome of this issue, an important goal of speech research is to understand how multiple cues are evaluated and integrated to achieve categorization.


1983 ◽  
Vol 50 (1) ◽  
pp. 27-45 ◽  
Author(s):  
M. B. Sachs ◽  
H. F. Voigt ◽  
E. D. Young

Responses of auditory nerve fibers to steady-state vowels presented alone and in the presence of background noise were obtained from anesthetized cats. Representation of vowels based on average discharge rate and representation based primarily on phase-locked properties of responses are considered. Profiles of average discharge rate versus characteristic frequency (CF) ("rate-place" representation) can show peaks of discharge rate in the vicinity of formant frequencies when vowels are presented alone. These profiles change drastically in the presence of background noise, however. At moderate vowel and noise levels and signal/noise ratios of +9 dB, there are not peaks of rate near the second and third formant frequencies. In fact, because of two-tone suppression, rate to vowels plus noise is less than rate to noise alone for fibers with CFs above the first formant. Rate profiles measured over 5-ms intervals near stimulus onset show clear formant-related peaks at higher sound levels than do profiles measured over intervals later in the stimulus (i.e., in the steady state). However, in background noise, rate profiles at onset are similar to those in the steady state. Specifically, for fibers with CFs above the first formant, response rates to the noise are suppressed by the addition of the vowel at both vowel onset and steady state. When rate profiles are plotted for low spontaneous rate fibers, formant-related peaks appear at stimulus levels higher than those at which peaks disappear for high spontaneous fibers. In the presence of background noise, however, the low spontaneous fibers do not preserve formant peaks better than do the high spontaneous fibers. In fact, the suppression of noise-evoked rate mentioned above is greater for the low spontaneous fibers than for high. Representations that reflect phase-locked properties as well as discharge rate ("temporal-place" representations) are much less affected by background noise. We have used synchronized discharge rate averaged over fibers with CFs near (+/- 0.25 octave) a stimulus component as a measure of the population temporal response to that component. Plots of this average localized synchronized rate (ALSR) versus frequency show clear first and second formant peaks at all vowel and noise levels used. Except at the highest level (vowel at 85 dB sound pressure level (SPL), signal/noise = +9 dB), there is also a clear third formant peak. At signal-to-noise ratios where there are no second formant peaks in rate profiles, human observers are able to discriminate second formant shifts of less than 112 Hz. ALSR plots show clear second formant peaks at these signal/noise ratios.


2019 ◽  
Vol 4 (4) ◽  
pp. 719-732 ◽  
Author(s):  
Steven Sandoval ◽  
Rene L. Utianski ◽  
Heike Lehnert-LeHouillier

Purpose The use and study of formant frequencies for the description of vowels is commonplace in acoustic phonetics and in attempts to understand results of speech perception studies. Numerous studies have shown that listeners are better able to distinguish vowels when the acoustic parameters are based on spectral information extracted at multiple time points during the duration of the vowel, rather than at a single point in time. The purpose of this study was to validate an automated method for extracting formant trajectories, using information across the time course of production, and subsequently characterize the formant trajectories of vowels using a large, diverse corpus of speech samples. Method Using software tools, we automatically extract the 1st 2 formant frequencies (F1/F2) at 10 equally spaced points over a vowel's duration. Then, we compute the average trajectory for each vowel token. The 1,600 vowel observations in the Hillenbrand database and the more than 50,000 vowel observations in the TIMIT database are analyzed. Results First, we validate the automated method by comparing against the manually obtained values in the Hillenbrand database. Analyses reveal a strong correlation between the automated and manual formant estimates. Then, we use the automated method on the 630 speakers in the TIMIT database to compute average formant trajectories. We noted that phonemes that have close F1 and F2 values at the temporal midpoint often exhibit formant trajectories progressing in different directions, hence highlighting the importance of formant trajectory progression. Conclusions The results of this study support the importance of formant trajectories over single-point measurements for the successful discrimination of vowels. Furthermore, this study provides a baseline for the formant trajectories for men and women across a broad range of dialects of Standard American English.


2017 ◽  
Vol 3 (1) ◽  
pp. 9-33 ◽  
Author(s):  
Jane F. Hacking ◽  
Bruce L. Smith ◽  
Eric M. Johnson

Previous research has shown that English-speaking learners of Russian, even those with advanced proficiency, often have not acquired the contrast between palatalized and unpalatalized consonants, which is a central feature of the Russian consonant system. The present study examined whether training utilizing electropalatography (EPG) could help a group of Russian learners achieve more native-like productions of this contrast. Although not all subjects showed significant improvements, on average, the Russian learners showed an increase from pre- to post-training in the second formant frequency of vowels preceding palatalized consonants, thus enhancing their contrast between palatalized and unpalatalized consonants. To determine whether these acoustic differences were associated with increased identification accuracy, three native Russian speakers listened to all pre- and post-training productions. A modest increase in identification accuracy was observed. These results suggest that even short-term EPG training can be an effective intervention with adult L2 learners.


1983 ◽  
Vol 48 (2) ◽  
pp. 210-215 ◽  
Author(s):  
Paul R. Hoffman ◽  
Sheila Stager ◽  
Raymond G. Daniloff

Twelve children who consistently misarticulated consonant [r] and five children who correctly articulated [r] were recorded while repeating sentences which differed only in a single /r/–/w/ contrast. All /r/ and /w/ productions were spectrographically analyzed. Error productions were judged for their similarity to [w]. Each child identified all of the recorded sentences via a picture-pointing task. Misarticulated [r] was identified as /w/ at above chance levels only by the children who did not misarticulated [r]. The subject groups did not differ in their perception of correctly articulated /r/ and /w/ phones. Children whose misarticulated [r] phones were judged to be /w/?like were most likely to misperceive their own productions of /r/. Children whose misarticulated [r] productions were characterized by higher second formant frequencies were better able to identify their productions of /r/. Results suggest that a subpopulation of children who misarticulate [r] may mark it acoustically in a nonstandard manner.


Sign in / Sign up

Export Citation Format

Share Document