second formant
Recently Published Documents


TOTAL DOCUMENTS

147
(FIVE YEARS 38)

H-INDEX

20
(FIVE YEARS 1)

2021 ◽  
Vol 11 (1) ◽  
pp. 3-34
Author(s):  
Jan Auracher

Abstract This study aimed to test sound-meaning relations in Japanese poetry. To this end, participants assessed the sentiments expressed in a random selection of Tanka (a specific form of Japanese poetry) on six bipolar scales comprising Evaluation (emotional valence), Potency (dominance), and Activity (arousal). The selected Tanka differed with regard to their average formant-dispersion (i.e., the distance between the first and second formant). Corroborating results of a previous study that tested the relation between formant dispersion and emotional tone in German poetry, results suggest that poems with an extremely low average formant dispersion have a significantly higher likelihood of expressing dominance and activity than poems with an extremely high formant dispersion. No significant differences regarding the Evaluation dimension were found.


Author(s):  
Katarzyna Pisanski ◽  
Andrey Anikin ◽  
David Reby

Vocal tract elongation, which uniformly lowers vocal tract resonances (formant frequencies) in animal vocalizations, has evolved independently in several vertebrate groups as a means for vocalizers to exaggerate their apparent body size. Here, we propose that smaller speech-like articulatory movements that alter only individual formants can serve a similar yet less energetically costly size-exaggerating function. To test this, we examine whether uneven formant spacing alters the perceived body size of vocalizers in synthesized human vowels and animal calls. Among six synthetic vowel patterns, those characterized by the lowest first and second formant (the vowel /u/ as in ‘boot’) are consistently perceived as produced by the largest vocalizer. Crucially, lowering only one or two formants in animal-like calls also conveys the impression of a larger body size, and lowering the second and third formants simultaneously exaggerates perceived size to a similar extent as rescaling all formants. As the articulatory movements required for individual formant shifts are minor compared to full vocal tract extension, they represent a rapid and energetically efficient mechanism for acoustic size exaggeration. We suggest that, by favouring the evolution of uneven formant patterns in vocal communication, this deceptive strategy may have contributed to the origins of the phonemic diversification required for articulated speech. This article is part of the theme issue ‘Voice modulation: from origin and mechanism to social impact (Part II)’.


Author(s):  
Hannah P. Rowe ◽  
Kaila L. Stipancic ◽  
Adam C. Lammert ◽  
Jordan R. Green

Purpose This study investigated the criterion (analytical and clinical) and construct (divergent) validity of a novel, acoustic-based framework composed of five key components of motor control: Coordination, Consistency, Speed, Precision, and Rate. Method Acoustic and kinematic analyses were performed on audio recordings from 22 subjects with amyotrophic lateral sclerosis during a sequential motion rate task. Perceptual analyses were completed by two licensed speech-language pathologists, who rated each subject's speech on the five framework components and their overall severity. Analytical and clinical validity were assessed by comparing performance on the acoustic features to their kinematic correlates and to clinician ratings of the five components, respectively. Divergent validity of the acoustic-based framework was then assessed by comparing performance on each pair of acoustic features to determine whether the features represent distinct articulatory constructs. Bivariate correlations and partial correlations with severity as a covariate were conducted for each comparison. Results Results revealed moderate-to-strong analytical validity for every acoustic feature, both with and without controlling for severity, and moderate-to-strong clinical validity for all acoustic features except Coordination, without controlling for severity. When severity was included as a covariate, the strong associations for Speed and Precision became weak. Divergent validity was supported by weak-to-moderate pairwise associations between all acoustic features except Speed (second-formant [F2] slope of consonant transition) and Precision (between-consonant variability in F2 slope). Conclusions This study demonstrated that the acoustic-based framework has potential as an objective, valid, and clinically useful tool for profiling articulatory deficits in individuals with speech motor disorders. The findings also suggest that compared to clinician ratings, instrumental measures are more sensitive to subtle differences in articulatory function. With further research, this framework could provide more accurate and reliable characterizations of articulatory impairment, which may eventually increase clinical confidence in the diagnosis and treatment of patients with different articulatory phenotypes.


Author(s):  
Jun Ma ◽  
Hongzhi Yu ◽  
Yan Xu ◽  
Kaiying Deng

According to relevant specifications, this article divides, marks, and extracts the acquired speech signals of the Salar language, and establishes the speech acoustic parameter database of the Salar language. Then, the vowels of the Salar language are analyzed and studied by using the parameter database. The vowel bitmap (average value at the beginning of words), the vowel bitmap (average value at the abdomen of words), the vowel bitmap (average value at the ending of words), and the vowel bitmap (average value) are obtained. Through the vowel bitmaps, we can observe the vowel in different positions of the word, the overall appearance of an obtuse triangle. The high vowel [i], [o], and low vowel [a] occupy three vertices, respectively. Among the three lines, [i] to [o] are the longest, [i] to [a] are the second longest, and [a] to [o] are the shortest. The lines between [a] to [o] and [a] and [i] are asymmetric. Combining with the vowel bitmap, the vowels were discretized, and the second formant (F2) frequency parameter was used as the coordinate of the X axis, and the first formant (F1) frequency was used as the coordinate of the Y axis to draw the region where the vowel was located, and then the vowel pattern was formed. These studies provide basic data and parameters for the future development of modern phonetics such as the database of Sarah language speech, speech recognition, and speech synthesis. It also provides the basic parameters of speech acoustics for the rare minority acoustic research work of the national language project.


2021 ◽  
Vol 13 (5) ◽  
pp. 36
Author(s):  
Yanxiao Ma

The study explores the acoustic properties of syllable-initial [ŋ] in Zhengding dialect, to see whether the younger generation shows the same pattern with the senior group. 60 items with vowel realizations [ʌ, a, ɑ, ə, ɤ] and [ai, ɑo, ou] in ‘[ŋ]-V’ and ‘[g]-V’ structures are produced by 8 native speakers. Three experiments are conducted. Experiment I compares ‘[ŋ]-V’ and ‘[g]-V’ structures in senior speeches. Three acoustic effects due to the initial [ŋ] are established: vowels become less distinctive from each other by decreasing the first formant (F1), increasing the second formant (F2), and shrinking the gap between the second formant (F2) and the third formant (F3). Experiment II is conducted between ‘[ŋ]-V’ and ‘[g]-V’ in the younger speakers, investigating whether they have a similar pattern with the seniors. Experiment III is supplemented to compare the younger speeches in Zhengding dialect and Mandarin, to explore whether the generational variation in Zhengding dialect is relevant to dialect contact, i.e., whether the younger speakers are largely influenced by Mandarin. The result shows the younger generation does not produce the initial [ŋ] with the vowel realizations [ʌ, a, ɑ, ə, ai, ɑo, ou], which traditionally have an initial [ŋ], with an exception in [ɤ]. A fusion process is assumed in [ɤ] in the younger pattern, in which the initial nasal [ŋ] and the following vowel [ɤ] are combined into the single nasalized vowel [ɤ̃], with the nasal effects remained, but the initial nasal then deleted. From the sociovariationist perspective, the nasal-initial pronunciation is a partial variation in Zhengding dialect. Not all speakers pronounce with the velar-initial [ŋ]. The older generation largely remained the velar-initial variant, but the younger generation preferred the zero-onset, which might be due to the influence of dialect contact with Mandarin.


2021 ◽  
Vol 12 (5) ◽  
pp. 678-687
Author(s):  
Katja Immonen ◽  
Jemina Kilpeläinen ◽  
Paavo Alku ◽  
Maija S. Peltola

Earlier studies have shown that children are efficient second language learners. Research has also shown that musical background might affect second language learning. A two-day auditory training paradigm was used to investigate whether studying in a music-oriented education program affects children’s sensitivity to acquire a non-native vowel contrast. Training effects were measured with listen-and-repeat production tests. Two groups of monolingual Finnish children (9–11 years, N=23) attending music-oriented and regular fourth grades were tested. The stimuli were two semisynthetic pseudo words /ty:ti/ and /tʉ:ti/ with the native vowel /y/ and the non-native vowel /ʉ/ embedded. Both groups changed their pronunciation after the first training. The change was reflected in the second formant values of /ʉ/, which lowered significantly after three trainings. The results show that 9–11-year-old children benefit from passive auditory training in second language production learning regardless of whether or not they attend a music-oriented education program.


2021 ◽  
Vol 6 ◽  
Author(s):  
Erika Brandt ◽  
Bernd Möbius ◽  
Bistra Andreeva

Phonetic structures expand temporally and spectrally when they are difficult to predict from their context. To some extent, effects of predictability are modulated by prosodic structure. So far, studies on the impact of contextual predictability and prosody on phonetic structures have neglected the dynamic nature of the speech signal. This study investigates the impact of predictability and prominence on the dynamic structure of the first and second formants of German vowels. We expect to find differences in the formant movements between vowels standing in different predictability contexts and a modulation of this effect by prominence. First and second formant values are extracted from a large German corpus. Formant trajectories of peripheral vowels are modeled using generalized additive mixed models, which estimate nonlinear regressions between a dependent variable and predictors. Contextual predictability is measured as biphone and triphone surprisal based on a statistical German language model. We test for the effects of the information-theoretic measures surprisal and word frequency, as well as prominence, on formant movement, while controlling for vowel phonemes and duration. Primary lexical stress and vowel phonemes are significant predictors of first and second formant trajectory shape. We replicate previous findings that vowels are more dispersed in stressed syllables than in unstressed syllables. The interaction of stress and surprisal explains formant movement: unstressed vowels show more variability in their formant trajectory shape at different surprisal levels than stressed vowels. This work shows that effects of contextual predictability on fine phonetic detail can be observed not only in pointwise measures but also in dynamic features of phonetic segments.


Author(s):  
Yi-Fang Chiu ◽  
Amy Neel ◽  
Travis Loux

Purpose Auditory perceptual judgments are commonly used to diagnose dysarthria and assess treatment progress. The purpose of the study was to examine the acoustic underpinnings of perceptual speech abnormalities in individuals with Parkinson's disease (PD). Method Auditory perceptual judgments were obtained from sentences produced by 13 speakers with PD and five healthy older adults. Twenty young listeners rated overall ease of understanding, articulatory precision, voice quality, and prosodic adequacy on a visual analog scale. Acoustic measures associated with the speech subsystems of articulation, phonation, and prosody were obtained, including second formant transitions, articulation rate, cepstral and spectral measures of voice, and pitch variations. Regression analyses were performed to assess the relationships between perceptual judgments and acoustic variables. Results Perceptual impressions of Parkinsonian speech were related to combinations of several acoustic variables. Approximately 36%–49% of the variance in the perceptual ratings were explained by the acoustic measures indicating a modest acoustic perceptual relationship. Conclusions The relationships between perceptual ratings and acoustic signals in Parkinsonian speech are multifactorial and involve a variety of acoustic features simultaneously. The modest acoustic perceptual relationships, however, suggest that future work is needed to further examine the acoustic bases of perceptual judgments in dysarthria.


Languages ◽  
2021 ◽  
Vol 6 (2) ◽  
pp. 61
Author(s):  
Lisa Kornder ◽  
Ineke Mennen

The purpose of this investigation was to trace first (L1) and second language (L2) segmental speech development in the Austrian German–English late bilingual Arnold Schwarzenegger over a period of 40 years, which makes it the first study to examine a bilingual’s speech development over several decades in both their languages. To this end, acoustic measurements of voice onset time (VOT) durations of word-initial plosives (Study 1) and formant frequencies of the first and second formant of Austrian German and English monophthongs (Study 2) were conducted using speech samples collected from broadcast interviews. The results of Study 1 showed a merging of Schwarzenegger’s German and English voiceless plosives in his late productions as manifested in a significant lengthening of VOT duration in his German plosives, and a shortening of VOT duration in his English plosives, closer to L1 production norms. Similar findings were evidenced in Study 2, revealing that some of Schwarzenegger’s L1 and L2 vowel categories had moved closer together in the course of L2 immersion. These findings suggest that both a bilingual’s first and second language accent is likely to develop and reorganize over time due to dynamic interactions between the first and second language system.


Sign in / Sign up

Export Citation Format

Share Document