Vocal tract area functions from two point acoustic measurements with formant frequency constraints

1984 ◽  
Vol 32 (6) ◽  
pp. 1122-1135 ◽  
Author(s):  
P. Milenkovic
1991 ◽  
Vol 34 (5) ◽  
pp. 1057-1065 ◽  
Author(s):  
Ruth Saletsky Kamen ◽  
Ben C. Watson

This study investigated the effects of long-term tracheostomy on the development of speech. Eight children who underwent tracheotomy during the prelingual period were compared to matched controls on selected spectral parameters of the speech acoustic signal and standard measures of oral-motor, phonologic, and articulatory proficiency. Analysis of formant frequency values revealed significant between-group differences. Children with histories of long-term tracheostomy showed reduced acoustic vowel space, as defined by group formant frequency values. This suggests that these children were limited in their ability to produce extreme vocal tract configurations for vowels /a,i,u/ postdecannulation. Oral motor patterns were less mature, and sound substitutions were not only more variable for this group, but also reflected a persistent overlay of maladaptive compensations developed during cannulation.


1992 ◽  
Vol 35 (4) ◽  
pp. 761-768 ◽  
Author(s):  
Petra Zwirner ◽  
Gary J. Barnes

Acoustic analyses of upper airway and phonatory stability were conducted on samples of sustained phonation to evaluate the relation between laryngeal and articulomotor stability for 31 patients with dysarthria and 12 non-dysarthric control subjects. Significantly higher values were found for the variability in fundamental frequency and formant frequency of patients who have Huntington’s disease compared with normal subjects and patients with Parkinson’s disease. No significant correlations were found between formant frequency variability and the variability of the fundamental frequency for any subject group. These findings are discussed as they pertain to the relationship between phonatory and upper airway subsystems and the evaluation of vocal tract motor control impairments in dysarthria.


Author(s):  
Johan Sundberg

The function of the voice organ is basically the same in classical singing as in speech. However, loud orchestral accompaniment has necessitated the use of the voice in an economical way. As a consequence, the vowel sounds tend to deviate considerably from those in speech. Male voices cluster formant three, four, and five, so that a marked peak is produced in spectrum envelope near 3,000 Hz. This helps them to get heard through a loud orchestral accompaniment. They seem to achieve this effect by widening the lower pharynx, which makes the vowels more centralized than in speech. Singers often sing at fundamental frequencies higher than the normal first formant frequency of the vowel in the lyrics. In such cases they raise the first formant frequency so that it gets somewhat higher than the fundamental frequency. This is achieved by reducing the degree of vocal tract constriction or by widening the lip and jaw openings, constricting the vocal tract in the pharyngeal end and widening it in the mouth. These deviations from speech cause difficulties in vowel identification, particularly at high fundamental frequencies. Actually, vowel identification is almost impossible above 700 Hz (pitch F5). Another great difference between vocal sound produced in speech and the classical singing tradition concerns female voices, which need to reduce the timbral differences between voice registers. Females normally speak in modal or chest register, and the transition to falsetto tends to happen somewhere above 350 Hz. The great timbral differences between these registers are avoided by establishing control over the register function, that is, over the vocal fold vibration characteristics, so that seamless transitions are achieved. In many other respects, there are more or less close similarities between speech and singing. Thus, marking phrase structure, emphasizing important events, and emotional coloring are common principles, which may make vocal artists deviate considerably from the score’s nominal description of fundamental frequency and syllable duration.


2012 ◽  
Vol 2012 ◽  
pp. 1-8 ◽  
Author(s):  
Mousmita Sarma ◽  
Kandarpa Kumar Sarma

In spoken word recognition, one of the crucial points is to identify the vowel phonemes. This paper describes an Artificial Neural Network (ANN) based algorithm developed for the segmentation and recognition of the vowel phonemes of Assamese language from some words containing those vowels. Self-Organizing Map (SOM) trained with a various number of iterations is used to segment the word into its constituent phonemes. Later, Probabilistic Neural Network (PNN) trained with clean vowel phonemes is used to recognize the vowel segment from the six different SOM segmented phonemes. One of the important aspects of the proposed algorithm is that it proves the validation of the recognized vowel by checking its first formant frequency. The first formant frequency of all the Assamese vowels is predetermined by estimating pole or formant location from the linear prediction (LP) model of the vocal tract. The proposed algorithm shows a high recognition performance in comparison to the conventional Discrete Wavelet Transform (DWT) based segmentation.


2016 ◽  
Vol 25 (4) ◽  
pp. 481-492 ◽  
Author(s):  
Jia-Shiou Liao

PurposeThis study investigated the acoustic properties of 6 Taiwan Southern Min vowels produced by 10 laryngeal speakers (LA), 10 speakers with a pneumatic artificial larynx (PA), and 8 esophageal speakers (ES).MethodEach of the 6 monophthongs of Taiwan Southern Min (/i, e, a, ɔ, u, ə/) was represented by a Taiwan Southern Min character and appeared randomly on a list 3 times (6 Taiwan Southern Min characters × 3 repetitions = 18 tokens). Each Taiwan Southern Min character in this study has the same syllable structure, /V/, and all were read with tone 1 (high and level). Acoustic measurements of the 1st formant, 2nd formant, and 3rd formant were taken for each vowel. Then, vowel space areas (VSAs) enclosed by /i, a, u/ were calculated for each group of speakers. The Euclidean distance between vowels in the pairs /i, a/, /i, u/, and /a, u/ was also calculated and compared across the groups.ResultsPA and ES have higher 1st or 2nd formant values than LA for each vowel. The distance is significantly shorter between vowels in the corner vowel pairs /i, a/ and /i, u/. PA and ES have a significantly smaller VSA compared with LA.ConclusionsIn accordance with previous studies, alaryngeal speakers have higher formant frequency values than LA because they have a shortened vocal tract as a result of their total laryngectomy. Furthermore, the resonance frequencies are inversely related to the length of the vocal tract (on the basis of the assumption of the source filter theory). PA and ES have a smaller VSA and shorter distances between corner vowels compared with LA, which may be related to speech intelligibility. This hypothesis needs further support from future study.


1976 ◽  
Vol 60 (S1) ◽  
pp. S77-S77 ◽  
Author(s):  
W. J. Möller ◽  
B. S. Atal ◽  
M. R. Schroeder

Author(s):  
S.K. Adhikari

The regions of speech spectrum in which the frequency corresponds to relatively large amplitude are known as formants. For any vocalic sounds, number of formants may occur in the frequency range 0 to 4000 Hz. The formant frequencies of speech sounds are directly depending up on the shape and size of vocal tract. The aim of study was to study the variation of formant frequency with Nepalese vowels. Ten Nepalese vowels word in initial position /VC/ as spoken three times by 10 male and 10 female Nepali speakers were recorded in system in the free field of partially acoustically treated room. PRRAT software is used to digitize and analyze the data. Linear predictive coding (LPC) spectra were obtained for each of vowels and formant frequencies were measured. By plotting curve between formant frequencies and vowels, explain their variation.  


1998 ◽  
Vol 41 (5) ◽  
pp. 1042-1051 ◽  
Author(s):  
Michael Blomgren ◽  
Michael Robb ◽  
Yang Chen

Inferences were made regarding vocal tract vowel space during fluently produced utterances through examination of the first two formant frequencies. Fifteen adult males served as subjects, representing separate groups of untreated and treated individuals who stutter and nonstuttering controls. The steady-state portion of formant one (F1) and formant two (F2) was examined in the production of various CVC tokens containing the vowels /i/, /u/, and /α/. Vocal tract vowel space was estimated three ways. The first analysis scheme involved measurement of formant frequency spacing. The second measure involved calculating the area of the vowel space triangle. The third measure was based on calculating the average Euclidean distance from each subject's midpoint "centroid" vocal tract position to the corresponding /i/, /u/, and /α/ points on the vowel triangle. The formant frequency spacing measures proved to be most revealing of group differences, with the untreated stutterers showing significantly greater vowel centralization than the treated group and control group. Discussion focuses on the vocal tract articulation characterizing fluent speech productions and possible treatment implications for persons who stutter.


1987 ◽  
Vol 62 (1) ◽  
pp. 259-270 ◽  
Author(s):  
R. Leanderson ◽  
J. Sundberg ◽  
C. von Euler

Esophageal and gastric pressures during singing are measured in four male professional singers performing singing tasks requiring rapid changes of subglottal pressure. Evidence for a consistent use of the diaphragm is found in all subjects. Some subjects punctually activate the diaphragm when there is a need for a rapid decrease of subglottal pressure, such as when singing a falling octave interval, when shifting from a loud to a soft note, to save air during a /p/ explosion, and in performing a trillo involving a repeated switching between glottal adduction and abduction. The first three cases were observed in the beginning of the phrase, presumably over the period that the pressure generated by the passive expiratory recoil forces of the breathing system was higher than the intended subglottal pressure. In addition to this, one subject exhibited a diaphragmatic tonus throughout the entire phrase. The phonatory relevance of a diaphragmatic activity was evaluated in a subsequent experiment. The transdiaphragmatic pressure was displayed on an oscilloscope screen as a visual feedback signal for singers and nonsingers, who performed various phonatory tasks with and without voluntary coactivation of the diaphragm. In most subjects this activity tended to increase the glottal closed/open ratio as well as the amplitude of the glottogram (i.e., the transglottal volume velocity wave-form as determined by inverse filtering). These changes suggest that diaphragmatic coactivation tends to affect phonation. Also, it tended to reduce the formant frequency variability under conditions of changing fundamental frequency suggesting a better stabilization of the vocal tract.


Sign in / Sign up

Export Citation Format

Share Document