scholarly journals Formant-Based Recognition of Words and Other Naturalistic Sounds in Rhesus Monkeys

2021 ◽  
Vol 15 ◽  
Author(s):  
Jonathan Melchor ◽  
José Vergara ◽  
Tonatiuh Figueroa ◽  
Isaac Morán ◽  
Luis Lemus

In social animals, identifying sounds is critical for communication. In humans, the acoustic parameters involved in speech recognition, such as the formant frequencies derived from the resonance of the supralaryngeal vocal tract, have been well documented. However, how formants contribute to recognizing learned sounds in non-human primates remains unclear. To determine this, we trained two rhesus monkeys to discriminate target and non-target sounds presented in sequences of 1–3 sounds. After training, we performed three experiments: (1) We tested the monkeys’ accuracy and reaction times during the discrimination of various acoustic categories; (2) their ability to discriminate morphing sounds; and (3) their ability to identify sounds consisting of formant 1 (F1), formant 2 (F2), or F1 and F2 (F1F2) pass filters. Our results indicate that macaques can learn diverse sounds and discriminate from morphs and formants F1 and F2, suggesting that information from few acoustic parameters suffice for recognizing complex sounds. We anticipate that future neurophysiological experiments in this paradigm may help elucidate how formants contribute to the recognition of sounds.

2005 ◽  
Vol 272 (1566) ◽  
pp. 941-947 ◽  
Author(s):  
David Reby ◽  
Karen McComb ◽  
Bruno Cargnelutti ◽  
Chris Darwin ◽  
W. Tecumseh Fitch ◽  
...  

While vocal tract resonances or formants are key acoustic parameters that define differences between phonemes in human speech, little is known about their function in animal communication. Here, we used playback experiments to present red deer stags with re-synthesized vocalizations in which formant frequencies were systematically altered to simulate callers of different body sizes. In response to stimuli where lower formants indicated callers with longer vocal tracts, stags were more attentive, replied with more roars and extended their vocal tracts further in these replies. Our results indicate that mammals other than humans use formants in vital vocal exchanges and can adjust their own formant frequencies in relation to those that they hear.


2014 ◽  
Vol 57 (1) ◽  
pp. 285-296 ◽  
Author(s):  
Verena G. Skuk ◽  
Stefan R. Schweinberger

Purpose To determine the relative importance of acoustic parameters (fundamental frequency [F0], formant frequencies [FFs], aperiodicity, and spectrum level [SL]) on voice gender perception, the authors used a novel parameter-morphing approach that, unlike spectral envelope shifting, allows the application of nonuniform scale factors to transform formants and more direct comparison of parameter impact. Method In each of 2 experiments, 16 listeners with normal hearing (8 female, 8 male) classified voice gender for morphs between female and male speakers, using syllable tokens from 2 male–female speaker pairs. Morphs varied single acoustic parameters (Experiment 1) or selected combinations (Experiment 2), keeping residual parameters androgynous, as determined in a baseline experiment. Results The strongest cue related to gender perception was F0, followed by FF and SL. Aperiodicity did not systematically influence gender perception. Morphing F0 and FF in conjunction produced convincing changes in perceived gender—changes that were equivalent to those for Full morphs interpolating all parameters. Despite the importance of F0, morphing FF and SL in combination produced effective changes in voice gender perception. Conclusions The most important single parameters for gender perception are, in order, F0, FF, and SL. At the same time, F0 and vocal tract resonances have a comparable impact on voice gender perception.


2012 ◽  
Author(s):  
Hiroaki Hatano ◽  
Tatsuya Kitamura ◽  
Hironori Takemoto ◽  
Parham Mokhtari ◽  
Kiyoshi Honda ◽  
...  

1987 ◽  
Vol 30 (3) ◽  
pp. 301-305 ◽  
Author(s):  
Robert A. Prosek ◽  
Allen A. Montgomery ◽  
Brian E. Walden ◽  
David B. Hawkins

The formant frequencies of 15 adult stutterers' fluent and disfluent vowels and the formant frequencies of stutterers' and nonstutterers' fluent vowels were compared in an F1-F2 vowel space and in a normalized F1-F2 vowel space. The results indicated that differences in formant frequencies observed between the stutterers' and nonstutterers' vowels can be accounted for by differences among the vocal tract dimensions of the talkers. In addition, no differences were found between the formant frequencies of the fluent and disfluent vowels produced by the stutterers. The overall pattern of these results indicates that, contrary to recent reports (Klich & May, 1982), stutterers do not exhibit significantly greater vowel centralization than nonstutterers.


Animals ◽  
2018 ◽  
Vol 8 (10) ◽  
pp. 167 ◽  
Author(s):  
Anton Baotic ◽  
Maxime Garcia ◽  
Markus Boeckle ◽  
Angela Stoeger

African savanna elephants live in dynamic fission–fusion societies and exhibit a sophisticated vocal communication system. Their most frequent call-type is the ‘rumble’, with a fundamental frequency (which refers to the lowest vocal fold vibration rate when producing a vocalization) near or in the infrasonic range. Rumbles are used in a wide variety of behavioral contexts, for short- and long-distance communication, and convey contextual and physical information. For example, maturity (age and size) is encoded in male rumbles by formant frequencies (the resonance frequencies of the vocal tract), having the most informative power. As sound propagates, however, its spectral and temporal structures degrade progressively. Our study used manipulated and resynthesized male social rumbles to simulate large and small individuals (based on different formant values) to quantify whether this phenotypic information efficiently transmits over long distances. To examine transmission efficiency and the potential influences of ecological factors, we broadcasted and re-recorded rumbles at distances of up to 1.5 km in two different habitats at the Addo Elephant National Park, South Africa. Our results show that rumbles were affected by spectral–temporal degradation over distance. Interestingly and unlike previous findings, the transmission of formants was better than that of the fundamental frequency. Our findings demonstrate the importance of formant frequencies for the efficiency of rumble propagation and the transmission of information content in a savanna elephant’s natural habitat.


2011 ◽  
Vol 129 (6) ◽  
pp. 3955-3963 ◽  
Author(s):  
Matthias Echternach ◽  
Johan Sundberg ◽  
Tobias Baumann ◽  
Michael Markl ◽  
Bernhard Richter

2003 ◽  
Vol 46 (3) ◽  
pp. 689-701 ◽  
Author(s):  
Steve An Xue ◽  
Grace Jianping Hao

This investigation used a derivation of acoustic reflection (AR) technology to make cross-sectional measurements of changes due to aging in the oral and pharyngeal lumina of male and female speakers. The purpose of the study was to establish preliminary normative data for such changes and to obtain acoustic measurements of changes due to aging in the formant frequencies of selected spoken vowels and their long-term average spectra (LTAS) analysis. Thirty- eight young men and women and 38 elderly men and women were involved in the study. The oral and pharyngeal lumina of the participants were measured with AR technology, and their formant frequencies were analyzed using the Kay Elemetrics Computerized Speech Lab. The findings have delineated specific and similar patterns of aging changes in human vocal tract configurations in speakers of both genders. Namely, the oral cavity length and volume of elderly speakers increased significantly compared to their young cohorts. The total vocal tract volume of elderly speakers also showed a significant increment, whereas the total vocal tract length of elderly speakers did not differ significantly from their young cohorts. Elderly speakers of both genders also showed similar patterns of acoustic changes of speech production, that is, consistent lowering of formant frequencies (especially F1) across selected vowel productions. Although new research models are still needed to succinctly account for the speech acoustic changes of the elderly, especially for their specific patterns of human vocal tract dimensional changes, this study has innovatively applied the noninvasive and cost-effective AR technology to monitor age-related human oral and pharyngeal lumina changes that have direct consequences for speech production.


Sign in / Sign up

Export Citation Format

Share Document