scholarly journals Distributed acoustic cues for caller identity in macaque vocalization

2015 ◽  
Vol 2 (12) ◽  
pp. 150432 ◽  
Author(s):  
Makoto Fukushima ◽  
Alex M. Doyle ◽  
Matthew P. Mullarkey ◽  
Mortimer Mishkin ◽  
Bruno B. Averbeck

Individual primates can be identified by the sound of their voice. Macaques have demonstrated an ability to discern conspecific identity from a harmonically structured ‘coo’ call. Voice recognition presumably requires the integrated perception of multiple acoustic features. However, it is unclear how this is achieved, given considerable variability across utterances. Specifically, the extent to which information about caller identity is distributed across multiple features remains elusive. We examined these issues by recording and analysing a large sample of calls from eight macaques. Single acoustic features, including fundamental frequency, duration and Weiner entropy, were informative but unreliable for the statistical classification of caller identity. A combination of multiple features, however, allowed for highly accurate caller identification. A regularized classifier that learned to identify callers from the modulation power spectrum of calls found that specific regions of spectral–temporal modulation were informative for caller identification. These ranges are related to acoustic features such as the call’s fundamental frequency and FM sweep direction. We further found that the low-frequency spectrotemporal modulation component contained an indexical cue of the caller body size. Thus, cues for caller identity are distributed across identifiable spectrotemporal components corresponding to laryngeal and supralaryngeal components of vocalizations, and the integration of those cues can enable highly reliable caller identification. Our results demonstrate a clear acoustic basis by which individual macaque vocalizations can be recognized.

Author(s):  
Youssef Elfahm ◽  
Nesrine Abajaddi ◽  
Badia Mounir ◽  
Laila Elmaazouzi ◽  
Ilham Mounir ◽  
...  

<span>Many technology systems have used voice recognition applications to transcribe a speaker’s speech into text that can be used by these systems. One of the most complex tasks in speech identification is to know, which acoustic cues will be used to classify sounds. This study presents an approach for characterizing Arabic fricative consonants in two groups (sibilant and non-sibilant). From an acoustic point of view, our approach is based on the analysis of the energy distribution, in frequency bands, in a syllable of the consonant-vowel type. From a practical point of view, our technique has been implemented, in the MATLAB software, and tested on a corpus built in our laboratory. The results obtained show that the percentage energy distribution in a speech signal is a very powerful parameter in the classification of Arabic fricatives. We obtained an accuracy of 92% for non-sibilant consonants /f, χ, ɣ, ʕ, ћ, and h/, 84% for sibilants /s, sҁ, z, Ӡ and ∫/, and 89% for the whole classification rate. In comparison to other algorithms based on neural networks and support vector machines (SVM), our classification system was able to provide a higher classification rate.</span>


2016 ◽  
Vol 41 (2) ◽  
pp. 233-243 ◽  
Author(s):  
Magdalena Igras ◽  
Bartosz Ziółko

Abstract In this article the authors investigated and presented the experiments on the sentence boundaries annotation from Polish speech using acoustic cues as a source of information. The main result of the investigation is an algorithm for detection of the syntactic boundaries appearing in the places of punctuation marks. In the first stage, the algorithm detects pauses and divides a speech signal into segments. In the second stage, it verifies the configuration of acoustic features and puts hypotheses of the positions of punctuation marks. Classification is performed with parameters describing phone duration and energy, speaking rate, fundamental frequency contours and frequency bands. The best results were achieved for Naive Bayes classifier. The efficiency of the algorithm is 52% precision and 98% recall. Another significant outcome of the research is statistical models of acoustic cues correlated with punctuation in spoken Polish.


1998 ◽  
Vol 2 ◽  
pp. 115-122
Author(s):  
Donatas Švitra ◽  
Jolanta Janutėnienė

In the practice of processing of metals by cutting it is necessary to overcome the vibration of the cutting tool, the processed detail and units of the machine tool. These vibrations in many cases are an obstacle to increase the productivity and quality of treatment of details on metal-cutting machine tools. Vibration at cutting of metals is a very diverse phenomenon due to both it’s nature and the form of oscillatory motion. The most general classification of vibrations at cutting is a division them into forced vibration and autovibrations. The most difficult to remove and poorly investigated are the autovibrations, i.e. vibrations arising at the absence of external periodic forces. The autovibrations, stipulated by the process of cutting on metalcutting machine are of two types: the low-frequency autovibrations and high-frequency autovibrations. When the low-frequency autovibration there appear, the cutting process ought to be terminated and the cause of the vibrations eliminated. Otherwise, there is a danger of a break of both machine and tool. In the case of high-frequency vibration the machine operates apparently quiently, but the processed surface feature small-sized roughness. The frequency of autovibrations can reach 5000 Hz and more.


Author(s):  
Vinayaravi R ◽  
Jayaraj Kochupillai ◽  
Kumaresan D ◽  
Asraff A. K

Abstract The objective of this paper is to investigate how higher damping is achieved by energy dissipation as high-frequency vibration due to the addition of impact mass. In an impact damper system, collision between primary and impact masses cause an exchange of momentum resulting in dissipation of energy. A numerical model is developed to study the dynamic behaviour of an impact damper system using a MDOF system with Augmented Lagrangian Multiplier contact algorithm. Mathematical modelling and numerical simulations are carried out using ANSYS FEA package. Studies are carried out for various mass ratios subjecting the system to low-frequency high amplitude excitation. Time responses obtained from numerical simulations at fundamental mode when the system is excited in the vicinity of its fundamental frequency are validated by comparing with experimental results. Magnification factor evaluated from numerical simulation results is comparable with those obtained from experimental data. The transient response obtained from numerical simulations is used to study the behaviour of first three modes of the system excited in vicinity of its fundamental frequency. It is inferred that dissipation of energy is a main reason for achieving higher damping for an impact damper system in addition to being transformed to heat, sound, and/or those required to deform a body.


2021 ◽  
Vol 263 (5) ◽  
pp. 1488-1496
Author(s):  
Yunqi Chen ◽  
Chuang Shi ◽  
Hao Mu

Earphones are commonly equipped with miniature loudspeaker units, which cannot transmit enough power of low-frequency sound. Meanwhile, there is often only one loudspeaker unit employed on each side of the earphone, whereby the multi-channel spatial audio processing cannot be applied. Therefore, the combined usage of the virtual bass (VB) and head-related transfer functions (HRTFs) is necessary for an immersive listening experience with earphones. However, the combining effect of the VB and HRTFs has not been comprehensively reported. The VB is developed based on the missing fundamental effect, providing that the presence of harmonics can be perceived as their fundamental frequency, even if the fundamental frequency is not presented. HRTFs describe the transmission process of a sound propagating from the sound source to human ears. Monaural audio processed by a pair of HRTFs can be perceived by the listener as a sound source located in the direction associated with the HRTFs. This paper carries out subjective listening tests and their results reveal that the harmonics required by the VB should be generated in the same direction as the high-frequency sound. The bass quality is rarely distorted by the presence of HRTFs, but the localization accuracy is occasionally degraded by the VB.


Author(s):  
Syed Akhter Hossain ◽  
M. Lutfar Rahman ◽  
Faruk Ahmed ◽  
M. Abdus Sobhan

The aim of this chapter is to clearly understand the salient features of Bangla vowels and the sources of acoustic variability in Bangla vowels, and to suggest classification of vowels based on normalized acoustic parameters. Possible applications in automatic speech recognition and speech enhancement have made the classification of vowels an important problem to study. However, Bangla vowels spoken by different native speakers show great variations in their respective formant values. This brings further complications in the acoustic comparison of vowels due to different dialect and language backgrounds of the speakers. This variation necessitates the use of normalization procedures to remove the effect of non-linguistic factors. Although several researchers found a number of acoustical and perceptual correlates of vowels, acoustic parameters that work well in a speaker-independent manner are yet to be found. Besides, study of acoustic features of Bangla dental consonants to identify the spectral differences between different consonants and to parameterize them for the synthesis of the segments is another problem area for study. The extracted features for both Bangla vowels and dental consonants are tested and found with good synthetic representations that demonstrate the quality of acoustic features.


Sign in / Sign up

Export Citation Format

Share Document