Research on Bird Songs Recognition Based on MFCC-HMM

Author(s):  
Xie Shan-shan ◽  
Xu Hai-feng ◽  
Liu Jiang ◽  
Zhang Yan ◽  
Lv Dan-jv
Keyword(s):  
The Auk ◽  
1915 ◽  
Vol 32 (4) ◽  
pp. 535-538 ◽  
Author(s):  
Robert Thomas Moore
Keyword(s):  

2021 ◽  
Author(s):  
Nasim Winchester Vahidi

The mechanisms underlying how single auditory neurons and neuron populations encode natural and acoustically complex vocal signals, such as human speech or bird songs, are not well understood. Classical models focus on individual neurons, whose spike rates vary systematically as a function of change in a small number of simple acoustic dimensions. However, neurons in the caudal medial nidopallium (NCM), an auditory forebrain region in songbirds that is analogous to the secondary auditory cortex in mammals, have composite receptive fields (CRFs) that comprise multiple acoustic features tied to both increases and decreases in firing rates. Here, we investigated the anatomical organization and temporal activation patterns of auditory CRFs in European starlings exposed to natural vocal communication signals (songs). We recorded extracellular electrophysiological responses to various bird songs at auditory NCM sites, including both single and multiple neurons, and we then applied a quadratic model to extract large sets of CRF features that were tied to excitatory and suppressive responses at each measurement site. We found that the superset of CRF features yielded spatially and temporally distributed, generalizable representations of a conspecific song. Individual sites responded to acoustically diverse features, as there was no discernable organization of features across anatomically ordered sites. The CRF features at each site yielded broad, temporally distributed responses that spanned the entire duration of many starling songs, which can last for 50 s or more. Based on these results, we estimated that a nearly complete representation of any conspecific song, regardless of length, can be obtained by evaluating populations as small as 100 neurons. We conclude that natural acoustic communication signals drive a distributed yet highly redundant representation across the songbird auditory forebrain, in which adjacent neurons contribute to the encoding of multiple diverse and time-varying spectro-temporal features.


The Auk ◽  
1984 ◽  
Vol 101 (2) ◽  
pp. 307-318 ◽  
Author(s):  
Jonathan Bart ◽  
James D. Schoultz

Abstract Field trials in which paired observers were used and indoor simulations in which recorded bird songs were used indicated that, as the number of singing birds audible from a listening station increased from 1 to 4, the fraction of them recorded by observers declined by up to 50%. This reduction in efficiency violates one of the basic assumptions of any index-that the proportion of animals detected remains constant-and could cause surveyors who rely primarily on auditory cues to underestimate changes in population density by up to 25% for common species and 33% for abundant species. The change in efficiency, which is best regarded as measurement error, cannot be detected by a statistical examination of the data and thus may pass undetected in many field studies. It seems unlikely that any general procedure for "correcting" the error would be reliable. The results indicate that singing bird surveys of common species should be supplemented by other methods if accurate estimates of changes in density are needed. A general conclusion of the study is that whenever animals "compete" for a place in the survey, for example by filling up traps or suppressing one another's songs, then the index tends to underestimate a change in density. If efficiency increases with density, then the survey tends to overestimate a change in density. If the sign of the bias can be determined, the survey can be used to provide a minimum or maximum estimate of a change in density even if the magnitude of the bias cannot be estimated.


2018 ◽  
Vol 96 (2) ◽  
pp. 63-78 ◽  
Author(s):  
Danilo Russo ◽  
Leonardo Ancillotto ◽  
Gareth Jones

The recording and analysis of echolocation calls are fundamental methods used to study bat distribution, ecology, and behavior. However, the goal of identifying bats in flight from their echolocation calls is not always possible. Unlike bird songs, bat calls show large variation that often makes identification challenging. The problem has not been fully overcome by modern digital-based hardware and software for bat call recording and analysis. Besides providing fundamental insights into bat physiology, ecology, and behavior, a better understanding of call variation is therefore crucial to best recognize limits and perspectives of call classification. We provide a comprehensive overview of sources of interspecific and intraspecific echolocation call variations, illustrating its adaptive significance and highlighting gaps in knowledge. We remark that further research is needed to better comprehend call variation and control for it more effectively in sound analysis. Despite the state-of-art technology in this field, combining acoustic surveys with capture and roost search, as well as limiting identification to species with distinctive calls, still represent the safest way of conducting bat surveys.


2019 ◽  
Vol 42 (1) ◽  
pp. 129-147 ◽  
Author(s):  
Christa A. Baker ◽  
Jan Clemens ◽  
Mala Murthy

Across the animal kingdom, social interactions rely on sound production and perception. From simple cricket chirps to more elaborate bird songs, animals go to great lengths to communicate information critical for reproduction and survival via acoustic signals. Insects produce a wide array of songs to attract a mate, and the intended receivers must differentiate these calls from competing sounds, analyze the quality of the sender from spectrotemporal signal properties, and then determine how to react. Insects use numerically simple nervous systems to analyze and respond to courtship songs, making them ideal model systems for uncovering the neural mechanisms underlying acoustic pattern recognition. We highlight here how the combination of behavioral studies and neural recordings in three groups of insects—crickets, grasshoppers, and fruit flies—reveals common strategies for extracting ethologically relevant information from acoustic patterns and how these findings might translate to other systems.


The Auk ◽  
2018 ◽  
Vol 135 (2) ◽  
pp. 314-325 ◽  
Author(s):  
Karan J. Odom ◽  
Lauryn Benedict
Keyword(s):  

2017 ◽  
Vol 29 (1) ◽  
pp. 213-223 ◽  
Author(s):  
Reiji Suzuki ◽  
◽  
Shiho Matsubayashi ◽  
Richard W. Hedley ◽  
Kazuhiro Nakadai ◽  
...  

[abstFig src='/00290001/20.jpg' width='300' text='Bird songs recorded and localized by HARKBird' ] Understanding auditory scenes is important when deploying intelligent robots and systems in real-world environments. We believe that robot audition can better recognize acoustic events in the field as compared to conventional methods such as human observation or recording using single-channel microphone array. We are particularly interested in acoustic interactions among songbirds. Birds do not always vocalize at random, for example, but may instead divide a soundscape so that they avoid overlapping their songs with those of other birds. To understand such complex interaction processes, we must collect much spatiotemporal data in which multiple individuals and species are singing simultaneously. However, it is costly and difficult to annotate many or long recorded tracks manually to detect their interactions. In order to solve this problem, we are developing HARKBird, an easily-available and portable system consisting of a laptop PC with open-source software for robot audition HARK (Honda Research Institute Japan Audition for Robots with Kyoto University) together with a low-cost and commercially available microphone array. HARKBird enables us to extract the songs of multiple individuals from recordings automatically. In this paper, we introduce the current status of our project and report preliminary results of recording experiments in two different types of forests – one in the USA and the other in Japan – using this system to automatically estimate the direction of arrival of the songs of multiple birds, and separate them from the recordings. We also discuss asymmetries among species in terms of their tendency to partition temporal resources.


Sign in / Sign up

Export Citation Format

Share Document