Automatic Environmental Sound Recognition (AESR) Using Convolutional Neural Network

At present, the environment sound recognition system mainly identifies environment sounds with deep neural networks and a wide variety of auditory features. Therefore, it is necessary to analyze which auditory features are more suitable for deep neural networks based ESCR systems. In this paper, we chose three sound features which based on two widely used filters:the Mel and Gammatone filter banks. Subsequently, the hybrid feature MGCC is presented. Finally, a deep convolutional neural network is proposed to verify which features are more suitable for environment sound classification and recognition tasks. The experimental results show that the signal processing features are better than the spectrogram features in the deep neural network based environmental sound recognition system. Among all the acoustic features, the MGCC feature achieves the best performance than other features. Finally, the MGCC-CNN model proposed in this paper is compared with the state-of-the-art environmental sound classification models on the UrbanSound 8K dataset. The results show that the proposed model has the best classification accuracy.

Download Full-text

Environmental sound recognition using Gaussian mixture model and neural network classifier

2014 International Conference on Green Computing Communication and Electrical Engineering (ICGCCEE) ◽

10.1109/icgccee.2014.6922272 ◽

2014 ◽

Cited By ~ 4

Author(s):

S. P. Mohanapriya ◽

E. P. Sumesh ◽

R. Karthika

Keyword(s):

Neural Network ◽

Gaussian Mixture Model ◽

Mixture Model ◽

Gaussian Mixture ◽

Sound Recognition ◽

Neural Network Classifier ◽

Environmental Sound ◽

Environmental Sound Recognition

Download Full-text

Deep Convolutional Neural Network with Transfer Learning for Environmental Sound Classification

2021 International Conference on Computer, Control and Robotics (ICCCR) ◽

10.1109/icccr49711.2021.9349393 ◽

2021 ◽

Author(s):

Jianrui Lu ◽

Ruofei Ma ◽

Gongliang Liu ◽

Zhiliang Qin

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Transfer Learning ◽

Deep Convolutional Neural Network ◽

Environmental Sound ◽

Sound Classification

Download Full-text

Environmental sound recognition using MP-based features

2008 IEEE International Conference on Acoustics, Speech and Signal Processing ◽

10.1109/icassp.2008.4517531 ◽

2008 ◽

Cited By ~ 24

Author(s):

Selina Chu ◽

Shrikanth Narayanan ◽

C.-C. Jay Kuo

Keyword(s):

Sound Recognition ◽

Environmental Sound ◽

Environmental Sound Recognition

Download Full-text

Convolutional Neural Network-Gated Recurrent Unit Neural Network with Feature Fusion for Environmental Sound Classification

Automatic Control and Computer Sciences ◽

10.3103/s0146411621040106 ◽

2021 ◽

Vol 55 (4) ◽

pp. 311-318

Author(s):

Yu Zhang ◽

Jinfang Zeng ◽

Youming Li ◽

Da Chen

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Feature Fusion ◽

Environmental Sound ◽

Sound Classification ◽

Gated Recurrent Unit

Download Full-text

Insect Sound Recognition Based on Convolutional Neural Network

2018 IEEE 3rd International Conference on Image, Vision and Computing (ICIVC) ◽

10.1109/icivc.2018.8492871 ◽

2018 ◽

Cited By ~ 1

Author(s):

Xue Dong ◽

Ning Yan ◽

Ying Wei

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Sound Recognition

Download Full-text

Convolutional Neural Network Audio Classifier for Alarm Sound Detection

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f8866.088619 ◽

2019 ◽

Vol 8 (6) ◽

pp. 4554-4557

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Short Term Memory ◽

Sound Recognition ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Differential Network ◽

Sound Detection ◽

Long Short Term Memory ◽

Lstm Network

Neural Networks (ANN) has evolved through many stages in the last three decades with many researchers contributing in this challenging field. With the power of math complex problems can also be solved by ANNs. ANNs like Convolutional Neural Network (CNN), Deep Neural network, Generative Adversarial Network (GAN), Long Short Term Memory (LSTM) network, Recurrent Neural Network (RNN), Ordinary Differential Network etc., are playing promising roles in many MNCs and IT industries for their predictions and accuracy. In this paper, Convolutional Neural Network is used for prediction of Beep sounds in high noise levels. Based on Supervised Learning, the research is developed the best CNN architecture for Beep sound recognition in noisy situations. The proposed method gives better results with an accuracy of 96%. The prototype is tested with few architectures for the training and test data out of which a two layer CNN classifier predictions were the best.

Download Full-text

Temporal patterns and ensemble learning for environmental sound recognition

10.32920/ryerson.14653065.v1 ◽

2021 ◽

Author(s):

Wenjun Yang

Keyword(s):

Event Detection ◽

Temporal Dynamics ◽

Total Error ◽

Nonnegative Matrix ◽

Training Data ◽

Sound Recognition ◽

Environmental Sound ◽

Time Frequency ◽

Ensemble Techniques ◽

Environmental Sound Recognition

This thesis explores features characterizing the temporal dynamics and the use of ensemble techniques to improve the performances of environmental sound recognition (ESR) system. Firstly, for acoustic scene classification (ASC), local binary pattern (LBP) technique is applied to extract the temporal evolution of Mel-frequency cepstral coefficients (MFCC) features, and the D3C ensemble classifier is adopted to optimize the system performance. The results show that the proposed method achieved a classification improvement of 8% compared to the baseline system. Secondly, a new approach for sound event detection (SED) using Nonnegative Matrix Factor 2- D Deconvolution (NMF2D) and RUSBoost techniques is presented. The idea is to capture the two dimensional joint spectral and temporal information from the time-frequency representation (TFR) while possibly separating the sound mixture into several sources. Besides, the RUSBoost ensemble technique is utilized in the event detection process to alleviate class imbalance in the training data. This method reduced the total error rate by 5% compared to the baseline method.

Download Full-text

Environmental Sound Recognition on Embedded Systems: From FPGAs to TPUs

Electronics ◽

10.3390/electronics10212622 ◽

2021 ◽

Vol 10 (21) ◽

pp. 2622

Author(s):

Jurgen Vandendriessche ◽

Nick Wouters ◽

Bruno da Silva ◽

Mimoun Lamrini ◽

Mohamed Yassin Chkouri ◽

...

Keyword(s):

Machine Learning ◽

High Performance ◽

Machine Learning Techniques ◽

Sound Recognition ◽

Learning Approaches ◽

Environmental Sound ◽

Embedded Devices ◽

Power Efficient ◽

Computationally Intensive ◽

Environmental Sound Recognition

In recent years, Environmental Sound Recognition (ESR) has become a relevant capability for urban monitoring applications. The techniques for automated sound recognition often rely on machine learning approaches, which have increased in complexity in order to achieve higher accuracy. Nonetheless, such machine learning techniques often have to be deployed on resource and power-constrained embedded devices, which has become a challenge with the adoption of deep learning approaches based on Convolutional Neural Networks (CNNs). Field-Programmable Gate Arrays (FPGAs) are power efficient and highly suitable for computationally intensive algorithms like CNNs. By fully exploiting their parallel nature, they have the potential to accelerate the inference time as compared to other embedded devices. Similarly, dedicated architectures to accelerate Artificial Intelligence (AI) such as Tensor Processing Units (TPUs) promise to deliver high accuracy while achieving high performance. In this work, we evaluate existing tool flows to deploy CNN models on FPGAs as well as on TPU platforms. We propose and adjust several CNN-based sound classifiers to be embedded on such hardware accelerators. The results demonstrate the maturity of the existing tools and how FPGAs can be exploited to outperform TPUs.

Download Full-text