Rethinking environmental sound classification using convolutional neural networks: optimized parameter tuning of single feature extraction

Research in sound classification and recognition is rapidly advancing in the field of pattern recognition. One important area in this field is environmental sound recognition, whether it concerns the identification of endangered species in different habitats or the type of interfering noise in urban environments. Since environmental audio datasets are often limited in size, a robust model able to perform well across different datasets is of strong research interest. In this paper, ensembles of classifiers are combined that exploit six data augmentation techniques and four signal representations for retraining five pre-trained convolutional neural networks (CNNs); these ensembles are tested on three freely available environmental audio benchmark datasets: (i) bird calls, (ii) cat sounds, and (iii) the Environmental Sound Classification (ESC-50) database for identifying sources of noise in environments. To the best of our knowledge, this is the most extensive study investigating ensembles of CNNs for audio classification. The best-performing ensembles are compared and shown to either outperform or perform comparatively to the best methods reported in the literature on these datasets, including on the challenging ESC-50 dataset. We obtained a 97% accuracy on the bird dataset, 90.51% on the cat dataset, and 88.65% on ESC-50 using different approaches. In addition, the same ensemble model trained on the three datasets managed to reach the same results on the bird and cat datasets while losing only 0.1% on ESC-50. Thus, we have managed to create an off-the-shelf ensemble that can be trained on different datasets and reach performances competitive with the state of the art.

Download Full-text

Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification

IEEE Signal Processing Letters ◽

10.1109/lsp.2017.2657381 ◽

2017 ◽

Vol 24 (3) ◽

pp. 279-283 ◽

Cited By ~ 370

Author(s):

Justin Salamon ◽

Juan Pablo Bello

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Data Augmentation ◽

Deep Convolutional Neural Networks ◽

Environmental Sound ◽

Sound Classification

Download Full-text

Convolutional Neural Networks for Scops Owl Sound Classification

Procedia Computer Science ◽

10.1016/j.procs.2020.12.010 ◽

2021 ◽

Vol 179 ◽

pp. 81-87

Author(s):

Alam Ahmad Hidayat ◽

Tjeng Wawan Cenggoro ◽

Bens Pardamean

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Sound Classification

Download Full-text

Image Classification for the Automatic Feature Extraction in Human Worn Fashion Data

Mathematics ◽

10.3390/math9060624 ◽

2021 ◽

Vol 9 (6) ◽

pp. 624

Author(s):

Stefan Rohrmanstorfer ◽

Mikhail Komarov ◽

Felix Mödritscher

Keyword(s):

Neural Networks ◽

Feature Extraction ◽

Image Classification ◽

Convolutional Neural Networks ◽

Data Augmentation ◽

State Of The Art ◽

Image Data ◽

Classification Model ◽

Upper Body ◽

Automatic Feature Extraction

With the always increasing amount of image data, it has become a necessity to automatically look for and process information in these images. As fashion is captured in images, the fashion sector provides the perfect foundation to be supported by the integration of a service or application that is built on an image classification model. In this article, the state of the art for image classification is analyzed and discussed. Based on the elaborated knowledge, four different approaches will be implemented to successfully extract features out of fashion data. For this purpose, a human-worn fashion dataset with 2567 images was created, but it was significantly enlarged by the performed image operations. The results show that convolutional neural networks are the undisputed standard for classifying images, and that TensorFlow is the best library to build them. Moreover, through the introduction of dropout layers, data augmentation and transfer learning, model overfitting was successfully prevented, and it was possible to incrementally improve the validation accuracy of the created dataset from an initial 69% to a final validation accuracy of 84%. More distinct apparel like trousers, shoes and hats were better classified than other upper body clothes.

Download Full-text

Writer adaptive feature extraction based on convolutional neural networks for online handwritten Chinese character recognition

2015 13th International Conference on Document Analysis and Recognition (ICDAR) ◽

10.1109/icdar.2015.7333880 ◽

2015 ◽

Cited By ~ 8

Author(s):

Jun Du ◽

Jian-Fang Zhai ◽

Jin-Shui Hu ◽

Bo Zhu ◽

Si Wei ◽

...

Keyword(s):

Neural Networks ◽

Feature Extraction ◽

Convolutional Neural Networks ◽

Character Recognition ◽

Chinese Character ◽

Chinese Character Recognition ◽

Handwritten Chinese Character Recognition ◽

Adaptive Feature Extraction

Download Full-text

Feature Extraction and Segmentation Processing of Images Based on Convolutional Neural Networks

Optical Memory and Neural Networks ◽

10.3103/s1060992x21010069 ◽

2021 ◽

Vol 30 (1) ◽

pp. 67-73

Author(s):

Shuping Nan

Keyword(s):

Neural Networks ◽

Feature Extraction ◽

Convolutional Neural Networks

Download Full-text