Deep Neural Network Bottleneck Features for Acoustic Event Recognition

Author(s):  
Seongkyu Mun ◽  
Suwon Shon ◽  
Wooil Kim ◽  
Hanseok Ko
2017 ◽  
Vol 77 (1) ◽  
pp. 897-916 ◽  
Author(s):  
Yanxiong Li ◽  
Xue Zhang ◽  
Hai Jin ◽  
Xianku Li ◽  
Qin Wang ◽  
...  

Sensors ◽  
2021 ◽  
Vol 21 (19) ◽  
pp. 6622
Author(s):  
Barış Bayram ◽  
Gökhan İnce

Acoustic scene analysis (ASA) relies on the dynamic sensing and understanding of stationary and non-stationary sounds from various events, background noises and human actions with objects. However, the spatio-temporal nature of the sound signals may not be stationary, and novel events may exist that eventually deteriorate the performance of the analysis. In this study, a self-learning-based ASA for acoustic event recognition (AER) is presented to detect and incrementally learn novel acoustic events by tackling catastrophic forgetting. The proposed ASA framework comprises six elements: (1) raw acoustic signal pre-processing, (2) low-level and deep audio feature extraction, (3) acoustic novelty detection (AND), (4) acoustic signal augmentations, (5) incremental class-learning (ICL) (of the audio features of the novel events) and (6) AER. The self-learning on different types of audio features extracted from the acoustic signals of various events occurs without human supervision. For the extraction of deep audio representations, in addition to visual geometry group (VGG) and residual neural network (ResNet), time-delay neural network (TDNN) and TDNN based long short-term memory (TDNN–LSTM) networks are pre-trained using a large-scale audio dataset, Google AudioSet. The performances of ICL with AND using Mel-spectrograms, and deep features with TDNNs, VGG, and ResNet from the Mel-spectrograms are validated on benchmark audio datasets such as ESC-10, ESC-50, UrbanSound8K (US8K), and an audio dataset collected by the authors in a real domestic environment.


Author(s):  
David T. Wang ◽  
Brady Williamson ◽  
Thomas Eluvathingal ◽  
Bruce Mahoney ◽  
Jennifer Scheler

Author(s):  
P.L. Nikolaev

This article deals with method of binary classification of images with small text on them Classification is based on the fact that the text can have 2 directions – it can be positioned horizontally and read from left to right or it can be turned 180 degrees so the image must be rotated to read the sign. This type of text can be found on the covers of a variety of books, so in case of recognizing the covers, it is necessary first to determine the direction of the text before we will directly recognize it. The article suggests the development of a deep neural network for determination of the text position in the context of book covers recognizing. The results of training and testing of a convolutional neural network on synthetic data as well as the examples of the network functioning on the real data are presented.


2020 ◽  
Author(s):  
Ala Supriya ◽  
Chiluka Venkat ◽  
Aliketti Deepak ◽  
GV Hari Prasad

Sign in / Sign up

Export Citation Format

Share Document