Environmental Sound Classification via Time-Frequency Attention and Framewise Self-Attention based Deep Neural Networks

Author(s):  
Bo Wu ◽  
Xiao-Ping Zhang
Author(s):  
Ke Zhang ◽  
Yu Su ◽  
Jingyu Wang ◽  
Sanyu Wang ◽  
Yanhua Zhang

At present, the environment sound recognition system mainly identifies environment sounds with deep neural networks and a wide variety of auditory features. Therefore, it is necessary to analyze which auditory features are more suitable for deep neural networks based ESCR systems. In this paper, we chose three sound features which based on two widely used filters:the Mel and Gammatone filter banks. Subsequently, the hybrid feature MGCC is presented. Finally, a deep convolutional neural network is proposed to verify which features are more suitable for environment sound classification and recognition tasks. The experimental results show that the signal processing features are better than the spectrogram features in the deep neural network based environmental sound recognition system. Among all the acoustic features, the MGCC feature achieves the best performance than other features. Finally, the MGCC-CNN model proposed in this paper is compared with the state-of-the-art environmental sound classification models on the UrbanSound 8K dataset. The results show that the proposed model has the best classification accuracy.


Author(s):  
Jinfang Zeng ◽  
Youming Li ◽  
Yu Zhang ◽  
Da Chen

Environmental sound classification (ESC) is a challenging problem due to the complexity of sounds. To date, a variety of signal processing and machine learning techniques have been applied to ESC task, including matrix factorization, dictionary learning, wavelet filterbanks and deep neural networks. It is observed that features extracted from deeper networks tend to achieve higher performance than those extracted from shallow networks. However, in ESC task, only the deep convolutional neural networks (CNNs) which contain several layers are used and the residual networks are ignored, which lead to degradation in the performance. Meanwhile, a possible explanation for the limited exploration of CNNs and the difficulty to improve on simpler models is the relative scarcity of labeled data for ESC. In this paper, a residual network called EnvResNet for the ESC task is proposed. In addition, we propose to use audio data augmentation to overcome the problem of data scarcity. The experiments will be performed on the ESC-50 database. Combined with data augmentation, the proposed model outperforms baseline implementations relying on mel-frequency cepstral coefficients and achieves results comparable to other state-of-the-art approaches in terms of classification accuracy.


2020 ◽  
Vol 10 (11) ◽  
pp. 2764-2767
Author(s):  
Chuanbin Ge ◽  
Di Liu ◽  
Juan Liu ◽  
Bingshuai Liu ◽  
Yi Xin

Arrhythmia is a group of conditions in which the heartbeat is irregular. There are many types of arrhythmia. Some can be life-threatening. Electrocardiogram (ECG) is an effective clinical tool used to diagnosis arrhythmia. Automatic recognition of different arrhythmia types in ECG signals has become an important and challenging issue. In this article, we proposed an algorithm to detect arrhythmia in 12-lead ECG signals and classify signals into 9 categories. Two 19-layer deep neural networks combining convolutional neural network and gated recurrent unit were proposed to realize this work. The first one was trained directly with the raw 12-lead ECG data while the other one was trained with an 18-"lead" ECG data, where the six extra leads containing morphology information in fractional time–frequency domain were generated utilizing fractional Fourier transform (FRFT). Overall detection results were obtained by fusing the output of these two networks and the final classification results on the testing dataset reports our proposed algorithm obtained a F1 score of 0.855. Furthermore, with our proposed algorithm, a better F1 score 0.81 was attained using training dataset provided by the China Physiological Signal Challenge held in 2018.


Sign in / Sign up

Export Citation Format

Share Document