Environmental Sound Classification Method Based on Two-Stream Lightweight Convolutional Neural Network

Abstract Neural networks have achieved success in the task of environmental sound classification. However, the traditional neural network model has too many parameters and high computational cost. The lightweight networks solve these problems by compressing parameters, but reduce the classification accuracy. To solve the problems in existing research, we propose a two-stream model based on two lightweight convolutional neural networks, called TSLCNN-DS, which saves memory and improves the classification performance of environmental sounds. Specifically, we first used data patching and data balancing to slightly expand the amount of experimental data. Then we designed two lightweight and efficient classification networks based on the attention mechanism and residual learning. Finally, the Dempster-Shafer evidence theory is used to fuse the output of the two networks, and the two-stream model is integrated. Experiments have shown that the model has achieved a classification accuracy of 97.44% on the UrbanSound8k dataset, using only 0.12 M parameters.

Download Full-text

An Ensemble Stacked Convolutional Neural Network Model for Environmental Event Sound Recognition

Applied Sciences ◽

10.3390/app8071152 ◽

2018 ◽

Vol 8 (7) ◽

pp. 1152 ◽

Cited By ~ 21

Author(s):

Shaobo Li ◽

Yong Yao ◽

Jie Hu ◽

Guokai Liu ◽

Xuemei Yao ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Neural Network Model ◽

Evidence Theory ◽

Sound Recognition ◽

Fusion Method ◽

Environmental Sound ◽

Model Based ◽

Environmental Event ◽

Public Datasets

Convolutional neural networks (CNNs) with log-mel audio representation and CNN-based end-to-end learning have both been used for environmental event sound recognition (ESC). However, log-mel features can be complemented by features learned from the raw audio waveform with an effective fusion method. In this paper, we first propose a novel stacked CNN model with multiple convolutional layers of decreasing filter sizes to improve the performance of CNN models with either log-mel feature input or raw waveform input. These two models are then combined using the Dempster–Shafer (DS) evidence theory to build the ensemble DS-CNN model for ESC. Our experiments over three public datasets showed that our method could achieve much higher performance in environmental sound recognition than other CNN models with the same types of input features. This is achieved by exploiting the complementarity of the model based on log-mel feature input and the model based on learning features directly from raw waveforms.

Download Full-text

Hyperspectral Image Classification Based on a Shuffled Group Convolutional Neural Network with Transfer Learning

Remote Sensing ◽

10.3390/rs12111780 ◽

2020 ◽

Vol 12 (11) ◽

pp. 1780 ◽

Cited By ~ 2

Author(s):

Yao Liu ◽

Lianru Gao ◽

Chenchao Xiao ◽

Ying Qu ◽

Ke Zheng ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Network ◽

Transfer Learning ◽

Classification Accuracy ◽

Hyperspectral Image ◽

Classification Performance ◽

Training Dataset ◽

Hyperspectral Image Classification ◽

Different Levels

Convolutional neural networks (CNNs) have been widely applied in hyperspectral imagery (HSI) classification. However, their classification performance might be limited by the scarcity of labeled data to be used for training and validation. In this paper, we propose a novel lightweight shuffled group convolutional neural network (abbreviated as SG-CNN) to achieve efficient training with a limited training dataset in HSI classification. SG-CNN consists of SG conv units that employ conventional and atrous convolution in different groups, followed by channel shuffle operation and shortcut connection. In this way, SG-CNNs have less trainable parameters, whilst they can still be accurately and efficiently trained with fewer labeled samples. Transfer learning between different HSI datasets is also applied on the SG-CNN to further improve the classification accuracy. To evaluate the effectiveness of SG-CNNs for HSI classification, experiments have been conducted on three public HSI datasets pretrained on HSIs from different sensors. SG-CNNs with different levels of complexity were tested, and their classification results were compared with fine-tuned ShuffleNet2, ResNeXt, and their original counterparts. The experimental results demonstrate that SG-CNNs can achieve competitive classification performance when the amount of labeled data for training is poor, as well as efficiently providing satisfying classification results.

Download Full-text

Environment Sound Classification System Based on Hybrid Feature and Convolutional Neural Network

Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University ◽

10.1051/jnwpu/20203810162 ◽

2020 ◽

Vol 38 (1) ◽

pp. 162-169

Author(s):

Ke Zhang ◽

Yu Su ◽

Jingyu Wang ◽

Sanyu Wang ◽

Yanhua Zhang

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Network ◽

Deep Neural Networks ◽

Recognition System ◽

Sound Recognition ◽

Environmental Sound ◽

Sound Classification ◽

Proposed Model ◽

Auditory Features

At present, the environment sound recognition system mainly identifies environment sounds with deep neural networks and a wide variety of auditory features. Therefore, it is necessary to analyze which auditory features are more suitable for deep neural networks based ESCR systems. In this paper, we chose three sound features which based on two widely used filters:the Mel and Gammatone filter banks. Subsequently, the hybrid feature MGCC is presented. Finally, a deep convolutional neural network is proposed to verify which features are more suitable for environment sound classification and recognition tasks. The experimental results show that the signal processing features are better than the spectrogram features in the deep neural network based environmental sound recognition system. Among all the acoustic features, the MGCC feature achieves the best performance than other features. Finally, the MGCC-CNN model proposed in this paper is compared with the state-of-the-art environmental sound classification models on the UrbanSound 8K dataset. The results show that the proposed model has the best classification accuracy.

Download Full-text

Acoustic traits of bat-pollinated flowers compared to flowers of other pollination syndromes and their echo-based classification using convolutional neural networks

PLoS Computational Biology ◽

10.1371/journal.pcbi.1009706 ◽

2021 ◽

Vol 17 (12) ◽

pp. e1009706

Author(s):

Ralph Simon ◽

Karol Bakunowski ◽

Angel Eduardo Reyes-Vasques ◽

Marco Tschapka ◽

Mirjam Knörnschild ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Network ◽

Plant Species ◽

Classification Accuracy ◽

Classification Performance ◽

Target Strength ◽

South East Asia ◽

Pollination Syndromes ◽

Good Classification

Bat-pollinated flowers have to attract their pollinators in absence of light and therefore some species developed specialized echoic floral parts. These parts are usually concave shaped and act like acoustic retroreflectors making the flowers acoustically conspicuous to the bats. Acoustic plant specializations only have been described for two bat-pollinated species in the Neotropics and one other bat-dependent plant in South East Asia. However, it remains unclear whether other bat-pollinated plant species also show acoustic adaptations. Moreover, acoustic traits have never been compared between bat-pollinated flowers and flowers belonging to other pollination syndromes. To investigate acoustic traits of bat-pollinated flowers we recorded a dataset of 32320 flower echoes, collected from 168 individual flowers belonging to 12 different species. 6 of these species were pollinated by bats and 6 species were pollinated by insects or hummingbirds. We analyzed the spectral target strength of the flowers and trained a convolutional neural network (CNN) on the spectrograms of the flower echoes. We found that bat-pollinated flowers have a significantly higher echo target strength, independent of their size, and differ in their morphology, specifically in the lower variance of their morphological features. We found that a good classification accuracy by our CNN (up to 84%) can be achieved with only one echo/spectrogram to classify the 12 different plant species, both bat-pollinated and otherwise, with bat-pollinated flowers being easier to classify. The higher classification performance of bat-pollinated flowers can be explained by the lower variance of their morphology.

Download Full-text

High Accurate Environmental Sound Classification: Sub-Spectrogram Segmentation versus Temporal-Frequency Attention Mechanism

Sensors ◽

10.3390/s21165500 ◽

2021 ◽

Vol 21 (16) ◽

pp. 5500

Author(s):

Tianhao Qiao ◽

Shunqing Zhang ◽

Shan Cao ◽

Shugong Xu

Keyword(s):

Classification Accuracy ◽

Temporal Frequency ◽

Classification Performance ◽

Optimal Number ◽

Attention Mechanism ◽

Feature Representation ◽

Environmental Sound ◽

The Public ◽

Sound Classification ◽

Classification Framework

In the important and challenging field of environmental sound classification (ESC), a crucial and even decisive factor is the feature representation ability, which can directly affect the accuracy of classification. Therefore, the classification performance often depends to a large extent on whether the effective representative features can be extracted from the environmental sound. In this paper, we firstly propose a sub-spectrogram segmentation with score level fusion based ESC classification framework, and we adopt the proposed convolutional recurrent neural network (CRNN) for improving the classification accuracy. By evaluating numerous truncation schemes, we numerically figure out the optimal number of sub-spectrograms and the corresponding band ranges, and, on this basis, we propose a joint attention mechanism with temporal and frequency attention mechanisms and use the global attention mechanism when generating the attention map. Finally, the numerical results show that the two frameworks we proposed can achieve 82.1% and 86.4% classification accuracy on the public environmental sound dataset ESC-50, respectively, which is equivalent to more than 13.5% improvement over the traditional baseline scheme.

Download Full-text

Improving Sentiment Analysis using Hybrid Deep Learning Model

Recent Advances in Computer Science and Communications ◽

10.2174/2213275912666190328200012 ◽

2020 ◽

Vol 13 (4) ◽

pp. 627-640 ◽

Cited By ~ 1

Author(s):

Avinash Chandra Pandey ◽

Dharmveer Singh Rajpoot

Keyword(s):

Neural Network ◽

Deep Learning ◽

Sentiment Analysis ◽

Classification Accuracy ◽

Short Term Memory ◽

Computational Cost ◽

Extraction Process ◽

Learning Model ◽

Sentiment Classification ◽

Deep Learning Model

Background: Sentiment analysis is a contextual mining of text which determines viewpoint of users with respect to some sentimental topics commonly present at social networking websites. Twitter is one of the social sites where people express their opinion about any topic in the form of tweets. These tweets can be examined using various sentiment classification methods to find the opinion of users. Traditional sentiment analysis methods use manually extracted features for opinion classification. The manual feature extraction process is a complicated task since it requires predefined sentiment lexicons. On the other hand, deep learning methods automatically extract relevant features from data hence; they provide better performance and richer representation competency than the traditional methods. Objective: The main aim of this paper is to enhance the sentiment classification accuracy and to reduce the computational cost. Method: To achieve the objective, a hybrid deep learning model, based on convolution neural network and bi-directional long-short term memory neural network has been introduced. Results: The proposed sentiment classification method achieves the highest accuracy for the most of the datasets. Further, from the statistical analysis efficacy of the proposed method has been validated. Conclusion: Sentiment classification accuracy can be improved by creating veracious hybrid models. Moreover, performance can also be enhanced by tuning the hyper parameters of deep leaning models.

Download Full-text

An efficient pruning scheme of deep neural networks for Internet of Things applications

EURASIP Journal on Advances in Signal Processing ◽

10.1186/s13634-021-00744-4 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Chen Qi ◽

Shibo Shen ◽

Rongpeng Li ◽

Zhifeng Zhao ◽

Qing Liu ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Internet Of Things ◽

Deep Neural Networks ◽

Computational Cost ◽

Superior Performance ◽

Compact Structure ◽

Resource Limited ◽

Benchmark Datasets ◽

Iot Devices

AbstractNowadays, deep neural networks (DNNs) have been rapidly deployed to realize a number of functionalities like sensing, imaging, classification, recognition, etc. However, the computational-intensive requirement of DNNs makes it difficult to be applicable for resource-limited Internet of Things (IoT) devices. In this paper, we propose a novel pruning-based paradigm that aims to reduce the computational cost of DNNs, by uncovering a more compact structure and learning the effective weights therein, on the basis of not compromising the expressive capability of DNNs. In particular, our algorithm can achieve efficient end-to-end training that transfers a redundant neural network to a compact one with a specifically targeted compression rate directly. We comprehensively evaluate our approach on various representative benchmark datasets and compared with typical advanced convolutional neural network (CNN) architectures. The experimental results verify the superior performance and robust effectiveness of our scheme. For example, when pruning VGG on CIFAR-10, our proposed scheme is able to significantly reduce its FLOPs (floating-point operations) and number of parameters with a proportion of 76.2% and 94.1%, respectively, while still maintaining a satisfactory accuracy. To sum up, our scheme could facilitate the integration of DNNs into the common machine-learning-based IoT framework and establish distributed training of neural networks in both cloud and edge.

Download Full-text

Deep Convolutional Neural Network with Transfer Learning for Environmental Sound Classification

2021 International Conference on Computer, Control and Robotics (ICCCR) ◽

10.1109/icccr49711.2021.9349393 ◽

2021 ◽

Author(s):

Jianrui Lu ◽

Ruofei Ma ◽

Gongliang Liu ◽

Zhiliang Qin

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Transfer Learning ◽

Deep Convolutional Neural Network ◽

Environmental Sound ◽

Sound Classification

Download Full-text

Rethinking environmental sound classification using convolutional neural networks: optimized parameter tuning of single feature extraction

Neural Computing and Applications ◽

10.1007/s00521-021-06091-7 ◽

2021 ◽

Author(s):

Yousef Abd Al-Hattab ◽

Hasan Firdaus Zaki ◽

Amir Akramin Shafie

Keyword(s):

Neural Networks ◽

Feature Extraction ◽

Convolutional Neural Networks ◽

Parameter Tuning ◽

Environmental Sound ◽

Sound Classification ◽

Single Feature

Download Full-text

Optimization Artificial Neural Network Using Artificial Bee Colony in Letter Recognition Classification

JELIKU (Jurnal Elektronik Ilmu Komputer Udayana) ◽

10.24843/jlk.2020.v08.i04.p13 ◽

2020 ◽

Vol 8 (4) ◽

pp. 469

Author(s):

I Gusti Ngurah Alit Indrawan ◽

I Made Widiartha

Keyword(s):

Neural Network ◽

Neural Networks ◽

Artificial Neural Networks ◽

Classification Accuracy ◽

Artificial Bee Colony Algorithm ◽

Artificial Bee Colony ◽

Letter Recognition ◽

Bee Colony ◽

Artificial Neural ◽

Hidden Layer

Artificial Neural Networks or commonly abbreviated as ANN is one branch of science from the field of artificial intelligence which is often used to solve various problems in fields that involve grouping and pattern recognition. This research aims to classify Letter Recognition datasets using Artificial Neural Networks which are weighted optimally using the Artificial Bee Colony algorithm. The best classification accuracy results from this study were 92.85% using a combination of 4 hidden layers with each hidden layer containing 10 neurons.

Download Full-text