audio classification Latest Research Papers

The paper investigates retraining options and the performance of pre-trained Convolutional Neural Networks (CNNs) for sound classification. CNNs were initially designed for image classification and recognition, and, at a second phase, they extended towards sound classification. Transfer learning is a promising paradigm, retraining already trained networks upon different datasets. We selected three ‘Image’- and two ‘Sound’-trained CNNs, namely, GoogLeNet, SqueezeNet, ShuffleNet, VGGish, and YAMNet, and applied transfer learning. We explored the influence of key retraining parameters, including the optimizer, the mini-batch size, the learning rate, and the number of epochs, on the classification accuracy and the processing time needed in terms of sound preprocessing for the preparation of the scalograms and spectrograms as well as CNN training. The UrbanSound8K, ESC-10, and Air Compressor open sound datasets were employed. Using a two-fold criterion based on classification accuracy and time needed, we selected the ‘champion’ transfer-learning parameter combinations, discussed the consistency of the classification results, and explored possible benefits from fusing the classification estimations. The Sound CNNs achieved better classification accuracy, reaching an average of 96.4% for UrbanSound8K, 91.25% for ESC-10, and 100% for the Air Compressor dataset.

Download Full-text

BAND: A Benchmark Dataset forBangla News Audio Classification

10.1145/3469877.3490575 ◽

2021 ◽

Author(s):

Md. Rafi Ur Rashid ◽

Mahim Mahbub ◽

Muhammad Abdullah Adnan

Keyword(s):

Benchmark Dataset ◽

Audio Classification

Download Full-text

Audio classification for music information retrieval of Hindustani vocal music

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v24.i3.pp1481-1490 ◽

2021 ◽

Vol 24 (3) ◽

pp. 1481

Author(s):

Amit Rege ◽

Ravi Sindal

Keyword(s):

Information Retrieval ◽

Classification Accuracy ◽

Music Information Retrieval ◽

Vocal Music ◽

Art Music ◽

Audio Classification ◽

Proposed Model ◽

Good Classification ◽

General Sequences ◽

Music Information

An important task in music information retrieval of Indian art music is the recognition of the larger musicological frameworks, called ragas, on which the performances are based. Ragas are characterized by prominent musical notes, motifs, general sequences of notes used and embellishments improvised by the performers. In this work we propose a convolutional neural network-based model to work on the mel-spectrograms for classication of steady note regions and note transition regions in vocal melodies which can be used for finding prominent musical notes. It is demonstrated that, good classification accuracy is obtained using the proposed model.

Download Full-text

Semi-Supervised Audio Classification with Partially Labeled Data

10.1109/ism52913.2021.00027 ◽

2021 ◽

Author(s):

Siddharth Gururani ◽

Alexander Lerch

Keyword(s):

Audio Classification ◽

Partially Labeled Data

Download Full-text

Covid Classification Using Audio Data

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.38675 ◽

2021 ◽

Vol 9 (10) ◽

pp. 1633-1637

Author(s):

Adwait Patil

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

Binary Classification ◽

Image Data ◽

Audio Classification ◽

Mel Frequency Cepstral Coefficients ◽

Audio Data ◽

Cepstral Coefficients ◽

Audio Files

Abstract: Coronavirus outbreak has affected the entire world adversely this project has been developed in order to help common masses diagnose their chances of been covid positive just by using coughing sound and basic patient data. Audio classification is one of the most interesting applications of deep learning. Similar to image data audio data is also stored in form of bits and to understand and analyze this audio data we have used Mel frequency cepstral coefficients (MFCCs) which makes it possible to feed the audio to our neural network. In this project we have used Coughvid a crowdsource dataset consisting of 27000 audio files and metadata of same amount of patients. In this project we have used a 1D Convolutional Neural Network (CNN) to process the audio and metadata. Future scope for this project will be a model that rates how likely it is that a person is infected instead of binary classification. Keywords: Audio classification, Mel frequency cepstral coefficients, Convolutional neural network, deep learning, Coughvid

Download Full-text

Over-Parameterization and Generalization in Audio Classification

10.31219/osf.io/umc9w ◽

2021 ◽

Author(s):

Khaled Koutini ◽

Hamid Eghbal-zadeh ◽

Florian Henkel ◽

Jan Schlüter ◽

Gerhard Widmer

Keyword(s):

Neural Networks ◽

Language Processing ◽

Recording Device ◽

Classification Models ◽

Audio Classification ◽

Scene Classification ◽

Machine Listening ◽

Substantial Problem ◽

Classification Tasks ◽

The Relationship

Convolutional Neural Networks (CNNs) have been dominating classification tasks in various domains, such as machine vision, machine listening, and natural language processing. In machine listening, while generally exhibiting very good generalization capabilities, CNNs are sensitive to the specific audio recording device used, which has been recognized as a substantial problem in the acoustic scene classification (DCASE) community. In this study, we investigate the relationship between over-parameterization of acoustic scene classification models, and their resulting generalization abilities. Our results indicate that increasing width improves generalization to unseen devices, even without an increase in the number of parameters.

Download Full-text

audio classification
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Multi-representation knowledge distillation for audio classification

Pruning vs XNOR-Net: A Comprehensive Study of Deep Learning for Audio Classification on Edge-devices

DCNN-LSTM Based Audio Classification Combining Multiple Feature Engineering and Data Augmentation Techniques

Audio Classification for Melody Transcription in the Context of Indian Art Music

Comparison of Pre-Trained CNNs for Audio Classification Using Transfer Learning

BAND: A Benchmark Dataset forBangla News Audio Classification

Audio classification for music information retrieval of Hindustani vocal music

Semi-Supervised Audio Classification with Partially Labeled Data

Covid Classification Using Audio Data

Over-Parameterization and Generalization in Audio Classification

Export Citation Format

audio classificationRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Multi-representation knowledge distillation for audio classification

Pruning vs XNOR-Net: A Comprehensive Study of Deep Learning for Audio Classification on Edge-devices

DCNN-LSTM Based Audio Classification Combining Multiple Feature Engineering and Data Augmentation Techniques

Audio Classification for Melody Transcription in the Context of Indian Art Music

Comparison of Pre-Trained CNNs for Audio Classification Using Transfer Learning

BAND: A Benchmark Dataset forBangla News Audio Classification

Audio classification for music information retrieval of Hindustani vocal music

Semi-Supervised Audio Classification with Partially Labeled Data

Covid Classification Using Audio Data

Over-Parameterization and Generalization in Audio Classification

audio classification
Recently Published Documents