audio classification
Recently Published Documents


TOTAL DOCUMENTS

281
(FIVE YEARS 94)

H-INDEX

22
(FIVE YEARS 5)

2021 ◽  
Vol 10 (4) ◽  
pp. 72
Author(s):  
Eleni Tsalera ◽  
Andreas Papadakis ◽  
Maria Samarakou

The paper investigates retraining options and the performance of pre-trained Convolutional Neural Networks (CNNs) for sound classification. CNNs were initially designed for image classification and recognition, and, at a second phase, they extended towards sound classification. Transfer learning is a promising paradigm, retraining already trained networks upon different datasets. We selected three ‘Image’- and two ‘Sound’-trained CNNs, namely, GoogLeNet, SqueezeNet, ShuffleNet, VGGish, and YAMNet, and applied transfer learning. We explored the influence of key retraining parameters, including the optimizer, the mini-batch size, the learning rate, and the number of epochs, on the classification accuracy and the processing time needed in terms of sound preprocessing for the preparation of the scalograms and spectrograms as well as CNN training. The UrbanSound8K, ESC-10, and Air Compressor open sound datasets were employed. Using a two-fold criterion based on classification accuracy and time needed, we selected the ‘champion’ transfer-learning parameter combinations, discussed the consistency of the classification results, and explored possible benefits from fusing the classification estimations. The Sound CNNs achieved better classification accuracy, reaching an average of 96.4% for UrbanSound8K, 91.25% for ESC-10, and 100% for the Air Compressor dataset.


2021 ◽  
Author(s):  
Md. Rafi Ur Rashid ◽  
Mahim Mahbub ◽  
Muhammad Abdullah Adnan

Author(s):  
Amit Rege ◽  
Ravi Sindal

An important task in music information retrieval of Indian art music is the recognition of the larger musicological frameworks, called ragas, on which the performances are based. Ragas are characterized by prominent musical notes, motifs, general sequences of notes used and embellishments improvised by the performers. In this work we propose a convolutional neural network-based model to work on the mel-spectrograms for classication of steady note regions and note transition regions in vocal melodies which can be used for finding prominent musical notes. It is demonstrated that, good classification accuracy is obtained using the proposed model.


Author(s):  
Adwait Patil

Abstract: Coronavirus outbreak has affected the entire world adversely this project has been developed in order to help common masses diagnose their chances of been covid positive just by using coughing sound and basic patient data. Audio classification is one of the most interesting applications of deep learning. Similar to image data audio data is also stored in form of bits and to understand and analyze this audio data we have used Mel frequency cepstral coefficients (MFCCs) which makes it possible to feed the audio to our neural network. In this project we have used Coughvid a crowdsource dataset consisting of 27000 audio files and metadata of same amount of patients. In this project we have used a 1D Convolutional Neural Network (CNN) to process the audio and metadata. Future scope for this project will be a model that rates how likely it is that a person is infected instead of binary classification. Keywords: Audio classification, Mel frequency cepstral coefficients, Convolutional neural network, deep learning, Coughvid


2021 ◽  
Author(s):  
Khaled Koutini ◽  
Hamid Eghbal-zadeh ◽  
Florian Henkel ◽  
Jan Schlüter ◽  
Gerhard Widmer

Convolutional Neural Networks (CNNs) have been dominating classification tasks in various domains, such as machine vision, machine listening, and natural language processing. In machine listening, while generally exhibiting very good generalization capabilities, CNNs are sensitive to the specific audio recording device used, which has been recognized as a substantial problem in the acoustic scene classification (DCASE) community. In this study, we investigate the relationship between over-parameterization of acoustic scene classification models, and their resulting generalization abilities. Our results indicate that increasing width improves generalization to unseen devices, even without an increase in the number of parameters.


Sign in / Sign up

Export Citation Format

Share Document