scholarly journals Semi-Supervised Audio Classification with Partially Labeled Data

Author(s):  
Siddharth Gururani ◽  
Alexander Lerch
2017 ◽  
Author(s):  
Sukanya Sonowal ◽  
Tushar Sandhan ◽  
Inkyu Choi ◽  
Nam Soo Kim
Keyword(s):  

Electronics ◽  
2021 ◽  
Vol 10 (15) ◽  
pp. 1807
Author(s):  
Sascha Grollmisch ◽  
Estefanía Cano

Including unlabeled data in the training process of neural networks using Semi-Supervised Learning (SSL) has shown impressive results in the image domain, where state-of-the-art results were obtained with only a fraction of the labeled data. The commonality between recent SSL methods is that they strongly rely on the augmentation of unannotated data. This is vastly unexplored for audio data. In this work, SSL using the state-of-the-art FixMatch approach is evaluated on three audio classification tasks, including music, industrial sounds, and acoustic scenes. The performance of FixMatch is compared to Convolutional Neural Networks (CNN) trained from scratch, Transfer Learning, and SSL using the Mean Teacher approach. Additionally, a simple yet effective approach for selecting suitable augmentation methods for FixMatch is introduced. FixMatch with the proposed modifications always outperformed Mean Teacher and the CNNs trained from scratch. For the industrial sounds and music datasets, the CNN baseline performance using the full dataset was reached with less than 5% of the initial training data, demonstrating the potential of recent SSL methods for audio data. Transfer Learning outperformed FixMatch only for the most challenging dataset from acoustic scene classification, showing that there is still room for improvement.


2021 ◽  
Vol 11 (11) ◽  
pp. 4880
Author(s):  
Abigail Copiaco ◽  
Christian Ritz ◽  
Nidhal Abdulaziz ◽  
Stefano Fasciani

Recent methodologies for audio classification frequently involve cepstral and spectral features, applied to single channel recordings of acoustic scenes and events. Further, the concept of transfer learning has been widely used over the years, and has proven to provide an efficient alternative to training neural networks from scratch. The lower time and resource requirements when using pre-trained models allows for more versatility in developing system classification approaches. However, information on classification performance when using different features for multi-channel recordings is often limited. Furthermore, pre-trained networks are initially trained on bigger databases and are often unnecessarily large. This poses a challenge when developing systems for devices with limited computational resources, such as mobile or embedded devices. This paper presents a detailed study of the most apparent and widely-used cepstral and spectral features for multi-channel audio applications. Accordingly, we propose the use of spectro-temporal features. Additionally, the paper details the development of a compact version of the AlexNet model for computationally-limited platforms through studies of performances against various architectural and parameter modifications of the original network. The aim is to minimize the network size while maintaining the series network architecture and preserving the classification accuracy. Considering that other state-of-the-art compact networks present complex directed acyclic graphs, a series architecture proposes an advantage in customizability. Experimentation was carried out through Matlab, using a database that we have generated for this task, which composes of four-channel synthetic recordings of both sound events and scenes. The top performing methodology resulted in a weighted F1-score of 87.92% for scalogram features classified via the modified AlexNet-33 network, which has a size of 14.33 MB. The AlexNet network returned 86.24% at a size of 222.71 MB.


Author(s):  
Arooshi Taneja ◽  
Yashvi Gulati ◽  
Tushar Chugh ◽  
Pawan Joshi ◽  
Narina Thakur

2021 ◽  
Vol 544 ◽  
pp. 500-518 ◽  
Author(s):  
Can Gao ◽  
Jie Zhou ◽  
Duoqian Miao ◽  
Jiajun Wen ◽  
Xiaodong Yue

2018 ◽  
Vol 5 (2) ◽  
pp. 239-250 ◽  
Author(s):  
Keyu Liu ◽  
Eric C. C. Tsang ◽  
Jingjing Song ◽  
Hualong Yu ◽  
Xiangjian Chen ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document