scholarly journals Ensemble-Guided Model for Performance Enhancement in Model-Complexity-Limited Acoustic Scene Classification

2021 ◽  
Vol 12 (1) ◽  
pp. 44
Author(s):  
Seokjin Lee ◽  
Minhan Kim ◽  
Seunghyeon Shin ◽  
Seungjae Baek ◽  
Sooyoung Park ◽  
...  

In recent acoustic scene classification (ASC) models, various auxiliary methods to enhance performance have been applied, e.g., subsystem ensembles and data augmentations. Particularly, the ensembles of several submodels may be effective in the ASC models, but there is a problem with increasing the size of the model because it contains several submodels. Therefore, it is hard to be used in model-complexity-limited ASC tasks. In this paper, we would like to find the performance enhancement method while taking advantage of the model ensemble technique without increasing the model size. Our method is proposed based on a mean-teacher model, which is developed for consistency learning in semi-supervised learning. Because our problem is supervised learning, which is different from the purpose of the conventional mean-teacher model, we modify detailed strategies to maximize the consistency learning performance. To evaluate the effectiveness of our method, experiments were performed with an ASC database from the Detection and Classification of Acoustic Scenes and Events 2021 Task 1A. The small-sized ASC model with our proposed method improved the log loss performance up to 1.009 and the F1-score performance by 67.12%, whereas the vanilla ASC model showed a log loss of 1.052 and an F1-score of 65.79%.

Author(s):  
Jianzhong Wang

The paper continues the development of the multiple 1D-embedding-based (or, 1D multi-embedding) methods for semi-supervised learning, which is preliminarily introduced by the author in [J. Wang, Semi-supervised learning using multiple one-dimensional embedding based adaptive interpolation, Int. J. Wavelets, Multiresolut. Inform Process. 14(2) (2016) 11 pp.]. This paper puts the development in a more general framework and creates a new method, which employs the ensemble technique to integrate multiple 1D embedding-based regularization and label boosting for semi-supervised learning (SSL). It combines parallel ensemble and serial ensemble. In each stage of parallel ensemble, the dataset is first smoothly mapped onto multiple 1D sequences. On each 1D embedded data, a classical regularization method is applied to construct a weak classifier. All of these weak classifiers are then integrated to an ensemble of 1D labeler (E1DL), which together with a nearest neighbor cluster (NNC) algorithm extracts a newborn labeled subset from the unlabeled set. The subset is believed to be correctly labeled with a high confidence, so that it joins with the original labeled set for the next learning stage. Repeating this process, we gradually obtain a boosted labeled set and the process will not stopped until the updated labeled set reaches a certain size. Finally, we use E1DL to build the target classifier, which labels all points of the dataset. In this paper, we also set the universal parameters for all experiments to make the algorithm as a parameter-free one. The validity of our method in the classification of the handwritten digits is confirmed by several experiments. Comparing to several other popular SSL methods, our results are very promising.


Electronics ◽  
2021 ◽  
Vol 10 (4) ◽  
pp. 371
Author(s):  
Yerin Lee ◽  
Soyoung Lim ◽  
Il-Youp Kwak

Acoustic scene classification (ASC) categorizes an audio file based on the environment in which it has been recorded. This has long been studied in the detection and classification of acoustic scenes and events (DCASE). This presents the solution to Task 1 of the DCASE 2020 challenge submitted by the Chung-Ang University team. Task 1 addressed two challenges that ASC faces in real-world applications. One is that the audio recorded using different recording devices should be classified in general, and the other is that the model used should have low-complexity. We proposed two models to overcome the aforementioned problems. First, a more general classification model was proposed by combining the harmonic-percussive source separation (HPSS) and deltas-deltadeltas features with four different models. Second, using the same feature, depthwise separable convolution was applied to the Convolutional layer to develop a low-complexity model. Moreover, using gradient-weight class activation mapping (Grad-CAM), we investigated what part of the feature our model sees and identifies. Our proposed system ranked 9th and 7th in the competition for these two subtasks, respectively.


Author(s):  
Leandro Skowronski ◽  
Paula Martin de Moraes ◽  
Mario Luiz Teixeira de Moraes ◽  
Wesley Nunes Gonçalves ◽  
Michel Constantino ◽  
...  

2021 ◽  
Vol 503 (2) ◽  
pp. 1828-1846
Author(s):  
Burger Becker ◽  
Mattia Vaccari ◽  
Matthew Prescott ◽  
Trienko Grobler

ABSTRACT The morphological classification of radio sources is important to gain a full understanding of galaxy evolution processes and their relation with local environmental properties. Furthermore, the complex nature of the problem, its appeal for citizen scientists, and the large data rates generated by existing and upcoming radio telescopes combine to make the morphological classification of radio sources an ideal test case for the application of machine learning techniques. One approach that has shown great promise recently is convolutional neural networks (CNNs). Literature, however, lacks two major things when it comes to CNNs and radio galaxy morphological classification. First, a proper analysis of whether overfitting occurs when training CNNs to perform radio galaxy morphological classification using a small curated training set is needed. Secondly, a good comparative study regarding the practical applicability of the CNN architectures in literature is required. Both of these shortcomings are addressed in this paper. Multiple performance metrics are used for the latter comparative study, such as inference time, model complexity, computational complexity, and mean per class accuracy. As part of this study, we also investigate the effect that receptive field, stride length, and coverage have on recognition performance. For the sake of completeness, we also investigate the recognition performance gains that we can obtain by employing classification ensembles. A ranking system based upon recognition and computational performance is proposed. MCRGNet, Radio Galaxy Zoo, and ConvXpress (novel classifier) are the architectures that best balance computational requirements with recognition performance.


Author(s):  
Wenshuai Chen ◽  
Shuiping Gou ◽  
Xinlin Wang ◽  
Licheng Jiao ◽  
Changzhe Jiao ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document