scholarly journals Self-Supervised Transfer Learning from Natural Images for Sound Classification

2021 ◽  
Vol 11 (7) ◽  
pp. 3043
Author(s):  
Sungho Shin ◽  
Jongwon Kim ◽  
Yeonguk Yu ◽  
Seongju Lee ◽  
Kyoobin Lee

We propose the implementation of transfer learning from natural images to audio-based images using self-supervised learning schemes. Through self-supervised learning, convolutional neural networks (CNNs) can learn the general representation of natural images without labels. In this study, a convolutional neural network was pre-trained with natural images (ImageNet) via self-supervised learning; subsequently, it was fine-tuned on the target audio samples. Pre-training with the self-supervised learning scheme significantly improved the sound classification performance when validated on the following benchmarks: ESC-50, UrbanSound8k, and GTZAN. The network pre-trained via self-supervised learning achieved a similar level of accuracy as those pre-trained using a supervised method that require labels. Therefore, we demonstrated that transfer learning from natural images contributes to improvements in audio-related tasks, and self-supervised learning with natural images is adequate for pre-training scheme in terms of simplicity and effectiveness.

2021 ◽  
pp. 20201263
Author(s):  
Mohammad Salehi ◽  
Reza Mohammadi ◽  
Hamed Ghaffari ◽  
Nahid Sadighi ◽  
Reza Reiazi

Objective: Pneumonia is a lung infection and causes the inflammation of the small air sacs (Alveoli) in one or both lungs. Proper and faster diagnosis of pneumonia at an early stage is imperative for optimal patient care. Currently, chest X-ray is considered as the best imaging modality for diagnosing pneumonia. However, the interpretation of chest X-ray images is challenging. To this end, we aimed to use an automated convolutional neural network-based transfer-learning approach to detect pneumonia in paediatric chest radiographs. Methods: Herein, an automated convolutional neural network-based transfer-learning approach using four different pre-trained models (i.e. VGG19, DenseNet121, Xception, and ResNet50) was applied to detect pneumonia in children (1–5 years) chest X-ray images. The performance of different proposed models for testing data set was evaluated using five performances metrics, including accuracy, sensitivity/recall, Precision, area under curve, and F1 score. Results: All proposed models provide accuracy greater than 83.0% for binary classification. The pre-trained DenseNet121 model provides the highest classification performance of automated pneumonia classification with 86.8% accuracy, followed by Xception model with an accuracy of 86.0%. The sensitivity of the proposed models was greater than 91.0%. The Xception and DenseNet121 models achieve the highest classification performance with F1-score greater than 89.0%. The plotted area under curve of receiver operating characteristics of VGG19, Xception, ResNet50, and DenseNet121 models are 0.78, 0.81, 0.81, and 0.86, respectively. Conclusion: Our data showed that the proposed models achieve a high accuracy for binary classification. Transfer learning was used to accelerate training of the proposed models and resolve the problem associated with insufficient data. We hope that these proposed models can help radiologists for a quick diagnosis of pneumonia at radiology departments. Moreover, our proposed models may be useful to detect other chest-related diseases such as novel Coronavirus 2019. Advances in knowledge: Herein, we used transfer learning as a machine learning approach to accelerate training of the proposed models and resolve the problem associated with insufficient data. Our proposed models achieved accuracy greater than 83.0% for binary classification.


2019 ◽  
Vol 12 (1) ◽  
pp. 86 ◽  
Author(s):  
Rafael Pires de Lima ◽  
Kurt Marfurt

Remote-sensing image scene classification can provide significant value, ranging from forest fire monitoring to land-use and land-cover classification. Beginning with the first aerial photographs of the early 20th century to the satellite imagery of today, the amount of remote-sensing data has increased geometrically with a higher resolution. The need to analyze these modern digital data motivated research to accelerate remote-sensing image classification. Fortunately, great advances have been made by the computer vision community to classify natural images or photographs taken with an ordinary camera. Natural image datasets can range up to millions of samples and are, therefore, amenable to deep-learning techniques. Many fields of science, remote sensing included, were able to exploit the success of natural image classification by convolutional neural network models using a technique commonly called transfer learning. We provide a systematic review of transfer learning application for scene classification using different datasets and different deep-learning models. We evaluate how the specialization of convolutional neural network models affects the transfer learning process by splitting original models in different points. As expected, we find the choice of hyperparameters used to train the model has a significant influence on the final performance of the models. Curiously, we find transfer learning from models trained on larger, more generic natural images datasets outperformed transfer learning from models trained directly on smaller remotely sensed datasets. Nonetheless, results show that transfer learning provides a powerful tool for remote-sensing scene classification.


Mekatronika ◽  
2020 ◽  
Vol 2 (2) ◽  
pp. 23-27
Author(s):  
Amirul Asyraf Abdul Manan ◽  
Mohd Azraai Mohd Razman ◽  
Ismail Mohd Khairuddin ◽  
Muhammad Nur Aiman Shapiee

This study presents an application of using a Convolutional Neural Network (CNN) based detector to detect chili and its leaves in the chili plant image. Detecting chili on its plant is essential for the development of robotic vision and monitoring. Thus, helps us supervise the plant growth, furthermore, analyses their productivity and quality. This paper aims to develop a system that can monitor and identify bird’s eye chili plants by implementing machine learning. First, the development of methodology for efficient detection of bird’s eye chili and its leaf was made. A dataset of a total of 1866 images after augmentation of bird’s eye chili and its leaf was used in this experiment. YOLO Darknet was implemented to train the dataset. After a series of experiments were conducted, the model is compared with other transfer learning models like YOLO Tiny, Faster R-CNN, and EfficientDet. The classification performance of these transfer learning models has been calculated and compared with each other. The experimental result shows that the Yolov4 Darknet model achieves mAP of 75.69%, followed by EfficientDet at 71.85% for augmented dataset.


Sensors ◽  
2019 ◽  
Vol 19 (18) ◽  
pp. 3992 ◽  
Author(s):  
Jingmei Li ◽  
Weifei Wu ◽  
Di Xue ◽  
Peng Gao

Transfer learning can enhance classification performance of a target domain with insufficient training data by utilizing knowledge relating to the target domain from source domain. Nowadays, it is common to see two or more source domains available for knowledge transfer, which can improve performance of learning tasks in the target domain. However, the classification performance of the target domain decreases due to mismatching of probability distribution. Recent studies have shown that deep learning can build deep structures by extracting more effective features to resist the mismatching. In this paper, we propose a new multi-source deep transfer neural network algorithm, MultiDTNN, based on convolutional neural network and multi-source transfer learning. In MultiDTNN, joint probability distribution adaptation (JPDA) is used for reducing the mismatching between source and target domains to enhance features transferability of the source domain in deep neural networks. Then, the convolutional neural network is trained by utilizing the datasets of each source and target domain to obtain a set of classifiers. Finally, the designed selection strategy selects classifier with the smallest classification error on the target domain from the set to assemble the MultiDTNN framework. The effectiveness of the proposed MultiDTNN is verified by comparing it with other state-of-the-art deep transfer learning on three datasets.


2020 ◽  
Vol 12 (11) ◽  
pp. 1780 ◽  
Author(s):  
Yao Liu ◽  
Lianru Gao ◽  
Chenchao Xiao ◽  
Ying Qu ◽  
Ke Zheng ◽  
...  

Convolutional neural networks (CNNs) have been widely applied in hyperspectral imagery (HSI) classification. However, their classification performance might be limited by the scarcity of labeled data to be used for training and validation. In this paper, we propose a novel lightweight shuffled group convolutional neural network (abbreviated as SG-CNN) to achieve efficient training with a limited training dataset in HSI classification. SG-CNN consists of SG conv units that employ conventional and atrous convolution in different groups, followed by channel shuffle operation and shortcut connection. In this way, SG-CNNs have less trainable parameters, whilst they can still be accurately and efficiently trained with fewer labeled samples. Transfer learning between different HSI datasets is also applied on the SG-CNN to further improve the classification accuracy. To evaluate the effectiveness of SG-CNNs for HSI classification, experiments have been conducted on three public HSI datasets pretrained on HSIs from different sensors. SG-CNNs with different levels of complexity were tested, and their classification results were compared with fine-tuned ShuffleNet2, ResNeXt, and their original counterparts. The experimental results demonstrate that SG-CNNs can achieve competitive classification performance when the amount of labeled data for training is poor, as well as efficiently providing satisfying classification results.


2020 ◽  
Author(s):  
Yuki Hashimoto ◽  
Yosuke Ogata ◽  
Manabu Honda ◽  
Yuichi Yamashita

AbstractIn this study, we propose a novel deep-learning technique for functional MRI analysis. We introduced an “identity feature” by a self-supervised learning schema, in which a neural network is trained solely based on the MRI-scans; furthermore, training does not require any explicit labels. The proposed method demonstrated that each temporal slice of resting state functional MRI contains enough information to identify the subject. The network learned a feature space in which the features were clustered per subject for the test data as well as for the training data; this is unlike the features extracted by conventional methods including region of interests pooling signals and principle component analysis. In addition, using a simple linear classifier for the identity features, we demonstrated that the extracted features could contribute to schizophrenia diagnosis. The classification accuracy of our identity features was higher than that of the conventional functional connectivity. Our results suggested that our proposed training scheme of the neural network captured brain functioning related to the diagnosis of psychiatric disorders as well as the identity of the subject. Our results together highlight the validity of our proposed technique as a design for self-supervised learning.


2022 ◽  
Author(s):  
M. Hongchul Sohn ◽  
Sonia Yuxiao Lai ◽  
Matthew L. Elwin ◽  
Julius P. A. Dewald

Myoelectric control uses electromyography (EMG) signals as human-originated input to enable intuitive interfaces with machines. As such, recent rehabilitation robotics employs myoelectric control to autonomously classify user intent or operation mode using machine learning. However, performance in such applications inherently suffers from the non-stationarity of EMG signals across measurement conditions. Current laboratory-based solutions rely on careful, time-consuming control of the recordings or periodic recalibration, impeding real-world deployment. We propose that robust yet seamless myoelectric control can be achieved using a low-end, easy-to-don and doff wearable EMG sensor combined with unsupervised transfer learning. Here, we test the feasibility of one such application using a consumer-grade sensor (Myo armband, 8 EMG channels @ 200 Hz) for gesture classification across measurement conditions using an existing dataset: 5 users x 10 days x 3 sensor locations. Specifically, we first train a deep neural network using Temporal-Spatial Descriptors (TSD) with labeled source data from any particular user, day, or location. We then apply the Self-Calibrating Asynchronous Domain Adversarial Neural Network (SCADANN), which automatically adjusts the trained TSD to improve classification performance for unlabeled target data from a different user, day, or sensor location. Compared to the original TSD, SCADANN improves accuracy by 12±5.2% (avg±sd), 9.6±5.0%, and 8.6±3.3% across all possible user-to-user, day-to-day, and location-to-location cases, respectively. In one best-case scenario, accuracy improves by 26% (from 67% to 93%), whereas sometimes the gain is modest (e.g., from 76% to 78%). We also show that the performance of transfer learning can be improved by using a better model trained with good (e.g., incremental) source data. We postulate that the proposed approach is feasible and promising and can be further tailored for seamless myoelectric control of powered prosthetics or exoskeletons.


2016 ◽  
Vol 40 (2) ◽  
pp. 363-374 ◽  
Author(s):  
Ye Tao ◽  
Duzhou Zhang ◽  
Shengjun Cheng ◽  
Xianglong Tang

Semi-supervised learning aims to utilize both labelled and unlabelled data to improve learning performance. This paper shows a distinct way to exploit unlabelled data for traditional semi-supervised learning methods, such as self-training. Self-training is a well-known semi-supervised learning algorithm which iteratively trains a classifier by bootstrapping from unlabelled data. Standard self-training barely selects unlabelled examples for training set augmentation according to the current classifier model, which is trained only on the labelled data. This could be problematic since the underlying classifier is not strong enough, especially when initial labelled data is sparse. Consequently, self-training suffers from too much classification noise accumulated in the training set. In this paper, we propose a novel self-training style algorithm, which exploits a manifold assumption to optimize the self-labelling process. Unlike standard self-training, our algorithm utilizes labelled and unlabelled data as a whole to label and select unlabelled examples for training set augmentation. In detail, two measures are employed to minimize the effect of noise introduced to the labelled training set: a transductive method based on controlled graph random walk is incorporated to generate reliable predictions on unlabelled data; secondly, the mechanism is adopted to sequentially augment the training set. Empirical results suggest that the proposed method can effectively improve classification performance.


Sign in / Sign up

Export Citation Format

Share Document