scholarly journals Classification of the Sidewalk Condition Using Self-Supervised Transfer Learning for Wheelchair Safety Driving

Sensors ◽  
2022 ◽  
Vol 22 (1) ◽  
pp. 380
Author(s):  
Ha-Yeong Yoon ◽  
Jung-Hwa Kim ◽  
Jin-Woo Jeong

The demand for wheelchairs has increased recently as the population of the elderly and patients with disorders increases. However, society still pays less attention to infrastructure that can threaten the wheelchair user, such as sidewalks with cracks/potholes. Although various studies have been proposed to recognize such challenges, they mainly depend on RGB images or IMU sensors, which are sensitive to outdoor conditions such as low illumination, bad weather, and unavoidable vibrations, resulting in unsatisfactory and unstable performance. In this paper, we introduce a novel system based on various convolutional neural networks (CNNs) to automatically classify the condition of sidewalks using images captured with depth and infrared modalities. Moreover, we compare the performance of training CNNs from scratch and the transfer learning approach, where the weights learned from the natural image domain (e.g., ImageNet) are fine-tuned to the depth and infrared image domain. In particular, we propose applying the ResNet-152 model pre-trained with self-supervised learning during transfer learning to leverage better image representations. Performance evaluation on the classification of the sidewalk condition was conducted with 100% and 10% of training data. The experimental results validate the effectiveness and feasibility of the proposed approach and bring future research directions.

Electronics ◽  
2021 ◽  
Vol 10 (15) ◽  
pp. 1807
Author(s):  
Sascha Grollmisch ◽  
Estefanía Cano

Including unlabeled data in the training process of neural networks using Semi-Supervised Learning (SSL) has shown impressive results in the image domain, where state-of-the-art results were obtained with only a fraction of the labeled data. The commonality between recent SSL methods is that they strongly rely on the augmentation of unannotated data. This is vastly unexplored for audio data. In this work, SSL using the state-of-the-art FixMatch approach is evaluated on three audio classification tasks, including music, industrial sounds, and acoustic scenes. The performance of FixMatch is compared to Convolutional Neural Networks (CNN) trained from scratch, Transfer Learning, and SSL using the Mean Teacher approach. Additionally, a simple yet effective approach for selecting suitable augmentation methods for FixMatch is introduced. FixMatch with the proposed modifications always outperformed Mean Teacher and the CNNs trained from scratch. For the industrial sounds and music datasets, the CNN baseline performance using the full dataset was reached with less than 5% of the initial training data, demonstrating the potential of recent SSL methods for audio data. Transfer Learning outperformed FixMatch only for the most challenging dataset from acoustic scene classification, showing that there is still room for improvement.


Author(s):  
Christian Horn ◽  
Oscar Ivarsson ◽  
Cecilia Lindhé ◽  
Rich Potter ◽  
Ashely Green ◽  
...  

AbstractRock art carvings, which are best described as petroglyphs, were produced by removing parts of the rock surface to create a negative relief. This tradition was particularly strong during the Nordic Bronze Age (1700–550 BC) in southern Scandinavia with over 20,000 boats and thousands of humans, animals, wagons, etc. This vivid and highly engaging material provides quantitative data of high potential to understand Bronze Age social structures and ideologies. The ability to provide the technically best possible documentation and to automate identification and classification of images would help to take full advantage of the research potential of petroglyphs in southern Scandinavia and elsewhere. We, therefore, attempted to train a model that locates and classifies image objects using faster region-based convolutional neural network (Faster-RCNN) based on data produced by a novel method to improve visualizing the content of 3D documentations. A newly created layer of 3D rock art documentation provides the best data currently available and has reduced inscribed bias compared to older methods. Several models were trained based on input images annotated with bounding boxes produced with different parameters to find the best solution. The data included 4305 individual images in 408 scans of rock art sites. To enhance the models and enrich the training data, we used data augmentation and transfer learning. The successful models perform exceptionally well on boats and circles, as well as with human figures and wheels. This work was an interdisciplinary undertaking which led to important reflections about archaeology, digital humanities, and artificial intelligence. The reflections and the success represented by the trained models open novel avenues for future research on rock art.


Author(s):  
Jianfang Cao ◽  
Minmin Yan ◽  
Yiming Jia ◽  
Xiaodong Tian ◽  
Zibang Zhang

AbstractIt is difficult to identify the historical period in which some ancient murals were created because of damage due to artificial and/or natural factors; similarities in content, style, and color among murals; low image resolution; and other reasons. This study proposed a transfer learning-fused Inception-v3 model for dynasty-based classification. First, the model adopted Inception-v3 with frozen fully connected and softmax layers for pretraining over ImageNet. Second, the model fused Inception-v3 with transfer learning for parameter readjustment over small datasets. Third, the corresponding bottleneck files of the mural images were generated, and the deep-level features of the images were extracted. Fourth, the cross-entropy loss function was employed to calculate the loss value at each step of the training, and an algorithm for the adaptive learning rate on the stochastic gradient descent was applied to unify the learning rate. Finally, the updated softmax classifier was utilized for the dynasty-based classification of the images. On the constructed small datasets, the accuracy rate, recall rate, and F1 value of the proposed model were 88.4%, 88.36%, and 88.32%, respectively, which exhibited noticeable increases compared with those of typical deep learning models and modified convolutional neural networks. Comparisons of the classification outcomes for the mural dataset with those for other painting datasets and natural image datasets showed that the proposed model achieved stable classification outcomes with a powerful generalization capacity. The training time of the proposed model was only 0.7 s, and overfitting seldom occurred.


Author(s):  
Hiroaki Hashimoto ◽  
Seiji Kameda ◽  
Hitoshi Maezawa ◽  
Satoru Oshino ◽  
Naoki Tani ◽  
...  

To realize a brain–machine interface to assist swallowing, neural signal decoding is indispensable. Eight participants with temporal-lobe intracranial electrode implants for epilepsy were asked to swallow during electrocorticogram (ECoG) recording. Raw ECoG signals or certain frequency bands of the ECoG power were converted into images whose vertical axis was electrode number and whose horizontal axis was time in milliseconds, which were used as training data. These data were classified with four labels (Rest, Mouth open, Water injection, and Swallowing). Deep transfer learning was carried out using AlexNet, and power in the high-[Formula: see text] band (75–150[Formula: see text]Hz) was the training set. Accuracy reached 74.01%, sensitivity reached 82.51%, and specificity reached 95.38%. However, using the raw ECoG signals, the accuracy obtained was 76.95%, comparable to that of the high-[Formula: see text] power. We demonstrated that a version of AlexNet pre-trained with visually meaningful images can be used for transfer learning of visually meaningless images made up of ECoG signals. Moreover, we could achieve high decoding accuracy using the raw ECoG signals, allowing us to dispense with the conventional extraction of high-[Formula: see text] power. Thus, the images derived from the raw ECoG signals were equivalent to those derived from the high-[Formula: see text] band for transfer deep learning.


2020 ◽  
Vol 10 (13) ◽  
pp. 4523 ◽  
Author(s):  
Laith Alzubaidi ◽  
Mohammed A. Fadhel ◽  
Omran Al-Shamma ◽  
Jinglan Zhang ◽  
J. Santamaría ◽  
...  

One of the main challenges of employing deep learning models in the field of medicine is a lack of training data due to difficulty in collecting and labeling data, which needs to be performed by experts. To overcome this drawback, transfer learning (TL) has been utilized to solve several medical imaging tasks using pre-trained state-of-the-art models from the ImageNet dataset. However, there are primary divergences in data features, sizes, and task characteristics between the natural image classification and the targeted medical imaging tasks. Therefore, TL can slightly improve performance if the source domain is completely different from the target domain. In this paper, we explore the benefit of TL from the same and different domains of the target tasks. To do so, we designed a deep convolutional neural network (DCNN) model that integrates three ideas including traditional and parallel convolutional layers and residual connections along with global average pooling. We trained the proposed model against several scenarios. We utilized the same and different domain TL with the diabetic foot ulcer (DFU) classification task and with the animal classification task. We have empirically shown that the source of TL from the same domain can significantly improve the performance considering a reduced number of images in the same domain of the target dataset. The proposed model with the DFU dataset achieved F1-score value of 86.6% when trained from scratch, 89.4% with TL from a different domain of the targeted dataset, and 97.6% with TL from the same domain of the targeted dataset.


Author(s):  
Guokai Liu ◽  
Liang Gao ◽  
Weiming Shen ◽  
Andrew Kusiak

Abstract Condition monitoring and fault diagnosis are of great interest to the manufacturing industry. Deep learning algorithms have shown promising results in equipment prognostics and health management. However, their success has been hindered by excessive training time. In addition, deep learning algorithms face the domain adaptation dilemma encountered in dynamic application environments. The emerging concept of broad learning addresses the training time and the domain adaptation issue. In this paper, a broad transfer learning algorithm is proposed for the classification of bearing faults. Data of the same frequency is used to construct one- and two-dimensional training data sets to analyze performance of the broad transfer and deep learning algorithms. A broad learning algorithm contains two main layers, an augmented feature layer and a classification layer. The broad learning algorithm with a sparse auto-encoder is employed to extract features. The optimal solution of a redefined cost function with a limited sample size to ten per class in the target domain offers the classifier of broad learning domain adaptation capability. The effectiveness of the proposed algorithm has been demonstrated on a benchmark dataset. Computational experiments have demonstrated superior efficiency and accuracy of the proposed algorithm over the deep learning algorithms tested.


2020 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Lokesh Singh ◽  
Rekh Ram Janghel ◽  
Satya Prakash Sahu

PurposeThe study aims to cope with the problems confronted in the skin lesion datasets with less training data toward the classification of melanoma. The vital, challenging issue is the insufficiency of training data that occurred while classifying the lesions as melanoma and non-melanoma.Design/methodology/approachIn this work, a transfer learning (TL) framework Transfer Constituent Support Vector Machine (TrCSVM) is designed for melanoma classification based on feature-based domain adaptation (FBDA) leveraging the support vector machine (SVM) and Transfer AdaBoost (TrAdaBoost). The working of the framework is twofold: at first, SVM is utilized for domain adaptation for learning much transferrable representation between source and target domain. In the first phase, for homogeneous domain adaptation, it augments features by transforming the data from source and target (different but related) domains in a shared-subspace. In the second phase, for heterogeneous domain adaptation, it leverages knowledge by augmenting features from source to target (different and not related) domains to a shared-subspace. Second, TrAdaBoost is utilized to adjust the weights of wrongly classified data in the newly generated source and target datasets.FindingsThe experimental results empirically prove the superiority of TrCSVM than the state-of-the-art TL methods on less-sized datasets with an accuracy of 98.82%.Originality/valueExperiments are conducted on six skin lesion datasets and performance is compared based on accuracy, precision, sensitivity, and specificity. The effectiveness of TrCSVM is evaluated on ten other datasets towards testing its generalizing behavior. Its performance is also compared with two existing TL frameworks (TrResampling, TrAdaBoost) for the classification of melanoma.


2019 ◽  
Vol 8 (2S11) ◽  
pp. 3677-3680

Dog Breed identification is a specific application of Convolutional Neural Networks. Though the classification of Images by Convolutional Neural Network serves to be efficient method, still it has few drawbacks. Convolutional Neural Networks requires a large amount of images as training data and basic time for training the data and to achieve higher accuracy on the classification. To overcome this substantial time we use Transfer Learning. In computer vision, transfer learning refers to the use of a pre-trained models to train the CNN. By Transfer learning, a pre-trained model is trained to provide solution to classification problem which is similar to the classification problem we have. In this project we are using various pre-trained models like VGG16, Xception, InceptionV3 to train over 1400 images covering 120 breeds out of which 16 breeds of dogs were used as classes for training and obtain bottleneck features from these pre-trained models. Finally, Logistic Regression a multiclass classifier is used to identify the breed of the dog from the images and obtained 91%, 94%,95% validation accuracy for these different pre-trained models VGG16, Xception, InceptionV3.


Sign in / Sign up

Export Citation Format

Share Document