scholarly journals The manifold embedded selective pseudo-labeling algorithm and transfer learning of small sample dataset

Author(s):  
Yaoli WANG ◽  
Xiaohui LIU ◽  
Bin LI ◽  
Qing CHANG

Special scene classification and identification tasks are not easily fulfilled to obtain samples, which results in a shortage of samples. The focus of current researches lies in how to use source domain data (or auxiliary domain data) to build domain adaption transfer learning models and to improve the classification accuracy and performance of small sample machine learning in these special and difficult scenes. In this paper, a model of deep convolution and Grassmann manifold embedded selective pseudo-labeling algorithm (DC-GMESPL) is proposed to enable transfer learning classifications among multiple small sample datasets. Firstly, DC-GMESPL algorithm uses satellite remote sensing image sample data as the source domain to extract the smoke features simultaneously from both the source domain and the target domain based on the Resnet50 deep transfer network. This is done for such special scene of the target domain as the lack of local sample data for forest fire smoke video images. Secondly, DC-GMESPL algorithm makes the source domain feature distribution aligned with the target domain feature distribution. The distance between the source domain and the target domain feature distribution is minimized by removing the correlation between the source domain features and re-correlation with the target domain. And then the target domain data is pseudo-labeled by selective pseudo-labeling algorithm in Grassmann manifold space. Finally, a trainable model is constructed to complete the transfer classification between small sample datasets. The model of this paper is evaluated by transfer learning between satellite remote sensing image and video image datasets. Experiments show that DC-GMESPL transfer accuracy is higher than DC-CMEDA, Easy TL, CMMS and SPL respectively. Compared with our former DC-CMEDA, the transfer accuracy of our new DC-GMESPL algorithm has been further improved. The transfer accuracy of DC-GMESPL from satellite remote sensing image to video image has been improved by 0.50%, the transfer accuracy from video image to satellite remote sensing image has been improved by 8.50% and then, the performance has been greatly improved.

2019 ◽  
Vol 11 (11) ◽  
pp. 1358 ◽  
Author(s):  
Xingli Qin ◽  
Jie Yang ◽  
Pingxiang Li ◽  
Weidong Sun ◽  
Wei Liu

The combination of transfer learning and remote sensing image processing technology can effectively improve the automation level of image information extraction from a remote sensing time series. However, in the processing of polarimetric synthetic aperture radar (PolSAR) time-series images, the existing transfer learning methods often cannot make full use of the time-series information of the images, relying too much on the labeled samples in the target domain. Furthermore, the speckle noise inherent in synthetic aperture radar (SAR) imagery aggravates the difficulty of the manual selection of labeled samples, so these methods have difficulty in meeting the processing requirements of large data volumes and high efficiency. In lieu of these problems and the spatio-temporal relational knowledge of objects in time-series images, this paper introduces the theory of time-series clustering and proposes a new three-phase time-series clustering algorithm. Due to the full use of the inherent characteristics of the PolSAR images, this algorithm can accurately transfer the labels of the source domain samples to those samples that have not changed in the whole time series without relying on the target domain labeled samples, so as to realize transductive sample label transfer for PolSAR time-series images. Experiments were carried out using three different sets of PolSAR time-series images and the proposed method was compared with two of the existing methods. The experimental results showed that the transfer precision of the proposed method reaches a high level with different data and different objects and it performs significantly better than the existing methods. With strong reliability and practicability, the proposed method can provide a new solution for the rapid information extraction of remote sensing image time series.


Biostatistics ◽  
2020 ◽  
Author(s):  
Abhirup Datta ◽  
Jacob Fiksel ◽  
Agbessi Amouzou ◽  
Scott L Zeger

Summary Computer-coded verbal autopsy (CCVA) algorithms predict cause of death from high-dimensional family questionnaire data (verbal autopsy) of a deceased individual, which are then aggregated to generate national and regional estimates of cause-specific mortality fractions. These estimates may be inaccurate if CCVA is trained on non-local training data different from the local population of interest. This problem is a special case of transfer learning, i.e., improving classification within a target domain (e.g., a particular population) with the classifier trained in a source-domain. Most transfer learning approaches concern individual-level (e.g., a person’s) classification. Social and health scientists such as epidemiologists are often more interested with understanding etiological distributions at the population-level. The sample sizes of their data sets are typically orders of magnitude smaller than those used for common transfer learning applications like image classification, document identification, etc. We present a parsimonious hierarchical Bayesian transfer learning framework to directly estimate population-level class probabilities in a target domain, using any baseline classifier trained on source-domain, and a small labeled target-domain dataset. To address small sample sizes, we introduce a novel shrinkage prior for the transfer error rates guaranteeing that, in absence of any labeled target-domain data or when the baseline classifier is perfectly accurate, our transfer learning agrees with direct aggregation of predictions from the baseline classifier, thereby subsuming the default practice as a special case. We then extend our approach to use an ensemble of baseline classifiers producing an unified estimate. Theoretical and empirical results demonstrate how the ensemble model favors the most accurate baseline classifier. We present data analyses demonstrating the utility of our approach.


2021 ◽  
Vol 2021 ◽  
pp. 1-16
Author(s):  
Jun He ◽  
Xiang Li ◽  
Yong Chen ◽  
Danfeng Chen ◽  
Jing Guo ◽  
...  

In mechanical fault diagnosis, it is impossible to collect massive labeled samples with the same distribution in real industry. Transfer learning, a promising method, is usually used to address the critical problem. However, as the number of samples increases, the interdomain distribution discrepancy measurement of the existing method has a higher computational complexity, which may make the generalization ability of the method worse. To solve the problem, we propose a deep transfer learning method based on 1D-CNN for rolling bearing fault diagnosis. First, 1-dimension convolutional neural network (1D-CNN), as the basic framework, is used to extract features from vibration signal. The CORrelation ALignment (CORAL) is employed to minimize marginal distribution discrepancy between the source domain and target domain. Then, the cross-entropy loss function and Adam optimizer are used to minimize the classification errors and the second-order statistics of feature distance between the source domain and target domain, respectively. Finally, based on the bearing datasets of Case Western Reserve University and Jiangnan University, seven transfer fault diagnosis comparison experiments are carried out. The results show that our method has better performance.


Author(s):  
Y. Xu ◽  
X. Hu ◽  
Y. Wei ◽  
Y. Yang ◽  
D. Wang

<p><strong>Abstract.</strong> The demand for timely information about earth’s surface such as land cover and land use (LC/LU), is consistently increasing. Machine learning method shows its advantage on collecting such information from remotely sensed images while requiring sufficient training sample. For satellite remote sensing image, however, sample datasets covering large scope are still limited. Most existing sample datasets for satellite remote sensing image built based on a few frames of image located on a local area. For large scope (national level) view, choosing a sufficient unbiased sampling method is crucial for constructing balanced training sample dataset. Dependable spatial sample locations considering spatial heterogeneity of land cover are needed for choosing sample images. This paper introduces an ongoing work on establishing a national scope sample dataset for high spatial-resolution satellite remote sensing image processing. Sample sites been chosen sufficiently using spatial sampling method, and divided sample patches been grouped using clustering method for further uses. The neural network model for road detection trained our dataset subset shows an increased performance on both completeness and accuracy, comparing to two widely used public dataset.</p>


2019 ◽  
Vol 56 (11) ◽  
pp. 111001
Author(s):  
贺琪 Qi He ◽  
李瑶 Yao Li ◽  
宋巍 Wei Song ◽  
黄冬梅 Dongmei Huang ◽  
何盛琪 Shengqi He ◽  
...  

2019 ◽  
Vol 12 (1) ◽  
pp. 86 ◽  
Author(s):  
Rafael Pires de Lima ◽  
Kurt Marfurt

Remote-sensing image scene classification can provide significant value, ranging from forest fire monitoring to land-use and land-cover classification. Beginning with the first aerial photographs of the early 20th century to the satellite imagery of today, the amount of remote-sensing data has increased geometrically with a higher resolution. The need to analyze these modern digital data motivated research to accelerate remote-sensing image classification. Fortunately, great advances have been made by the computer vision community to classify natural images or photographs taken with an ordinary camera. Natural image datasets can range up to millions of samples and are, therefore, amenable to deep-learning techniques. Many fields of science, remote sensing included, were able to exploit the success of natural image classification by convolutional neural network models using a technique commonly called transfer learning. We provide a systematic review of transfer learning application for scene classification using different datasets and different deep-learning models. We evaluate how the specialization of convolutional neural network models affects the transfer learning process by splitting original models in different points. As expected, we find the choice of hyperparameters used to train the model has a significant influence on the final performance of the models. Curiously, we find transfer learning from models trained on larger, more generic natural images datasets outperformed transfer learning from models trained directly on smaller remotely sensed datasets. Nonetheless, results show that transfer learning provides a powerful tool for remote-sensing scene classification.


Sign in / Sign up

Export Citation Format

Share Document