unlabeled sample
Recently Published Documents


TOTAL DOCUMENTS

16
(FIVE YEARS 8)

H-INDEX

4
(FIVE YEARS 2)

2021 ◽  
Vol 13 (17) ◽  
pp. 3539
Author(s):  
Chen Ding ◽  
Yu Li ◽  
Yue Wen ◽  
Mengmeng Zheng ◽  
Lei Zhang ◽  
...  

Deep neural networks have underpinned much of the recent progress in the field of hyperspectral image (HSI) classification owing to their powerful ability to learn discriminative features. However, training a deep neural network often requires the availability of a large number of labeled samples to mitigate over-fitting, and these labeled samples are not always available in practical applications. To adapt the deep neural network-based HSI classification approach to cases in which only a very limited number of labeled samples (i.e., few or even only one labeled sample) are provided, we propose a novel few-shot deep learning framework for HSI classification. In order to mitigate over-fitting, the framework borrows supervision from an auxiliary set of unlabeled samples with soft pseudo-labels to assist the training of the feature extractor on few labeled samples. By considering each labeled sample as a reference agent, the soft pseudo-label is assigned by computing the distances between the unlabeled sample and all agents. To demonstrate the effectiveness of the proposed method, we evaluate it on three benchmark HSI classification datasets. The results indicate that our method achieves better performance relative to existing competitors in few-shot and one-shot settings.


2021 ◽  
pp. 1-45
Author(s):  
Zhaohui Song ◽  
Sanyi Yuan ◽  
Zimeng Li ◽  
Shangxu Wang

Gas-bearing prediction of tight sandstone reservoirs is significant but challenging due to the relationship between the gas-bearing property and its seismic response being nonlinear and complex. Although machine learning (ML) methods provide potential for solving the issue, the major challenge of ML applications to gas-bearing prediction is that of generating accurate and interpretable intelligent models with limited training sets. The k Nearest neighbor ( kNN) method is a supervised ML method classifying an unlabeled sample according to its k neighboring labeled samples. We have introduced a kNN-based gas-bearing prediction method. The method can automatically extract a gas-sensitive attribute called the gas-indication local waveform similarity attribute (GLWSA) combining prestack seismic gathers with interpreted gas-bearing curves. GLWSA uses the local waveform similarity among the predicting samples and the gas-bearing training samples to indicate the existence of an exploitable gas reservoir. GLWSA has simple principles and an explicit geophysical meaning. We use a numerical model and field data to test the effectiveness of our method. The result demonstrates that GLWSA is good at characterizing the reservoir morphology and location qualitatively. When the method applies to the field data, we evaluate the performance with a blind well. The prediction result is consistent with the geologic law of the work area and indicates more details compared to the root-mean-square attribute.


2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Juan Xu ◽  
Pengfei Xu ◽  
Zhenchun Wei ◽  
Xu Ding ◽  
Lei Shi

In recent years, deep learning has become a popular topic in the intelligent fault diagnosis of industrial equipment. In practical working conditions, how to realize intelligent fault diagnosis in the case of the different mechanical components with a tiny labeled sample is a challenging problem. That means training with one component sample but testing with another component sample has not been resolved. In this paper, we propose a deep convolutional nearest neighbor matching network (DC-NNMN) based on few-shot learning. The 1D convolution embedding network is constructed to extract the high-dimensional fault feature. The cosine distance is merged into the K-Nearest Neighbor method to model the distance distribution between the unlabeled sample from the query set and labeled sample from the support set in high-dimensional fault features. The multiple few-shot learning fault diagnosis tasks as the testing dataset are constructed, and then the network parameters are optimized through training in multiple tasks. Thus, a robust network model is obtained to classify the unknown fault categories in different components with tiny labeled fault samples. We use the CWRU bearing vibration dataset, the bearing vibration data selected from the Lab-built experimental platform, and another gearing vibration dataset for across components experiment to prove the proposed method. Experimental results show that the proposed method can achieve fault diagnosis accuracy of 82.19% for gearing and 82.63% for bearings with only one sample of each fault category. The proposed DC-NNMN model provides a new approach to solve the across components fault diagnosis in few-shot learning.


2020 ◽  
Vol 64 (1) ◽  
pp. 10503-1-10503-9
Author(s):  
Jie Xiao ◽  
Yunpeng Wang ◽  
Hua Su

Abstract A classification problem involving multi-class samples is typically divided into a set of two-class sub-problems. The pairwise probabilities produced by the binary classifiers are subsequently combined to generate a final result. However, only the binary classifiers that have been trained with the unknown real class of an unlabeled sample are relevant to the multi-class problem. A distance-based relative competence weighting (DRCW) combination mechanism can estimate the competence of the binary classifiers. In this work, we adapt the DRCW mechanism to the support vector machine (SVM) approach for the classification of remote sensing images. The application of DRCW can allow the competence of a binary classifier to be estimated from the spectral information. It is therefore possible to distinguish the relevant and irrelevant binary classifiers. The SVM+DRCW classification approach is applied to analyzing the land-use/land-cover patterns in Guangzhou, China from the remotely sensed images from Landsat-5 TM and SPOT-5. The results show that the SVM+DRVW approach can achieve higher classification accuracies compared to the conventional SVM and SVMs combined with other combination mechanisms such as weighted voting (WV) and probability estimates by pairwise coupling (PE).


Open Physics ◽  
2019 ◽  
Vol 17 (1) ◽  
pp. 975-983
Author(s):  
Jianhua Zhao ◽  
Ning Liu

Abstract In practical application, there are a large amount of imbalanced data containing only a small number of labeled data. In order to improve the classification performance of this kind of problem, this paper proposes a semi-supervised learning algorithm based on mixed sampling for imbalanced data classification (S2MAID), which combines semi-supervised learning, over sampling, under sampling and ensemble learning. Firstly, a kind of under sampling algorithm UD-density is provided to select samples with high information content from majority class set for semi-supervised learning. Secondly, a safe supervised-learning method is used to mark unlabeled sample and expand the labeled sample. Thirdly, a kind of over sampling algorithm SMOTE-density is provided to make the imbalanced data set become balance set. Fourthly, an ensemble technology is used to generate a strong classifier. Finally, the experiment is carried out on imbalanced data with containing only a few labeled samples, and semi-supervised learning process is simulated. The proposed S2MAID is verified and the experimental result shows that the proposed S2MAID has a better classification performance.


2019 ◽  
Vol 20 (S25) ◽  
Author(s):  
Ye Wang ◽  
Changqing Mei ◽  
Yuming Zhou ◽  
Yan Wang ◽  
Chunhou Zheng ◽  
...  

Abstract Background The recognition of protein interaction sites is of great significance in many biological processes, signaling pathways and drug designs. However, most sites on protein sequences cannot be defined as interface or non-interface sites because only a small part of protein interactions had been identified, which will cause the lack of prediction accuracy and generalization ability of predictors in protein interaction sites prediction. Therefore, it is necessary to effectively improve prediction performance of protein interaction sites using large amounts of unlabeled data together with small amounts of labeled data and background knowledge today. Results In this work, three semi-supervised support vector machine–based methods are proposed to improve the performance in the protein interaction sites prediction, in which the information of unlabeled protein sites can be involved. Herein, five features related with the evolutionary conservation of amino acids are extracted from HSSP database and Consurf Sever, i.e., residue spatial sequence spectrum, residue sequence information entropy and relative entropy, residue sequence conserved weight and residual Base evolution rate, to represent the residues within the protein sequence. Then three predictors are built for identifying the interface residues from protein surface using three types of semi-supervised support vector machine algorithms. Conclusion The experimental results demonstrated that the semi-supervised approaches can effectively improve prediction performance of protein interaction sites when unlabeled information is involved into the predictors and one of them can achieve the best prediction performance, i.e., the accuracy of 70.7%, the sensitivity of 62.67% and the specificity of 78.72%, respectively. With comparison to the existing studies, the semi-supervised models show the improvement of the predication performance.


2019 ◽  
Vol 11 (16) ◽  
pp. 1933 ◽  
Author(s):  
Yangyang Li ◽  
Ruoting Xing ◽  
Licheng Jiao ◽  
Yanqiao Chen ◽  
Yingte Chai ◽  
...  

Polarimetric synthetic aperture radar (PolSAR) image classification is a recent technology with great practical value in the field of remote sensing. However, due to the time-consuming and labor-intensive data collection, there are few labeled datasets available. Furthermore, most available state-of-the-art classification methods heavily suffer from the speckle noise. To solve these problems, in this paper, a novel semi-supervised algorithm based on self-training and superpixels is proposed. First, the Pauli-RGB image is over-segmented into superpixels to obtain a large number of homogeneous areas. Then, features that can mitigate the effects of the speckle noise are obtained using spatial weighting in the same superpixel. Next, the training set is expanded iteratively utilizing a semi-supervised unlabeled sample selection strategy that elaborately makes use of spatial relations provided by superpixels. In addition, a stacked sparse auto-encoder is self-trained using the expanded training set to obtain classification results. Experiments on two typical PolSAR datasets verified its capability of suppressing the speckle noise and showed excellent classification performance with limited labeled data.


Author(s):  
Bingbing Jiang ◽  
Xingyu Wu ◽  
Kui Yu ◽  
Huanhuan Chen

With the increasing data dimensionality, feature selection has become a fundamental task to deal with high-dimensional data. Semi-supervised feature selection focuses on the problem of how to learn a relevant feature subset in the case of abundant unlabeled data with few labeled data. In recent years, many semi-supervised feature selection algorithms have been proposed. However, these algorithms are implemented by separating the processes of feature selection and classifier training, such that they cannot simultaneously select features and learn a classifier with the selected features. Moreover, they ignore the difference of reliability inside unlabeled samples and directly use them in the training stage, which might cause performance degradation. In this paper, we propose a joint semi-supervised feature selection and classification algorithm (JSFS) which adopts a Bayesian approach to automatically select the relevant features and simultaneously learn a classifier. Instead of using all unlabeled samples indiscriminately, JSFS associates each unlabeled sample with a self-adjusting weight to distinguish the difference between them, which can effectively eliminate the irrelevant unlabeled samples via introducing a left-truncated Gaussian prior. Experiments on various datasets demonstrate the effectiveness and superiority of JSFS.


Author(s):  
Qianying Wang ◽  
Ming Lu ◽  
Junhong Li

Semi-supervised boosting strategy aims at improving the performance of a given classifier with a multitude of unlabeled data. In semi-supervised boosting strategy, a similarity is needed to select unlabeled samples and then a pseudo label will be assigned to the unlabeled sample. A good similarity is helpful to assign a more proper pseudo label to unlabeled samples. Those selected samples with their pseudo labels will serve as labeled samples to train the new component classifier. So, similarity is important in semi-supervised boosting. Gaussian kernel similarity [Formula: see text] is used in semi-supervised boosting strategy. There are two drawbacks, first, the Euclidean distance [Formula: see text] cannot characterize the complicated relationship between the data samples; second, the parameter [Formula: see text] needs to set carefully. So, this paper proposes a novel adaptive similarity based on sparse representation for semi-supervised boosting. Our sparse representation is learned from a “clean” dictionary, which is a low rank matrix obtained from the sample matrix. We evaluate the proposed method on COIL20 databases. Experimental results show that: the semi-supervised boosting algorithm with sparse representation similarity outperforms the algorithm with Gaussian kernel similarity.


Nanophotonics ◽  
2017 ◽  
Vol 7 (2) ◽  
pp. 489-495 ◽  
Author(s):  
Anna Labno ◽  
Christopher Gladden ◽  
Jeongmin Kim ◽  
Dylan Lu ◽  
Xiaobo Yin ◽  
...  

AbstractThree-dimensional (3D) imaging at the nanoscale is a key to understanding of nanomaterials and complex systems. While scanning probe microscopy (SPM) has been the workhorse of nanoscale metrology, its slow scanning speed by a single probe tip can limit the application of SPM to wide-field imaging of 3D complex nanostructures. Both electron microscopy and optical tomography allow 3D imaging, but are limited to the use in vacuum environment due to electron scattering and to optical resolution in micron scales, respectively. Here we demonstrate plasmonic Brownian microscopy (PBM) as a way to improve the imaging speed of SPM. Unlike photonic force microscopy where a single trapped particle is used for a serial scanning, PBM utilizes a massive number of plasmonic nanoparticles (NPs) under Brownian diffusion in solution to scan in parallel around the unlabeled sample object. The motion of NPs under an evanescent field is three-dimensionally localized to reconstruct the super-resolution topology of 3D dielectric objects. Our method allows high throughput imaging of complex 3D structures over a large field of view, even with internal structures such as cavities that cannot be accessed by conventional mechanical tips in SPM.


Sign in / Sign up

Export Citation Format

Share Document