Semi-supervised learning using multiple one-dimensional embedding based adaptive interpolation

Author(s):  
Jianzhong Wang

We propose a novel semi-supervised learning (SSL) scheme using adaptive interpolation on multiple one-dimensional (1D) embedded data. For a given high-dimensional dataset, we smoothly map it onto several different 1D sequences, so that the labeled subset is converted to a 1D subset for each of these sequences. Applying the cubic interpolation of the labeled subset, we obtain a subset of unlabeled points, which are assigned to the same label in all interpolations. Selecting a proportion of these points at random and adding them to the current labeled subset, we build a larger labeled subset for the next interpolation. Repeating the embedding and interpolation, we enlarge the labeled subset gradually, and finally reach a labeled set with a reasonable large size, based on which the final classifier is constructed. We explore the use of the proposed scheme in the classification of handwritten digits and show promising results.

Author(s):  
Jianzhong Wang

The paper continues the development of the multiple 1D-embedding-based (or, 1D multi-embedding) methods for semi-supervised learning, which is preliminarily introduced by the author in [J. Wang, Semi-supervised learning using multiple one-dimensional embedding based adaptive interpolation, Int. J. Wavelets, Multiresolut. Inform Process. 14(2) (2016) 11 pp.]. This paper puts the development in a more general framework and creates a new method, which employs the ensemble technique to integrate multiple 1D embedding-based regularization and label boosting for semi-supervised learning (SSL). It combines parallel ensemble and serial ensemble. In each stage of parallel ensemble, the dataset is first smoothly mapped onto multiple 1D sequences. On each 1D embedded data, a classical regularization method is applied to construct a weak classifier. All of these weak classifiers are then integrated to an ensemble of 1D labeler (E1DL), which together with a nearest neighbor cluster (NNC) algorithm extracts a newborn labeled subset from the unlabeled set. The subset is believed to be correctly labeled with a high confidence, so that it joins with the original labeled set for the next learning stage. Repeating this process, we gradually obtain a boosted labeled set and the process will not stopped until the updated labeled set reaches a certain size. Finally, we use E1DL to build the target classifier, which labels all points of the dataset. In this paper, we also set the universal parameters for all experiments to make the algorithm as a parameter-free one. The validity of our method in the classification of the handwritten digits is confirmed by several experiments. Comparing to several other popular SSL methods, our results are very promising.


Author(s):  
Luoqing Li ◽  
Chuanwu Yang ◽  
Qiwei Xie

In this paper, we propose a novel semi-supervised multi-category classification method based on one-dimensional (1D) multi-embedding. Based on the multiple 1D embedding based interpolation technique, we embed the high-dimensional data into several different 1D manifolds and perform binary classification firstly. Then we construct the multi-category classifiers by means of one-versus-rest and one-versus-one strategies separately. A weight strategy is employed in our algorithm for improving the classification performance. The proposed method shows promising results in the classification of handwritten digits and facial images.


Author(s):  
Yalong Song ◽  
Hong Li ◽  
Jianzhong Wang ◽  
Kit Ian Kou

In this paper, we present a novel multiple 1D-embedding based clustering (M1DEBC) scheme for hyperspectral image (HSI) classification. This novel clustering scheme is an iteration algorithm of 1D-embedding based regularization, which is first proposed by J. Wang [Semi-supervised learning using ensembles of multiple 1D-embedding-based label boosting, Int. J. Wavelets[Formula: see text] Multiresolut. Inf. Process. 14(2) (2016) 33 pp.; Semi-supervised learning using multiple one-dimensional embedding-based adaptive interpolation, Int. J. Wavelets[Formula: see text] Multiresolut. Inf. Process. 14(2) (2016) 11 pp.]. In the algorithm, at each iteration, we do the following three steps. First, we construct a 1D multi-embedding, which contains [Formula: see text] different versions of 1D embedding. Each of them is realized by an isometric mapping that maps all the pixels in a HSI into a line such that the sum of the distances of adjacent pixels in the original space is minimized. Second, for each 1D embedding, we use the regularization method to find a pre-classifier to give each unlabeled sample a preliminary label. If all of the [Formula: see text] different versions of regularization vote the same preliminary label, then we call it a feasible confident sample. All the feasible confident samples and their corresponding labels constitute the auxiliary set. We randomly select a part of the elements from the auxiliary set to construct the newborn labeled set. Finally, we add the newborn labeled set into the labeled sample set. Thus, the labeled sample set is gradually enlarged in the process of the iteration. The iteration terminates until the updated labeled set reaches a certain size. Our experimental results on real hyperspectral datasets confirm the effectiveness of the proposed scheme.


Author(s):  
Dan Luo ◽  
Xili Wang

Background: Semi-supervised learning in the machine learning community has received widespread attention. Semi-supervised learning can use a small number of tagged samples and a large number of untagged samples for efficient learning. Methods: In 2014, Kim proposed a new semi-supervised learning method: the minimax label propagation (MMLP) method. This method reduces time complexity to O (n), with a smaller computation cost and stronger classification ability than traditional methods. However, classification results are not accurate in large-scale image classifications. Thus, in this paper, we propose a semisupervised image classification method, which is an MMLP-based algorithm. The main idea is threefold: (1) Improving connectivity of image pixels by pixel sampling to reduce the image size, at the same time, reduce the diversity of image characteristics; (2) Using a recall feature to improve the MMLP algorithm; (3) through classification mapping, gaining the classification of the original data from the classification of the data reduction. Results: In the end, our algorithm also gains a minimax path from untagged samples to tagged samples. The experimental results proved that this algorithm is applicable to semi-supervised learning on small-size and that it can also gain better classification results for large-size image at the same time. Conclusion: In our paper, considering the connectivity of the neighboring matrix and the diversity of the characteristics, we used meanshift clustering algorithm, next we will use fuzzy energy clustering on our algorithm. We will study the function of these paths.


2020 ◽  
Vol 10 (5) ◽  
pp. 1797 ◽  
Author(s):  
Mera Kartika Delimayanti ◽  
Bedy Purnama ◽  
Ngoc Giang Nguyen ◽  
Mohammad Reza Faisal ◽  
Kunti Robiatul Mahmudah ◽  
...  

Manual classification of sleep stage is a time-consuming but necessary step in the diagnosis and treatment of sleep disorders, and its automation has been an area of active study. The previous works have shown that low dimensional fast Fourier transform (FFT) features and many machine learning algorithms have been applied. In this paper, we demonstrate utilization of features extracted from EEG signals via FFT to improve the performance of automated sleep stage classification through machine learning methods. Unlike previous works using FFT, we incorporated thousands of FFT features in order to classify the sleep stages into 2–6 classes. Using the expanded version of Sleep-EDF dataset with 61 recordings, our method outperformed other state-of-the art methods. This result indicates that high dimensional FFT features in combination with a simple feature selection is effective for the improvement of automated sleep stage classification.


Mathematics ◽  
2021 ◽  
Vol 9 (7) ◽  
pp. 779
Author(s):  
Ruriko Yoshida

A tropical ball is a ball defined by the tropical metric over the tropical projective torus. In this paper we show several properties of tropical balls over the tropical projective torus and also over the space of phylogenetic trees with a given set of leaf labels. Then we discuss its application to the K nearest neighbors (KNN) algorithm, a supervised learning method used to classify a high-dimensional vector into given categories by looking at a ball centered at the vector, which contains K vectors in the space.


Sign in / Sign up

Export Citation Format

Share Document