Semi-supervised learning using ensembles of multiple 1D-embedding-based label boosting

Author(s):  
Jianzhong Wang

The paper continues the development of the multiple 1D-embedding-based (or, 1D multi-embedding) methods for semi-supervised learning, which is preliminarily introduced by the author in [J. Wang, Semi-supervised learning using multiple one-dimensional embedding based adaptive interpolation, Int. J. Wavelets, Multiresolut. Inform Process. 14(2) (2016) 11 pp.]. This paper puts the development in a more general framework and creates a new method, which employs the ensemble technique to integrate multiple 1D embedding-based regularization and label boosting for semi-supervised learning (SSL). It combines parallel ensemble and serial ensemble. In each stage of parallel ensemble, the dataset is first smoothly mapped onto multiple 1D sequences. On each 1D embedded data, a classical regularization method is applied to construct a weak classifier. All of these weak classifiers are then integrated to an ensemble of 1D labeler (E1DL), which together with a nearest neighbor cluster (NNC) algorithm extracts a newborn labeled subset from the unlabeled set. The subset is believed to be correctly labeled with a high confidence, so that it joins with the original labeled set for the next learning stage. Repeating this process, we gradually obtain a boosted labeled set and the process will not stopped until the updated labeled set reaches a certain size. Finally, we use E1DL to build the target classifier, which labels all points of the dataset. In this paper, we also set the universal parameters for all experiments to make the algorithm as a parameter-free one. The validity of our method in the classification of the handwritten digits is confirmed by several experiments. Comparing to several other popular SSL methods, our results are very promising.

Author(s):  
Jianzhong Wang

We propose a novel semi-supervised learning (SSL) scheme using adaptive interpolation on multiple one-dimensional (1D) embedded data. For a given high-dimensional dataset, we smoothly map it onto several different 1D sequences, so that the labeled subset is converted to a 1D subset for each of these sequences. Applying the cubic interpolation of the labeled subset, we obtain a subset of unlabeled points, which are assigned to the same label in all interpolations. Selecting a proportion of these points at random and adding them to the current labeled subset, we build a larger labeled subset for the next interpolation. Repeating the embedding and interpolation, we enlarge the labeled subset gradually, and finally reach a labeled set with a reasonable large size, based on which the final classifier is constructed. We explore the use of the proposed scheme in the classification of handwritten digits and show promising results.


Author(s):  
Yalong Song ◽  
Hong Li ◽  
Jianzhong Wang ◽  
Kit Ian Kou

In this paper, we present a novel multiple 1D-embedding based clustering (M1DEBC) scheme for hyperspectral image (HSI) classification. This novel clustering scheme is an iteration algorithm of 1D-embedding based regularization, which is first proposed by J. Wang [Semi-supervised learning using ensembles of multiple 1D-embedding-based label boosting, Int. J. Wavelets[Formula: see text] Multiresolut. Inf. Process. 14(2) (2016) 33 pp.; Semi-supervised learning using multiple one-dimensional embedding-based adaptive interpolation, Int. J. Wavelets[Formula: see text] Multiresolut. Inf. Process. 14(2) (2016) 11 pp.]. In the algorithm, at each iteration, we do the following three steps. First, we construct a 1D multi-embedding, which contains [Formula: see text] different versions of 1D embedding. Each of them is realized by an isometric mapping that maps all the pixels in a HSI into a line such that the sum of the distances of adjacent pixels in the original space is minimized. Second, for each 1D embedding, we use the regularization method to find a pre-classifier to give each unlabeled sample a preliminary label. If all of the [Formula: see text] different versions of regularization vote the same preliminary label, then we call it a feasible confident sample. All the feasible confident samples and their corresponding labels constitute the auxiliary set. We randomly select a part of the elements from the auxiliary set to construct the newborn labeled set. Finally, we add the newborn labeled set into the labeled sample set. Thus, the labeled sample set is gradually enlarged in the process of the iteration. The iteration terminates until the updated labeled set reaches a certain size. Our experimental results on real hyperspectral datasets confirm the effectiveness of the proposed scheme.


2021 ◽  
Vol 12 (1) ◽  
pp. 44
Author(s):  
Seokjin Lee ◽  
Minhan Kim ◽  
Seunghyeon Shin ◽  
Seungjae Baek ◽  
Sooyoung Park ◽  
...  

In recent acoustic scene classification (ASC) models, various auxiliary methods to enhance performance have been applied, e.g., subsystem ensembles and data augmentations. Particularly, the ensembles of several submodels may be effective in the ASC models, but there is a problem with increasing the size of the model because it contains several submodels. Therefore, it is hard to be used in model-complexity-limited ASC tasks. In this paper, we would like to find the performance enhancement method while taking advantage of the model ensemble technique without increasing the model size. Our method is proposed based on a mean-teacher model, which is developed for consistency learning in semi-supervised learning. Because our problem is supervised learning, which is different from the purpose of the conventional mean-teacher model, we modify detailed strategies to maximize the consistency learning performance. To evaluate the effectiveness of our method, experiments were performed with an ASC database from the Detection and Classification of Acoustic Scenes and Events 2021 Task 1A. The small-sized ASC model with our proposed method improved the log loss performance up to 1.009 and the F1-score performance by 67.12%, whereas the vanilla ASC model showed a log loss of 1.052 and an F1-score of 65.79%.


Geophysics ◽  
1985 ◽  
Vol 50 (10) ◽  
pp. 1548-1555 ◽  
Author(s):  
Kou‐Yuan Huang ◽  
King‐sun Fu

Syntactic pattern recognition techniques are applied to the analysis of one‐dimensional seismic traces for classification of Ricker wavelets. The system for one‐dimensional seismic analysis includes a likelihood ratio test, optimal amplitude‐dependent encoding, probability of detecting the signal involved in the global and local detection, plus minimum‐distance and nearest‐neighbor classification rules. The relation between error probability and Levenshtein distance is proposed.


Author(s):  
M. Jeyanthi ◽  
C. Velayutham

In Science and Technology Development BCI plays a vital role in the field of Research. Classification is a data mining technique used to predict group membership for data instances. Analyses of BCI data are challenging because feature extraction and classification of these data are more difficult as compared with those applied to raw data. In this paper, We extracted features using statistical Haralick features from the raw EEG data . Then the features are Normalized, Binning is used to improve the accuracy of the predictive models by reducing noise and eliminate some irrelevant attributes and then the classification is performed using different classification techniques such as Naïve Bayes, k-nearest neighbor classifier, SVM classifier using BCI dataset. Finally we propose the SVM classification algorithm for the BCI data set.


Author(s):  
Herman Herman ◽  
Demi Adidrana ◽  
Nico Surantha ◽  
Suharjito Suharjito

The human population significantly increases in crowded urban areas. It causes a reduction of available farming land. Therefore, a landless planting method is needed to supply the food for society. Hydroponics is one of the solutions for gardening methods without using soil. It uses nutrient-enriched mineral water as a nutrition solution for plant growth. Traditionally, hydroponic farming is conducted manually by monitoring the nutrition such as acidity or basicity (pH), the value of Total Dissolved Solids (TDS), Electrical Conductivity (EC), and nutrient temperature. In this research, the researchers propose a system that measures pH, TDS, and nutrient temperature values in the Nutrient Film Technique (NFT) technique using a couple of sensors. The researchers use lettuce as an object of experiment and apply the k-Nearest Neighbor (k-NN) algorithm to predict the classification of nutrient conditions. The result of prediction is used to provide a command to the microcontroller to turn on or off the nutrition controller actuators simultaneously at a time. The experiment result shows that the proposed k-NN algorithm achieves 93.3% accuracy when it is k = 5.


2010 ◽  
Vol 23 (2) ◽  
pp. 025601 ◽  
Author(s):  
Monodeep Chakraborty ◽  
A N Das ◽  
Atisdipankar Chakrabarti

Sign in / Sign up

Export Citation Format

Share Document