cluster assumption
Recently Published Documents


TOTAL DOCUMENTS

11
(FIVE YEARS 3)

H-INDEX

4
(FIVE YEARS 1)

2021 ◽  
pp. 1-52
Author(s):  
Taira Tsuchiya ◽  
Nontawat Charoenphakdee ◽  
Issei Sato ◽  
Masashi Sugiyama

Abstract Ordinal regression is aimed at predicting an ordinal class label. In this letter, we consider its semisupervised formulation, in which we have unlabeled data along with ordinal-labeled data to train an ordinal regressor. There are several metrics to evaluate the performance of ordinal regression, such as the mean absolute error, mean zero-one error, and mean squared error. However, the existing studies do not take the evaluation metric into account, restrict model choice, and have no theoretical guarantee. To overcome these problems, we propose a novel generic framework for semisupervised ordinal regression based on the empirical risk minimization principle that is applicable to optimizing all of the metrics mentioned above. In addition, our framework has flexible choices of models, surrogate losses, and optimization algorithms without the common geometric assumption on unlabeled data such as the cluster assumption or manifold assumption. We provide an estimation error bound to show that our risk estimator is consistent. Finally, we conduct experiments to show the usefulness of our framework.


2020 ◽  
Vol 34 (04) ◽  
pp. 4320-4327 ◽  
Author(s):  
Songlei Jian ◽  
Liang Hu ◽  
Longbing Cao ◽  
Kai Lu

The cross-domain representation learning plays an important role in tasks including domain adaptation and transfer learning. However, existing cross-domain representation learning focuses on building one shared space and ignores the unlabeled data in the source domain, which cannot effectively capture the distribution and structure heterogeneities in cross-domain data. To address this challenge, we propose a new cross-domain representation learning approach: MUltiple Lipschitz-constrained AligNments (MULAN) on partially-labeled cross-domain data. MULAN produces two representation spaces: a common representation space to incorporate knowledge from the source domain and a complementary representation space to complement the common representation with target local topological information by Lipschitz-constrained representation transformation. MULAN utilizes both unlabeled and labeled data in the source and target domains to address distribution heterogeneity by Lipschitz-constrained adversarial distribution alignment and structure heterogeneity by cluster assumption-based class alignment while keeping the target local topological information in complementary representation by self alignment. Moreover, MULAN is effectively equipped with a customized learning process and an iterative parameter updating process. MULAN shows its superior performance on partially-labeled semi-supervised domain adaptation and few-shot domain adaptation and outperforms the state-of-the-art visual domain adaptation models by up to 12.1%.


Author(s):  
M. Peréz-Ortiz ◽  
P. Tiňo ◽  
R. Mantiuk ◽  
C. Hervás-Martínez

Data augmentation is rapidly gaining attention in machine learning. Synthetic data can be generated by simple transformations or through the data distribution. In the latter case, the main challenge is to estimate the label associated to new synthetic patterns. This paper studies the effect of generating synthetic data by convex combination of patterns and the use of these as unsupervised information in a semi-supervised learning framework with support vector machines, avoiding thus the need to label synthetic examples. We perform experiments on a total of 53 binary classification datasets. Our results show that this type of data over-sampling supports the well-known cluster assumption in semi-supervised learning, showing outstanding results for small high-dimensional datasets and imbalanced learning problems.


2017 ◽  
Vol 46 (3) ◽  
pp. 1031-1042 ◽  
Author(s):  
Yunyun Wang ◽  
Yan Meng ◽  
Zhenyong Fu ◽  
Hui Xue

2012 ◽  
Vol 220-223 ◽  
pp. 452-458
Author(s):  
Xian Xin Shi ◽  
Zhong Xiang Zhao ◽  
Chang Jian Zhu ◽  
Xiao Xiao Kong ◽  
Jun Fei Chai ◽  
...  

A cluster kernel semi-supervised support vector machine (CKS3VM) based on spectral cluster algorithm is proposed and applied in winch fault classification in this paper. The spectral clustering method is used to re-represent original data samples in an eigenvector space so as to make the data samples in the same cluster gather together much better. Then, a cluster kernel function is constructed upon the eigenvector space. Finally, a cluster kernel S3VM is designed which can satisfy the cluster assumption of semi-supervised study. The experiments on winch fault classification show that the novel approach has high classification accuracy.


2012 ◽  
Vol 33 (9) ◽  
pp. 1042-1048 ◽  
Author(s):  
Swarnajyoti Patra ◽  
Lorenzo Bruzzone

Sign in / Sign up

Export Citation Format

Share Document