Label distribution learning with label-specific features

Label distribution learning (LDL) is a novel machine learning paradigm to deal with label ambiguity issues by placing more emphasis on how relevant each label is to a particular instance. Many LDL algorithms have been proposed and most of them concentrate on the learning models, while few of them focus on the feature selection problem. All existing LDL models are built on a simple feature space in which all features are shared by all the class labels. However, this kind of traditional data representation strategy tends to select features that are distinguishable for all labels, but ignores label-specific features that are pertinent and discriminative for each class label. In this paper, we propose a novel LDL algorithm by leveraging label-specific features. The common features for all labels and specific features for each label are simultaneously learned to enhance the LDL model. Moreover, we also exploit the label correlations in the proposed LDL model. The experimental results on several real-world data sets validate the effectiveness of our method.

Download Full-text

Label Distribution Learning with Label Correlations via Low-Rank Approximation

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/461 ◽

2019 ◽

Cited By ~ 1

Author(s):

Tingting Ren ◽

Xiuyi Jia ◽

Weiwei Li ◽

Shu Zhao

Keyword(s):

Correlation Matrix ◽

Learning Algorithm ◽

Low Rank ◽

Low Rank Approximation ◽

Real World Data ◽

Label Correlations ◽

Rank Approximation ◽

Label Distribution Learning ◽

Label Distribution ◽

Label Correlation

Label distribution learning (LDL) can be viewed as the generalization of multi-label learning. This novel paradigm focuses on the relative importance of different labels to a particular instance. Most previous LDL methods either ignore the correlation among labels, or only exploit the label correlations in a global way. In this paper, we utilize both the global and local relevance among labels to provide more information for training model and propose a novel label distribution learning algorithm. In particular, a label correlation matrix based on low-rank approximation is applied to capture the global label correlations. In addition, the label correlation among local samples are adopted to modify the label correlation matrix. The experimental results on real-world data sets show that the proposed algorithm outperforms state-of-the-art LDL methods.

Download Full-text

MULFE: Multi-Label Learning via Label-Specific Feature Space Ensemble

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3451392 ◽

2021 ◽

Vol 16 (1) ◽

pp. 1-24

Author(s):

Yaojin Lin ◽

Qinghua Hu ◽

Jinghua Liu ◽

Xingquan Zhu ◽

Xindong Wu

Keyword(s):

Empirical Studies ◽

Feature Space ◽

Training Data ◽

Data Sets ◽

Learning Framework ◽

Feature Spaces ◽

Public Data ◽

Margin Distribution ◽

Label Correlations ◽

Label Correlation

In multi-label learning, label correlations commonly exist in the data. Such correlation not only provides useful information, but also imposes significant challenges for multi-label learning. Recently, label-specific feature embedding has been proposed to explore label-specific features from the training data, and uses feature highly customized to the multi-label set for learning. While such feature embedding methods have demonstrated good performance, the creation of the feature embedding space is only based on a single label, without considering label correlations in the data. In this article, we propose to combine multiple label-specific feature spaces, using label correlation, for multi-label learning. The proposed algorithm, mu lti- l abel-specific f eature space e nsemble (MULFE), takes consideration label-specific features, label correlation, and weighted ensemble principle to form a learning framework. By conducting clustering analysis on each label’s negative and positive instances, MULFE first creates features customized to each label. After that, MULFE utilizes the label correlation to optimize the margin distribution of the base classifiers which are induced by the related label-specific feature spaces. By combining multiple label-specific features, label correlation based weighting, and ensemble learning, MULFE achieves maximum margin multi-label classification goal through the underlying optimization framework. Empirical studies on 10 public data sets manifest the effectiveness of MULFE.

Download Full-text

Auto-weighted concept factorization for joint feature map and data representation learning

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-200298 ◽

2021 ◽

pp. 1-13

Author(s):

Yikai Zhang ◽

Yong Peng ◽

Hongyu Bian ◽

Yuan Ge ◽

Feiwei Qin ◽

...

Keyword(s):

Objective Function ◽

Optimization Procedure ◽

Feature Space ◽

Representation Learning ◽

Data Representation ◽

Data Sets ◽

Reconstruction Process ◽

Factorization Model ◽

Efficient Data ◽

Concept Factorization

Concept factorization (CF) is an effective matrix factorization model which has been widely used in many applications. In CF, the linear combination of data points serves as the dictionary based on which CF can be performed in both the original feature space as well as the reproducible kernel Hilbert space (RKHS). The conventional CF treats each dimension of the feature vector equally during the data reconstruction process, which might violate the common sense that different features have different discriminative abilities and therefore contribute differently in pattern recognition. In this paper, we introduce an auto-weighting variable into the conventional CF objective function to adaptively learn the corresponding contributions of different features and propose a new model termed Auto-Weighted Concept Factorization (AWCF). In AWCF, on one hand, the feature importance can be quantitatively measured by the auto-weighting variable in which the features with better discriminative abilities are assigned larger weights; on the other hand, we can obtain more efficient data representation to depict its semantic information. The detailed optimization procedure to AWCF objective function is derived whose complexity and convergence are also analyzed. Experiments are conducted on both synthetic and representative benchmark data sets and the clustering results demonstrate the effectiveness of AWCF in comparison with the related models.

Download Full-text

An Incremental Classification Algorithm for Mining Data with Feature Space Heterogeneity

Mathematical Problems in Engineering ◽

10.1155/2014/327142 ◽

2014 ◽

Vol 2014 ◽

pp. 1-9 ◽

Cited By ~ 1

Author(s):

Yu Wang

Keyword(s):

Feature Space ◽

Classification Problem ◽

Classification Algorithm ◽

Data Sets ◽

Real World Data ◽

Supervised Clustering ◽

Online Classification ◽

Efficiency And Effectiveness ◽

Feature Relevance ◽

Incremental Classification

Feature space heterogeneity often exists in many real world data sets so that some features are of different importance for classification over different subsets. Moreover, the pattern of feature space heterogeneity might dynamically change over time as more and more data are accumulated. In this paper, we develop an incremental classification algorithm, Supervised Clustering for Classification with Feature Space Heterogeneity (SCCFSH), to address this problem. In our approach, supervised clustering is implemented to obtain a number of clusters such that samples in each cluster are from the same class. After the removal of outliers, relevance of features in each cluster is calculated based on their variations in this cluster. The feature relevance is incorporated into distance calculation for classification. The main advantage of SCCFSH lies in the fact that it is capable of solving a classification problem with feature space heterogeneity in an incremental way, which is favorable for online classification tasks with continuously changing data. Experimental results on a series of data sets and application to a database marketing problem show the efficiency and effectiveness of the proposed approach.

Download Full-text

Preventing Disparate Treatment in Sequential Decision Making

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/311 ◽

2018 ◽

Cited By ~ 1

Author(s):

Hoda Heidari ◽

Andreas Krause

Keyword(s):

Decision Making ◽

Learning Algorithm ◽

Feature Space ◽

Sequential Decision Making ◽

Data Sets ◽

Sequential Decision ◽

Real World Data ◽

Time Step ◽

Job Application ◽

Disparate Treatment

We study fairness in sequential decision making environments, where at each time step a learning algorithm receives data corresponding to a new individual (e.g. a new job application) and must make an irrevocable decision about him/her (e.g. whether to hire the applicant) based on observations made so far. In order to prevent cases of disparate treatment, our time-dependent notion of fairness requires algorithmic decisions to be consistent: if two individuals are similar in the feature space and arrive during the same time epoch, the algorithm must assign them to similar outcomes. We propose a general framework for post-processing predictions made by a black-box learning model, that guarantees the resulting sequence of outcomes is consistent. We show theoretically that imposing consistency will not significantly slow down learning. Our experiments on two real-world data sets illustrate and confirm this finding in practice.

Download Full-text

Feature Selection for Ridge Regression with Provable Guarantees

Neural Computation ◽

10.1162/neco_a_00816 ◽

2016 ◽

Vol 28 (4) ◽

pp. 716-742 ◽

Cited By ~ 9

Author(s):

Saurabh Paul ◽

Petros Drineas

Keyword(s):

Feature Selection ◽

Ridge Regression ◽

Feature Selection Method ◽

Feature Space ◽

Data Sets ◽

Real World Data ◽

Feature Selection Technique ◽

Worst Case ◽

Single Set ◽

Classification Function

We introduce single-set spectral sparsification as a deterministic sampling–based feature selection technique for regularized least-squares classification, which is the classification analog to ridge regression. The method is unsupervised and gives worst-case guarantees of the generalization power of the classification function after feature selection with respect to the classification function obtained using all features. We also introduce leverage-score sampling as an unsupervised randomized feature selection method for ridge regression. We provide risk bounds for both single-set spectral sparsification and leverage-score sampling on ridge regression in the fixed design setting and show that the risk in the sampled space is comparable to the risk in the full-feature space. We perform experiments on synthetic and real-world data sets; a subset of TechTC-300 data sets, to support our theory. Experimental results indicate that the proposed methods perform better than the existing feature selection methods.

Download Full-text

Label Distribution Learning with Label Correlations on Local Samples

IEEE Transactions on Knowledge and Data Engineering ◽

10.1109/tkde.2019.2943337 ◽

2020 ◽

pp. 1-1 ◽

Cited By ~ 1

Author(s):

Xiuyi Jia ◽

Zechao Li ◽

Xiang Zheng ◽

Weiwei Li ◽

Sheng-Jun Huang

Keyword(s):

Label Correlations ◽

Label Distribution Learning ◽

Label Distribution

Download Full-text

Neighbor-Based Label Distribution Learning to Model Label Ambiguity for Aerial Scene Classification

Remote Sensing ◽

10.3390/rs13040755 ◽

2021 ◽

Vol 13 (4) ◽

pp. 755

Author(s):

Jianqiao Luo ◽

Yihan Wang ◽

Yang Ou ◽

Biao He ◽

Bailin Li

Keyword(s):

Subspace Learning ◽

Label Propagation ◽

Aerial Images ◽

Aerial Image ◽

Local Similarity ◽

Scene Classification ◽

Sparse Constraint ◽

Label Correlations ◽

Label Distribution Learning ◽

Label Distribution

Many aerial images with similar appearances have different but correlated scene labels, which causes the label ambiguity. Label distribution learning (LDL) can express label ambiguity by giving each sample a label distribution. Thus, a sample contributes to the learning of its ground-truth label as well as correlated labels, which improve data utilization. LDL has gained success in many fields, such as age estimation, in which label ambiguity can be easily modeled on the basis of the prior knowledge about local sample similarity and global label correlations. However, LDL has never been applied to scene classification, because there is no knowledge about the local similarity and label correlations and thus it is hard to model label ambiguity. In this paper, we uncover the sample neighbors that cause label ambiguity by jointly capturing the local similarity and label correlations and propose neighbor-based LDL (N-LDL) for aerial scene classification. We define a subspace learning problem, which formulates the neighboring relations as a coefficient matrix that is regularized by a sparse constraint and label correlations. The sparse constraint provides a few nearest neighbors, which captures local similarity. The label correlations are predefined according to the confusion matrices on validation sets. During subspace learning, the neighboring relations are encouraged to agree with the label correlations, which ensures that the uncovered neighbors have correlated labels. Finally, the label propagation among the neighbors forms the label distributions, which leads to label smoothing in terms of label ambiguity. The label distributions are used to train convolutional neural networks (CNNs). Experiments on the aerial image dataset (AID) and NWPU_RESISC45 (NR) datasets demonstrate that using the label distributions clearly improves the classification performance by assisting feature learning and mitigating over-fitting problems, and our method achieves state-of-the-art performance.

Download Full-text

Latent Semantics Encoding for Label Distribution Learning

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/553 ◽

2019 ◽

Cited By ~ 1

Author(s):

Suping Xu ◽

Lin Shang ◽

Furao Shen

Keyword(s):

Empirical Studies ◽

Semantic Space ◽

Learning Ability ◽

Semantic Features ◽

Risk Minimization ◽

Negative Effects ◽

Real World Data ◽

Empirical Risk ◽

Label Distribution Learning ◽

Label Distribution

Label distribution learning (LDL) is a newly arisen learning paradigm to deal with label ambiguity problems, which can explore the relative importance of different labels in the description of a particular instance. Although some existing LDL algorithms have achieved better effectiveness in real applications, most of them typically emphasize on improving the learning ability by manipulating the label space, while ignoring the fact that irrelevant and redundant features exist in most practical classification learning tasks, which increase not only storage requirements but also computational overheads. Furthermore, noises in data acquisition will bring negative effects on the generalization performance of LDL algorithms. In this paper, we propose a novel algorithm, i.e., Latent Semantics Encoding for Label Distribution Learning (LSE-LDL), which learns the label distribution and implements feature selection simultaneously under the guidance of latent semantics. Specifically, to alleviate noise disturbances, we seek and encode discriminative original physical/chemical features into advanced latent semantic features, and then construct a mapping from the encoded semantic space to the label space via empirical risk minimization. Empirical studies on 15 real-world data sets validate the effectiveness of the proposed algorithm.

Download Full-text

Feature Learning for SAR Target Recognition with Unknown Classes by Using CVAE-GAN

Remote Sensing ◽

10.3390/rs13183554 ◽

2021 ◽

Vol 13 (18) ◽

pp. 3554

Author(s):

Xiaowei Hu ◽

Weike Feng ◽

Yiduo Guo ◽

Qiang Wang

Keyword(s):

Target Recognition ◽

Feature Learning ◽

Feature Space ◽

Automatic Target Recognition ◽

Data Sets ◽

Generative Adversarial Network ◽

Data Set ◽

Adversarial Network ◽

Public Data ◽

Class Labels

Even though deep learning (DL) has achieved excellent results on some public data sets for synthetic aperture radar (SAR) automatic target recognition(ATR), several problems exist at present. One is the lack of transparency and interpretability for most of the existing DL networks. Another is the neglect of unknown target classes which are often present in practice. To solve the above problems, a deep generation as well as recognition model is derived based on Conditional Variational Auto-encoder (CVAE) and Generative Adversarial Network (GAN). A feature space for SAR-ATR is built based on the proposed CVAE-GAN model. By using the feature space, clear SAR images can be generated with given class labels and observation angles. Besides, the feature of the SAR image is continuous in the feature space and can represent some attributes of the target. Furthermore, it is possible to classify the known classes and reject the unknown target classes by using the feature space. Experiments on the MSTAR data set validate the advantages of the proposed method.

Download Full-text