Incorporating Label Embedding and Feature Augmentation for Multi-Dimensional Classification

Haobo Wang; Chen Chen; Weiwei Liu; Ke Chen; Tianlei Hu; Gang Chen

doi:10.1609/aaai.v34i04.6083

Incorporating Label Embedding and Feature Augmentation for Multi-Dimensional Classification

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6083 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6178-6185 ◽

Cited By ~ 1

Author(s):

Haobo Wang ◽

Chen Chen ◽

Weiwei Liu ◽

Ke Chen ◽

Tianlei Hu ◽

...

Keyword(s):

Cross Correlation ◽

Feature Space ◽

Dimensional Classification ◽

Label Information ◽

Label Correlations ◽

Factorization Machine ◽

Feature Augmentation ◽

Real World Datasets ◽

Low Dimensional ◽

Original Feature

Feature augmentation, which manipulates the feature space by integrating the label information, is one of the most popular strategies for solving Multi-Dimensional Classification (MDC) problems. However, the vanilla feature augmentation approaches fail to consider the intra-class exclusiveness, and may achieve degenerated performance. To fill this gap, a novel neural network based model is proposed which seamlessly integrates the Label Embedding and Feature Augmentation (LEFA) techniques to learn label correlations. Specifically, based on attentional factorization machine, a cross correlation aware network is introduced to learn a low-dimensional label representation that simultaneously depicts the inter-class correlations and the intra-class exclusiveness. Then the learned latent label vector can be used to augment the original feature space. Extensive experiments on seven real-world datasets demonstrate the superiority of LEFA over state-of-the-art MDC approaches.

Download Full-text

Multi-Dimensional Classification via kNN Feature Augmentation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33013975 ◽

2019 ◽

Vol 33 ◽

pp. 3975-3982 ◽

Cited By ~ 2

Author(s):

Bin-Bin Jia ◽

Min-Ling Zhang

Keyword(s):

Feature Space ◽

Classification Performance ◽

Classification Model ◽

Data Sets ◽

Specific Class ◽

Class Membership ◽

Dimensional Classification ◽

Augmentation Techniques ◽

Feature Augmentation ◽

Original Feature

Multi-dimensional classification (MDC) deals with the problem where one instance is associated with multiple class variables, each of which specifies its class membership w.r.t. one specific class space. Existing approaches learn from MDC examples by focusing on modeling dependencies among class variables, while the potential usefulness of manipulating feature space hasn’t been investigated. In this paper, a first attempt towards feature manipulation for MDC is proposed which enriches the original feature space with kNNaugmented features. Specifically, simple counting statistics on the class membership of neighboring MDC examples are used to generate augmented feature vector. In this way, discriminative information from class space is encoded into the feature space to help train the multi-dimensional classification model. To validate the effectiveness of the proposed feature augmentation techniques, extensive experiments over eleven benchmark data sets as well as four state-of-the-art MDC approaches are conducted. Experimental results clearly show that, compared to the original feature space, classification performance of existing MDC approaches can be significantly improved by incorporating kNN-augmented features.

Download Full-text

Two-Stage Label Embedding via Neural Factorization Machine for Multi-Label Classification

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33013304 ◽

2019 ◽

Vol 33 ◽

pp. 3304-3311

Author(s):

Chen Chen ◽

Haobo Wang ◽

Weiwei Liu ◽

Xingyuan Zhao ◽

Tianlei Hu ◽

...

Keyword(s):

Learning Model ◽

Two Stage ◽

Second Stage ◽

Proposed Model ◽

Latent Space ◽

Label Correlations ◽

Factorization Machine ◽

Classification Tasks ◽

Real World Datasets ◽

Embedding Methods

Label embedding has been widely used as a method to exploit label dependency with dimension reduction in multilabel classification tasks. However, existing embedding methods intend to extract label correlations directly, and thus they might be easily trapped by complex label hierarchies. To tackle this issue, we propose a novel Two-Stage Label Embedding (TSLE) paradigm that involves Neural Factorization Machine (NFM) to jointly project features and labels into a latent space. In encoding phase, we introduce a Twin Encoding Network (TEN) that digs out pairwise feature and label interactions in the first stage and then efficiently learn higherorder correlations with deep neural networks (DNNs) in the second stage. After the codewords are obtained, a set of hidden layers is applied to recover the output labels in decoding phase. Moreover, we develop a novel learning model by leveraging a max margin encoding loss and a label-correlation aware decoding loss, and we adopt the mini-batch Adam to optimize our learning model. Lastly, we also provide a kernel insight to better understand our proposed TSLE. Extensive experiments on various real-world datasets demonstrate that our proposed model significantly outperforms other state-ofthe-art approaches.

Download Full-text

Simple and Efficient Computational Intelligence Strategies for Effective Collaborative Decisions

Future Internet ◽

10.3390/fi11010024 ◽

2019 ◽

Vol 11 (1) ◽

pp. 24

Author(s):

Emelia Opoku Aboagye ◽

Rajesh Kumar

Keyword(s):

Deep Learning ◽

Feature Space ◽

Cold Start ◽

Personalized Recommendation ◽

Tensor Factorization ◽

Scalable Systems ◽

Collaborative Recommendation ◽

Real World Datasets ◽

Low Dimensional ◽

Cold Start Problem

We approach scalability and cold start problems of collaborative recommendation in this paper. An intelligent hybrid filtering framework that maximizes feature engineering and solves cold start problem for personalized recommendation based on deep learning is proposed in this paper. Present e-commerce sites mainly recommend pertinent items or products to a lot of users through personalized recommendation. Such personalization depends on large extent on scalable systems which strategically responds promptly to the request of the numerous users accessing the site (new users). Tensor Factorization (TF) provides scalable and accurate approach for collaborative filtering in such environments. In this paper, we propose a hybrid-based system to address scalability problems in such environments. We propose to use a multi-task approach which represent multiview data from users, according to their purchasing and rating history. We use a Deep Learning approach to map item and user inter-relationship to a low dimensional feature space where item-user resemblance and their preferred items is maximized. The evaluation results from real world datasets show that, our novel deep learning multitask tensor factorization (NeuralFil) analysis is computationally less expensive, scalable and addresses the cold-start problem through explicit multi-task approach for optimal recommendation decision making.

Download Full-text

Improving Multi-Label Learning by Correlation Embedding

Applied Sciences ◽

10.3390/app112412145 ◽

2021 ◽

Vol 11 (24) ◽

pp. 12145

Author(s):

Jun Huang ◽

Qian Xu ◽

Xiwen Qu ◽

Yaojin Lin ◽

Xiao Zheng

Keyword(s):

Feature Space ◽

Classification Model ◽

Unified Framework ◽

Single Instance ◽

Nonlinear Version ◽

Correlation Graph ◽

Label Correlations ◽

Class Labels ◽

Low Dimensional ◽

Label Correlation

In multi-label learning, each object is represented by a single instance and is associated with more than one class labels, where the labels might be correlated with each other. As we all know, exploiting label correlations can definitely improve the performance of a multi-label classification model. Existing methods mainly model label correlations in an indirect way, i.e., adding extra constraints on the coefficients or outputs of a model based on a pre-learned label correlation graph. Meanwhile, the high dimension of the feature space also poses great challenges to multi-label learning, such as high time and memory costs. To solve the above mentioned issues, in this paper, we propose a new approach for Multi-Label Learning by Correlation Embedding, namely MLLCE, where the feature space dimension reduction and the multi-label classification are integrated into a unified framework. Specifically, we project the original high-dimensional feature space to a low-dimensional latent space by a mapping matrix. To model label correlation, we learn an embedding matrix from the pre-defined label correlation graph by graph embedding. Then, we construct a multi-label classifier from the low-dimensional latent feature space to the label space, where the embedding matrix is utilized as the model coefficients. Finally, we extend the proposed method MLLCE to the nonlinear version, i.e., NL-MLLCE. The comparison experiment with the state-of-the-art approaches shows that the proposed method MLLCE has a competitive performance in multi-label learning.

Download Full-text

Discriminative and Correlative Partial Multi-Label Learning

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/512 ◽

2019 ◽

Cited By ~ 4

Author(s):

Haobo Wang ◽

Weiwei Liu ◽

Yang Zhao ◽

Chen Zhang ◽

Tianlei Hu ◽

...

Keyword(s):

Real World ◽

State Of The Art ◽

Feature Space ◽

Gradient Boosting ◽

Training Procedure ◽

Second Stage ◽

Label Correlations ◽

Real World Datasets ◽

Partial Label Learning ◽

Confidence Value

In partial label learning (PML), each instance is associated with a candidate label set that contains multiple relevant labels and other false positive labels. The most challenging issue for the PML is that the training procedure is prone to be affected by the labeling noise. We observe that state-of-the-art PML methods are either powerless to disambiguate the correct labels from the candidate labels or incapable of extracting the label correlations sufficiently. To fill this gap, a two-stage DiscRiminative and correlAtive partial Multi-label leArning (DRAMA) algorithm is presented in this work. In the first stage, a confidence value is learned for each label by utilizing the feature manifold, which indicates how likely a label is correct. In the second stage, a gradient boosting model is induced to fit the label confidences. Specifically, to explore the label correlations, we augment the feature space by the previously elicited labels on each boosting round. Extensive experiments on various real-world datasets clearly validate the superiority of our proposed method.

Download Full-text

Analyzing Intra-Speaker and Inter-Speaker Vocal Tract Impedance Characteristics in a Low-Dimensional Feature Space Using t-SNE

10.21437/interspeech.2019-1492 ◽

2019 ◽

Author(s):

Balamurali B.T. ◽

Jer-Ming Chen

Keyword(s):

Vocal Tract ◽

Feature Space ◽

Impedance Characteristics ◽

Low Dimensional

Download Full-text

TransET: Knowledge Graph Embedding with Entity Types

Electronics ◽

10.3390/electronics10121407 ◽

2021 ◽

Vol 10 (12) ◽

pp. 1407

Author(s):

Peng Wang ◽

Jing Zhou ◽

Yuzhang Liu ◽

Xingchen Zhou

Keyword(s):

Link Prediction ◽

State Of The Art ◽

Score Function ◽

Graph Embedding ◽

Vector Spaces ◽

Knowledge Graph ◽

Semantic Features ◽

Knowledge Graphs ◽

Real World Datasets ◽

Low Dimensional

Knowledge graph embedding aims to embed entities and relations into low-dimensional vector spaces. Most existing methods only focus on triple facts in knowledge graphs. In addition, models based on translation or distance measurement cannot fully represent complex relations. As well-constructed prior knowledge, entity types can be employed to learn the representations of entities and relations. In this paper, we propose a novel knowledge graph embedding model named TransET, which takes advantage of entity types to learn more semantic features. More specifically, circle convolution based on the embeddings of entity and entity types is utilized to map head entity and tail entity to type-specific representations, then translation-based score function is used to learn the presentation triples. We evaluated our model on real-world datasets with two benchmark tasks of link prediction and triple classification. Experimental results demonstrate that it outperforms state-of-the-art models in most cases.

Download Full-text

MULFE: Multi-Label Learning via Label-Specific Feature Space Ensemble

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3451392 ◽

2021 ◽

Vol 16 (1) ◽

pp. 1-24

Author(s):

Yaojin Lin ◽

Qinghua Hu ◽

Jinghua Liu ◽

Xingquan Zhu ◽

Xindong Wu

Keyword(s):

Empirical Studies ◽

Feature Space ◽

Training Data ◽

Data Sets ◽

Learning Framework ◽

Feature Spaces ◽

Public Data ◽

Margin Distribution ◽

Label Correlations ◽

Label Correlation

In multi-label learning, label correlations commonly exist in the data. Such correlation not only provides useful information, but also imposes significant challenges for multi-label learning. Recently, label-specific feature embedding has been proposed to explore label-specific features from the training data, and uses feature highly customized to the multi-label set for learning. While such feature embedding methods have demonstrated good performance, the creation of the feature embedding space is only based on a single label, without considering label correlations in the data. In this article, we propose to combine multiple label-specific feature spaces, using label correlation, for multi-label learning. The proposed algorithm, mu lti- l abel-specific f eature space e nsemble (MULFE), takes consideration label-specific features, label correlation, and weighted ensemble principle to form a learning framework. By conducting clustering analysis on each label’s negative and positive instances, MULFE first creates features customized to each label. After that, MULFE utilizes the label correlation to optimize the margin distribution of the base classifiers which are induced by the related label-specific feature spaces. By combining multiple label-specific features, label correlation based weighting, and ensemble learning, MULFE achieves maximum margin multi-label classification goal through the underlying optimization framework. Empirical studies on 10 public data sets manifest the effectiveness of MULFE.

Download Full-text

A Fusion Feature Extraction Method Using EEMD and Correlation Coefficient Analysis for Bearing Fault Diagnosis

Applied Sciences ◽

10.3390/app8091621 ◽

2018 ◽

Vol 8 (9) ◽

pp. 1621 ◽

Cited By ~ 11

Author(s):

Fan Jiang ◽

Zhencai Zhu ◽

Wei Li ◽

Yong Ren ◽

Gongbo Zhou ◽

...

Keyword(s):

Fault Diagnosis ◽

Correlation Coefficient ◽

Feature Space ◽

Support Vector ◽

Vibration Signals ◽

Bearing Fault ◽

Bearing Fault Diagnosis ◽

Acceleration Sensors ◽

Reconstructed Signal ◽

Original Feature

Acceleration sensors are frequently applied to collect vibration signals for bearing fault diagnosis. To fully use these vibration signals of multi-sensors, this paper proposes a new approach to fuse multi-sensor information for bearing fault diagnosis by using ensemble empirical mode decomposition (EEMD), correlation coefficient analysis, and support vector machine (SVM). First, EEMD is applied to decompose the vibration signal into a set of intrinsic mode functions (IMFs), and a correlation coefficient ratio factor (CCRF) is defined to select sensitive IMFs to reconstruct new vibration signals for further feature fusion analysis. Second, an original feature space is constructed from the reconstructed signal. Afterwards, weights are assigned by correlation coefficients among the vibration signals of the considered multi-sensors, and the so-called fused features are extracted by the obtained weights and original feature space. Finally, a trained SVM is employed as the classifier for bearing fault diagnosis. The diagnosis results of the original vibration signals, the first IMF, the proposed reconstruction signal, and the proposed method are 73.33%, 74.17%, 95.83% and 100%, respectively. Therefore, the experiments show that the proposed method has the highest diagnostic accuracy, and it can be regarded as a new way to improve diagnosis results for bearings.

Download Full-text

A Machine Learning Approach to Reveal the NeuroPhenotypes of Autisms

International Journal of Neural Systems ◽

10.1142/s0129065718500582 ◽

2019 ◽

Vol 29 (07) ◽

pp. 1850058 ◽

Cited By ~ 8

Author(s):

Juan M. Górriz ◽

Javier Ramírez ◽

F. Segovia ◽

Francisco J. Martínez ◽

Meng-Chuan Lai ◽

...

Keyword(s):

Machine Learning ◽

Brain Structure ◽

Feature Space ◽

Classification Problem ◽

Small Sample ◽

Biological Sex ◽

Machine Learning Approach ◽

Learning Machine ◽

Small Sample Sizes ◽

Low Dimensional

Although much research has been undertaken, the spatial patterns, developmental course, and sexual dimorphism of brain structure associated with autism remains enigmatic. One of the difficulties in investigating differences between the sexes in autism is the small sample sizes of available imaging datasets with mixed sex. Thus, the majority of the investigations have involved male samples, with females somewhat overlooked. This paper deploys machine learning on partial least squares feature extraction to reveal differences in regional brain structure between individuals with autism and typically developing participants. A four-class classification problem (sex and condition) is specified, with theoretical restrictions based on the evaluation of a novel upper bound in the resubstitution estimate. These conditions were imposed on the classifier complexity and feature space dimension to assure generalizable results from the training set to test samples. Accuracies above [Formula: see text] on gray and white matter tissues estimated from voxel-based morphometry (VBM) features are obtained in a sample of equal-sized high-functioning male and female adults with and without autism ([Formula: see text], [Formula: see text]/group). The proposed learning machine revealed how autism is modulated by biological sex using a low-dimensional feature space extracted from VBM. In addition, a spatial overlap analysis on reference maps partially corroborated predictions of the “extreme male brain” theory of autism, in sexual dimorphic areas.

Download Full-text