Unsupervised feature selection based on feature relevance

2017 ◽

Cited By ~ 13

Author(s):

Jundong Li ◽

Jiliang Tang ◽

Huan Liu

Keyword(s):

Feature Selection ◽

Real World ◽

Original Data ◽

Reconstruction Error ◽

High Dimensional ◽

Real World Data ◽

Unsupervised Feature Selection ◽

Feature Relevance ◽

Selection Framework ◽

Real World Datasets

Feature selection has been proven to be effective and efficient in preparing high-dimensional data for data mining and machine learning problems. Since real-world data is usually unlabeled, unsupervised feature selection has received increasing attention in recent years. Without label information, unsupervised feature selection needs alternative criteria to define feature relevance. Recently, data reconstruction error emerged as a new criterion for unsupervised feature selection, which defines feature relevance as the capability of features to approximate original data via a reconstruction function. Most existing algorithms in this family assume predefined, linear reconstruction functions. However, the reconstruction function should be data dependent and may not always be linear especially when the original data is high-dimensional. In this paper, we investigate how to learn the reconstruction function from the data automatically for unsupervised feature selection, and propose a novel reconstruction-based unsupervised feature selection framework REFS, which embeds the reconstruction function learning process into feature selection. Experiments on various types of real-world datasets demonstrate the effectiveness of the proposed framework REFS.

Download Full-text

Block Model Guided Unsupervised Feature Selection

Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining ◽

10.1145/3394486.3403173 ◽

2020 ◽

Author(s):

Zilong Bai ◽

Hoa Nguyen ◽

Ian Davidson

Keyword(s):

Feature Selection ◽

Block Model ◽

Unsupervised Feature Selection

Download Full-text

Unsupervised Feature Selection With Extended OLSDA via Embedding Nonnegative Manifold Structure

IEEE Transactions on Neural Networks and Learning Systems ◽

10.1109/tnnls.2020.3045053 ◽

2021 ◽

pp. 1-7

Author(s):

Rui Zhang ◽

Hongyuan Zhang ◽

Xuelong Li ◽

Sheng Yang

Keyword(s):

Feature Selection ◽

Unsupervised Feature Selection ◽

Manifold Structure

Download Full-text

Unsupervised Feature Selection using Pseudo Label Approximation

2021 13th International Conference on Machine Learning and Computing ◽

10.1145/3457682.3457758 ◽

2021 ◽

Author(s):

Ren Deng ◽

Ye Liu ◽

Liyan Luo ◽

DongJing Chen ◽

Xijie Li

Keyword(s):

Feature Selection ◽

Unsupervised Feature Selection

Download Full-text

Cross-view Locality Preserved Diversity and Consensus Learning for Multi-view Unsupervised Feature Selection

IEEE Transactions on Knowledge and Data Engineering ◽

10.1109/tkde.2020.3048678 ◽

2021 ◽

pp. 1-1

Author(s):

Chang Tang ◽

Xiao Zheng ◽

Xinwang Liu ◽

Wei Zhang ◽

Jing Zhang ◽

...

Keyword(s):

Feature Selection ◽

Unsupervised Feature Selection

Download Full-text

An Adaptive Unsupervised Feature Selection Algorithm Based on MDS for Tumor Gene Data Classification

Sensors ◽

10.3390/s21113627 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3627

Author(s):

Bo Jin ◽

Chunling Fu ◽

Yong Jin ◽

Wei Yang ◽

Shengbin Li ◽

...

Keyword(s):

Feature Selection ◽

Local Structure ◽

Gene Selection ◽

Dimensional Space ◽

Original Data ◽

Global Structure ◽

Biological Data ◽

Special Treatment ◽

Selection Scheme ◽

Unsupervised Feature Selection

Identifying the key genes related to tumors from gene expression data with a large number of features is important for the accurate classification of tumors and to make special treatment decisions. In recent years, unsupervised feature selection algorithms have attracted considerable attention in the field of gene selection as they can find the most discriminating subsets of genes, namely the potential information in biological data. Recent research also shows that maintaining the important structure of data is necessary for gene selection. However, most current feature selection methods merely capture the local structure of the original data while ignoring the importance of the global structure of the original data. We believe that the global structure and local structure of the original data are equally important, and so the selected genes should maintain the essential structure of the original data as far as possible. In this paper, we propose a new, adaptive, unsupervised feature selection scheme which not only reconstructs high-dimensional data into a low-dimensional space with the constraint of feature distance invariance but also employs ℓ2,1-norm to enable a matrix with the ability to perform gene selection embedding into the local manifold structure-learning framework. Moreover, an effective algorithm is developed to solve the optimization problem based on the proposed scheme. Comparative experiments with some classical schemes on real tumor datasets demonstrate the effectiveness of the proposed method.

Download Full-text