Unsupervised feature selection based on feature relevance

Author(s):  
Feng Zhang ◽  
Ya-Jun Zhao ◽  
Jun-Fen
Author(s):  
Jundong Li ◽  
Jiliang Tang ◽  
Huan Liu

Feature selection has been proven to be effective and efficient in preparing high-dimensional data for data mining and machine learning problems. Since real-world data is usually unlabeled, unsupervised feature selection has received increasing attention in recent years. Without label information, unsupervised feature selection needs alternative criteria to define feature relevance. Recently, data reconstruction error emerged as a new criterion for unsupervised feature selection, which defines feature relevance as the capability of features to approximate original data via a reconstruction function. Most existing algorithms in this family assume predefined, linear reconstruction functions. However, the reconstruction function should be data dependent and may not always be linear especially when the original data is high-dimensional. In this paper, we investigate how to learn the reconstruction function from the data automatically for unsupervised feature selection, and propose a novel reconstruction-based unsupervised feature selection framework REFS, which embeds the reconstruction function learning process into feature selection. Experiments on various types of real-world datasets demonstrate the effectiveness of the proposed framework REFS.


Sensors ◽  
2021 ◽  
Vol 21 (11) ◽  
pp. 3627
Author(s):  
Bo Jin ◽  
Chunling Fu ◽  
Yong Jin ◽  
Wei Yang ◽  
Shengbin Li ◽  
...  

Identifying the key genes related to tumors from gene expression data with a large number of features is important for the accurate classification of tumors and to make special treatment decisions. In recent years, unsupervised feature selection algorithms have attracted considerable attention in the field of gene selection as they can find the most discriminating subsets of genes, namely the potential information in biological data. Recent research also shows that maintaining the important structure of data is necessary for gene selection. However, most current feature selection methods merely capture the local structure of the original data while ignoring the importance of the global structure of the original data. We believe that the global structure and local structure of the original data are equally important, and so the selected genes should maintain the essential structure of the original data as far as possible. In this paper, we propose a new, adaptive, unsupervised feature selection scheme which not only reconstructs high-dimensional data into a low-dimensional space with the constraint of feature distance invariance but also employs ℓ2,1-norm to enable a matrix with the ability to perform gene selection embedding into the local manifold structure-learning framework. Moreover, an effective algorithm is developed to solve the optimization problem based on the proposed scheme. Comparative experiments with some classical schemes on real tumor datasets demonstrate the effectiveness of the proposed method.


2021 ◽  
Author(s):  
Yan Min ◽  
Mao Ye ◽  
Liang Tian ◽  
Yulin Jian ◽  
Ce Zhu ◽  
...  

2021 ◽  
Vol 173 ◽  
pp. 114643
Author(s):  
Jianyu Miao ◽  
Yuan Ping ◽  
Zhensong Chen ◽  
Xiao-Bo Jin ◽  
Peijia Li ◽  
...  

2014 ◽  
Vol 44 (6) ◽  
pp. 793-804 ◽  
Author(s):  
Chenping Hou ◽  
Feiping Nie ◽  
Xuelong Li ◽  
Dongyun Yi ◽  
Yi Wu

Sign in / Sign up

Export Citation Format

Share Document