SimulaD: A Novel Feature Selection Heuristics for Discrete Data

10.20944/preprints202102.0260.v1 ◽

2021 ◽

Author(s):

Mohammad Reza Besharati ◽

Mohammad Izadi

Keyword(s):

Feature Selection ◽

Optimization Problem ◽

Window Size ◽

Discrete Data ◽

Simulated Distillation ◽

Parameters Tuning

By applying a running average (with a window-size= d), we could transform Discrete data to broad-range, Continuous values. When we have more than 2 columns and one of them is containing data about the tags of classification (Class Column), we could compare and sort the features (Non-class Columns) based on the R2 coefficient of the regression for running averages. The parameters tuning could help us to select the best features (the non-class columns which have the best correlation with the Class Column). “Window size” and “Ordering” could be tuned to achieve the goal. this optimization problem is hard and we need an Algorithm (or Heuristics) for simplifying this tuning. We demonstrate a novel heuristics, Called Simulated Distillation (SimulaD), which could help us to gain a somehow good results with this optimization problem.

Download Full-text

SimulaD: A Novel Feature Selection Heuristics for Discrete Data

10.20944/preprints202102.0260.v3 ◽

2021 ◽

Author(s):

Mohammad Reza Besharati ◽

Mohammad Izadi

Keyword(s):

Feature Selection ◽

Optimization Problem ◽

Window Size ◽

Discrete Data ◽

Simulated Distillation ◽

Parameters Tuning

By applying a running average (with a window-size= d), we could transform Discrete data to broad-range, Continuous values. When we have more than 2 columns and one of them is containing data about the tags of classification (Class Column), we could compare and sort the features (Non-class Columns) based on the R2 coefficient of the regression for running averages. The parameters tuning could help us to select the best features (the non-class columns which have the best correlation with the Class Column). “Window size” and “Ordering” could be tuned to achieve the goal. this optimization problem is hard and we need an Algorithm (or Heuristics) for simplifying this tuning. We demonstrate a novel heuristics, Called Simulated Distillation (SimulaD), which could help us to gain a somehow good results with this optimization problem.

Download Full-text

SimulaD: A Novel Feature Selection Heuristics For Discrete Data

10.21203/rs.3.rs-885675/v1 ◽

2021 ◽

Author(s):

Mohammad Reza Besharati ◽

Mohammad Izadi

Keyword(s):

Optimization Problem ◽

Window Size ◽

Discrete Data ◽

Limited Range ◽

Simulated Distillation ◽

Machine Learning Methods ◽

Parameters Tuning ◽

Data Points ◽

Conventional Machine ◽

Range Of Values

Abstract For discrete big data which have a limited range of values, Conventional machine learning methods cannot be applied because we see clutter and overlapping of classes in such data: many data points from different classes overlap. In this paper we introduce a solution for this problem through a novel heuristics method. By applying a running average (with a window-size= d) we could transform Discrete data to broad-range, Continuous values. When we have more than 2 columns and one of them is containing data about the tags of classification (Class Column), we could compare and sort the features (Non-class Columns) based on the R2 coefficient of the regression for running averages. The parameters tuning could help us to select the best features (the non-class columns which have the best correlation with the Class Column). “Window size” and “Ordering” could be tuned to achieve the goal. This optimization problem is hard and we need an Algorithm (or Heuristics) for simplifying this tuning. We demonstrate a novel heuristics, Called Simulated Distillation (SimulaD), which could help us to gain a somehow good results with this optimization problem.

Download Full-text

Class-Dependent Weighted Feature Selection as a Bi-Level Optimization Problem

Communications in Computer and Information Science - Neural Information Processing ◽

10.1007/978-3-030-63823-8_32 ◽

2020 ◽

pp. 269-278

Author(s):

Marwa Hammami ◽

Slim Bechikh ◽

Chih-Cheng Hung ◽

Lamjed Ben Said

Keyword(s):

Feature Selection ◽

Optimization Problem

Download Full-text

Exploiting the Bin-Class Histograms for Feature Selection on Discrete Data

Pattern Recognition and Image Analysis - Lecture Notes in Computer Science ◽

10.1007/978-3-319-19390-8_39 ◽

2015 ◽

pp. 345-353

Author(s):

Artur J. Ferreira ◽

Mário A. T. Figueiredo

Keyword(s):

Feature Selection ◽

Discrete Data

Download Full-text

A Correlation-Change Based Feature Selection Method for IoT Equipment Anomaly Detection

Applied Sciences ◽

10.3390/app9030437 ◽

2019 ◽

Vol 9 (3) ◽

pp. 437 ◽

Cited By ~ 9

Author(s):

Shen Su ◽

Yanbin Sun ◽

Xiangsong Gao ◽

Jing Qiu ◽

Zhihong Tian

Keyword(s):

Feature Selection ◽

Anomaly Detection ◽

False Negative ◽

Feature Selection Method ◽

Window Size ◽

Selection Method ◽

Sensor Data ◽

Data Correlation ◽

The Right ◽

Sensor Clustering

Selecting the right features for further data analysis is important in the process of equipment anomaly detection, especially when the origin data source involves high dimensional data with a low value density. However, existing researches failed to capture the fact that the sensor data are usually correlated (e.g., duplicated deployed sensors), and the correlations would be broken when anomalies occur with happen to the monitored equipment. In this paper, we propose to capture such sensor data correlation changes to improve the performance of IoT (Internet of Things) equipment anomaly detection. In our feature selection method, we first cluster correlated sensors together to recognize the duplicated deployed sensors according to sensor data correlations, and we monitor the data correlation changes in real time to select the sensors with correlation changes as the representative features for anomaly detection. To that end, (1) we conducted curve alignment for the sensor clustering; (2) we discuss the appropriate window size for data correlation calculation; (3) and adopted MCFS (Multi-Cluster Feature Selection) into our method to adapt to the online feature selection scenario. According to the experiment evaluation derived from real IoT equipment, we prove that our method manages to reduce the false negative of IoT equipment anomaly detection of 30% with almost the same level of false positive.

Download Full-text

Feature Extraction for Kernel Minimum Squared Error by Sparsity Shrinkage

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.536-537.450 ◽

2014 ◽

Vol 536-537 ◽

pp. 450-453 ◽

Cited By ~ 2

Author(s):

Jiang Jiang ◽

Xi Chen ◽

Hai Tao Gan

Keyword(s):

Feature Extraction ◽

Feature Selection ◽

Optimization Problem ◽

Subset Selection ◽

Experimental Results ◽

Computational Burden ◽

Squared Error ◽

Kernel Minimum Squared Error ◽

Benchmark Datasets ◽

Training Examples

In this paper, a sparsity based model is proposed for feature selection in kernel minimum squared error (KMSE). By imposing a sparsity shrinkage term, we formulate the procedure of subset selection as an optimization problem. With the chosen small portion of training examples, the computational burden of feature extraction is largely alleviated. Experimental results conducted on several benchmark datasets indicate the effectivity and efficiency of our method.

Download Full-text

Bayesian Treatment of Incomplete Discrete Data Applied to Mutual Information and Feature Selection

KI 2003: Advances in Artificial Intelligence - Lecture Notes in Computer Science ◽

10.1007/978-3-540-39451-8_29 ◽

2003 ◽

pp. 396-406 ◽

Cited By ~ 3

Author(s):

Marcus Hutter ◽

Marco Zaffalon

Keyword(s):

Feature Selection ◽

Mutual Information ◽

Discrete Data

Download Full-text

A Comparative Study of Meta-Heuristic and Conventional Search in Optimization of Multi-Dimensional Feature Selection

International Journal of Applied Metaheuristic Computing ◽

10.4018/ijamc.292517 ◽

2022 ◽

Vol 13 (1) ◽

pp. 0-0

Keyword(s):

Feature Selection ◽

Support Vector ◽

Feature Selection Problem ◽

Adaptive Parameters ◽

Parameters Tuning ◽

Document Categorization ◽

Comparative Results ◽

Search Approach ◽

Performance Results ◽

Optimal Feature

Algorithmic – based search approach is ineffective at addressing the problem of multi-dimensional feature selection for document categorization. This study proposes the use of meta heuristic based search approach for optimal feature selection. Elephant optimization (EO) and Ant Colony optimization (ACO) algorithms coupled with Naïve Bayes (NB), Support Vector Machin (SVM), and J48 classifiers were used to highlight the optimization capability of meta-heuristic search for multi-dimensional feature selection problem in document categorization. In addition, the performance results for feature selection using the two meta-heuristic based approaches (EO and ACO) were compared with conventional Best First Search (BFS) and Greedy Stepwise (GS) algorithms on news document categorization. The comparative results showed that global optimal feature subsets were attained using adaptive parameters tuning in meta-heuristic based feature selection optimization scheme. In addition, the selected number of feature subsets were minimized dramatically for document classification.

Download Full-text

GSPL: A Succinct Kernel Model for Group-Sparse Projections Learning of Multiview Data

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/438 ◽

2021 ◽

Author(s):

Danyang Wu ◽

Jin Xu ◽

Xia Dong ◽

Meng Liao ◽

Rong Wang ◽

...

Keyword(s):

Feature Selection ◽

Iterative Algorithm ◽

Optimization Problem ◽

Selection Task ◽

Experimental Results ◽

Multiple Views ◽

Intrinsic Parameters ◽

Norm Constraint ◽

Kernel Model

This paper explores a succinct kernel model for Group-Sparse Projections Learning (GSPL), to handle multiview feature selection task completely. Compared to previous works, our model has the following useful properties: 1) Strictness: GSPL innovatively learns group-sparse projections strictly on multiview data via ‘2;0-norm constraint, which is different with previous works that encourage group-sparse projections softly. 2) Adaptivity: In GSPL model, when the total number of selected features is given, the numbers of selected features of different views can be determined adaptively, which avoids artificial settings. Besides, GSPL can capture the differences among multiple views adaptively, which handles the inconsistent problem among different views. 3) Succinctness: Except for the intrinsic parameters of projection-based feature selection task, GSPL does not bring extra parameters, which guarantees the applicability in practice. To solve the optimization problem involved in GSPL, a novel iterative algorithm is proposed with rigorously theoretical guarantees. Experimental results demonstrate the superb performance of GSPL on synthetic and real datasets.

Download Full-text