Class-Dependent Weighted Feature Selection as a Bi-Level Optimization Problem

SimulaD: A Novel Feature Selection Heuristics for Discrete Data

10.20944/preprints202102.0260.v2 ◽

2021 ◽

Author(s):

Mohammad Reza Besharati ◽

Mohammad Izadi

Keyword(s):

Feature Selection ◽

Optimization Problem ◽

Window Size ◽

Discrete Data ◽

Simulated Distillation ◽

Parameters Tuning

By applying a running average (with a window-size= d), we could transform Discrete data to broad-range, Continuous values. When we have more than 2 columns and one of them is containing data about the tags of classification (Class Column), we could compare and sort the features (Non-class Columns) based on the R2 coefficient of the regression for running averages. The parameters tuning could help us to select the best features (the non-class columns which have the best correlation with the Class Column). “Window size” and “Ordering” could be tuned to achieve the goal. this optimization problem is hard and we need an Algorithm (or Heuristics) for simplifying this tuning. We demonstrate a novel heuristics, Called Simulated Distillation (SimulaD), which could help us to gain a somehow good results with this optimization problem.

Download Full-text

SimulaD: A Novel Feature Selection Heuristics for Discrete Data

10.20944/preprints202102.0260.v1 ◽

2021 ◽

Author(s):

Mohammad Reza Besharati ◽

Mohammad Izadi

Keyword(s):

Feature Selection ◽

Optimization Problem ◽

Window Size ◽

Discrete Data ◽

Simulated Distillation ◽

Parameters Tuning

By applying a running average (with a window-size= d), we could transform Discrete data to broad-range, Continuous values. When we have more than 2 columns and one of them is containing data about the tags of classification (Class Column), we could compare and sort the features (Non-class Columns) based on the R2 coefficient of the regression for running averages. The parameters tuning could help us to select the best features (the non-class columns which have the best correlation with the Class Column). “Window size” and “Ordering” could be tuned to achieve the goal. this optimization problem is hard and we need an Algorithm (or Heuristics) for simplifying this tuning. We demonstrate a novel heuristics, Called Simulated Distillation (SimulaD), which could help us to gain a somehow good results with this optimization problem.

Download Full-text

Feature Extraction for Kernel Minimum Squared Error by Sparsity Shrinkage

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.536-537.450 ◽

2014 ◽

Vol 536-537 ◽

pp. 450-453 ◽

Cited By ~ 2

Author(s):

Jiang Jiang ◽

Xi Chen ◽

Hai Tao Gan

Keyword(s):

Feature Extraction ◽

Feature Selection ◽

Optimization Problem ◽

Subset Selection ◽

Experimental Results ◽

Computational Burden ◽

Squared Error ◽

Kernel Minimum Squared Error ◽

Benchmark Datasets ◽

Training Examples

In this paper, a sparsity based model is proposed for feature selection in kernel minimum squared error (KMSE). By imposing a sparsity shrinkage term, we formulate the procedure of subset selection as an optimization problem. With the chosen small portion of training examples, the computational burden of feature extraction is largely alleviated. Experimental results conducted on several benchmark datasets indicate the effectivity and efficiency of our method.

Download Full-text

SimulaD: A Novel Feature Selection Heuristics for Discrete Data

10.20944/preprints202102.0260.v3 ◽

2021 ◽

Author(s):

Mohammad Reza Besharati ◽

Mohammad Izadi

Keyword(s):

Feature Selection ◽

Optimization Problem ◽

Window Size ◽

Discrete Data ◽

Simulated Distillation ◽

Parameters Tuning

By applying a running average (with a window-size= d), we could transform Discrete data to broad-range, Continuous values. When we have more than 2 columns and one of them is containing data about the tags of classification (Class Column), we could compare and sort the features (Non-class Columns) based on the R2 coefficient of the regression for running averages. The parameters tuning could help us to select the best features (the non-class columns which have the best correlation with the Class Column). “Window size” and “Ordering” could be tuned to achieve the goal. this optimization problem is hard and we need an Algorithm (or Heuristics) for simplifying this tuning. We demonstrate a novel heuristics, Called Simulated Distillation (SimulaD), which could help us to gain a somehow good results with this optimization problem.

Download Full-text

GSPL: A Succinct Kernel Model for Group-Sparse Projections Learning of Multiview Data

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/438 ◽

2021 ◽

Author(s):

Danyang Wu ◽

Jin Xu ◽

Xia Dong ◽

Meng Liao ◽

Rong Wang ◽

...

Keyword(s):

Feature Selection ◽

Iterative Algorithm ◽

Optimization Problem ◽

Selection Task ◽

Experimental Results ◽

Multiple Views ◽

Intrinsic Parameters ◽

Norm Constraint ◽

Kernel Model

This paper explores a succinct kernel model for Group-Sparse Projections Learning (GSPL), to handle multiview feature selection task completely. Compared to previous works, our model has the following useful properties: 1) Strictness: GSPL innovatively learns group-sparse projections strictly on multiview data via ‘2;0-norm constraint, which is different with previous works that encourage group-sparse projections softly. 2) Adaptivity: In GSPL model, when the total number of selected features is given, the numbers of selected features of different views can be determined adaptively, which avoids artificial settings. Besides, GSPL can capture the differences among multiple views adaptively, which handles the inconsistent problem among different views. 3) Succinctness: Except for the intrinsic parameters of projection-based feature selection task, GSPL does not bring extra parameters, which guarantees the applicability in practice. To solve the optimization problem involved in GSPL, a novel iterative algorithm is proposed with rigorously theoretical guarantees. Experimental results demonstrate the superb performance of GSPL on synthetic and real datasets.

Download Full-text

Sequential Feature selection in a multi-objective optimization problem

IFAC Proceedings Volumes ◽

10.3182/20110828-6-it-1002.02339 ◽

2011 ◽

Vol 44 (1) ◽

pp. 10553-10558

Author(s):

Claudio Carnevale ◽

Giovanna Finzi ◽

Enrico Pisoni ◽

Marialuisa Volta

Keyword(s):

Feature Selection ◽

Optimization Problem ◽

Multi Objective Optimization ◽

Multi Objective ◽

Sequential Feature Selection

Download Full-text

Feature Selection Based on Clonal Selection Algorithm

Handbook of Research on Artificial Immune Systems and Natural Computing ◽

10.4018/978-1-60566-310-4.ch009 ◽

2011 ◽

pp. 180-203

Author(s):

Xiangrong Zhang ◽

Fang Liu

Keyword(s):

Feature Selection ◽

Optimization Problem ◽

Clonal Selection ◽

Good Choice ◽

Affinity Maturation ◽

Clonal Selection Algorithm ◽

Feature Subset ◽

Selection Algorithm ◽

Rapid Convergence ◽

Optimal Feature Subset

The problem of feature selection is fundamental in various tasks like classification, data mining, image processing, conceptual learning, and so on. Feature selection is usually used to achieve the same or better performance using fewer features. It can be considered as an optimization problem and aims to find an optimal feature subset from the available features according to a certain criterion function. Clonal selection algorithm is a good choice in solving an optimization problem. It introduces the mechanisms of affinity maturation, clone, and memorization. Rapid convergence and good global searching capability characterize the performance of the corresponding operations. In this study, the property of rapid convergence to global optimum of clonal selection algorithm is made use of to speed up the searching of the most appropriate feature subset among a huge number of possible feature combinations. Compared with the traditional genetic algorithm-based feature selection, the clonal selection algorithm-based feature selection can find a better feature subset for classification. Experimental results on datasets from UCI learning repository, 16 types of Brodatz textures classification, and synthetic aperture radar (SAR) images classification demonstrated the effectiveness and good performance of the method in applications.

Download Full-text

Hybrid binary Sine Cosine Algorithm and Ant Lion Optimization (SCALO) approaches for feature selection problem

International Journal of Computational Materials Science and Engineering ◽

10.1142/s2047684119500210 ◽

2020 ◽

Vol 09 (01) ◽

pp. 1950021

Author(s):

Rahul Hans ◽

Harjot Kaur

Keyword(s):

Feature Selection ◽

Optimization Problem ◽

Transfer Functions ◽

Selection Problem ◽

Feature Selection Problem ◽

Ant Lion Optimization ◽

Ant Lion Optimizer ◽

Sine Cosine Algorithm ◽

Ant Lion ◽

Better Than

These days, a massive quantity of data is produced online and is incorporated into a variety of datasets in the form of features, however there are lot of features in these datasets that may not be relevant to the problem. In this perspective, feature selection aids to improve the classification accuracy with lesser number of features, which can be well thought-out as an optimization problem. In this paper, Sine Cosine Algorithm (SCA) hybridized with Ant Lion Optimizer (ALO) to form a hybrid Sine Cosine Ant Lion Optimizer (SCALO) is proposed. The proposed algorithm is mapped to its binary versions by using the concept of transfer functions, with the objective to eliminate the inappropriate features and to enhance the accuracy of the classification algorithm (or in any case remains the same). For the purpose of experimentation, this research considers 18 diverse datasets and moreover, the performance of the binary versions of SCALO is compared with some of the latest metaheuristic algorithms, on the basis of various criterions. It can be observed that the binary versions of SCALO perform better than the other algorithms on various evaluation criterions for solving feature selection problem.

Download Full-text