An effective procedure for feature subset selection in logistic regression based on information criteria

AbstractIn this paper, the problem of best subset selection in logistic regression is addressed. In particular, we take into account formulations of the problem resulting from the adoption of information criteria, such as AIC or BIC, as goodness-of-fit measures. There exist various methods to tackle this problem. Heuristic methods are computationally cheap, but are usually only able to find low quality solutions. Methods based on local optimization suffer from similar limitations as heuristic ones. On the other hand, methods based on mixed integer reformulations of the problem are much more effective, at the cost of higher computational requirements, that become unsustainable when the problem size grows. We thus propose a new approach, which combines mixed-integer programming and decomposition techniques in order to overcome the aforementioned scalability issues. We provide a theoretical characterization of the proposed algorithm properties. The results of a vast numerical experiment, performed on widely available datasets, show that the proposed method achieves the goal of outperforming state-of-the-art techniques.

Download Full-text

Feature subset selection for logistic regression via mixed integer optimization

Computational Optimization and Applications ◽

10.1007/s10589-016-9832-2 ◽

2016 ◽

Vol 64 (3) ◽

pp. 865-880 ◽

Cited By ~ 18

Author(s):

Toshiki Sato ◽

Yuichi Takano ◽

Ryuhei Miyashiro ◽

Akiko Yoshise

Keyword(s):

Logistic Regression ◽

Subset Selection ◽

Feature Subset Selection ◽

Mixed Integer ◽

Feature Subset ◽

Integer Optimization ◽

Mixed Integer Optimization ◽

Selection For

Download Full-text

Empirical evaluation of feature subset selection based on a real-world data set

Engineering Applications of Artificial Intelligence ◽

10.1016/j.engappai.2004.03.005 ◽

2004 ◽

Vol 17 (3) ◽

pp. 285-288 ◽

Cited By ~ 5

Author(s):

Petra Perner ◽

Chid Apte

Keyword(s):

Real World ◽

Empirical Evaluation ◽

Subset Selection ◽

Feature Subset Selection ◽

Feature Subset ◽

Real World Data ◽

Data Set ◽

World Data

Download Full-text

Threshold accepting trained principal component neural network and feature subset selection: Application to bankruptcy prediction in banks

Applied Soft Computing ◽

10.1016/j.asoc.2007.12.003 ◽

2008 ◽

Vol 8 (4) ◽

pp. 1539-1548 ◽

Cited By ~ 67

Author(s):

V. Ravi ◽

C. Pramodh

Keyword(s):

Neural Network ◽

Subset Selection ◽

Principal Component ◽

Bankruptcy Prediction ◽

Feature Subset Selection ◽

Feature Subset ◽

Threshold Accepting

Download Full-text

Interaction between feature subset selection techniques and machine learning classifiers for detecting unsolicited emails

ACM SIGAPP Applied Computing Review ◽

10.1145/2600617.2600622 ◽

2014 ◽

Vol 14 (1) ◽

pp. 53-61 ◽

Cited By ~ 15

Author(s):

Shrawan Kumar Trivedi ◽

Shubhamoy Dey

Keyword(s):

Machine Learning ◽

Subset Selection ◽

Feature Subset Selection ◽

Feature Subset ◽

Machine Learning Classifiers ◽

Learning Classifiers

Download Full-text

An intelligent hybrid feature subset selection and production pattern recognition method for modeling steam cracking process

Journal of Analytical and Applied Pyrolysis ◽

10.1016/j.jaap.2021.105352 ◽

2021 ◽

pp. 105352

Author(s):

Qing Li ◽

Mengxuan Zhang ◽

Xiaogang Shi ◽

Xingying Lan ◽

Xuqiang Guo ◽

...

Keyword(s):

Pattern Recognition ◽

Subset Selection ◽

Feature Subset Selection ◽

Feature Subset ◽

Pattern Recognition Method ◽

Recognition Method ◽

Steam Cracking ◽

Production Pattern ◽

Cracking Process

Download Full-text

A New Approach for Feature Subset Selection using Quantum Inspired Owl Search Algorithm

2020 10th International Conference on Information Science and Technology (ICIST) ◽

10.1109/icist49303.2020.9202140 ◽

2020 ◽

Author(s):

Ashis Kumar Mandal ◽

Rikta Sen ◽

Saptarsi Goswami ◽

Amlan Chakrabarti ◽

Basabi Chakraborty

Keyword(s):

Search Algorithm ◽

Subset Selection ◽

Feature Subset Selection ◽

Feature Subset ◽

New Approach

Download Full-text

Cascading GA & CFS for Feature Subset selection in Medical Data Mining

2009 IEEE International Advance Computing Conference ◽

10.1109/iadcc.2009.4809226 ◽

2009 ◽

Cited By ~ 11

Author(s):

Asha Gowda Karegowda ◽

M.A. Jayaram

Keyword(s):

Data Mining ◽

Subset Selection ◽

Medical Data ◽

Feature Subset Selection ◽

Feature Subset ◽

Medical Data Mining

Download Full-text

Acceleration of Feature Subset Selection Using CUDA

2018 14th International Conference on Computational Intelligence and Security (CIS) ◽

10.1109/cis2018.2018.00038 ◽

2018 ◽

Cited By ~ 1

Author(s):

Jun Yang ◽

Siyuan Jing

Keyword(s):

Subset Selection ◽

Feature Subset Selection ◽

Feature Subset

Download Full-text

Klasifikasi Resiko Kehamilan Menggunakan Ensemble Learning berbasis Classification Tree

INFORMAL: Informatics Journal ◽

10.19184/isj.v6i3.28396 ◽

2021 ◽

Vol 6 (3) ◽

pp. 177

Author(s):

Muhamad Arief Hidayat

Keyword(s):

Ensemble Learning ◽

Classification Tree ◽

Subset Selection ◽

Feature Subset Selection ◽

Feature Subset ◽

Pregnancy Risk ◽

Risk Status ◽

Cost Sensitive Learning ◽

Best Value ◽

Selection Stage

In health science there is a technique to determine the level of risk of pregnancy, namely the Poedji Rochyati score technique. In this evaluation technique, the level of pregnancy risk is calculated from the values of 22 parameters obtained from pregnant women. Under certain conditions, some parameter values are unknown. This causes the level of risk of pregnancy can not be calculated. For that we need a way to predict pregnancy risk status in cases of incomplete attribute values. There are several studies that try to overcome this problem. The research "classification of pregnancy risk using cost sensitive learning" [3] applies cost sensitive learning to the process of classifying the level of pregnancy risk. In this study, the best classification accuracy achieved was 73% and the best value was 77.9%. To increase the accuracy and recall of predicting pregnancy risk status, in this study several improvements were proposed. 1) Using ensemble learning based on classification tree 2) using the SVMattributeEvaluator evaluator to optimize the feature subset selection stage. In the trials conducted using the classification tree-based ensemble learning method and the SVMattributeEvaluator at the feature subset selection stage, the best value for accuracy was up to 76% and the best value for recall was up to 89.5%

Download Full-text

IFSS An Improved Filter-Wrapper Algorithm for Feature Subset Selection

International Journal of Computer Applications ◽

10.5120/16665-6656 ◽

2014 ◽

Vol 95 (14) ◽

pp. 33-35

Author(s):

Saurabh Soni ◽

Pratik Patel

Keyword(s):

Subset Selection ◽

Feature Subset Selection ◽

Feature Subset

Download Full-text