A Comparative Study of Meta-Heuristic and Conventional Search in Optimization of Multi-Dimensional Feature Selection

Algorithmic – based search approach is ineffective at addressing the problem of multi-dimensional feature selection for document categorization. This study proposes the use of meta heuristic based search approach for optimal feature selection. Elephant optimization (EO) and Ant Colony optimization (ACO) algorithms coupled with Naïve Bayes (NB), Support Vector Machin (SVM), and J48 classifiers were used to highlight the optimization capability of meta-heuristic search for multi-dimensional feature selection problem in document categorization. In addition, the performance results for feature selection using the two meta-heuristic based approaches (EO and ACO) were compared with conventional Best First Search (BFS) and Greedy Stepwise (GS) algorithms on news document categorization. The comparative results showed that global optimal feature subsets were attained using adaptive parameters tuning in meta-heuristic based feature selection optimization scheme. In addition, the selected number of feature subsets were minimized dramatically for document classification.

Download Full-text

Minimax feature selection problem for constructing a classifier using support vector machines

Computational Mathematics and Mathematical Physics ◽

10.1134/s0965542510050143 ◽

2010 ◽

Vol 50 (5) ◽

pp. 917-925

Author(s):

Yu. V. Goncharov

Keyword(s):

Feature Selection ◽

Support Vector Machines ◽

Selection Problem ◽

Support Vector ◽

Feature Selection Problem ◽

Vector Machines

Download Full-text

Feature Selection Using Random Forest Algorithm to Diagnose Tuberculosis From Lung CT Images

AI Innovation in Medical Imaging Diagnostics - Advances in Medical Technologies and Clinical Practice ◽

10.4018/978-1-7998-3092-4.ch005 ◽

2021 ◽

pp. 92-100

Author(s):

Beaulah Jeyavathana Rajendran ◽

Kanimozhi K. V.

Keyword(s):

Feature Selection ◽

Random Forest ◽

The Body ◽

Support Vector ◽

Feature Descriptor ◽

Feature Sets ◽

Modified Particle Swarm Optimization ◽

Tuberculosis Disease ◽

Optimal Feature ◽

Lung Ct

Tuberculosis is one of the hazardous infectious diseases that can be categorized by the evolution of tubercles in the tissues. This disease mainly affects the lungs and also the other parts of the body. The disease can be easily diagnosed by the radiologists. The main objective of this chapter is to get best solution selected by means of modified particle swarm optimization is regarded as optimal feature descriptor. Five stages are being used to detect tuberculosis disease. They are pre-processing an image, segmenting the lungs and extracting the feature, feature selection and classification. These stages that are used in medical image processing to identify the tuberculosis. In the feature extraction, the GLCM approach is used to extract the features and from the extracted feature sets the optimal features are selected by random forest. Finally, support vector machine classifier method is used for image classification. The experimentation is done, and intermediate results are obtained. The proposed system accuracy results are better than the existing method in classification.

Download Full-text

Binary Political Optimizer for Feature Selection Using Gene Expression Data

Computational Intelligence and Neuroscience ◽

10.1155/2020/8896570 ◽

2020 ◽

Vol 2020 ◽

pp. 1-14

Author(s):

Ghaith Manita ◽

Ouajdi Korbaa

Keyword(s):

Gene Expression ◽

Feature Selection ◽

Gene Expression Data ◽

Transfer Functions ◽

Research Field ◽

Sigmoid Function ◽

Expression Data ◽

Feature Selection Problem ◽

Single Experiment ◽

Comparative Results

DNA Microarray technology is an emergent field, which offers the possibility of obtaining simultaneous estimates of the expression levels of several thousand genes in an organism in a single experiment. One of the most significant challenges in this research field is to select high relevant genes from gene expression data. To address this problem, feature selection is a well-known technique to eliminate unnecessary genes in order to ensure accurate classification results. This paper proposes a binary version of Political Optimizer (PO) to solve feature selection problem using gene expression data. Two transfer functions are used to design a binary PO. The first one is based on Sigmoid function and will be noted as BPO-S, while the second one is based on V-shaped function and will be noted as BPO-V. The proposed methods are evaluated using 9 biological datasets and compared with 8 binary well-known metaheuristics. The comparative results show the prevalent performance of the BPO methods especially BPO-V in comparison with other techniques.

Download Full-text

Identification of Novel COVID-19 Biomarkers by Multiple Feature Selection Strategies

Computational and Mathematical Methods in Medicine ◽

10.1155/2021/2203636 ◽

2021 ◽

Vol 2021 ◽

pp. 1-8

Author(s):

Shuai Zhang ◽

Renliang Qu ◽

Pengyan Wang ◽

Shenghan Wang

Keyword(s):

Feature Selection ◽

Expression Profiles ◽

Feature Selection Method ◽

Principal Component ◽

Enrichment Analysis ◽

Functional Enrichment ◽

Nucleic Acid Detection ◽

Support Vector ◽

New Biomarkers ◽

Optimal Feature

Coronavirus disease 2019 (COVID-19) arising from severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has resulted in a global pandemic since its first report in December 2019. So far, SARS-CoV-2 nucleic acid detection has been deemed as the golden standard of COVID-19 diagnosis. However, this detection method often leads to false negatives, thus triggering missed COVID-19 diagnosis. Therefore, it is urgent to find new biomarkers to increase the accuracy of COVID-19 diagnosis. To explore new biomarkers of COVID-19 in this study, expression profiles were firstly accessed from the GEO database. On this basis, 500 feature genes were screened by the minimum-redundancy maximum-relevancy (mRMR) feature selection method. Afterwards, the incremental feature selection (IFS) method was used to choose a classifier with the best performance from different feature gene-based support vector machine (SVM) classifiers. The corresponding 66 feature genes were set as the optimal feature genes. Lastly, the optimal feature genes were subjected to GO functional enrichment analysis, principal component analysis (PCA), and protein-protein interaction (PPI) network analysis. All in all, it was posited that the 66 feature genes could effectively classify positive and negative COVID-19 and work as new biomarkers of the disease.

Download Full-text

FEATURE SELECTION VIA LEAST SQUARES SUPPORT FEATURE MACHINE

International Journal of Information Technology & Decision Making ◽

10.1142/s0219622007002733 ◽

2007 ◽

Vol 06 (04) ◽

pp. 671-686 ◽

Cited By ~ 37

Author(s):

JIANPING LI ◽

ZHENYU CHEN ◽

LIWEI WEI ◽

WEIXUAN XU ◽

GANG KOU

Keyword(s):

Feature Selection ◽

Least Squares ◽

Credit Card ◽

Feature Selection Method ◽

Support Vector ◽

Credit Risk Management ◽

Feature Selection Problem ◽

Feature Parameters ◽

Single Feature ◽

Effectiveness And Efficiency

In many applications such as credit risk management, data are represented as high-dimensional feature vectors. It makes the feature selection necessary to reduce the computational complexity, improve the generalization ability and the interpretability. In this paper, we present a novel feature selection method — "Least Squares Support Feature Machine" (LS-SFM). The proposed method has two advantages comparing with conventional Support Vector Machine (SVM) and LS-SVM. First, the convex combinations of basic kernels are used as the kernel and each basic kernel makes use of a single feature. It transforms the feature selection problem that cannot be solved in the context of SVM to an ordinary multiple-parameter learning problem. Second, all parameters are learned by a two stage iterative algorithm. A 1-norm based regularized cost function is used to enforce sparseness of the feature parameters. The "support features" refer to the respective features with nonzero feature parameters. Experimental study on some of the UCI datasets and a commercial credit card dataset demonstrates the effectiveness and efficiency of the proposed approach.

Download Full-text

Developing Machine Learning Models to Predict Roadway Traffic Noise: An Opportunity to Escape Conventional Techniques

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/0361198119838514 ◽

2019 ◽

Vol 2673 (4) ◽

pp. 158-172 ◽

Cited By ~ 1

Author(s):

Mohamad Ali Khalil ◽

Khaled Hamad ◽

Abdallah Shanableh

Keyword(s):

Machine Learning ◽

Feature Selection ◽

United Arab Emirates ◽

Traffic Noise ◽

Free Field ◽

Model Performance ◽

Noise Model ◽

Support Vector ◽

Feature Selection Technique ◽

Performance Results

Accurate prediction of roadway traffic noise remains challenging. Many researchers continue to improve the performance of their models by either adding more variables or improving their modeling algorithms. In this research, machine learning (ML) modeling techniques were developed to predict roadway traffic noise accurately. The ML techniques applied were: regression decision trees, support vector machine, ensembles, and artificial neural network. The parameters of each of these models were fine-tuned to achieve the best performance results. In addition, a state-of-the-art hybrid feature-selection technique has been employed to select a minimum set of input features (variables) while maintaining the accuracy of the developed models. By optimizing the number of features used in the model, the resources needed to develop and utilize a model to predict roadway noise would be less, hence decreasing the development cost. The proposed approach has been applied to develop a free-field roadway traffic noise model for Sharjah City in the United Arab Emirates. The best developed ML model was compared with a conventional regression model which was developed earlier under the same conditions. The cross-validated results clearly indicate that the best ML model outperformed the regression modeling. The performance of the ML model was also assessed after reducing the number of its input features based on the outcome of the feature-selection algorithm; the model performance was slightly affected. This result emphasizes the importance of considering only features that greatly influence the roadway traffic noise.

Download Full-text

Product Review Based Customer Sentiment Analysis using an Ensemble of mRMR and Forest Optimization Algorithm (FOA)

International Journal of Applied Metaheuristic Computing ◽

10.4018/ijamc.2022010107 ◽

2022 ◽

Vol 13 (1) ◽

pp. 0-0

Keyword(s):

Feature Selection ◽

Sentiment Analysis ◽

Optimization Algorithm ◽

Nearest Neighbor ◽

Hybrid Approach ◽

Support Vector ◽

K Nearest Neighbor ◽

Feature Selection Technique ◽

Feature Selection Problem

This research presents a way of feature selection problem for classification of sentiments that use ensemble-based classifier. This includes a hybrid approach of minimum redundancy and maximum relevance (mRMR) technique and Forest Optimization Algorithm (FOA) (i.e. mRMR-FOA) based feature selection. Before applying the FOA on sentiment analysis, it has been used as feature selection technique applied on 10 different classification datasets publically available on UCI machine learning repository. The classifiers for example k-Nearest Neighbor (k-NN), Support Vector Machine (SVM) and Naïve Bayes used the ensemble based algorithm for available datasets. The mRMR-FOA uses the Blitzer’s dataset (customer reviews on electronic products survey) to select the significant features. The classification of sentiments has noticed to improve by 12 to 18%. The evaluated results are further enhanced by the ensemble of k-NN, NB and SVM with an accuracy of 88.47% for the classification of sentiment analysis task.

Download Full-text

Optimal feature selection for support vector machines

Pattern Recognition ◽

10.1016/j.patcog.2009.09.003 ◽

2010 ◽

Vol 43 (3) ◽

pp. 584-591 ◽

Cited By ~ 125

Author(s):

Minh Hoai Nguyen ◽

Fernando de la Torre

Keyword(s):

Feature Selection ◽

Support Vector Machines ◽

Support Vector ◽

Vector Machines ◽

Optimal Feature Selection ◽

Selection For ◽

Optimal Feature

Download Full-text

Comprehensive strategy for classification of voltage sags source location using optimal feature selection applied to support vector machine and ensemble techniques

International Journal of Electrical Power & Energy Systems ◽

10.1016/j.ijepes.2020.106363 ◽

2021 ◽

Vol 124 ◽

pp. 106363

Author(s):

Younes Mohammadi ◽

Amir Salarpour ◽

Roberto Chouhy Leborgne

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Source Location ◽

Support Vector ◽

Voltage Sags ◽

Comprehensive Strategy ◽

Ensemble Techniques ◽

Optimal Feature Selection ◽

Optimal Feature

Download Full-text

WaveletLp-Norm Support Vector Regression with Feature Selection

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2015.p0407 ◽

2015 ◽

Vol 19 (3) ◽

pp. 407-416 ◽

Cited By ~ 5

Author(s):

Ya-Fen Ye ◽

◽

Yuan-Hai Shao ◽

Chun-Na Li ◽

Keyword(s):

Feature Selection ◽

Real Estate ◽

Theoretical Analysis ◽

Support Vector Regression ◽

Housing Prices ◽

Support Vector ◽

Powerful Factor ◽

Regression Problems ◽

Real Estate Prices ◽

Performance Results

This paper proposes waveletLp-norm support vector regression (Lp-WSVR) to solve feature selection and regression problems effectively. Unlike conventional support vector regression (SVR), linearLp-WSVR ensures that useful features are selected based on theoretical analysis. By using the wavelet kernel,Lp-WSVR approaches any curve in quadratic continuous integral space that leads to improving regression performance. Results of experiments show the superiority ofLp-WSVR in both feature selection and regression performances. ApplyingLp-WSVR to Chinese real estate prices shows that the most significant and powerful factor contributing to Chinese housing prices is monetary growth.

Download Full-text