scholarly journals A Comparative Study of Meta-Heuristic and Conventional Search in Optimization of Multi-Dimensional Feature Selection

2022 ◽  
Vol 13 (1) ◽  
pp. 0-0

Algorithmic – based search approach is ineffective at addressing the problem of multi-dimensional feature selection for document categorization. This study proposes the use of meta heuristic based search approach for optimal feature selection. Elephant optimization (EO) and Ant Colony optimization (ACO) algorithms coupled with Naïve Bayes (NB), Support Vector Machin (SVM), and J48 classifiers were used to highlight the optimization capability of meta-heuristic search for multi-dimensional feature selection problem in document categorization. In addition, the performance results for feature selection using the two meta-heuristic based approaches (EO and ACO) were compared with conventional Best First Search (BFS) and Greedy Stepwise (GS) algorithms on news document categorization. The comparative results showed that global optimal feature subsets were attained using adaptive parameters tuning in meta-heuristic based feature selection optimization scheme. In addition, the selected number of feature subsets were minimized dramatically for document classification.

Author(s):  
Beaulah Jeyavathana Rajendran ◽  
Kanimozhi K. V.

Tuberculosis is one of the hazardous infectious diseases that can be categorized by the evolution of tubercles in the tissues. This disease mainly affects the lungs and also the other parts of the body. The disease can be easily diagnosed by the radiologists. The main objective of this chapter is to get best solution selected by means of modified particle swarm optimization is regarded as optimal feature descriptor. Five stages are being used to detect tuberculosis disease. They are pre-processing an image, segmenting the lungs and extracting the feature, feature selection and classification. These stages that are used in medical image processing to identify the tuberculosis. In the feature extraction, the GLCM approach is used to extract the features and from the extracted feature sets the optimal features are selected by random forest. Finally, support vector machine classifier method is used for image classification. The experimentation is done, and intermediate results are obtained. The proposed system accuracy results are better than the existing method in classification.


2020 ◽  
Vol 2020 ◽  
pp. 1-14
Author(s):  
Ghaith Manita ◽  
Ouajdi Korbaa

DNA Microarray technology is an emergent field, which offers the possibility of obtaining simultaneous estimates of the expression levels of several thousand genes in an organism in a single experiment. One of the most significant challenges in this research field is to select high relevant genes from gene expression data. To address this problem, feature selection is a well-known technique to eliminate unnecessary genes in order to ensure accurate classification results. This paper proposes a binary version of Political Optimizer (PO) to solve feature selection problem using gene expression data. Two transfer functions are used to design a binary PO. The first one is based on Sigmoid function and will be noted as BPO-S, while the second one is based on V-shaped function and will be noted as BPO-V. The proposed methods are evaluated using 9 biological datasets and compared with 8 binary well-known metaheuristics. The comparative results show the prevalent performance of the BPO methods especially BPO-V in comparison with other techniques.


2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Shuai Zhang ◽  
Renliang Qu ◽  
Pengyan Wang ◽  
Shenghan Wang

Coronavirus disease 2019 (COVID-19) arising from severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has resulted in a global pandemic since its first report in December 2019. So far, SARS-CoV-2 nucleic acid detection has been deemed as the golden standard of COVID-19 diagnosis. However, this detection method often leads to false negatives, thus triggering missed COVID-19 diagnosis. Therefore, it is urgent to find new biomarkers to increase the accuracy of COVID-19 diagnosis. To explore new biomarkers of COVID-19 in this study, expression profiles were firstly accessed from the GEO database. On this basis, 500 feature genes were screened by the minimum-redundancy maximum-relevancy (mRMR) feature selection method. Afterwards, the incremental feature selection (IFS) method was used to choose a classifier with the best performance from different feature gene-based support vector machine (SVM) classifiers. The corresponding 66 feature genes were set as the optimal feature genes. Lastly, the optimal feature genes were subjected to GO functional enrichment analysis, principal component analysis (PCA), and protein-protein interaction (PPI) network analysis. All in all, it was posited that the 66 feature genes could effectively classify positive and negative COVID-19 and work as new biomarkers of the disease.


Author(s):  
JIANPING LI ◽  
ZHENYU CHEN ◽  
LIWEI WEI ◽  
WEIXUAN XU ◽  
GANG KOU

In many applications such as credit risk management, data are represented as high-dimensional feature vectors. It makes the feature selection necessary to reduce the computational complexity, improve the generalization ability and the interpretability. In this paper, we present a novel feature selection method — "Least Squares Support Feature Machine" (LS-SFM). The proposed method has two advantages comparing with conventional Support Vector Machine (SVM) and LS-SVM. First, the convex combinations of basic kernels are used as the kernel and each basic kernel makes use of a single feature. It transforms the feature selection problem that cannot be solved in the context of SVM to an ordinary multiple-parameter learning problem. Second, all parameters are learned by a two stage iterative algorithm. A 1-norm based regularized cost function is used to enforce sparseness of the feature parameters. The "support features" refer to the respective features with nonzero feature parameters. Experimental study on some of the UCI datasets and a commercial credit card dataset demonstrates the effectiveness and efficiency of the proposed approach.


Author(s):  
Mohamad Ali Khalil ◽  
Khaled Hamad ◽  
Abdallah Shanableh

Accurate prediction of roadway traffic noise remains challenging. Many researchers continue to improve the performance of their models by either adding more variables or improving their modeling algorithms. In this research, machine learning (ML) modeling techniques were developed to predict roadway traffic noise accurately. The ML techniques applied were: regression decision trees, support vector machine, ensembles, and artificial neural network. The parameters of each of these models were fine-tuned to achieve the best performance results. In addition, a state-of-the-art hybrid feature-selection technique has been employed to select a minimum set of input features (variables) while maintaining the accuracy of the developed models. By optimizing the number of features used in the model, the resources needed to develop and utilize a model to predict roadway noise would be less, hence decreasing the development cost. The proposed approach has been applied to develop a free-field roadway traffic noise model for Sharjah City in the United Arab Emirates. The best developed ML model was compared with a conventional regression model which was developed earlier under the same conditions. The cross-validated results clearly indicate that the best ML model outperformed the regression modeling. The performance of the ML model was also assessed after reducing the number of its input features based on the outcome of the feature-selection algorithm; the model performance was slightly affected. This result emphasizes the importance of considering only features that greatly influence the roadway traffic noise.


2022 ◽  
Vol 13 (1) ◽  
pp. 0-0

This research presents a way of feature selection problem for classification of sentiments that use ensemble-based classifier. This includes a hybrid approach of minimum redundancy and maximum relevance (mRMR) technique and Forest Optimization Algorithm (FOA) (i.e. mRMR-FOA) based feature selection. Before applying the FOA on sentiment analysis, it has been used as feature selection technique applied on 10 different classification datasets publically available on UCI machine learning repository. The classifiers for example k-Nearest Neighbor (k-NN), Support Vector Machine (SVM) and Naïve Bayes used the ensemble based algorithm for available datasets. The mRMR-FOA uses the Blitzer’s dataset (customer reviews on electronic products survey) to select the significant features. The classification of sentiments has noticed to improve by 12 to 18%. The evaluated results are further enhanced by the ensemble of k-NN, NB and SVM with an accuracy of 88.47% for the classification of sentiment analysis task.


Author(s):  
Ya-Fen Ye ◽  
◽  
Yuan-Hai Shao ◽  
Chun-Na Li ◽  

This paper proposes waveletLp-norm support vector regression (Lp-WSVR) to solve feature selection and regression problems effectively. Unlike conventional support vector regression (SVR), linearLp-WSVR ensures that useful features are selected based on theoretical analysis. By using the wavelet kernel,Lp-WSVR approaches any curve in quadratic continuous integral space that leads to improving regression performance. Results of experiments show the superiority ofLp-WSVR in both feature selection and regression performances. ApplyingLp-WSVR to Chinese real estate prices shows that the most significant and powerful factor contributing to Chinese housing prices is monetary growth.


Sign in / Sign up

Export Citation Format

Share Document