feature selection problem
Recently Published Documents


TOTAL DOCUMENTS

114
(FIVE YEARS 55)

H-INDEX

13
(FIVE YEARS 4)

2022 ◽  
Vol 70 (1) ◽  
pp. 557-579
Author(s):  
R. Manjula Devi ◽  
M. Premkumar ◽  
Pradeep Jangir ◽  
B. Santhosh Kumar ◽  
Dalal Alrowaili ◽  
...  

2022 ◽  
Vol 13 (1) ◽  
pp. 0-0

This research presents a way of feature selection problem for classification of sentiments that use ensemble-based classifier. This includes a hybrid approach of minimum redundancy and maximum relevance (mRMR) technique and Forest Optimization Algorithm (FOA) (i.e. mRMR-FOA) based feature selection. Before applying the FOA on sentiment analysis, it has been used as feature selection technique applied on 10 different classification datasets publically available on UCI machine learning repository. The classifiers for example k-Nearest Neighbor (k-NN), Support Vector Machine (SVM) and Naïve Bayes used the ensemble based algorithm for available datasets. The mRMR-FOA uses the Blitzer’s dataset (customer reviews on electronic products survey) to select the significant features. The classification of sentiments has noticed to improve by 12 to 18%. The evaluated results are further enhanced by the ensemble of k-NN, NB and SVM with an accuracy of 88.47% for the classification of sentiment analysis task.


2022 ◽  
Vol 13 (1) ◽  
pp. 0-0

Algorithmic – based search approach is ineffective at addressing the problem of multi-dimensional feature selection for document categorization. This study proposes the use of meta heuristic based search approach for optimal feature selection. Elephant optimization (EO) and Ant Colony optimization (ACO) algorithms coupled with Naïve Bayes (NB), Support Vector Machin (SVM), and J48 classifiers were used to highlight the optimization capability of meta-heuristic search for multi-dimensional feature selection problem in document categorization. In addition, the performance results for feature selection using the two meta-heuristic based approaches (EO and ACO) were compared with conventional Best First Search (BFS) and Greedy Stepwise (GS) algorithms on news document categorization. The comparative results showed that global optimal feature subsets were attained using adaptive parameters tuning in meta-heuristic based feature selection optimization scheme. In addition, the selected number of feature subsets were minimized dramatically for document classification.


2022 ◽  
Vol 13 (1) ◽  
pp. 0-0

Feature selection is performed to eliminate irrelevant features to reduce computational overheads. Metaheuristic algorithms have become popular for the task of feature selection due to their effectiveness and flexibility. Hybridization of two or more such metaheuristics has become popular in solving optimization problems. In this paper, we propose a hybrid wrapper feature selection technique based on binary butterfly optimization algorithm (bBOA) and Simulated Annealing (SA). The SA is combined with the bBOA in a pipeline fashion such that the best solution obtained by the bBOA is passed on to the SA for further improvement. The SA solution improves the best solution obtained so far by searching in its neighborhood. Thus the SA tries to enhance the exploitation property of the bBOA. The proposed method is tested on twenty datasets from the UCI repository and the results are compared with five popular algorithms for feature selection. The results confirm the effectiveness of the hybrid approach in improving the classification accuracy and selecting the optimal feature subset.


2021 ◽  
Vol 4 (2) ◽  
pp. 116-122
Author(s):  
Ibraheem Al-Jadir ◽  
Waleed A. Mahmoud

Optimization methods are considered as one of the highly developed areas in Artificial Intelligence (AI). The success of the Particle Swarm Optimization (PSO) and Genetic Algorithms (GA) has encouraged researchers to develop other methods that can obtain better performance outcomes and to be more responding to the modern needs. The Grey Wolf Optimization (GWO), and the Krill Herd (KH) are some of those methods that showed a great success in different applications in the last few years. In this paper, we propose a comparative study of using different optimization methods including KH and GWO in order to solve the problem of document feature selection for the classification problem. These methods are used to model the feature selection problem as a typical optimization method. Due to the complexity and the non-linearity of this kind of problems, it becomes necessary to use some advanced techniques to make the judgement of which features subset that is optimal to enhance the performance of classification of text documents. The test results showed the superiority of GWO over the other counterparts using the specified evaluation measures.


2021 ◽  
Author(s):  
Jing Wang ◽  
Yuanzi Zhang ◽  
Minglin Hong ◽  
Haiyang He ◽  
Shiguo Huang

Abstract Feature selection is an important data preprocessing method in data mining and machine learning, yet it faces the challenge of “curse of dimensionality” when dealing with high-dimensional data. In this paper, a self-adaptive level-based learning artificial bee colony (SLLABC) algorithm is proposed for high-dimensional feature selection problem. The SLLABC algorithm includes three new mechanisms: (1) A novel level-based learning mechanism is introduced to accelerate the convergence of the basic artificial bee colony algorithm, which divides the population into several levels and the individuals on each level learn from the individuals on higher levels, especially, the individuals on the highest level learn from each other. (2) A self-adaptive method is proposed to keep the balance between exploration and exploitation abilities, which takes the diversity of population into account to determine the number of levels. The lower the diversity is, the fewer the levels are divided. (3) A new update mechanism is proposed to reduce the number of selected features. In this mechanism, if the error rate of an offspring is higher than or is equal to that of its parent but selects more features, then the offspring is discarded and the parent is retained, otherwise, the offspring replaces its parent. Further, we discuss and analyze the contribution of these novelties to the diversity of population and the performance of classification. Finally, the results, compared with 8 state-of-the-art algorithms on 12 high-dimensional datasets, confirm the competitive performance of the proposed SLLABC on both classification accuracy and the size of the feature subset.


Computers ◽  
2021 ◽  
Vol 10 (11) ◽  
pp. 136
Author(s):  
Mohammad H. Nadimi-Shahraki ◽  
Mahdis Banaie-Dezfouli ◽  
Hoda Zamani ◽  
Shokooh Taghian ◽  
Seyedali Mirjalili

Advancements in medical technology have created numerous large datasets including many features. Usually, all captured features are not necessary, and there are redundant and irrelevant features, which reduce the performance of algorithms. To tackle this challenge, many metaheuristic algorithms are used to select effective features. However, most of them are not effective and scalable enough to select effective features from large medical datasets as well as small ones. Therefore, in this paper, a binary moth-flame optimization (B-MFO) is proposed to select effective features from small and large medical datasets. Three categories of B-MFO were developed using S-shaped, V-shaped, and U-shaped transfer functions to convert the canonical MFO from continuous to binary. These categories of B-MFO were evaluated on seven medical datasets and the results were compared with four well-known binary metaheuristic optimization algorithms: BPSO, bGWO, BDA, and BSSA. In addition, the convergence behavior of the B-MFO and comparative algorithms were assessed, and the results were statistically analyzed using the Friedman test. The experimental results demonstrate a superior performance of B-MFO in solving the feature selection problem for different medical datasets compared to other comparative algorithms.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Yue Li ◽  
Zhiheng Sun ◽  
Xin Liu ◽  
Wei-Tung Chen ◽  
Der-Juinn Horng ◽  
...  

The feature selection problem is a fundamental issue in many research fields. In this paper, the feature selection problem is regarded as an optimization problem and addressed by utilizing a large-scale many-objective evolutionary algorithm. Considering the number of selected features, accuracy, relevance, redundancy, interclass distance, and intraclass distance, a large-scale many-objective feature selection model is constructed. It is difficult to optimize the large-scale many-objective feature selection optimization problem by using the traditional evolutionary algorithms. Therefore, this paper proposes a modified vector angle-based large-scale many-objective evolutionary algorithm (MALSMEA). The proposed algorithm uses polynomial mutation based on variable grouping instead of naive polynomial mutation to improve the efficiency of solving large-scale problems. And a novel worst-case solution replacement strategy using shift-based density estimation is used to replace the poor solution of two individuals with similar search directions to enhance convergence. The experimental results show that MALSMEA is competitive and can effectively optimize the proposed model.


2021 ◽  
pp. 2454-2462
Author(s):  
Thurayaa B. Alhijaj ◽  
Sarab M. Hameed ◽  
Bara'a A. Attea

     In this paper, the botnet detection problem is defined as a feature selection problem and the genetic algorithm (GA) is used to search for the best significant combination of features from the entire search space of set of features. Furthermore, the Decision Tree (DT) classifier is used as an objective function to direct the ability of the proposed GA to locate the combination of features that can correctly classify the activities into normal traffics and botnet attacks. Two datasets  namely the UNSW-NB15 and the Canadian Institute for Cybersecurity Intrusion Detection System 2017 (CICIDS2017), are used as evaluation datasets. The results reveal that the proposed DT-aware GA can effectively find the relevant features from the whole features set. Thus, it obtains efficient botnet detection results in terms of F-score, precision, detection rate, and  number of relevant features, when compared with DT alone.


Symmetry ◽  
2021 ◽  
Vol 13 (7) ◽  
pp. 1290
Author(s):  
Le Wang ◽  
Yuelin Gao ◽  
Shanshan Gao ◽  
Xin Yong

In solving classification problems in the field of machine learning and pattern recognition, the pre-processing of data is particularly important. The processing of high-dimensional feature datasets increases the time and space complexity of computer processing and reduces the accuracy of classification models. Hence, the proposal of a good feature selection method is essential. This paper presents a new algorithm for solving feature selection, retaining the selection and mutation operators from traditional genetic algorithms. On the one hand, the global search capability of the algorithm is ensured by changing the population size, on the other hand, finding the optimal mutation probability for solving the feature selection problem based on different population sizes. During the iteration of the algorithm, the population size does not change, no matter how many transformations are made, and is the same as the initialized population size; this spatial invariance is physically defined as symmetry. The proposed method is compared with other algorithms and validated on different datasets. The experimental results show good performance of the algorithm, in addition to which we apply the algorithm to a practical Android software classification problem and the results also show the superiority of the algorithm.


Sign in / Sign up

Export Citation Format

Share Document