A High Efficiency Thyroid Disorders Prediction System with Non-Dominated Sorting Genetic Algorithm NSGA-II as a Feature Selection Algorithm

Feature selection is an important research field for pattern classification, data mining, etc. Population-based optimization algorithms (POA) have high parallelism and are widely used as search algorithm for feature selection. Population-based feature selection algorithms (PFSA) involve compromise between precision and time cost. In order to optimize the PFSA, the feature selection models need to be improved. Feature selection algorithms broadly fall into two categories: the filter model and the wrapper model. The filter model is fast but less precise; while the wrapper model is more precise but generally computationally more intensive. In this paper, we proposed a new mechanism — proportional hybrid mechanism (PHM) to combine the advantages of filter and wrapper models. The mechanism can be applied in PFSA to improve their performance. Genetic algorithm (GA) has been applied in many kinds of feature selection problems as search algorithm because of its high efficiency and implicit parallelism. Therefore, GAs are used in this paper. In order to validate the mechanism, seven datasets from university of California Irvine (UCI) database and artificial toy datasets are tested. The experiments are carried out for different GAs, classifiers, and evaluation criteria, the results show that with the introduction of PHM, the GA-based feature selection algorithm can be improved in both time cost and classification accuracy. Moreover, the comparison of GA-based, PSO-based and some other feature selection algorithms demonstrate that the PHM can be used in other population-based feature selection algorithms and obtain satisfying results.

Download Full-text

Genetic Algorithm Based Feature Selection Algorithm for Effective Intrusion Detection in Cloud Networks

2012 IEEE 12th International Conference on Data Mining Workshops ◽

10.1109/icdmw.2012.56 ◽

2012 ◽

Cited By ~ 20

Author(s):

Anand Kannan ◽

Gerald Q. Maguire Jr. ◽

Ayush Sharma ◽

Peter Schoo

Keyword(s):

Genetic Algorithm ◽

Feature Selection ◽

Intrusion Detection ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Cloud Networks

Download Full-text

Improved Relief Weight Feature Selection Algorithm Based on Relief and Mutual Information

Information ◽

10.3390/info12060228 ◽

2021 ◽

Vol 12 (6) ◽

pp. 228

Author(s):

Hongbin Wang ◽

Pengming Wang ◽

Shengchun Deng ◽

Haoran Li

Keyword(s):

Feature Selection ◽

Mutual Information ◽

High Efficiency ◽

Feature Subset ◽

Evaluation Function ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Relief Algorithm ◽

Sequential Forward Search ◽

Computation Speed

As the classic feature selection algorithm, the Relief algorithm has the advantages of simple computation and high efficiency, but the algorithm itself is limited to only dealing with binary classification problems, and the comprehensive distinguishing ability of the feature subsets composed of the former K features selected by the Relief algorithm is often redundant, as the algorithm cannot select the ideal feature subset. When calculating the correlation and redundancy between characteristics by mutual information, the computation speed is slow because of the high computational complexity and the method’s need to calculate the probability density function of the corresponding features. Aiming to solve the above problems, we first improve the weight of the Relief algorithm, so that it can be used to evaluate a set of candidate feature sets. Then we use the improved joint mutual information evaluation function to replace the basic mutual information computation and solve the problem of computation speed and correlation, and redundancy between features. Finally, a compound correlation feature selection algorithm based on Relief and joint mutual information is proposed using the evaluation function and the heuristic sequential forward search strategy. This algorithm can effectively select feature subsets with small redundancy and strong classification characteristics, and has the excellent characteristics of faster calculation speed.

Download Full-text