The Improvement of Na¿ve Bayesian Classifier Based on the Strategy of Fuzzy Feature Selection

Author(s):  
Xuefeng Zhang ◽  
Peng Liu ◽  
Jinjin Fan
Author(s):  
Danlei Xu ◽  
Lan Du ◽  
Hongwei Liu ◽  
Penghui Wang

A Bayesian classifier for sparsity-promoting feature selection is developed in this paper, where a set of nonlinear mappings for the original data is performed as a pre-processing step. The linear classification model with such mappings from the original input space to a nonlinear transformation space can not only construct the nonlinear classification boundary, but also realize the feature selection for the original data. A zero-mean Gaussian prior with Gamma precision and a finite approximation of Beta process prior are used to promote sparsity in the utilization of features and nonlinear mappings in our model, respectively. We derive the Variational Bayesian (VB) inference algorithm for the proposed linear classifier. Experimental results based on the synthetic data set, measured radar data set, high-dimensional gene expression data set, and several benchmark data sets demonstrate the aggressive and robust feature selection capability and comparable classification accuracy of our method comparing with some other existing classifiers.


2021 ◽  
Vol 2129 (1) ◽  
pp. 012022
Author(s):  
Mohamad Faiz Dzulkalnine ◽  
Roselina Sallehuddin ◽  
Yusliza Yussof ◽  
Nor Haizan Mohd Radzi ◽  
Noorfa Haszlinna Binti Mustaffa ◽  
...  

Abstract In Malaysia, Colorectal Cancer (CRC) is one of the most common cancers that occur in both men and women. Early detection is very crucial and it can significantly increase the rate of survival for the patients and if left untreated can lead to death. With the lack of high-quality CRC data, expert systems and machine learning analysis are burdened with the presence of irrelevant features, outliers, and noise. This can reduce the classification accuracy for data analysis. Accordingly, it is essential to find a reliable feature selection method that can identify and remove any irrelevant feature while being resistant to noise and outliers. In this paper, Fuzzy Principal Component Analysis (FPCA) was tested for the classification of Malaysian’s CRC dataset. With the utilization of fuzzy membership in FPCA, the experimental results showed that the proposed method produces higher accuracy compared to PCA and SVM by almost 2% and 5% respectively. Empirical results showed that FPCA is a reliable feature selection method that can find the most informative features in the CRC dataset that could assist medical practitioners in making an informed decision.


Sign in / Sign up

Export Citation Format

Share Document