A noise tolerant fine tuning algorithm for the Naïve Bayesian learning algorithm

Text classification is one domain in which the naive Bayesian (NB) learning algorithm performs remarkably well. However, making further improvement in performance using ensemble-building techniques proved to be a challenge because NB is a stable algorithm. This work shows that, while an ensemble of NB classifiers achieves little or no improvement in terms of classification accuracy, an ensemble of fine-tuned NB classifiers can achieve a remarkable improvement in accuracy. We propose a fine-tuning algorithm for text classification that is both more accurate and less stable than the NB algorithm and the fine-tuning NB (FTNB) algorithm. This improvement makes it more suitable than the FTNB algorithm for building ensembles of classifiers using bagging. Our empirical experiments, using 16-benchmark text-classification data sets, show significant improvement for most data sets.

Download Full-text

Semi-naive Bayesian Learning

Encyclopedia of Machine Learning and Data Mining ◽

10.1007/978-1-4899-7687-1_748 ◽

2017 ◽

pp. 1137-1142

Author(s):

Fei Zheng ◽

Geoffrey I. Webb

Keyword(s):

Bayesian Learning ◽

Naive Bayesian ◽

Naïve Bayesian

Download Full-text

Semi-Naive Bayesian Learning

Encyclopedia of Machine Learning ◽

10.1007/978-0-387-30164-8_748 ◽

2011 ◽

pp. 889-892

Author(s):

Eric Martin ◽

Samuel Kaski ◽

Fei Zheng ◽

Geoffrey I. Webb ◽

Xiaojin Zhu ◽

...

Keyword(s):

Bayesian Learning ◽

Naive Bayesian ◽

Naïve Bayesian

Download Full-text

AN INFORMATION-THEORETIC FILTER METHOD FOR FEATURE WEIGHTING IN NAIVE BAYES

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001414510070 ◽

2014 ◽

Vol 28 (05) ◽

pp. 1451007 ◽

Cited By ~ 2

Author(s):

CHANG-HWAN LEE

Keyword(s):

Data Mining ◽

Bayesian Learning ◽

State Of The Art ◽

Feature Weighting ◽

New Method ◽

Filter Method ◽

Information Theoretic ◽

Naive Bayesian ◽

Naïve Bayesian ◽

Unrealistic Assumption

In spite of its simplicity, naive Bayesian learning has been widely used in many data mining applications. However, the unrealistic assumption that all features are equally important negatively impacts the performance of naive Bayesian learning. In this paper, we propose a new method that uses a Kullback–Leibler measure to calculate the weights of the features analyzed in naive Bayesian learning. Its performance is compared to that of other state-of-the-art methods over a number of datasets.

Download Full-text

Semi-Naive Bayesian Learning

10.1007/springerreference_179460 ◽

2012 ◽

Cited By ~ 1

Keyword(s):

Bayesian Learning ◽

Naive Bayesian ◽

Naïve Bayesian

Download Full-text

Subsumption resolution: an efficient and effective technique for semi-naive Bayesian learning

Machine Learning ◽

10.1007/s10994-011-5275-2 ◽

2011 ◽

Vol 87 (1) ◽

pp. 93-125 ◽

Cited By ~ 39

Author(s):

Fei Zheng ◽

Geoffrey I. Webb ◽

Pramuditha Suraweera ◽

Liguang Zhu

Keyword(s):

Bayesian Learning ◽

Effective Technique ◽

Naive Bayesian ◽

Naïve Bayesian

Download Full-text

Using differential evolution for fine tuning naïve Bayesian classifiers and its application for text classification

Applied Soft Computing ◽

10.1016/j.asoc.2016.12.043 ◽

2017 ◽

Vol 54 ◽

pp. 183-199 ◽

Cited By ~ 43

Author(s):

Diab M. Diab ◽

Khalil M. El Hindi

Keyword(s):

Differential Evolution ◽

Text Classification ◽

Fine Tuning ◽

Bayesian Classifiers ◽

Naive Bayesian ◽

Naïve Bayesian

Download Full-text

Analysis of the Effect of Data Scaling on the Performance of the Machine Learning Algorithm for Plant Identification

Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) ◽

10.29207/resti.v4i1.1517 ◽

2020 ◽

Vol 4 (1) ◽

pp. 117-122

Author(s):

Agus Ambarwari ◽

Qadhli Jafar Adrian ◽

Yeni Herdiyeni

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Data Normalization ◽

Leaf Venation ◽

Naive Bayesian ◽

Naïve Bayesian ◽

Data Scaling ◽

Rbf Kernel

Data scaling has an important role in preprocessing data that has an impact on the performance of machine learning algorithms. This study aims to analyze the effect of min-max normalization techniques and standardization (zero-mean normalization) on the performance of machine learning algorithms. The stages carried out in this study included data normalization on the data of leaf venation features. The results of the normalized dataset, then tested to four machine learning algorithms include KNN, Naïve Bayesian, ANN, SVM with RBF kernels and linear kernels. The analysis was carried out on the results of model evaluations using 10-fold cross-validation, and validation using test data. The results obtained show that Naïve Bayesian has the most stable performance against the use of min-max normalization techniques as well as standardization. The KNN algorithm is quite stable compared to SVM and ANN. However, the combination of the min-max normalization technique with SVM that uses the RBF kernel can provide the best performance results. On the other hand, SVM with a linear kernel, the best performance is obtained when applying standardization techniques (zero-mean normalization). While the ANN algorithm, it is necessary to do a number of trials to find out the best data normalization techniques that match the algorithm.

Download Full-text