scholarly journals An Improved Naive Bayesian Classification Model Based on Attribute Weighting

2020 ◽  
Vol 1550 ◽  
pp. 022017
Author(s):  
Xi Yue ◽  
Mengxuan Tang
2021 ◽  
Vol 748 (1) ◽  
pp. 012034
Author(s):  
Novriadi Antonius Siagian ◽  
Sutarman Wage ◽  
Sawaluddin

Abstract The Naïve Bayes method is proven to have a high speed when applied to large datasets, but the Naïve Bayes method has weaknesses when selecting attributes because Naïve Bayes is a statistical classification method that is only based on the Bayes theorem so that it can only be used to predict the probability of the class membership of a class independently. Independent without being able to do the selection of attributes that have a high correlation and correlation between one attribute with other attributes so that it can affect the value of accuracy. Naïve Bayesian Weight has been able to provide better accuracy than conventional Naïve Bayesian. Where an increase in the highest accuracy value obtained from the Water Quality dataset is equal to 88.57% in the Weight Naïve Bayesian classification model, while the lowest accuracy value is obtained from the Haberman dataset which is 78.95% in the conventional Naïve Bayesian classification model. The increase in accuracy of the Weight Naïve Bayesian classification model in the Water Quality dataset is 2.9%. While the increase in accuracy value in the Haberman dataset is 1.8%. If done the average accuracy of each dataset using the Weight Naïve Bayesian classification model is 2.35%. Based on the testing that has been done on all test data, it can be said that the Weight Naïve Bayesian classification model can provide better accuracy values than those produced by the conventional Naïve Bayesian classification model.


2013 ◽  
Vol 325-326 ◽  
pp. 1593-1596
Author(s):  
Yin E Hu ◽  
Ke Luo

Naive Bayesian classifier (NBC) is a simple and effective classification model, but its condition independence assumption is often violated in reality and makes it perform poorly. In our study, we attempt to improve the NBC model through the way of attribute selection based on rough set. The main idea of the improvement model is to select a closest approximate independent attributes subset and relax the assumption of independence. Through the experimental comparison and analysis on the UCI datasets, the model is proved effective.


2011 ◽  
Vol 36 (4) ◽  
pp. 51-66 ◽  
Author(s):  
Hemanta Saikia ◽  
Dibyojyoti Bhattacharjee

An all-rounder can take an imperative role in any version of the game of cricket, whether it is a test match or any other limited-over format of the game. The study classifies the performance of all-rounders who participated in IPL based on their strike rate and economy rate. Based on the factors mentioned, the all-rounders can be divided into four non-overlapping classes, viz., Performer, Batting All-rounder, Bowling All-rounder, and Under-performer. Several predictor variables that are supposed to influence the performance of all-rounders are considered. Step-wise multinomial logistic regression (SMLR) is used to identify the significant predictors. Samples of six incumbent all-rounders who had not participated in the first three seasons of IPL are considered. The significant predictors were then used to predict the expected class of an incumbent all-rounder using naive Bayesian classification model. The relevant data were collected from the websites, www.cricinfo.org and www.cricketnirvana.com. The key points of this study are as follows: The training sample is populated with 35 all-rounders who had performed in the first three seasons of IPL. Two variables, viz., strike rate (number of runs scored per 100 balls faced) and economy rate (average number of runs scored per over against the bowler) are used to classify the all-rounders as follows: Performer: An all-rounder with strike rate above median and economy rate below median. Batting All-rounder: An all-rounder with strike rate above median and economy rate above median. Bowling All-rounder: An all-rounder with strike rate below median and economy rate below median. Under-performer: An all-rounder with strike rate below median and economy rate above median. The step-wise multinomial logistic regression (SMLR) was used to identify the significant variables that are actually responsible for classification of the all-rounders. The strike rate in ODI, strike rate in Twenty-20, economy rate in ODI, economy rate in Twenty-20 and bowling type (Spin or Fast) of the all-rounders are found to be significant in determining the class of an all-rounder. The naive Bayesian classification model is used for forecasting the expected class of allrounders based on the significant predictors for six incumbent all-rounders who had played only in fourth season of IPL. The prediction done before IPL IV was then compared with the actual situation at the end of the tournament. It is found that four predictions were performed correctly out of the six. This model would be useful for the participating teams' management while deciding the bid of an all-rounder in the upcoming season of IPL as per their requirement.


Sign in / Sign up

Export Citation Format

Share Document