scholarly journals Bayesian Prediction Model Based on Attribute Weighting and Kernel Density Estimations

2015 ◽  
Vol 2015 ◽  
pp. 1-7
Author(s):  
Zhong-Liang Xiang ◽  
Xiang-Ru Yu ◽  
Dae-Ki Kang

Although naïve Bayes learner has been proven to show reasonable performance in machine learning, it often suffers from a few problems with handling real world data. First problem is conditional independence; the second problem is the usage of frequency estimator. Therefore, we have proposed methods to solve these two problems revolving around naïve Bayes algorithms. By using an attribute weighting method, we have been able to handle conditional independence assumption issue, whereas, for the case of the frequency estimators, we have found a way to weaken the negative effects through our proposed smooth kernel method. In this paper, we have proposed a compact Bayes model, in which a smooth kernel augments weights on likelihood estimation. We have also chosen an attribute weighting method which employs mutual information metric to cooperate with the framework. Experiments have been conducted on UCI benchmark datasets and the accuracy of our proposed learner has been compared with that of standard naïve Bayes. The experimental results have demonstrated the effectiveness and efficiency of our proposed learning algorithm.

Mathematics ◽  
2021 ◽  
Vol 9 (19) ◽  
pp. 2378
Author(s):  
Shengfeng Gan ◽  
Shiqi Shao ◽  
Long Chen ◽  
Liangjun Yu ◽  
Liangxiao Jiang

Due to its simplicity, efficiency, and effectiveness, multinomial naive Bayes (MNB) has been widely used for text classification. As in naive Bayes (NB), its assumption of the conditional independence of features is often violated and, therefore, reduces its classification performance. Of the numerous approaches to alleviating its assumption of the conditional independence of features, structure extension has attracted less attention from researchers. To the best of our knowledge, only structure-extended MNB (SEMNB) has been proposed so far. SEMNB averages all weighted super-parent one-dependence multinomial estimators; therefore, it is an ensemble learning model. In this paper, we propose a single model called hidden MNB (HMNB) by adapting the well-known hidden NB (HNB). HMNB creates a hidden parent for each feature, which synthesizes all the other qualified features’ influences. For HMNB to learn, we propose a simple but effective learning algorithm without incurring a high-computational-complexity structure-learning process. Our improved idea can also be used to improve complement NB (CNB) and the one-versus-all-but-one model (OVA), and the resulting models are simply denoted as HCNB and HOVA, respectively. The extensive experiments on eleven benchmark text classification datasets validate the effectiveness of HMNB, HCNB, and HOVA.


2019 ◽  
Vol 8 (4) ◽  
pp. 2240-2242

Phishing email becomes more dangers problem in online bank truncation processing problem as well as social networking sites like Facebook, twitter, Instagram. Normally phishing is carrying out by mocking of email or text embedded in email body, which will provoke users to enter their credential. Training on phishing approach is not so much effective because users are not permanently remember their training tricks, warning messages.it is totally depend on the user action which will be performed on certain time on warning messages given by software while operating any URL. In this paper, phishing email classification is enhanced using J48, Naïve Bayes and decision tree on Spam base dataset. J48 does best classification on spam base which is 97%for true positive and 0.025% false negative. Random forest work best on small dataset that is up to 5000 and number of feature are 34.but increase dataset size and reduce feature Naïve Bayes work faster.


Author(s):  
Sheela Rani P ◽  
Dhivya S ◽  
Dharshini Priya M ◽  
Dharmila Chowdary A

Machine learning is a new analysis discipline that uses knowledge to boost learning, optimizing the training method and developing the atmosphere within which learning happens. There square measure 2 sorts of machine learning approaches like supervised and unsupervised approach that square measure accustomed extract the knowledge that helps the decision-makers in future to require correct intervention. This paper introduces an issue that influences students' tutorial performance prediction model that uses a supervised variety of machine learning algorithms like support vector machine , KNN(k-nearest neighbors), Naïve Bayes and supplying regression and logistic regression. The results supported by various algorithms are compared and it is shown that the support vector machine and Naïve Bayes performs well by achieving improved accuracy as compared to other algorithms. The final prediction model during this paper may have fairly high prediction accuracy .The objective is not just to predict future performance of students but also provide the best technique for finding the most impactful features that influence student’s while studying.


Smart cities which are becoming overcrowded today are making human beings life miserable and prone to more challenges on daily basis. Overcrowded is leading to vast generation of wastes contributing to air pollution and in turn is affecting health causing various diseases. Even though various measures are taken to recycle wastes, the rate at which it is being produced is becoming higher and higher. This paper deals with prediction of waste generation using Naïve Bayes machine learning algorithm(Classifier) based on the statistics of previous waste datasets. The datasets used for the future prediction are obtained from reliable sources. The implementation of the algorithm is done in Pyspark using Anaconda Jupyter. The performance of the classifier on the datasets is analyzed with confusion matrix and accuracy metric is used to rate the efficiency of the classifier. The accuracy obtained indicates that algorithm can be effectively used for real time prediction and it gives more accurate results for huge input datasets based on independence assumption.


Author(s):  
Liwei Fan ◽  
Kim Leng Poh

A Bayesian Network (BN) takes a relationship between graphs and probability distributions. In the past, BN was mainly used for knowledge representation and reasoning. Recent years have seen numerous successful applications of BN in classification, among which the Naïve Bayes classifier was found to be surprisingly effective in spite of its simple mechanism (Langley, Iba & Thompson, 1992). It is built upon the strong assumption that different attributes are independent with each other. Despite of its many advantages, a major limitation of using the Naïve Bayes classifier is that the real-world data may not always satisfy the independence assumption among attributes. This strong assumption could make the prediction accuracy of the Naïve Bayes classifier highly sensitive to the correlated attributes. To overcome the limitation, many approaches have been developed to improve the performance of the Naïve Bayes classifier. This article gives a brief introduction to the approaches which attempt to relax the independence assumption among attributes or use certain pre-processing procedures to make the attributes as independent with each other as possible. Previous theoretical and empirical results have shown that the performance of the Naïve Bayes classifier can be improved significantly by using these approaches, while the computational complexity will also increase to a certain extent.


Sign in / Sign up

Export Citation Format

Share Document