Learning Naïve Bayes Tree for Conditional Probability Estimation

This study presents an efficient way to deal with discrete as well as continuous values in Big Data in a parallel Naïve Bayes implementation on Hadoop's MapReduce environment. Two approaches were taken: (i) discretizing continuous values using a binning method; and (ii) using a multinomial distribution for probability estimation of discrete values and a Gaussian distribution for probability estimation of continuous values. The models were analyzed and compared for performance with respect to run time and classification accuracy for varying data sizes, data block sizes, and map memory sizes.

Download Full-text

Multinomial Naïve Bayes using similarity based conditional probability

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-181009 ◽

2019 ◽

Vol 36 (2) ◽

pp. 1431-1441 ◽

Cited By ~ 1

Author(s):

B. Santhi ◽

G.R. Brindha

Keyword(s):

Conditional Probability ◽

Naive Bayes ◽

Naïve Bayes

Download Full-text

Improving Tree augmented Naive Bayes for class probability estimation

Knowledge-Based Systems ◽

10.1016/j.knosys.2011.08.010 ◽

2012 ◽

Vol 26 ◽

pp. 239-245 ◽

Cited By ~ 71

Author(s):

Liangxiao Jiang ◽

Zhihua Cai ◽

Dianhong Wang ◽

Harry Zhang

Keyword(s):

Naive Bayes ◽

Naïve Bayes ◽

Probability Estimation ◽

Class Probability Estimation ◽

Class Probability

Download Full-text

A naive Bayes probability estimation model based on self-adaptive differential evolution

Journal of Intelligent Information Systems ◽

10.1007/s10844-013-0279-y ◽

2013 ◽

Vol 42 (3) ◽

pp. 671-694 ◽

Cited By ~ 28

Author(s):

Jia Wu ◽

Zhihua Cai

Keyword(s):

Differential Evolution ◽

Naive Bayes ◽

Naïve Bayes ◽

Probability Estimation ◽

Estimation Model ◽

Model Based ◽

Adaptive Differential Evolution ◽

Self Adaptive

Download Full-text

Learning Naive Bayes for Probability Estimation by Feature Selection

Advances in Artificial Intelligence - Lecture Notes in Computer Science ◽

10.1007/11766247_43 ◽

2006 ◽

pp. 503-514 ◽

Cited By ~ 5

Author(s):

Liangxiao Jiang ◽

Harry Zhang

Keyword(s):

Feature Selection ◽

Naive Bayes ◽

Naïve Bayes ◽

Probability Estimation

Download Full-text

Mining Aspects on the Social Network

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.i1045.0789s19 ◽

2019 ◽

Vol 8 (9S) ◽

pp. 285-289

Keyword(s):

Social Network ◽

Conditional Probability ◽

Significant Role ◽

Naive Bayes ◽

Market Value ◽

Naïve Bayes ◽

Bootstrap Technique ◽

The Social

This paper proposes an effective concept of mining the feedback of product given by the user. In return various solutions are suggested according to the ratings of the aspect and its corresponding weightage. The satisfaction of user is determined by the help of user’s rating and weight of the aspect determines the significance of each aspect in the user’s review. These methodologies are thus, important and play a significant role for the manufacturers and producers to improvise their product and eventually leading to rise in the market value of that particular product. The methodology here extracts the aspects from the feedbacks of users with the help of conditional probability and bootstrap technique. Also an approach that is supervised and is called by the name, Naïve Bayes is used to classify aspect ratings and the sentiment words are considered as properties or features.

Download Full-text

LEARNING DECISION TREES WITH LOG CONDITIONAL LIKELIHOOD

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001410007877 ◽

2010 ◽

Vol 24 (01) ◽

pp. 117-151 ◽

Cited By ~ 2

Author(s):

HAN LIANG ◽

YUHONG YAN ◽

HARRY ZHANG

Keyword(s):

Machine Learning ◽

Classification Accuracy ◽

Naive Bayes ◽

Naïve Bayes ◽

Probability Estimation ◽

Conditional Likelihood ◽

Learning Models ◽

Class Probability Estimation ◽

Probability Prediction ◽

Class Probability

In machine learning and data mining, traditional learning models aim for high classification accuracy. However, accurate class probability prediction is more desirable than classification accuracy in many practical applications, such as medical diagnosis. Although it is known that decision trees can be adapted to be class probability estimators in a variety of approaches, and the resulting models are uniformly called Probability Estimation Trees (PETs), the performances of these PETs in class probability estimation, have not yet been investigated. We begin our research by empirically studying PETs in terms of class probability estimation, measured by Log Conditional Likelihood (LCL). We also compare a PET called C4.4 with other representative models, including Naïve Bayes, Naïve Bayes Tree, Bayesian Network, KNN and SVM, in LCL. From our experiments, we draw several valuable conclusions. First, among various tree-based models, C4.4 is the best in yielding precise class probability prediction measured by LCL. We provide an explanation for this and reveal the nature of LCL. Second, compared with non tree-based models, C4.4 also performs best. Finally, LCL does not dominate another well-established relevant metric — AUC, which suggests that different decision-tree learning models should be used for different objectives. Our experiments are conducted on the basis of 36 UCI sample sets. We run all the models within a machine learning platform — Weka. We also explore an approach to improve the class probability estimation of Naïve Bayes Tree. We propose a greedy and recursive learning algorithm, where at each step, LCL is used as the scoring function to expand the decision tree. The algorithm uses Naïve Bayes created at leaves to estimate class probabilities of test samples. The whole tree encodes the posterior class probability in its structure. One benefit of improving class probability estimation is that both classification accuracy and AUC can be possibly scaled up. We call the new model LCL Tree (LCLT). Our experiments on 33 UCI sample sets show that LCLT outperforms all state-of-the-art learning models, such as Naïve Bayes Tree, significantly in accurate class probability prediction measured by LCL, as well as in classification accuracy and AUC.

Download Full-text

Naive Bayes Classification Given Probability Estimation Trees

2006 5th International Conference on Machine Learning and Applications (ICMLA'06) ◽

10.1109/icmla.2006.36 ◽

2006 ◽

Cited By ~ 14

Author(s):

Zengchang Qin

Keyword(s):

Naive Bayes ◽

Naïve Bayes ◽

Probability Estimation ◽

Naive Bayes Classification ◽

Naïve Bayes Classification

Download Full-text

Deteksi Komentar Cyberbullying Pada Media Sosial Berbahasa Inggris Menggunakan Naïve Bayes Classification

Jurnal Informatika ◽

10.31311/ji.v7i1.6920 ◽

2020 ◽

Vol 7 (1) ◽

pp. 46-54

Author(s):

Jasman Pardede

Keyword(s):

Feature Selection ◽

Conditional Probability ◽

Naive Bayes ◽

Naïve Bayes ◽

Naive Bayes Classification ◽

Text Preprocessing ◽

Naïve Bayes Classification

Pesatnya perkembangan teknologi dan media sosial dapat memudahkan pengguna untuk menyampaikan informasi. Selain itu, media sosial juga memberikan dampak negatif dengan cara memposting tulisan kejam atau berkomentar semena-mena tanpa memikirkan akibat pada orang lain. Hal inilah yang menjadikan salah satu terjadinya tindak kekerasan dalam dunia maya (Cyberbullying). Tahapan awal yang dilakukan dalam penelitian ini adalah pengolahan bahasa atau yang disebut dengan text preprocessing meliputi tokenizing,casefolding, stopword removal dan stemming. Kemudian feature selection yaitu mengubah dokument teks menjadi matriks dengan tujuan untuk mendapatkan fitur pada setiap kata untuk dijadikan parameter atau kriteria klasifikasi. Untuk pengambilan keputusan apakah komentar mengandung makna bully atau nonbully menggunakan algoritma Naïve Bayes Classification dengan model multinomial naïve bayes. Perhitungan yang dilakukan adalah menghitung nilai probabilitas setiap kata yang muncul berdasarkan classdan nilai perkalian class conditional probability. Berdasarkan hasil eksperimen menggunakan dataset “cyberbullying comments” yang diambil dari Kaggle akurasi yang didapat sebesar 80%, precission 81% dan recall 80%.

Download Full-text