Constructing decision rules from naive bayes model for robust and low complexity classification

A large spectrum of classifiers has been described in the literature. One attractive classification technique is a Naïve Bayes (NB) which has been relayed on probability theory. NB has two major limitations: First, it requires to rescan the dataset and applying a set of equations each time to classify instances, which is an expensive step if a dataset is relatively large. Second, NB may remain challenging for non-statisticians to understand the deep work of a model. On the other hand, Rule-Based classifiers (RBCs) have used IF-THEN rules (henceforth, rule-set), which are more comprehensible and less complex for classification tasks. For elevating NB limitations, this paper presents a method for constructing a rule-set from the NB model, which serves as RBC. Experiments of the constructing rule-set have been conducted on (Iris, WBC, Vote) datasets. Coverage, Accuracy, M-Estimate, and Laplace are crucial evaluation metrics that have been projected to rule-set. In some datasets, the rule-set obtains significant accuracy results that reach 95.33 %, 95.17% for Iris and vote datasets, respectively. The constructed rule-set can mimic the classification capability of NB, provide a visual representation of the model, express rules infidelity with acceptable accuracy; an easier method to interpreting and adjusting from the original model. Hence, the rule-set will provide a comprehensible and lightweight model than NB itself.

Download Full-text

Twitter Sentiment Analysis towards COVID-19 Vaccines in the Philippines Using Naïve Bayes

Information ◽

10.3390/info12050204 ◽

2021 ◽

Vol 12 (5) ◽

pp. 204

Author(s):

Charlyn Villavicencio ◽

Julio Jerison Macrohon ◽

X. Alphonse Inbaraj ◽

Jyh-Horng Jeng ◽

Jer-Guang Hsieh

Keyword(s):

Sentiment Analysis ◽

Language Processing ◽

Data Science ◽

Naive Bayes ◽

The Philippines ◽

Naïve Bayes ◽

Social Networking Site ◽

Bayes Model ◽

The Government ◽

Processing Techniques

A year into the COVID-19 pandemic and one of the longest recorded lockdowns in the world, the Philippines received its first delivery of COVID-19 vaccines on 1 March 2021 through WHO’s COVAX initiative. A month into inoculation of all frontline health professionals and other priority groups, the authors of this study gathered data on the sentiment of Filipinos regarding the Philippine government’s efforts using the social networking site Twitter. Natural language processing techniques were applied to understand the general sentiment, which can help the government in analyzing their response. The sentiments were annotated and trained using the Naïve Bayes model to classify English and Filipino language tweets into positive, neutral, and negative polarities through the RapidMiner data science software. The results yielded an 81.77% accuracy, which outweighs the accuracy of recent sentiment analysis studies using Twitter data from the Philippines.

Download Full-text

Session Segmentation Method Based on Naïve Bayes Model

Advanced Engineering Forum ◽

10.4028/www.scientific.net/aef.6-7.576 ◽

2012 ◽

Vol 6-7 ◽

pp. 576-582

Author(s):

Ping Li ◽

Ming Liang Cui ◽

Zhen Shan Hou ◽

Liu Liu Wei ◽

Wen Hao Ying ◽

...

Keyword(s):

Naive Bayes ◽

Naïve Bayes ◽

Time Interval ◽

Segmentation Method ◽

Retrieval Process ◽

Search Activity ◽

Query Suggestion ◽

Bayes Model ◽

Discrimination Model ◽

Naïve Bayes Model

Session segmentation can not only contribute a lot to the further and deeper analysis of user’s search behavior but also act as the foundation of other retrieval process researches based on users’ complicated search behaviors. This paper proposes a session boundary discrimination model utilizing time interval and query likelihood on the basis of Naive Bayes Model. Compared with previous study, the model proposed in this paper shows a prominent improvement through experiment in three aspects, which is: recall ratio, precision ratio and value F. Owing to its advantage in session boundary discrimination, the application of the model can serve as a tool in fields like personalized information retrieval, query suggestion, search activity analysis and other fields which is related to search results improvement.

Download Full-text

An Enhanced Naive Bayes Model for Dissolved Oxygen Forecasting in Shellfish Aquaculture

IEEE Access ◽

10.1109/access.2020.3042180 ◽

2020 ◽

Vol 8 ◽

pp. 217917-217927

Author(s):

Dashe Li ◽

Jiajun Sun ◽

Huanhai Yang ◽

Xueying Wang

Keyword(s):

Dissolved Oxygen ◽

Naive Bayes ◽

Naïve Bayes ◽

Shellfish Aquaculture ◽

Bayes Model ◽

Naïve Bayes Model

Download Full-text

Sign prediction by motif naive Bayes model in social networks

Information Sciences ◽

10.1016/j.ins.2020.05.128 ◽

2020 ◽

Vol 541 ◽

pp. 316-331

Author(s):

Si-Yuan Liu ◽

Jing Xiao ◽

Xiao-Ke Xu

Keyword(s):

Social Networks ◽

Naive Bayes ◽

Naïve Bayes ◽

Bayes Model ◽

Naïve Bayes Model

Download Full-text

A Multilayer Naïve Bayes Model for Analyzing User’s Retweeting Sentiment Tendency

Computational Intelligence and Neuroscience ◽

10.1155/2015/510281 ◽

2015 ◽

Vol 2015 ◽

pp. 1-11 ◽

Cited By ~ 1

Author(s):

Mengmeng Wang ◽

Wanli Zuo ◽

Ying Wang

Keyword(s):

Information Diffusion ◽

Naive Bayes ◽

Naïve Bayes ◽

Structure Information ◽

Bayes Model ◽

Dynamic Social Network ◽

Dynamic Social Networks ◽

Text Information ◽

Naïve Bayes Model ◽

Tendency Analysis

Today microblogging has increasingly become a means of information diffusion via user’s retweeting behavior. Since retweeting content, as context information of microblogging, is an understanding of microblogging, hence, user’s retweeting sentiment tendency analysis has gradually become a hot research topic. Targeted at online microblogging, a dynamic social network, we investigate how to exploit dynamic retweeting sentiment features in retweeting sentiment tendency analysis. On the basis of time series of user’s network structure information and published text information, we first model dynamic retweeting sentiment features. Then we build Naïve Bayes models from profile-, relationship-, and emotion-based dimensions, respectively. Finally, we build a multilayer Naïve Bayes model based on multidimensional Naïve Bayes models to analyze user’s retweeting sentiment tendency towards a microblog. Experiments on real-world dataset demonstrate the effectiveness of the proposed framework. Further experiments are conducted to understand the importance of dynamic retweeting sentiment features and temporal information in retweeting sentiment tendency analysis. What is more, we provide a new train of thought for retweeting sentiment tendency analysis in dynamic social networks.

Download Full-text

The application of naive Bayes model averaging to predict Alzheimer's disease from genome-wide data

Journal of the American Medical Informatics Association ◽

10.1136/amiajnl-2011-000101 ◽

2011 ◽

Vol 18 (4) ◽

pp. 370-375 ◽

Cited By ~ 42

Author(s):

Wei Wei ◽

Shyam Visweswaran ◽

Gregory F Cooper

Keyword(s):

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Naive Bayes ◽

Model Averaging ◽

Naïve Bayes ◽

Bayes Model ◽

Genome Wide ◽

Genome Wide Data ◽

Naïve Bayes Model

Download Full-text

COMPARATIVE STUDY OF CLASSIFICATION ALGORITHMS: HOLDOUTS AS ACCURACY ESTIMATION

CogITo Smart Journal ◽

10.31154/cogito.v1i1.2.13-23 ◽

2016 ◽

Vol 1 (1) ◽

pp. 13 ◽

Cited By ~ 1

Author(s):

Debby Erce Sondakh

Keyword(s):

Decision Tree ◽

Nearest Neighbor ◽

Naive Bayes ◽

Decision Rules ◽

Naïve Bayes ◽

Support Vector ◽

Classification Algorithms ◽

K Nearest Neighbor ◽

Accuracy Estimation ◽

F Measure

Penelitian ini bertujuan untuk mengukur dan membandingkan kinerja lima algoritma klasifikasi teks berbasis pembelajaran mesin, yaitu decision rules, decision tree, k-nearest neighbor (k-NN), naïve Bayes, dan Support Vector Machine (SVM), menggunakan dokumen teks multi-class. Perbandingan dilakukan pada efektifiatas algoritma, yaitu kemampuan untuk mengklasifikasi dokumen pada kategori yang tepat, menggunakan metode holdout atau percentage split. Ukuran efektifitas yang digunakan adalah precision, recall, F-measure, dan akurasi. Hasil eksperimen menunjukkan bahwa untuk algoritma naïve Bayes, semakin besar persentase dokumen pelatihan semakin tinggi akurasi model yang dihasilkan. Akurasi tertinggi naïve Bayes pada persentase 90/10, SVM pada 80/20, dan decision tree pada 70/30. Hasil eksperimen juga menunjukkan, algoritma naïve Bayes memiliki nilai efektifitas tertinggi di antara lima algoritma yang diuji, dan waktu membangun model klasiifikasi yang tercepat, yaitu 0.02 detik. Algoritma decision tree dapat mengklasifikasi dokumen teks dengan nilai akurasi yang lebih tinggi dibanding SVM, namun waktu membangun modelnya lebih lambat. Dalam hal waktu membangun model, k-NN adalah yang tercepat namun nilai akurasinya kurang.

Download Full-text

Mining the crime data using naïve Bayes model

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v23.i2.pp1084-1092 ◽

2021 ◽

Vol 23 (2) ◽

pp. 1084

Author(s):

Lourdes M. Padirayon ◽

Melvin S. Atayan ◽

Jose Sherief Panelo ◽

Carlito R. Fagela, Jr

Keyword(s):

Law Enforcement ◽

Police Officers ◽

Naive Bayes ◽

Naïve Bayes ◽

Police Departments ◽

Data Set ◽

Crime Data ◽

Bayes Model ◽

Massive Number ◽

Index Crime

<p>A massive number of documents on crime has been handled by police departments worldwide and today's criminals are becoming technologically elegant. One obstacle faced by law enforcement is the complexity of processing voluminous crime data. Approximately 439 crimes have been registered in sanchez mira municipality in the past seven years. Police officers have no clear view as to the pattern crimes in the municipality, peak hours, months of the commission and the location where the crimes are concentrated. The naïve Bayes modelis a classification algorithm using the Rapid miner auto model which is used and analyze the crime data set. This approach helps to recognize crime trends and of which, most of the crimes committed were a violation of special penal laws. The month of May has the highest for index and non-index crimes and Tuesday as for the day of crimes. Hotspots were barangay centro 1 for non-index crimes and barangay centro 2 for index crimes. Most non-index crimes committed were violations of special law and for index crime rape recorded the highest crime and usually occurs at 2 o’clock in the afternoon. The crime outcome takes various decisions to maximize the efficacy of crime solutions.</p>

Download Full-text