Illiteracy Classification Using K Means-Naïve Bayes Algorithm

Muhammad Firman Aji Saputra; Triyanna Widiyaningtyas; Aji Prasetya Wibawa

doi:10.30630/joiv.2.3.129

Illiteracy Classification Using K Means-Naïve Bayes Algorithm

JOIV International Journal on Informatics Visualization ◽

10.30630/joiv.2.3.129 ◽

2018 ◽

Vol 2 (3) ◽

pp. 153 ◽

Cited By ~ 1

Author(s):

Muhammad Firman Aji Saputra ◽

Triyanna Widiyaningtyas ◽

Aji Prasetya Wibawa

Keyword(s):

Error Rate ◽

Naive Bayes ◽

Research Result ◽

Naïve Bayes ◽

Optimal Number ◽

Testing Method ◽

The World ◽

Bayes Algorithm ◽

Number Of Classes

Illiteracy is an inability to recognize characters, both in order to read and write. It is a significant problem for countries all around the world including Indonesia. In Indonesia, illiteracy rate is generally set as an indicator to see whether or not education in Indonesia is successful. If this problem is not going to be overcome, it will affect people’s prosperity. One system that has been used to overcome this problem is prioritizing the treatment from areas with the highest illiteracy rate and followed by areas with lower illiteracy rate. The method is going to be a way easier to be applied if it is supported by classification process. Since the classification process needs a class, and there has not been any fine classification of illiteracy rate, there is needed a clustering process before classification process. This research is aimed to get optimal number of classes through clustering process and know the result of illiteracy classification process. The clustering process is conducted by using k means algorithm, and for the classification process is conducted by using Naïve Bayes algorithm. The testing method used to assess the success of classification process is 10-fold method. Based on the research result, it can be concluded that the optimal illiteracy classes are three classes with the classification accuracy value of 96.4912% and error rate value of 3.5088%. Whereas the classification with two classes get the accuracy value of 93.8596% and error rate value of 6.1404%. And for the classification with five classes get the accuracy value of 90.3509% and error rate value of 9.6491%.

Download Full-text

Preparing Annotated Data on Covid -19 by Employing Naïve Bayes

10.5121/csit.2021.111211 ◽

2021 ◽

Author(s):

Dipankar Das ◽

Akash Ghosh ◽

AdityaR Rayala ◽

Dibyajyoti Dhar ◽

Vidit Sarkar ◽

...

Keyword(s):

Naive Bayes ◽

Naïve Bayes ◽

The Past ◽

Positive Side ◽

The People ◽

The World ◽

Bayes Algorithm ◽

Opening Up

The on-going pandemic has opened the pandora’s box of the plethora of hidden problems which the society has been hiding for years. But the positive side to the present scenario is the opening up of opportunities to solve these problems on the global stage. One such area which was being flooded with all kinds of different emotions, and reaction from the people all over the world, is twitter, which is a micro blogging platform. Coronavirus related hash tags have been trending all over for many days unlikeany other event in the past. Our experiment mainly deals with the collection, tagging and classification of these tweets based on the different keywords that they may belong to, using the Naive Bayes algorithm atthe core.

Download Full-text

Perbandingan Optimasi Feature Selection pada Naïve Bayes untuk Klasifikasi Kepuasan Airline Passenger

Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) ◽

10.29207/resti.v5i3.3086 ◽

2021 ◽

Vol 5 (3) ◽

pp. 527-533

Author(s):

Yoga Religia ◽

Amali Amali

Keyword(s):

Feature Selection ◽

Customer Satisfaction ◽

Naive Bayes ◽

Naïve Bayes ◽

Point Of View ◽

Classification Model ◽

Passenger Satisfaction ◽

Airline Passenger ◽

Bayes Algorithm

The quality of an airline's services cannot be measured from the company's point of view, but must be seen from the point of view of customer satisfaction. Data mining techniques make it possible to predict airline customer satisfaction with a classification model. The Naïve Bayes algorithm has demonstrated outstanding classification accuracy, but currently independent assumptions are rarely discussed. Some literature suggests the use of attribute weighting to reduce independent assumptions, which can be done using particle swarm optimization (PSO) and genetic algorithm (GA) through feature selection. This study conducted a comparison of PSO and GA optimization on Naïve Bayes for the classification of Airline Passenger Satisfaction data taken from www.kaggle.com. After testing, the best performance is obtained from the model formed, namely the classification of Airline Passenger Satisfaction data using the Naïve Bayes algorithm with PSO optimization, where the accuracy value is 86.13%, the precision value is 87.90%, the recall value is 87.29%, and the value is AUC of 0.923.

Download Full-text

Analysis and Classification of Danger Level in Android Applications Using Naive Bayes Algorithm

2018 6th International Conference on Information and Communication Technology (ICoICT) ◽

10.1109/icoict.2018.8528733 ◽

2018 ◽

Author(s):

Ridho Alif Utama ◽

Parman Sukarno ◽

Erwid Musthofa Jadied

Keyword(s):

Naive Bayes ◽

Naïve Bayes ◽

Android Applications ◽

Bayes Algorithm ◽

Danger Level

Download Full-text

Analisa Topik Pendidikan Dalam Al-Quran dengan Pendekatan Text Mining

Jurnal Serambi Engineering ◽

10.32672/jse.v6i1.2649 ◽

2021 ◽

Vol 6 (1) ◽

Author(s):

Bustami Yusuf ◽

Muhammad Zaeki ◽

Hendri Ahmadian ◽

Khairan Ar ◽

Sri Wahyuni

Keyword(s):

Evaluation Method ◽

Naive Bayes ◽

Naïve Bayes ◽

Bayes Classifier ◽

Sources Of Knowledge ◽

Scientific Disciplines ◽

Study Results ◽

The World ◽

The Subject ◽

Bayes Algorithm

Education is one of the sciences that makes humans much better by learning various scientific disciplines. Al-Quran is one of the sources of knowledge that is believed by Muslims around the world. Because technology has penetrated almost every domain of our lives , including the world of education. Thus, the authors make technology as tool for researching educational topics in Al-Quran by implementing text exploration .The research was carried out by making some basic words that were related to the subject of education as the keywords in this study. The keywords are “Ajar”, “Bicara”, “Cipta”, “Dengar”, “Ingat” and “Lihat”. Then, the authors implemented the Naïve Bayes Classifier algorithm. To test and evaluate the results, the author used two methods, i.e. recall and precision. The study results are the keyword “cipta” by 3.05 %, “Ingat” 2.25 %, “Ajar” 1.96 %,“Lihat” 0.82 %, finally “Dengar” 0.62% and “Bicara” 0.34% with total weight of 3,516 words that have been filtered. The overall percentage of the results is 9.04% of the total number of words 38,761 in the Al-Quran. For the Naïve Bayes algorithm evaluation method, the recall and precision scores are 0.605 and 0.366, respectively.

Download Full-text

Classifying the Level of Energy-Environmental Efficiency Rating of Brazilian Ethanol

Energies ◽

10.3390/en13082067 ◽

2020 ◽

Vol 13 (8) ◽

pp. 2067

Author(s):

Nilsa Duarte da Silva Lima ◽

Irenilza de Alencar Nääs ◽

João Gilberto Mendes dos Reis ◽

Raquel Baracat Tosi Rodrigues da Silva

Keyword(s):

Decision Tree ◽

High Efficiency ◽

Rating Scale ◽

Naive Bayes ◽

Naïve Bayes ◽

Environmental Efficiency ◽

Classification Model ◽

Bayes Algorithm ◽

J48 Decision Tree

The present study aimed to assess and classify energy-environmental efficiency levels to reduce greenhouse gas emissions in the production, commercialization, and use of biofuels certified by the Brazilian National Biofuel Policy (RenovaBio). The parameters of the level of energy-environmental efficiency were standardized and categorized according to the Energy-Environmental Efficiency Rating (E-EER). The rating scale varied between lower efficiency (D) and high efficiency + (highest efficiency A+). The classification method with the J48 decision tree and naive Bayes algorithms was used to predict the models. The classification of the E-EER scores using a decision tree using the J48 algorithm and Bayesian classifiers using the naive Bayes algorithm produced decision tree models efficient at estimating the efficiency level of Brazilian ethanol producers and importers certified by the RenovaBio. The rules generated by the models can assess the level classes (efficiency scores) according to the scale discretized into high efficiency (Classification A), average efficiency (Classification B), and standard efficiency (Classification C). These results might generate an ethanol energy-environmental efficiency label for the end consumers and resellers of the product, to assist in making a purchase decision concerning its performance. The best classification model was naive Bayes, compared to the J48 decision tree. The classification of the Energy Efficiency Note levels using the naive Bayes algorithm produced a model capable of estimating the efficiency level of Brazilian ethanol to create labels.

Download Full-text

Analisis Klasifikasi Kanker Payudara Menggunakan Algoritma Naive Bayes

INFORMAL: Informatics Journal ◽

10.19184/isj.v4i3.14170 ◽

2020 ◽

Vol 4 (3) ◽

pp. 117

Author(s):

Hardian Oktavianto ◽

Rahman Puji Handri

Keyword(s):

Breast Cancer ◽

Naive Bayes ◽

Naïve Bayes ◽

World Health ◽

Average Percentage ◽

Average Value ◽

Treatment Measures ◽

Bayes Algorithm ◽

Health Organization

Breast cancer is one of the highest causes of death among women, this disease ranks second cause of death after lung cancer. According to the world health organization, 1 million women get a diagnosis of breast cancer every year and half of them die, in general this is due to early treatment and slow treatment resulting in new cancers being detected after entering the final stage. In the field of health and medicine, machine learning-based classification has been carried out to help doctors and health professionals in classifying the types of cancer, to determine which treatment measures should be performed. In this study breast cancer classification will be carried out using the Naive Bayes algorithm to group the types of cancer. The dataset used is from the Wisconsin breast cancer database. The results of this study are the ability of the Naive Bayes algorithm for the classification of breast cancer produces a good value, where the average percentage of correctly classified data reaches 96.9% and the average percentage of data is classified as incorrect only 3.1%. While the level of effectiveness of classification with naive bayes is high, where the average value of precision and recall is around 0.96. The highest precision and recall values are when the test data uses a percentage split of 40% with the respective values reaching 0.974 and 0.973.

Download Full-text

Implementasi Data Mining Untuk Memprediksi Penyakit Jantung Mengunakan Metode Naive Bayes

Journal of Innovation Information Technology and Application (JINITA) ◽

10.35970/jinita.v1i01.64 ◽

2019 ◽

Vol 1 (01) ◽

pp. 25-34

Author(s):

Ade Riani ◽

Yessy Susianto ◽

Nur Rahman

Keyword(s):

Data Mining ◽

Heart Rate ◽

Heart Disease ◽

Chest Pain ◽

Naive Bayes ◽

Naïve Bayes ◽

Mining Method ◽

The World ◽

Bayes Algorithm ◽

Exercise Induced

Heart disease is a disease with a high mortality rate in the world of health. The disease is usually rarely realized the cause. However, there are several parameters that can be used to predict whether a person has a risk of heart disease or not. As for this study, researchers will use several indicators including Age, Sex, Chest pain type, Trestbps, Cholesterol, Fasting blood sugar, Resting ECG, Max heart rate, Exercise-induced angina, Oldpeak, Slope, Number of vessels coloured, and Thal This research will perform calculations using the Data Mining method with the Naive Bayes Algorithm. The results of this study get an accuracy of 86% for the 303 datasets tested.

Download Full-text

KLASIFIKASI TEKS MENGGUNAKAN CHI SQUARE FEATURE SELECTION UNTUK MENENTUKAN KOMIK BERDASARKAN PERIODE, MATERI DAN FISIKDENGAN ALGORITMA NAIVEBAYES

Compiler ◽

10.28989/compiler.v5i2.171 ◽

2016 ◽

Vol 5 (2) ◽

Author(s):

Siti Anisah ◽

Anton Setiawan Honggowibowo ◽

Asih Pujiastuti

Keyword(s):

Feature Selection ◽

Error Rate ◽

Classification System ◽

Naive Bayes ◽

Naïve Bayes ◽

Chi Square ◽

Oracle Database ◽

Category O ◽

The Difference ◽

Bayes Algorithm

A comic has its own characteristics compared the other types of books. The difference between comic and other books can be seen from the category o f period, material and physical. Comicand other booksneeded an application o f classification system. Looking for the problem, classification system was made using Chi Square Feature Selection and Naive Bayes algorithm to determine the comic based on the period, material and physical. Delphi programming language and Oracle Database are used to build the Classification System. Chi Square Feature Selection acquired trait a comic is in 0.10347 and which not comic is in 1.9531. Furthermore, data is classified by the Naive Bayes algorithm. From 120 titles o f comic that consists 60 titles o f comic and non comicused to build classesfor trainand 60 titles o f comic and non comic used to test. The results o f Naive Bayesalgorithm for comic is 96,67%with 3.33% error rate, and non comic is 90% with 10% error rate. The classification to determine comic is good.

Download Full-text

Opinion Mining on Culinary Food Customer Satisfaction Using Naïve Bayes Based-on Hybrid Feature Selection

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v15.i1.pp468-475 ◽

2019 ◽

Vol 15 (1) ◽

pp. 468 ◽

Cited By ~ 3

Author(s):

Oman Somantri ◽

Dyah Apriliani

Keyword(s):

Feature Selection ◽

Opinion Mining ◽

Naive Bayes ◽

Information Gain ◽

Naïve Bayes ◽

Classification Model ◽

Consumer Ratings ◽

Bayes Algorithm ◽

Restaurant Owners

<p>Conducting an assessment of consumer sentiments taken from social media in assessing a culinary food gives useful information for everyone who wants to get this information especially for migrants and tourists, in th other hand that information is very valuable for food stall and restaurant owners as information in improvinf food quality. Overcoming this problem, a sentiment analysis classification model using naïve bayes algorithm (NB) was applied to get this information. This problem occurs is the level of accuracy of classification of consumer ratings of culinary food is still not optimal because the weight of values in the data preprocessing process are not optimal. In this paper proposed a hybrid feature selection models to overcome the problems in the process of selecting the feature attributes that have not been optimal by using a combination of information gain (IG) and genetic algorithm (GA) algorithms. The result of this research showed that after the experiment and compared to using others algorithms produce the best of the level occuracy is 93%.</p>

Download Full-text

Image classification of art works based on multiple naive Bayes algorithm

International Journal of Arts and Technology ◽

10.1504/ijart.2021.10039375 ◽

2021 ◽

Vol 13 (2) ◽

pp. 1

Author(s):

Gang Liang

Keyword(s):

Image Classification ◽

Naive Bayes ◽

Naïve Bayes ◽

Art Works ◽

Bayes Algorithm

Download Full-text