Analysis and Classification of Danger Level in Android Applications Using Naive Bayes Algorithm

The quality of an airline's services cannot be measured from the company's point of view, but must be seen from the point of view of customer satisfaction. Data mining techniques make it possible to predict airline customer satisfaction with a classification model. The Naïve Bayes algorithm has demonstrated outstanding classification accuracy, but currently independent assumptions are rarely discussed. Some literature suggests the use of attribute weighting to reduce independent assumptions, which can be done using particle swarm optimization (PSO) and genetic algorithm (GA) through feature selection. This study conducted a comparison of PSO and GA optimization on Naïve Bayes for the classification of Airline Passenger Satisfaction data taken from www.kaggle.com. After testing, the best performance is obtained from the model formed, namely the classification of Airline Passenger Satisfaction data using the Naïve Bayes algorithm with PSO optimization, where the accuracy value is 86.13%, the precision value is 87.90%, the recall value is 87.29%, and the value is AUC of 0.923.

Download Full-text

Classifying the Level of Energy-Environmental Efficiency Rating of Brazilian Ethanol

Energies ◽

10.3390/en13082067 ◽

2020 ◽

Vol 13 (8) ◽

pp. 2067

Author(s):

Nilsa Duarte da Silva Lima ◽

Irenilza de Alencar Nääs ◽

João Gilberto Mendes dos Reis ◽

Raquel Baracat Tosi Rodrigues da Silva

Keyword(s):

Decision Tree ◽

High Efficiency ◽

Rating Scale ◽

Naive Bayes ◽

Naïve Bayes ◽

Environmental Efficiency ◽

Classification Model ◽

Bayes Algorithm ◽

J48 Decision Tree

The present study aimed to assess and classify energy-environmental efficiency levels to reduce greenhouse gas emissions in the production, commercialization, and use of biofuels certified by the Brazilian National Biofuel Policy (RenovaBio). The parameters of the level of energy-environmental efficiency were standardized and categorized according to the Energy-Environmental Efficiency Rating (E-EER). The rating scale varied between lower efficiency (D) and high efficiency + (highest efficiency A+). The classification method with the J48 decision tree and naive Bayes algorithms was used to predict the models. The classification of the E-EER scores using a decision tree using the J48 algorithm and Bayesian classifiers using the naive Bayes algorithm produced decision tree models efficient at estimating the efficiency level of Brazilian ethanol producers and importers certified by the RenovaBio. The rules generated by the models can assess the level classes (efficiency scores) according to the scale discretized into high efficiency (Classification A), average efficiency (Classification B), and standard efficiency (Classification C). These results might generate an ethanol energy-environmental efficiency label for the end consumers and resellers of the product, to assist in making a purchase decision concerning its performance. The best classification model was naive Bayes, compared to the J48 decision tree. The classification of the Energy Efficiency Note levels using the naive Bayes algorithm produced a model capable of estimating the efficiency level of Brazilian ethanol to create labels.

Download Full-text

Analisis Klasifikasi Kanker Payudara Menggunakan Algoritma Naive Bayes

INFORMAL: Informatics Journal ◽

10.19184/isj.v4i3.14170 ◽

2020 ◽

Vol 4 (3) ◽

pp. 117

Author(s):

Hardian Oktavianto ◽

Rahman Puji Handri

Keyword(s):

Breast Cancer ◽

Naive Bayes ◽

Naïve Bayes ◽

World Health ◽

Average Percentage ◽

Average Value ◽

Treatment Measures ◽

Bayes Algorithm ◽

Health Organization

Breast cancer is one of the highest causes of death among women, this disease ranks second cause of death after lung cancer. According to the world health organization, 1 million women get a diagnosis of breast cancer every year and half of them die, in general this is due to early treatment and slow treatment resulting in new cancers being detected after entering the final stage. In the field of health and medicine, machine learning-based classification has been carried out to help doctors and health professionals in classifying the types of cancer, to determine which treatment measures should be performed. In this study breast cancer classification will be carried out using the Naive Bayes algorithm to group the types of cancer. The dataset used is from the Wisconsin breast cancer database. The results of this study are the ability of the Naive Bayes algorithm for the classification of breast cancer produces a good value, where the average percentage of correctly classified data reaches 96.9% and the average percentage of data is classified as incorrect only 3.1%. While the level of effectiveness of classification with naive bayes is high, where the average value of precision and recall is around 0.96. The highest precision and recall values are when the test data uses a percentage split of 40% with the respective values reaching 0.974 and 0.973.

Download Full-text

Opinion Mining on Culinary Food Customer Satisfaction Using Naïve Bayes Based-on Hybrid Feature Selection

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v15.i1.pp468-475 ◽

2019 ◽

Vol 15 (1) ◽

pp. 468 ◽

Cited By ~ 3

Author(s):

Oman Somantri ◽

Dyah Apriliani

Keyword(s):

Feature Selection ◽

Opinion Mining ◽

Naive Bayes ◽

Information Gain ◽

Naïve Bayes ◽

Classification Model ◽

Consumer Ratings ◽

Bayes Algorithm ◽

Restaurant Owners

<p>Conducting an assessment of consumer sentiments taken from social media in assessing a culinary food gives useful information for everyone who wants to get this information especially for migrants and tourists, in th other hand that information is very valuable for food stall and restaurant owners as information in improvinf food quality. Overcoming this problem, a sentiment analysis classification model using naïve bayes algorithm (NB) was applied to get this information. This problem occurs is the level of accuracy of classification of consumer ratings of culinary food is still not optimal because the weight of values in the data preprocessing process are not optimal. In this paper proposed a hybrid feature selection models to overcome the problems in the process of selecting the feature attributes that have not been optimal by using a combination of information gain (IG) and genetic algorithm (GA) algorithms. The result of this research showed that after the experiment and compared to using others algorithms produce the best of the level occuracy is 93%.</p>

Download Full-text

Illiteracy Classification Using K Means-Naïve Bayes Algorithm

JOIV International Journal on Informatics Visualization ◽

10.30630/joiv.2.3.129 ◽

2018 ◽

Vol 2 (3) ◽

pp. 153 ◽

Cited By ~ 1

Author(s):

Muhammad Firman Aji Saputra ◽

Triyanna Widiyaningtyas ◽

Aji Prasetya Wibawa

Keyword(s):

Error Rate ◽

Naive Bayes ◽

Research Result ◽

Naïve Bayes ◽

Optimal Number ◽

Testing Method ◽

The World ◽

Bayes Algorithm ◽

Number Of Classes

Illiteracy is an inability to recognize characters, both in order to read and write. It is a significant problem for countries all around the world including Indonesia. In Indonesia, illiteracy rate is generally set as an indicator to see whether or not education in Indonesia is successful. If this problem is not going to be overcome, it will affect people’s prosperity. One system that has been used to overcome this problem is prioritizing the treatment from areas with the highest illiteracy rate and followed by areas with lower illiteracy rate. The method is going to be a way easier to be applied if it is supported by classification process. Since the classification process needs a class, and there has not been any fine classification of illiteracy rate, there is needed a clustering process before classification process. This research is aimed to get optimal number of classes through clustering process and know the result of illiteracy classification process. The clustering process is conducted by using k means algorithm, and for the classification process is conducted by using Naïve Bayes algorithm. The testing method used to assess the success of classification process is 10-fold method. Based on the research result, it can be concluded that the optimal illiteracy classes are three classes with the classification accuracy value of 96.4912% and error rate value of 3.5088%. Whereas the classification with two classes get the accuracy value of 93.8596% and error rate value of 6.1404%. And for the classification with five classes get the accuracy value of 90.3509% and error rate value of 9.6491%.

Download Full-text

Image classification of art works based on multiple naive Bayes algorithm

International Journal of Arts and Technology ◽

10.1504/ijart.2021.10039375 ◽

2021 ◽

Vol 13 (2) ◽

pp. 1

Author(s):

Gang Liang

Keyword(s):

Image Classification ◽

Naive Bayes ◽

Naïve Bayes ◽

Art Works ◽

Bayes Algorithm

Download Full-text

Application of the Naïve Bayes Algorithm for Student Graduation Analysis

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i4.15.23596 ◽

2018 ◽

Vol 7 (4.15) ◽

pp. 421

Author(s):

Erick Akhmad Fahmi Alfa’izy ◽

Khairil Anam ◽

Naidah Naing ◽

Rosanita Tritias Utami ◽

Nur Anim Jauhariyah ◽

...

Keyword(s):

Naive Bayes ◽

Naïve Bayes ◽

Training Data ◽

College System ◽

Student Graduation ◽

Bayes Algorithm ◽

Using Data ◽

Analysis System ◽

Law Student

Design an analysis system to find out graduation by comparing previous data and existing data to overcome errors in a college system. By taking data records that are already available to be processed using the naïve Bayes algorithm. This research was conducted at Universitas Maarif Hasyim Latif. In this case, the object of research is to analyze the data of students with naïve Bayes algorithms to find out their graduation. For sampling the data taken is the previous Faculty of Law Student data to be used as training data, to retrieve the entire data using data records that are already available in the Directorate of Information Systems. That the naïve Bayes algorithm can be used in the classification of data in the form of a string or textual. This is based on researchers' trials in taking examples of calculations that have been done before. To compare the results of the classification of graduation analysis using the naïve Bayes algorithm testing is done with a sample of data in the form of training data compared to data testing. From the calculations that have been made, the accuracy is 77.78%.

Download Full-text

Pengaruh N-Gram terhadap Klasifikasi Buku menggunakan Ekstraksi dan Seleksi Fitur pada Multinomial Naïve Bayes

JURNAL MEDIA INFORMATIKA BUDIDARMA ◽

10.30865/mib.v5i1.2672 ◽

2021 ◽

Vol 5 (1) ◽

pp. 264

Author(s):

Esti Mulyani ◽

Fachrul Pralienka Bani Muhamad ◽

Kurnia Adi Cahyanto

Keyword(s):

Naive Bayes ◽

Automatic Classification ◽

Naïve Bayes ◽

Main Task ◽

Test Results ◽

Book Title ◽

Feature Extraction And Selection ◽

N Gram ◽

Bayes Algorithm

Libraries have the main task in the processing of library materials by classifying books according to certain ways. Dewey Decimal Classification (DDC) is the method most commonly used in the world to determine book classification (labeling) in libraries. The advantages of this DDC method are universal and more systematic. However, this method is less efficient considering the large number of books that must be classified in a library, as well as labeling that must follow label updates on the DDC. An automatic classification system will be the perfect solution to this problem. Automatic classification can be done by applying the text mining method. In this study, searching for words in the book title was carried out with N-Gram (Unigram, Bigram, Trigram) as a feature generation. The features that have been raised are then selected for features. The process of book title classification is carried out using the Naïve Bayes Multinomial algorithm. This study examines the effect of Unigram, Bigram, Trigram on the classification of book titles using the feature extraction and selection feature on Multinomial Naïve Bayes algorithm. The test results show Unigram has the highest accuracy value of 74.4%.

Download Full-text

Analisis Sentimen Sistem E-Tilang Menggunakan Algoritma Naive Bayes Dengan Optimalisasi Information Gain

Journal of Informatic and Information Security ◽

10.31599/jiforty.v1i1.137 ◽

2020 ◽

Vol 1 (1) ◽

pp. 19-26

Author(s):

Rakhmi Khalida ◽

Siti Setiawati

Keyword(s):

Sentiment Analysis ◽

Opinion Mining ◽

Naive Bayes ◽

Information Gain ◽

Naïve Bayes ◽

Traffic Violations ◽

The Government ◽

Bayes Algorithm ◽

User Friendly

Abstract The Government of Indonesia took steps to change the system to improve public services in traffic violations by implementing the e-ticketing system. This system is a solution for disciplining motorized motorists from committing traffic violations. The existence of e-ticketing is also a solution to prevent the delinquency of law enforcers from illegal levies, peace terms in place, to accountability of fines. In this study, sentiment analysis of the e-ticketing system or opinion mining to classify the variety of public comments that give a positive, negative or neutral impression. Twitter social media is one of the objects to express opinions because it is user friendly, updated topics, and openly accesses tweets. Opinions on Twitter are collected, then the preprocessing stage is performed, then the selection of information gain features helps reduce noise caused by irrelevant labels, the next step is the classification of sentiments with the Naïve Bayes algorithm and finally polarity sentiments. This research resulted in an accuracy of 41.82%, a precision of 50.51% and a recall of 45.45%. Keywords: Sentiment analysis, E-ticketing, Information Gain, Naive Bayes Abstrak Pemerintah Indonesia melakukan langkah perubahan untuk memperbaiki sistem pelayanan publik dalam pelanggaran berlalu-lintas yaitu dengan menerapkan sistem e-Tilang. Sistem ini menjadi solusi mendisiplinkan para pengendara kendaraan bermotor dari banyaknya melakukan pelanggaran berlalu-lintas. Keberadaan e-Tilang juga menjadi solusi mencegah kenakalan penegak hukum dari pungutan liar, istilah damai ditempat, hingga akuntabilitas uang denda. Dalam penelitian ini melakukan analisis sentimen tentang sistem e-Tilang atau opinion mining untuk mengelompokan ragam komentar masyarakat yang memberikan kesan positif, negatif atau netral. Media sosial Twitter menjadi salah satu objek untuk menyampaikan opini karena user friendly, topik ter-update, dan terbuka mengakses tweet. Opini pada twitter dikumpulkan, lalu dilakukan tahapan preprocessing, selanjutnya dengan seleksi fitur information gain membantu mengurangi noise yang disebabkan oleh label-label yang tidak relevan, tahap selanjutnya adalah klasifikasi sentimen dengan algoritma Naïve Bayes dan terakhir sentimen polarity. Penelitian ini menghasilkan accuracy 41,82%, presisi 50,51% dan recall 45,45%. Kata kunci: Analisis sentimen, E-Tilang, Information Gain, Naive Bayes

Download Full-text

ALGORITMA KLASIFIKASI NAIVE BAYES DAN SUPPORT VECTOR MACHINE DALAM LAYANAN KOMPLAIN MAHASISWA

JITK (Jurnal Ilmu Pengetahuan dan Teknologi Komputer) ◽

10.33480/jitk.v5i2.1181 ◽

2020 ◽

Vol 5 (2) ◽

pp. 211-220 ◽

Cited By ~ 2

Author(s):

Hermanto Hermanto ◽

Ali Mustopa ◽

Antonius Yadi Kuntoro

Keyword(s):

Support Vector Machine ◽

Text Mining ◽

Naive Bayes ◽

Naïve Bayes ◽

Support Vector ◽

Machine Method ◽

The Status ◽

Auc Value ◽

Bayes Algorithm

Service in the world of education is an important element for the creation of an academic atmosphere that is conducive to the implementation of a successful teaching and learning process. The process of service to students there is a tendency to be implemented not following the minimum service standards that must be provided to students so that students tend to complain about the services provided. Submission of criticism, complaints, input, or suggestions for dissatisfaction and problems that exist in the university environment is still very limited. Complaints can be constructive if submitted to the right place and party. In this research the data processing of email complaints from students conducted at the academic student body (students.bsi.ac.id). Student complaint data that will be processed is data in the form of * .xls complaint file. Before text data is analyzed using text mining methods, the pre-processing text needs to be done including tokenizing, case folding, stopwords, and stemming. After pre-processing, the classification method is then performed in classifying each complaint category and dividing the status into two parts, namely complaint and not complaint so that the status becomes a normal condition in text mining research. The purpose of this study is to obtain the most accurate algorithm in the classification of student complaints and can find out the results of the classification of the Naïve Bayes algorithm method and Support vector Machine used and compared. In this study, the results of testing by measuring the performance of these two algorithms using Cross-Validation, Confusion Matrix, and ROC Curves. The obtained Support vector Machine algorithm has the highest accuracy value compared to Naïve Bayes. AUC value = 0.922. for the Support vector machine method using the student academic data collection dataset (students.bsi.ac.id) has 84.45%, from the Naïve Bayes algorithm has an accuracy rate of about 69.75% and AUC value = 0.679.

Download Full-text