Analisis Optimasi Algoritma Klasifikasi Naive Bayes menggunakan Genetic Algorithm dan Bagging

The increasing demand for credit applications to banks has motivated the banking world to switch to more sophisticated techniques for analyzing the level of credit risk. One technique for analyzing the level of credit risk is the data mining approach. Data mining provides a technique for finding meaningful information from large amounts of data by way of classification. However, bank marketing data is a type of imbalance data so that if the classification is done the results are less than optimal. The classification algorithm that can be used for imbalance data types can use naïve Bayes. Naïve Bayes performs well in terms of classification. However, optimization is needed in order to obtain more optimal classification results. Optimization techniques in handling imbalance data have been developed with several approaches. Bagging and Genetic Algorithms can be used to overcome imbalance data. This study aims to compare the accuracy level of the naïve Bayes algorithm after optimization using the bagging and genetic algorithm. The results showed that the combination of bagging and a genetic algorithm could improve the performance of Naive Bayes by 4.57%.

Download Full-text

Prediksi Tingkat Kelulusan Tepat Waktu Mahasiswa Menggunakan Algoritma Naïve Bayes pada Universitas XYZ

Jurnal ULTIMATICS ◽

10.31937/ti.v12i2.1715 ◽

2020 ◽

Vol 12 (2) ◽

pp. 104-107

Author(s):

Nurhayati . ◽

Nuraeny Septianti ◽

Nani Retnowati ◽

Arief Wibowo

Keyword(s):

Data Mining ◽

Information Technology ◽

Data Processing ◽

Naive Bayes ◽

Naïve Bayes ◽

Bayes Method ◽

Processing Data ◽

Student Graduation ◽

Phase Data ◽

Bayes Algorithm

Data processing is imperative for the development of information technology. Almost any field of work has information about data. The data is made use of the analysis of the job. Nowadays, information data is imperatively processed to help workers in making decisions. This study discusses student prediction graduation rates by using the naïve Bayes method. That aims at providing information to college if they can use it properly to utilize the data of students who graduated by processing data mining. Based on the data mining process, steps founded that used producing information, namely predicting student graduation on time. The method of this study is Naïve Bayes with classification techniques. At this study, researchers used a six-phase data mining process of industry crossing standards in data mining known as CRISP-DM. The results of research concluded that the application of the Naive Bayes algorithm uses 4 (four) parameters namely ips, ipk, the number of credits, and graduation by getting an accuracy value of 80.95%.

Download Full-text

Implementation of The Naïve Bayes Algorithm with Feature Selection using Genetic Algorithm for Sentiment Review Analysis of Fashion Online Companies

2018 6th International Conference on Cyber and IT Service Management (CITSM) ◽

10.1109/citsm.2018.8674286 ◽

2018 ◽

Cited By ~ 2

Author(s):

Siti Ernawati ◽

Eka Rini Yulia ◽

Frieyadie ◽

Samudi

Keyword(s):

Genetic Algorithm ◽

Feature Selection ◽

Naive Bayes ◽

Naïve Bayes ◽

Review Analysis ◽

Bayes Algorithm

Download Full-text

Optimasi Model Prediksi Kelulusan Mahasiswa Menggunakan Algoritma Naive Bayes

Indonesian Journal of Applied Informatics ◽

10.20961/ijai.v5i1.44379 ◽

2021 ◽

Vol 5 (1) ◽

pp. 32

Author(s):

Hartatik Hartatik

Keyword(s):

Data Mining ◽

Big Data ◽

Prediction Models ◽

Naive Bayes ◽

Program Planning ◽

Naïve Bayes ◽

Bayes Method ◽

Student Graduation ◽

Bayes Algorithm ◽

Naive Bayes Method

Abstrak :Prediksi tentang status kelulusan mahasiswa menjadi persoalan tersendiri di perguruan tinggi. Perguruan tinggi utamanya di era Big Data sangatlah penting untuk melakukan prediksi perilaku akademik mahasiswa aktif sehingga dapat di ketahui kemungkinan mahasiswa bisa studi secara tepat waktu serta dapat diketahui langkah preventive dalam membuat prpgram perencanaan. Salah satu cara yang digunakan adalah teknik data mining yaitu menggunakan Algoritma naive bayes. Algoritma Naive bayes merupakan salah satu metode yang digunakan untuk memprediksi kelulusan mahasiswa. Peneliti dalam hal ini menerapkan metode Naive bayes menggunakan parameter Indeks prestasi kumulatif( IPK) dan membandingkan dengan menggunakan prediksi naive bayes methods berdasarkan parameter IPK dan sosial parameter yaitu jenis kelamin dan status tinggal. Dalam penelitian ini menggunakan parameter akademis dan dilakukan optimasi menggunakan parameter sosial yang melekat pada mahasiswa. Berdasarkan hasil evaluasi untuk mendapatkan akurasi, hasil dari penelitian ini mendapatkan nilai akurasi untuk metode Naive bayes sebesar 75% dan akurasi untuk model prediksi dengan parameter sosial sebesar 85% dengan selisih akurasi 10%.__________________________Abstract : Predictions about a student's graduation status are a problem in college. Major tertiary institutions in the era of Big Data are very important to predict the behavior of active students so that they can find out the possibility of students in a timely manner and can determine preventive steps in making program planning. One method used is data mining techniques using the Naive bayes Algorithm. The Naive bayes algorithm is one of the methods used to predict student graduation. Researchers in this case applied the Naive bayes method using the cumulative achievement index (GPA) parameter and compared using the prediction of the Naive bayes method based on the GPA parameters and social parameters, namely gender and status. This study uses academic parameters and is carried out optimally using social parameters inherent in students. Based on the results of the evaluation to get an accuracy value, the results of this study get an accurate value for the Naive bayes method of 75% and accurate for prediction models with social parameters of 85% with a difference of 10%.

Download Full-text

Implementasi Data Mining Untuk Memprediksi Penyakit Jantung Mengunakan Metode Naive Bayes

Journal of Innovation Information Technology and Application (JINITA) ◽

10.35970/jinita.v1i01.64 ◽

2019 ◽

Vol 1 (01) ◽

pp. 25-34

Author(s):

Ade Riani ◽

Yessy Susianto ◽

Nur Rahman

Keyword(s):

Data Mining ◽

Heart Rate ◽

Heart Disease ◽

Chest Pain ◽

Naive Bayes ◽

Naïve Bayes ◽

Mining Method ◽

The World ◽

Bayes Algorithm ◽

Exercise Induced

Heart disease is a disease with a high mortality rate in the world of health. The disease is usually rarely realized the cause. However, there are several parameters that can be used to predict whether a person has a risk of heart disease or not. As for this study, researchers will use several indicators including Age, Sex, Chest pain type, Trestbps, Cholesterol, Fasting blood sugar, Resting ECG, Max heart rate, Exercise-induced angina, Oldpeak, Slope, Number of vessels coloured, and Thal This research will perform calculations using the Data Mining method with the Naive Bayes Algorithm. The results of this study get an accuracy of 86% for the 303 datasets tested.

Download Full-text

The Comparison of Data Mining Methods Using C4.5 Algorithm and Naive Bayes in Predicting Heart Disease

Tech-E ◽

10.31253/te.v4i2.543 ◽

2021 ◽

Vol 4 (2) ◽

pp. 44

Author(s):

Rino Rino

Keyword(s):

Data Mining ◽

Heart Disease ◽

Naive Bayes ◽

Naïve Bayes ◽

Data Set ◽

A Value ◽

C4.5 Algorithm ◽

Calculation Results ◽

Mining Methods ◽

Bayes Algorithm

Heart disease is a condition of the presence of fatty deposits in the coronary arteries in the heart which changes the role and shape of the arteries so that blood flow to the heart is obstructed. Data mining methods can predict this disease, some of the methods are C4.5 Algorithm and Naive Bayes which are often used in research.The data set in this research was obtained from the uci machine learning repository site, where the dataset has 3546 records and 13 attributes.The accuracy value of the Naïve Bayes algorithm has a high value of 81.40% compared to the C4.5 algorithm which only has an accuracy value of 79.07%. Based on the calculation results, it can be concluded that the Naïve Bayes Algorithm is a very good clarification because it has a value between 0.709 - 1.00.From conclusion above, the Naïve Bayes algorithm has a higher accuracy value than the C4.5 algorithm so the researchers decided to use the Naïve Bayes algorithm in predicting heart disease.

Download Full-text

Optimasi Naive Bayes Menggunakan Algoritma Genetika Sebagai Seleksi Fitur Untuk Memprediksi Performa Siswa

Jurnal Ilmiah Teknologi Informasi Asia ◽

10.32815/jitika.v14i1.400 ◽

2020 ◽

Vol 14 (1) ◽

pp. 31

Author(s):

Suhendro Busono

Keyword(s):

Data Mining ◽

Genetic Algorithm ◽

Student Performance ◽

Parent Education ◽

Naive Bayes ◽

Electronic Media ◽

Naïve Bayes ◽

Parent Support ◽

Long Time ◽

Parent Relation

In this globalisation era, the morality tenegers decrease.This fenomena can be seen on mass or electronic media. Mass or electronic media inform that the negatif case often happend on teenegers community. Negatif case such as brawl, drug, gambling, rape, disobidience to parents, and others. The cause of negatif case is not from himself or hisself but it is triggered by bad customs. The less of parent attention, the low of parent relation quality can inflict bad customs from children. Parent education, parent job, the parent support of education can influence children mainset. How long time children study, how long time children have sparetime, how long time children make friend, and how long time children acess internet can influence mainset of children. The customs of children explained on sentences before, can be measured by science and tecnology. Data Mining that is branch of computer science can measure how much quality children or adult perform based on custom framer indicator. In the last research of student performance using Naive Bayes Methode, the number of attribute is too much (33 attribut) and the score of accuracy is 91.15 %. In this research, the researcher optimize attributes of the last research using Genetic Algorithm. Genetic Algorithm can choose relevant attribut. The choice of relevant attributes can increase score of accuracy. The score of accuracy after using Genetic Algorithm is 97.21 %.

Download Full-text

Genetic Algorithm based CFS and Naive Bayes Algorithm to Enhance the Predictive Accuracy

Indian Journal of Science and Technology ◽

10.17485/ijst/2015/v8i26/53086 ◽

2015 ◽

Vol 8 (26) ◽

Cited By ~ 2

Author(s):

T. Karthikeyan ◽

P. Thangaraju

Keyword(s):

Genetic Algorithm ◽

Naive Bayes ◽

Predictive Accuracy ◽

Naïve Bayes ◽

Bayes Algorithm

Download Full-text

Analisis Kelayakan Lokasi Promosi Dalam Penerimaan Mahasiswa Baru (PMB) Dengan Algoritma Naïve Bayes & Decission Tree C4.5

Kilat ◽

10.33322/kilat.v10i1.1196 ◽

2021 ◽

Vol 10 (1) ◽

pp. 169-178

Author(s):

Wulan Wulandari

Keyword(s):

Data Mining ◽

Naive Bayes ◽

Naïve Bayes ◽

Classification Algorithms ◽

Tertiary Institution ◽

Public And Private ◽

Measurement Results ◽

Bayes Algorithm ◽

The City ◽

New Student

Competition for new student admissions in every public and private tertiary institution is currently growing rapidly every year, some spend a lot of money on promotional activities, to assist institutions / institutions in obtaining recommendations for the feasibility of promotion locations based on several measurement criteria using the classification algorithms contained in data mining . The algorithm used to compare the measurement of the feasibility of the promotion location of the city and district of Bekasi is Naïve Bayes and Decission Tree C4.5 using four parameters including the number of students in one sub-district, the number of students in one sub-district, the distance of location and last year's enthusiasts using 35 regions / sub-districts in Bekasi city and district. measurement results using the rapidminner, the accuracy value of the Naïve Bayes algorithm is 91.43% and the Decission Tree C4.5 is 94.29%.

Download Full-text

Analisis Sentimen Multi-Aspek Berbasis Konversi Ikon Emosi dengan Algoritme Naïve Bayes untuk Ulasan Wisata Kuliner Pada Web Tripadvisor

Jurnal Teknologi Informasi dan Ilmu Komputer ◽

10.25126/jtiik.2020731907 ◽

2020 ◽

Vol 7 (4) ◽

pp. 737

Author(s):

Sitti Aliyah Azzahra ◽

Arief Wibowo

Keyword(s):

Data Mining ◽

Naive Bayes ◽

Naïve Bayes ◽

Emotional Expressions ◽

Tourist Attraction ◽

Test Results ◽

Tourist Attractions ◽

Bayes Algorithm ◽

Labeling Method ◽

The City

Wisatawan seringkali mencari informasi tentang obyek wisata pada situs web seperti TripAdvisor. Situs web TripAdvisor memiliki fitur bagi penguna terdaftar untuk memberi ulasan tentang objek wisata dalam kategori kuliner dari berbagai negara. Ulasan tersebut bisa digunakan wisatawan sebagai pertimbangan sebelum mendatangi objek wisata kuliner yang ingin dituju. Komentar atau ulasan yang ada di situs TripAdvisor dapat dianalisis untuk mengetahui nilai sentimen dari suatu obyek wisata yang diulas. Hasil analisis itu dapat bermanfaat bagi pengelola tempat wisata, pengusaha kuliner maupun bagi wisatawan lain. Ada tantangan yang ditemukan saat analisis sentimen dilakukan pada kalimat ulasan yang mengandung ikon emosi atau emoticon, karena ulasan dapat mengandung arti sentimen yang berbeda antara kalimat dengan ekspresi emosi yang ada. Penelitian ini berisi analisis ulasan tentang kuliner kota Bandung pada situs TripAdvisor yang mengklasifikasi sentimen menjadi tiga kelas. Penelitian ini menggunakan teknik klasifikasi data mining dengan algoritme Naïve Bayes dikombinasi dengan metode pelabelan multi aspek yang disertai konversi ikon emosi pada teks ulasan. Selain itu, analisis dilakukan pada bobot ulasan berdasarkan jumlah kontribusi pemberi ulasan di web TripAdvisor. Hasil pengujian menunjukkan bahwa penggunaan seluruh kombinasi metode tersebut dalam proses klasifikasi sentimen mampu menghasilkan nilai akurasi sebesar 98,67%. AbstractTourists often look for information about attractions on websites such as TripAdvisor. The TripAdvisor website has a feature for registered users to provide reviews about attractions in the culinary category from various countries. These reviews can be used by tourists as a consideration before visiting culinary attractions to be addressed. Comments or reviews on the TripAdvisor site can be analyzed to determine the sentiment value of a tourist attraction being reviewed. The results of the analysis can be useful for managers of tourist attractions, culinary entrepreneurs and for other tourists. There are challenges that are found when sentiment analysis is carried out on review sentences that contain emotion icons or emoticons, because reviews may contain different sentiment meanings between sentences and existing emotional expressions. This study contains a review of the culinary analysis of the city of Bandung on the TripAdvisor site which classifies sentiments into three classes. This study uses data mining classification techniques with the Naïve Bayes algorithm combined with a multi-aspect labeling method accompanied by the conversion of emotional icons in the review text. In addition, the analysis is carried out on the weight of the review based on the number of contributing reviewers on the TripAdvisor web. The test results show that the use of all combinations of these methods in the sentiment classification process is able to produce an accuracy value of 98.67%.

Download Full-text

Sentiment Analysis Using Naive Bayes Algorithm with Feature Selection Particle Swarm Optimization (PSO) and Genetic Algorithm

International Journal of Advances in Data and Information Systems ◽

10.25008/ijadis.v2i2.1224 ◽

2021 ◽

Vol 2 (2) ◽

Author(s):

Abi Rafdi ◽

Herman Mawengkang Herman ◽

Syahril Efendi

Keyword(s):

Genetic Algorithm ◽

Feature Selection ◽

Particle Swarm Optimization ◽

Sentiment Analysis ◽

Naive Bayes ◽

Confusion Matrix ◽

Particle Swarm ◽

Naïve Bayes ◽

Swarm Optimization ◽

Bayes Algorithm

This study analyzes Sentiment to see opinions, points of view, judgments, attitudes, and emotions towards creatures and aspects expressed through texts. One of Social Media is like Twitter is one of the most widely used means of communication as a research topic. The main problem with sentiment analysis is voting and using the best feature options for maximum results. Either, the most widely known classification method is Naive Bayes. However, Naive Bayes is very sensitive to significant features. That way, in this test, a comparison of feature selection is carried out using Particle Swarm Optimization and Genetic Algorithm to improve the accuracy performance of the Naive Bayes algorithm. Analyses are performed by comparing before and after testing using feature selection. Validation uses a cross-validation technique, while the confusion matrix ??is appealed to measure accuracy. The results showed the highest increase for Naïve Bayes algorithm accuracy when using the feature selection of the Particle Swarm Optimization Algorithm from 60.26% to 77.50%, while the genetic algorithm from 60.26% to 70.71%. Therefore, the choice of the best characteristics is Particle Swarm Optimization which is superior with an increase in accuracy of 17.24%.

Download Full-text