scholarly journals Comparison of Classification Data Mining C4.5 and Naïve Bayes Algorithms of EDM Dataset

TEM Journal ◽  
2021 ◽  
pp. 1738-1744
Author(s):  
Joseph Teguh Santoso ◽  
Ni Luh Wiwik Sri Rahayu Ginantra ◽  
Muhammad Arifin ◽  
R Riinawati ◽  
Dadang Sudrajat ◽  
...  

The purpose of this research is to choose the best method by comparing two classification methods of data mining C4.5 and Naïve Bayes on Educational Data Mining, in which the data used is student graduation data consisting of 79 records. Both methods are tested for validation with 10-ford X Validation and perform a T-Test difference test to produce a table that contains the best method ranking. Different results were obtained for each method. Based on the results of these two methods, it is very influential on the dataset and the value of the area under curve in the Naïve Bayes method is better than the C4.5 method in various datasets. Comparison of the method with the 10-Ford X Validation test and the T-Test difference test is that the Naïve Bayes method is better than C4.5 with an average accuracy value of 73.41% and an under-curve area of 0.664.

Author(s):  
Aida Sopia ◽  
Muhammad Syahrizal

Office stationery (ATK) is now a necessity for almost everyone, especially for corporate instances or educational institutions. The need for office stationery is often an unexpected need to buy, this is what makes some educational institutions overwhelmed in doing their work, when they find out their office stationery is out of stock, so it is not uncommon to make work in company or institutional instances education is not completed on time, one of the ways to be more efficient is by implementing data mining to predict the purchase of office stationery (ATK) at the company or educational institution's intents, especially at the AL IKHWAN Middle School in Tanjung Morawa. The Naive Bayes method is used to analyze data in pattern recognition and predict purchase of office stationery (ATK) at AL IKHWAN Middle School in Tanjung Morawa. The data needed is the data for the purchase of office stationery (ATK) last month as test data, calculated from the date of the first purchase until the expiry date of office stationery (ATK) at AL IKHWAN Middle School in Tanjung Morawa. The results of this study are to be able to predict whether the office stationery (ATK) at AL IKHWAN Middle School in Tanjung Morawa can be bought back, or it can still be used for a long time, and if more than four types of stationery at the AL IKHWAN Middle School in Tanjung Morawa the lack of writing instruments from the two then the purchase of new stationery is feasible to do, from the amount of data occurring out of stationery


2021 ◽  
Vol 5 (1) ◽  
pp. 32
Author(s):  
Hartatik Hartatik

<p>Abstrak :</p><p>Prediksi tentang status kelulusan mahasiswa menjadi persoalan tersendiri di perguruan tinggi. Perguruan tinggi utamanya di era Big Data sangatlah penting untuk melakukan prediksi perilaku akademik mahasiswa aktif sehingga dapat di ketahui kemungkinan mahasiswa bisa studi secara tepat waktu serta dapat diketahui langkah preventive dalam membuat prpgram perencanaan. Salah satu cara yang digunakan adalah teknik data mining yaitu menggunakan Algoritma <em>naive bayes</em>. Algoritma <em>Naive bayes</em> merupakan salah satu metode yang digunakan untuk memprediksi kelulusan mahasiswa.  Peneliti  dalam hal ini menerapkan  metode  <em>Naive bayes</em> menggunakan parameter Indeks prestasi kumulatif( IPK) dan membandingkan dengan menggunakan prediksi <em>naive bayes methods</em> berdasarkan parameter IPK dan sosial parameter yaitu jenis kelamin dan status tinggal. Dalam penelitian ini menggunakan parameter akademis  dan dilakukan optimasi menggunakan parameter sosial yang melekat pada mahasiswa. Berdasarkan hasil evaluasi untuk mendapatkan akurasi, hasil dari penelitian ini mendapatkan nilai akurasi untuk metode <em>Naive bayes</em>  sebesar 75% dan akurasi untuk model prediksi dengan parameter sosial  sebesar 85% dengan selisih akurasi 10%.</p><p>__________________________</p><p>Abstract : </p><p><em>Predictions about a student's graduation status are a problem in college. Major tertiary institutions in the era of Big Data are very important to predict the behavior of active students so that they can find out the possibility of students in a timely manner and can determine preventive steps in making program planning. One method used is data mining techniques using the Naive bayes Algorithm. The Naive bayes algorithm is one of the methods used to predict student graduation. Researchers in this case applied the Naive bayes method using the cumulative achievement index (GPA) parameter and compared using the prediction of the Naive bayes method based on the GPA parameters and social parameters, namely gender and status. This study uses academic parameters and is carried out optimally using social parameters inherent in students. Based on the results of the evaluation to get an accuracy value, the results of this study get an accurate value for the Naive bayes method of 75% and accurate for prediction models with social parameters of 85% with a difference of 10%.</em></p>


Author(s):  
Putu Gede Surya Cipta Nugraha ◽  
Gede Rasben Dantes ◽  
Kadek Yota Ernanda Aryanto

At PT. BPR XYZ credit problems is a very vital issue, where if many debtors are delinquent in payment it will increase the NPL value of the bank itself. Increasing the NPL value above 5% indicates that the bank is not healthy. From the above problems, then in this study aims to perform the implementation process of data mining methods to determine the accuracy level of prediction of creditworthiness at PT. BPR XYZ, so that the future of credit problems can be overcome. Data mining methods used in the prediction process are C4.5 and Naïve Bayes methods, where both methods are implemented and the accuracy level comparison process is used to see which method is more accurate in predicting creditworthiness. Both methods are also embedded AdaBoost method with the aim of increasing the accuracy in the process of prediction of creditworthiness feasibility. The result obtained from the comparison of method accuracy level, stated that the better accuracy is C4.5 method that is 90.00% with the precision level of 86.67%. As for the accuracy of Naïve Bayes method that is equal to 70.00% with the precision level of 79.71%. Then with the addition of AdaBoost method in predicting creditworthiness proved to increase the higher accuracy value of 91.54% in method C4.5 and by 78.13% in Naïve Bayes method. From the description above, with the implementation of AdaBoost method on the method of C4.5 and Naïve Bayes can improve the accuracy of the prediction of creditworthiness of PT. BPR XYZ. In addition, the implementation of the AdaBoost-based C4.5 method can be a recommendation for PT. BPR XYZ in conducting predictive process of credit worthiness in the future.


2020 ◽  
Vol 6 (1) ◽  
pp. 75
Author(s):  
Mufti Ari Bianto ◽  
Kusrini Kusrini ◽  
Sudarmawan Sudarmawan

Serangan Jantung adalah salah satu penyakit yang paling mematikan tercatat di dunia, terdapat jumlah kasus baru Penyakit Jantung sebanyak 43,32% serta jumlah kematian sebanyak 12,91%. Pada tahun 2013 jumlah penderita Penyakit Jantung di Indonesaia sejumlah 61.682 orang, pada umumnya jumlah penderita penyakit ini terus meningkat dikarenakan kurangnya pengetahuan atau informasi tentang penyakit jantung tersebut, oleh karena itu dibutuhkan sebuah sistem yang dapat memberikan informasi serta klasifikasi penyakit secara dini yang dapat digunakan untuk klasifikasi apabila seseorang ingin mengetahui informasi ataupun gejala awal serangan jantung. Metode naïve bayes merupakan salah satu metode yang digunakan untuk melakukan klasifikasi berdasarkan probabilitas atau kemungkinan dari data sebelumnya, selain pendekatannya sederhana metode tersebut juga dapat melakukan klasifikasi secara baik. Mekanisme pengujiannya yaitu membagi 303 data kedalam 5 subset yang akan divalidasi dengan 5-fold cross validation. Hasil akhir dari penelitian ini adalah penerapan sistem klasifikasi dengan menggunakan metode naïve bayes yang akan menghasilkan nilai rata-rata akurasi sebesar 90,61%, presisi sebesar 87,44 %, dan recall sebesar 87,95%. Kata Kunci — klasifikasi, penyakit jantung, naïve bayesClassifier Heart attack is one of the most deadly diseases recorded in the world, there are a number of new cases of heart disease as much as 43.32% and the number of deaths as much as 12.91%. In 2013 the number of sufferers of heart disease in Indonesia amounted to 61,682 people, in general the number of sufferers of this disease continues to increase due to lack of knowledge or information about heart disease, therefore we need a system that can provide information and classification of diseases early that can be used for classification if someone wants to find out information or early symptoms of a heart attack. Naïve Bayes method is one of the methods used to classify based on the probability or likelihood of previous data, in addition to a simple approach the method can also do a good classification. The testing mechanism is to divide 303 data into 5 subsets that will be validated by 5-fold cross validation. The final result of this study is the application of the classification system using the Naïve Bayes method which will produce an average accuracy value of 90.61%, a precision of 87.44%, and a recall of 87.95%. Keywords — classification, heart disease, naïve bayes


2021 ◽  
Vol 14 (1) ◽  
pp. 60
Author(s):  
Ngurah Agus Sanjaya ER ◽  
I Gusti Agung Gede Arya Kadyanan

Udatari is the first traditional dance platform in Indonesia which provides information about traditional events such as, dance tutorials, group dancer and dance attributes. The tight competition in the startup world, requires Udatari as a new startup to manage application users optimally. Knowing loyal users will help startups determine the right marketing strategy. In this study, the method used for clustering is the K-Means method where this method seeks to classify existing data into several groups provided that the data in one group have the same characteristics as each other. The model used for the clustering process is RFM, namely recency, frequency and monetary. The purpose of this clustering is to get the segmentation of users who have different Customer Lifetime Value. The second method for conducting classification is the Naïve Bayes method, where this method predicts future opportunities based on past experiences. The purpose of this classification is to predict new users into the user segmentation obtained from the clustering results. From the results of this study, the optimum k value for K-Means are 3 clusters with the largest CLV value in the second cluster where testing on this method uses the Silhouette Index. Furthermore, for the test results of the Naïve Bayes method, the average accuracy value is 97.44% where the accuracy of each class is 92.31% for cluster 0 (first cluster), 100% for the second cluster and 100% for the third cluster. Keywords: K-Means, Naïve Bayes, Loyalty, Segmentation, RFM


2017 ◽  
Vol 8 (3) ◽  
pp. 146
Author(s):  
BUDI RAMADHANI

Permasalahan yang sering timbul pada perusahaan leasing adalah banyaknya pelanggan yang mengalami kesulitan dalam membayar cicilannya, maka diperlukan suatu sistem yang dapat mengklasifikasikan konsumen yang masuk ke grup saat ini, kelompok kurang lancar dan konsumen yang masuk ke dalam kelompok tidak lancar dalam membayar cicilan cicilan sepeda motor. Sehingga sewa bisa mengatasi masalah awal. Sebuah perusahaan leasing harus memiliki data yang sangat besar. Banyak yang tidak menyadari bahwa pengolahan data data tersebut bisa memberikan informasi seperti klasifikasi data konsumen yang akan bergabung dengan perusahaan itu sendiri. Penerapan teknik data mining diharapkan dapat memberikan informasi yang berguna mengenai teknik klasifikasi data konsumen yang akan bergabung dengan grup saat ini, kelompok kurang lancar atau tidak lancar dalam membayar premi.Langkah penelitian meliputi pengumpulan dan pengujian data algoritma Naive Bayes. Dalam penelitian ini, kumpulan data yang digunakan adalah Customer, Employment, Number of Children, Status Houses, region, angsuran.Penelitian ini bertujuan untuk mengetahui Klasifikasi Metode Naive Bayes Berbasis Metode PSO Untuk Smooth Credit Leasing MotorcyclesHasil percobaan menggunakan metode Naïve Bayes untuk mengukur pengukuran lancar dan tidak lancar yang diperoleh pengukuran memiliki Naïve Baiyes tertinggi adalah 96,43% namun sekarang metode algoritma Naive Bayes Particle Swarm Optimization sebesar 96,88%, adalah akurasi namun baik Keywords: Current and Non Current, Naive Bayes Method Based PSO


2018 ◽  
Vol 5 (2) ◽  
pp. 60-67 ◽  
Author(s):  
Dwi Yulianto ◽  
Retno Nugroho Whidhiasih ◽  
Maimunah Maimunah

ABSTRACT   Banana fruit is a commodity that contributes a great value to both national and international fruit production achievement. The government through the National Standardization Agency establishes standards to maintain the quality of bananas. The purpose of this Project is to classify the stages of maturity of Ambon banana base on the color index using Naïve Bayes method in accordance with the regulations of SNI 7422:2009. Naive Bayes is used as a method in the classification process by comparing the probability values generated from the variable value of each model to determine the stage of Ambon banana maturity. The data used is the primary data image of 105 pieces of Ambon banana. By using 3 models which consists of different variables obtained the same greatest average accuracy by using the 2nd model which has 9 variable values (r, g, b, v, * a, * b, entropy, energy, and homogeneity) and the 3rd model has 7 variable values (r, g, b, v , * a, entropy and homogeneity) that is 90.48%.   Keywords: banana maturity, classification, image processing     ABSTRAK   Buah pisang merupakan komoditas yang memberikan kontribusi besar terhadap angka produksi buah nasional maupun internasional. Pemerintah melalui Badan Standarisasi Nasional menetapkan standar untuk buah pisang, menjaga mutu  buah pisang. Tujuan dari penelitian ini adalah klasifikasi tahapan kematangan dari buah pisang ambon berdasarkan indeks warna menggunakan metode Naïve Bayes  sesuai dengan SNI 7422:2009. Naive bayes digunakan sebagai metode dalam proses pengklasifikasian dengan cara membandingkan nilai probabilitas yang dihasilkan dari nilai variabel penduga setiap model untuk menentukan tahap kematangan pisang ambon. Data yang digunakan adalah data primer citra pisang ambon sebanyak 105. Dengan menggunakan 3 buah model yang terdiri dari variabel penduga yang berbeda didapatkan akurasi rata-rata terbesar yang sama yaitu dengan menggunakan model ke-2 yang mempunyai 9 nilai variabel (r, g, b, v, *a, *b, entropi, energi, dan homogenitas) dan model ke-3 yang mempunyai 7 nilai variabel (r, g, b, v, *a, entropi dan homogenitas) yaitu sebesar 90.48%.   Kata Kunci : kematangan pisang,  klasifikasi, pengolahan citra


Sign in / Sign up

Export Citation Format

Share Document