Educational Data Mining in Graduation Rate and Grade Predictions Utilizing Hybrid Decision Tree and Naïve Bayes Classifier

Author(s):  
La Ode Mohamad Zulfiqar ◽  
Nurul Renaningtias ◽  
M. Yoka Fathoni
2020 ◽  
Vol 4 (1) ◽  
pp. 95-101 ◽  
Author(s):  
Edi Sutoyo ◽  
Ahmad Almaarif

The quality of students can be seen from the academic achievements, which are evidence of the efforts made by students. Student academic achievement is evaluated at the end of each semester to determine the learning outcomes that have been achieved. If a student cannot meet certain academic criteria that are stated by fulfilling the requirements to continue his studies, the student may have the potential to not graduate on time or even Drop Out (DO). The high number of students who do not graduate on time or DO in higher education institutions can be minimized by detecting students who are at risk in the early stages of education and is supported by making policies that can direct students to complete their education. Also, if the time for completion of student studies can be predicted then the handling of students will be more effective. One technique for making predictions that can be used is data mining techniques. Therefore, in this study, the Naive Bayes Classifier (NBC) algorithm will be used to predict student graduation at Telkom University. The dataset was obtained from the Information Systems Directorate (SISFO), Telkom University which contained 4000 instance data. The results of this study prove that NBC was successfully implemented to predict student graduation. Prediction of the graduation of these students is able to produce an accuracy of 73,725%, precision 0.742, recall 0.736 and F-measure of 0.735.


Author(s):  
M. Khairul Anam ◽  
Bunga Nanti Pikir ◽  
Muhammad Bambang Firdaus

Pemerintah Pekanbaru saat ini sudah menerapkan teknologi dalam sistem pemerintahan, penerapannya saat ini masih mendapat keluhan dari masyarakat seperti layanan publik command center yang hanya sebagian masyarakat mengetahuinya dan penerapan cctv yang ada di Alat Pemberi Isyarat Lalu Lintas (APILL) yang belum berfungsi dengan baik. Penerapan teknologi lainnya oleh Pemerintah Pekanbaru dapat kita lihat dari keberadaan portal-portal web situs resmi Pemerintah. Sedangkan untuk melihat beragam komentar netizen dari twitter. Twitter menjadi tempat untuk mendapatkan data yang diungkapkan masyarakat melalui tweets yang diposting ke timeline. Analisa sentimen dilakukan untuk melihat pendapat atau kecenderungan opini netizen terhadap pemerintah Pekanbaru yang mengandung sentimen positif, negatif, dan netral. Data yang digunakan adalah tweet dengan jumlah dataset sebanyak 150 tweets. Data tersebut kemudian di analisa agar menjadi informasi. Analisa dilakukan menggunakan metode data mining yaitu Naïve Bayes Classifier, K-Nearest Neighbor (KNN), dan Decision tree. Penggunaan ketiga pendekatan ini berupaya untuk mengkategorikan hasil komentar netizen terkait penggunaan teknologi yang telah melalui proses analisis sentimen dan membandingkan keakuratan ketiga cara tersebut. Hasil akurasi yang didapatkan cukup beragam yaitu dari metode Naïve Bayes akurasi 100%, metode KKN akurasi 98,25%, dan metode decision tree akurasi 62,28%.


2021 ◽  
Vol 10 (2) ◽  
pp. 128-131
Author(s):  
Eka Sabna

Berlimpahnya data mahasiswa pada perguruan tinggi dapat digunakan secara maksimal sesuai dengan kebutuhan dan mampu diolah menjadi informasi yang bermanfaat sehingga dapat mengetahui hubungan antara atribut data yang di dalamnya dapat dianalisis dan diharapkan memiliki keluaran berupa kinerja mahasiswa yang berhubungan Hasil Belajar (IPK). Metode yang digunakan mengunakan metode CRISP-DM yang terdiri dari 6 tahapan. Ada 2 (dua) metode klasifikasi yang digunakan yaitu Naïve Bayes Classifier dan Decision Tree Algoritma C4.5 untuk dilakukan perbandingan algoritma mana yang lebih baik untuk memprediksi kinerja mahasiswa. Hasil dari implementasi data mining dengan menggunakan software Rapidminer, dilakukan terhadap dua model algoritma klasifikasi yaitu C4.5 dan NBC kemudian memasukan dataset sebagai bahan uji untuk kedua model tersebut yang di dalamnya terdapat data latih dan data uji. Berdasarkan nilai akurasi terbaik pada model algoritma Naive Bayes Classifier (NBC) adalah 80% sedangkan  berdasarkan nilai akurasi terbaik  pada model algoritma C4.5 adalah 60 %. Keywords: C4.5, NBC, IPK, kinerja, mahasiswa


2019 ◽  
Vol 1 (1) ◽  
pp. 14-28
Author(s):  
Ahmad Haidar Mirza

Data Mining is a process that uses statistical techniques, mathematics, artificial intelligence, machine learning to extract and identify useful information and related knowledge from large databases. Data mining is the process of finding new patterns in data by filtering large amounts of data. Data mining uses pattern recognition technology that is similar to statistical techniques and mathematical techniques. The patterns found can provide useful information for generating economic benefits, effectiveness and efficiency. Algorithm Naive Bayes Classifier is one method of data mining that can be used to support effective and efficient promotion strategies. The Naive Bayes Classifier algorithm is used to predict the interest of the study based on the calculations performed. The data used are new student registration data from 2014 until 2016 at Bina Darma University. The results of this study are new models that are expected to provide important information can be used to assist the Marketing Team of Bina Darma University Palembang in policy making and implementation of appropriate marketing strategy. The results obtained are expected to help to support the promotion strategies that impact on the effectiveness and efficiency of promotion and increase the number of new students who will register.


2017 ◽  
Vol 9 (2) ◽  
pp. 37
Author(s):  
Jaka Aulia Pratama ◽  
Zulhanif Zulhanif ◽  
Yadi Suprijadi

PT. JKL has a role as a main dealer of T’s brand are handling three types of motorcycle products in West Java. These are type of Sport, CUB, and Scooter(Automatic Transmissions). The company records the buyer of T’s brand motorcycle in the Customer Database (CDB). CDB collected from 2011 to 2013 yielded information of consumer characteristics which is necessary in market planning. Consumer characteristics are classified into two groups: Repeated Order and New Customer. Classification methods used in the study of Data Mining is the Naïve Bayes Classifier. Model classification is done by calculating the conditional probability to choose the greatest value of probability. The accuracy of the classification is 83% and the error classification is 17%.


2020 ◽  
Vol 17 (1) ◽  
pp. 37-42
Author(s):  
Yuris Alkhalifi ◽  
Ainun Zumarniansyah ◽  
Rian Ardianto ◽  
Nila Hardi ◽  
Annisa Elfina Augustia

Non-Cash Food Assistance or Bantuan Pangan Non-Tunai (BPNT) is food assistance from the government given to the Beneficiary Family (KPM) every month through an electronic account mechanism that is used only to buy food at the Electronic Shop Mutual Assistance Joint Business Group Hope Family Program (e-Warong KUBE PKH ) or food traders working with Bank Himbara. In its distribution, BPNT still has problems that occur that are experienced by the village apparatus especially the apparatus of Desa Wanasari on making decisions, which ones are worthy of receiving (poor) and not worthy of receiving (not poor). So one way that helps in making decisions can be done through the concept of data mining. In this study, a comparison of 2 algorithms will be carried out namely Naive Bayes Classifier and Decision Tree C.45. The total sample used is as much as 200 head of household data which will then be divided into 2 parts into validation techniques is 90% training data and 10% test data of the total sample used then the proposed model is made in the RapidMiner application and then evaluated using the Confusion Matrix table to find out the highest level of accuracy from 2 of these methods. The results in this classification indicate that the level of accuracy in the Naive Bayes Classifier method is 98.89% and the accuracy level in the Decision Tree C.45 method is 95.00%. Then the conclusion that in this study the algorithm with the highest level of accuracy is the Naive Bayes Classifier algorithm method with a difference in the accuracy rate of 3.89%.


2017 ◽  
Vol 5 (8) ◽  
pp. 260-266
Author(s):  
Subhankar Manna ◽  
Malathi G.

Healthcare industry collects huge amount of unclassified data every day.  For an effective diagnosis and decision making, we need to discover hidden data patterns. An instance of such dataset is associated with a group of metabolic diseases that vary greatly in their range of attributes. The objective of this paper is to classify the diabetic dataset using classification techniques like Naive Bayes, ID3 and k means classification. The secondary objective is to study the performance of various classification algorithms used in this work. We propose to implement the classification algorithm using R package. This work used the dataset that is imported from the UCI Machine Learning Repository, Diabetes 130-US hospitals for years 1999-2008 Data Set. Motivation/Background: Naïve Bayes is a probabilistic classifier based on Bayes theorem. It provides useful perception for understanding many algorithms. In this paper when Bayesian algorithm applied on diabetes dataset, it shows high accuracy. Is assumes variables are independent of each other. In this paper, we construct a decision tree from diabetes dataset in which it selects attributes at each other node of the tree like graph and model, each branch represents an outcome of the test, and each node hold a class attribute. This technique separates observation into branches to construct tree. In this technique tree is split in a recursive way called recursive partitioning. Decision tree is widely used in various areas because it is good enough for dataset distribution. For example, by using ID3 (Decision tree) algorithm we get a result like they are belong to diabetes or not. Method: We will use Naïve Bayes for probabilistic classification and ID3 for decision tree.  Results: The dataset is related to Diabetes dataset. There are 18 columns like – Races, Gender, Take_metformin, Take_repaglinide, Insulin, Body_mass_index, Self_reported_health etc. and 623 rows. Naive Bayes Classifier algorithm will be used for getting the probability of having diabetes or not. Here Diabetes is the class for Diabetes data set. There are two conditions “Yes” and “No” and have some personal information about the patient like - Races, Gender, Take_metformin, Take_repaglinide, Insulin, Body_mass_index, Self_reported_health etc. We will see the probability that for “Yes” what unit of probability and for “No” what unit of probability which is given bellow. For Example: Gender – Female have 0.4964 for “No” and 0.5581 for “Yes” and for Male 0.5035 is for “No” and 0.4418 for “Yes”. Conclusions: In this paper two algorithms had been implemented Naive Bayes Classifier algorithm and ID3 algorithm. From Naive Bayes Classifier algorithm, the probability of having diabetes has been predicted and from ID3 algorithm a decision tree has been generated.


2016 ◽  
Vol 8 (1) ◽  
Author(s):  
Linda Jayanti ◽  
Steven R. Sentinuwo ◽  
Oktavian A. Lantang ◽  
Agustinus Jacobus

Abstrak - Facebook memungkinkan penggunanya berinteraksi dengan orang yang kita kenal maupun orang yang tidak kita kenal, dimana hal tersebut dapat membuka peluang bagi kejahatan dunia maya seperti, penculikan, perdagangan manusia (trafficking), hingga pembunuhan. IOM mecatat bahwa korban perdagangan orang atau trafficking di Indonesia mencapai 74.616 hingga I juta per tahun, dimana tindak kejahatan teersebut banyak dilakukan melalui facebook sebagai medianya. Data teks (status) yang berada di halaman facebook sangat besar. Dengan menggunakan Teknik pengolahan data dari ilmu Data Mining, terutama di bidangtext mining, penulis memanfaatkannya untuk mengidentifikasi data teks (status facebook) yang terindikasi sebagai proses kejahatan trafficking dengan memakai salah satu teknik klasifikasi dengan teorema naïve bayes classifier (NBC).   Kata kunci : facebook, trafficking, data mining, text mining, klasifikasi, naïve bayes classifier.


Sign in / Sign up

Export Citation Format

Share Document