Data Mining Implementation Using Naïve Bayes Algorithm and Decision Tree J48 In Determining Concentration Selection

Computerization of society has substantially improved the ability to generate and collect data from a variety of sources. A large amount of data has flooded almost every aspect of people's lives. AMIK HASS Bandung has an Informatic Management Study Program consisting of three areas of concentration that can be selected by students in the fourth semester including Computerized Accounting, Computer Administration, and Multimedia. The determination of concentration selection should be precise based on past data, so the academic section must have a pattern or rule to predict concentration selection. In this work, the data mining techniques were using Naive Bayes and Decision Tree J48 using WEKA tools. The data set used in this study was 111 with a split test percentage mode of 75% used as training data as the model formation and 25% as test data to be tested against both models that had been established. The highest accuracy result obtained on Naive Bayes which is obtaining a 71.4% score consisting of 20 instances that were properly clarified from 28 training data. While Decision Tree J48 has a lower accuracy of 64.3% consisting of 18 instances that are properly clarified from 28 training data. In Decision Tree J48 there are 4 patterns or rules formed to determine concentration selection so that the academic section can assist students in determining concentration selection.

Download Full-text

IMPLEMENTASI DATA MINING UNTUK MEMPREDIKSI PEMESANAN DRIVER GO-JEK ONLINE DENGAN MENGGUNAKAN METODE NAIVE BAYES (STUDI KASUS: PT. GO-JEK INDONESIA)

KOMIK (Konferensi Nasional Teknologi Informasi dan Komputer) ◽

10.30865/komik.v2i1.972 ◽

2018 ◽

Vol 2 (1) ◽

Author(s):

Delisman Laia ◽

Efori Buulolo ◽

Matias Julyus Fika Sirait

Keyword(s):

Data Mining ◽

Naive Bayes ◽

Naïve Bayes ◽

Training Data ◽

Transportation Industry ◽

Data Set ◽

Data Mining Algorithms ◽

Taxi Service ◽

Bayes Algorithm ◽

Using Data

PT. Go-Jek Indonesia is a service company. Go-jek online is a technology-based motorcycle taxi service that leads the transportation industry revolution. Predictions on ordering go-jek drivers using data mining algorithms are used to solve problems faced by the company PT. Go-Jek Indonesia to predict the level of ordering of online go-to drivers. In determining the crowded and lonely time. The proposed method is Naive Bayes. Naive Bayes algorithm aims to classify data in certain classes. The purpose of this study is to look at the prediction patterns of each of the attributes contained in the data set by using the naive algorithm and testing the training data on testing data to see whether the data pattern is good or not. what will be predicted is to collect the data of the previous driver ordering, which is based on the day, time for one month. The Naive Bayes algorithm is used to predict the ordering of online go-to-go drivers that will be experienced every day by seeing each order such as morning, afternoon and evening. The results of this study are to make it easier for the company to analyze the data of each go-jek driver booking in taking policies to ensure that both drivers and consumers or customers.Keywords: Go-jek Driver, Data Mining, Naive Bayes

Download Full-text

The Comparison of Data Mining Methods Using C4.5 Algorithm and Naive Bayes in Predicting Heart Disease

Tech-E ◽

10.31253/te.v4i2.543 ◽

2021 ◽

Vol 4 (2) ◽

pp. 44

Author(s):

Rino Rino

Keyword(s):

Data Mining ◽

Heart Disease ◽

Naive Bayes ◽

Naïve Bayes ◽

Data Set ◽

A Value ◽

C4.5 Algorithm ◽

Calculation Results ◽

Mining Methods ◽

Bayes Algorithm

Heart disease is a condition of the presence of fatty deposits in the coronary arteries in the heart which changes the role and shape of the arteries so that blood flow to the heart is obstructed. Data mining methods can predict this disease, some of the methods are C4.5 Algorithm and Naive Bayes which are often used in research.The data set in this research was obtained from the uci machine learning repository site, where the dataset has 3546 records and 13 attributes.The accuracy value of the Naïve Bayes algorithm has a high value of 81.40% compared to the C4.5 algorithm which only has an accuracy value of 79.07%. Based on the calculation results, it can be concluded that the Naïve Bayes Algorithm is a very good clarification because it has a value between 0.709 - 1.00.From conclusion above, the Naïve Bayes algorithm has a higher accuracy value than the C4.5 algorithm so the researchers decided to use the Naïve Bayes algorithm in predicting heart disease.

Download Full-text

ANALISA PERBANDINGAN ALGORITME NAIVE BAYES DAN DECISION TREE PADA KLASIFIKASI DATA TRANSFUSI DARAH

Jurnal Ilmiah Teknologi Infomasi Terapan ◽

10.33197/jitter.vol5.iss1.2018.251 ◽

2019 ◽

Vol 5 (1) ◽

pp. 38-44

Author(s):

Rini Indrayani

Keyword(s):

Data Mining ◽

Decision Tree ◽

Naive Bayes ◽

Naïve Bayes ◽

Data Set

Donor darah merupakan proses pengambilan darah dari pendonor yang telah dinyatakan layak, ditinjau dari berbagai faktor. Penyakit yang diderita, usia, berat badan, tekanan darah, kadar hemoglobin, dan interval waktu donor merupakan aspek-aspek yang menjadi pertimbangan saat uji kelayakan. Karena pentingnya uji kelayakan tersebut, berbagai penelitian terkait uji kelayakan pendonor dilakukan menggunakan klasifikasi data mining dengan berbagai metode. Tantangan dari berbagai penelitian yang dilakukan adalah menemukan metode paling tepat dengan nilai akurasi dan presisi yang tinggi. Penelitian ini menggunakan 748 data set donor darah yang diproses menggunakan metode klasifikasi Na

Download Full-text

Prediksi Angka Kelahiran Bayi Pada Desa Tridaya Sakti Dengan Menggunakan Algoritma Naive Bayes

Journal of Students‘ Research in Computer Science ◽

10.31599/jsrcs.v1i2.423 ◽

2020 ◽

Vol 1 (2) ◽

pp. 77-88

Author(s):

Nur Isnaini Parihah ◽

Sari Hartini ◽

Juarni Siregar

Keyword(s):

Data Mining ◽

Population Growth ◽

Naive Bayes ◽

Large Population ◽

Naïve Bayes ◽

Training Data ◽

Birth Rates ◽

Testing Data ◽

Bayes Algorithm ◽

Infant Birth

The birth rate is something that can affect the increase in population growth. Large population is a burden for development. According to Malthus's Theory which states that a large population growth is not the welfare that is obtained but rather poverty will be encountered if the population is not well controlled. The number of baby births in Tridaya Sakti Village is increasing every year. Therefore Data Mining using the Naive Bayes algorithm can help in the calculation of predicting infant birth rates in Tridaya Sakti Village. Data Mining in predicting the number of infant birth rates aims to determine the number of infant birth rates for the coming year using the Naive Bayes algorithm. By looking at the prediction patterns of each variable and testing training data on testing data. It is hoped that the Naive Bayes algorithm can solve the problem in Tridaya Sakti Village in handling and overcoming the calculation of infant birth rates and can help the Tridaya Sakti Village in regulating population growth in the coming years. The results obtained from the data that have been taken and calculated by Data Mining using the Naive Bayes algorithm produce an information that can be used as a reference to find out the number of births. Performance and time in data processing are more effective and efficient as well as more accurate and accurate predictions of the number of baby births. Keywords: Naive Bayes, Birth of a Baby, Prediction Abstrak Angka kelahiran merupakan suatu hal yang dapat mempengaruhi peningkatan pertumbuhan penduduk. Jumlah penduduk yang besar merupakan beban bagi pembangunan. Menurut Teori Malthus yang menyatakan bahwa pertumbuhan jumlah penduduk yang besar bukanlah kesejahteraan yang didapat tapi justru kemelaratan akan ditemui bilamana jumlah penduduk tidak dikendalikan dengan baik. Jumlah angka kelahiran bayi di Desa Tridaya Sakti setiap tahunnya semakin bertambah. Maka dari itu Data Mining dengan menggunakan algoritman Naive Bayes dapat membantu dalam perhitungan memprediksi angka kelahiran bayi di Desa Tridaya Sakti. Data Mining dalam memprediksi jumlah angka kelahiran bayi bertujuan untuk mengetahui jumlah angka kelahiran bayi tahun yang akan mendatang mengunakan algoritma Naive Bayes. Dengan melihat pola prediksi dari setiap variabel dan melakukan pengujian data training terhadap data testing. Diharapkan algoritma Naive Bayes ini dapat menyelesaikan permasalahan di Desa Tridaya Sakti dalam menangani dan mengatasi perhitungan angka kelahiran bayi dan dapat membantu pihak Desa Tridaya Sakti dalam mengatur pertumbuhan jumlah penduduk tahun yang akan mendatang. Hasil yang diperoleh dari data yang sudah diambil dan dihitung dengan Data Mining mengunakan algoritam Naive Bayes menghasilkan sebuah informasi yang dapat digunakan sebagai acuan untuk mengetahui jumlah angka kelahiran bayi. Kinerja dan waktu dalam proses pengolahan data lebih efektif dan efesien serta dari prediksi jumlah kelahiran bayi lebih tepat dan akurat. Kata Kunci: Naive Bayes, Kelahiran Bayi, Prediks

Download Full-text

APPLICATION OF PREDICTION TIME OF GRADUATION USING THE NAÏVE BAYES ALGORITHM WITH THE PYTHON PROGRAM

Journal of Industrial Engineering Management ◽

10.33536/jiem.v0i0.773 ◽

2021 ◽

pp. 38-43

Author(s):

Priskila Christine Rahayu ◽

Eric Jobiliong ◽

Antonny Antonny

Keyword(s):

Data Mining ◽

Naive Bayes ◽

Quality Standard ◽

Study Data ◽

Naïve Bayes ◽

Application Development ◽

Study Program ◽

Prediction Time ◽

Student Graduation ◽

Bayes Algorithm

Accreditation is a process to ensure the quality of a university and study program. There are several factors that determine the quality standard of accreditation. One of them is the time of graduation. However, there is no means that can be used to predict early student graduation time. Therefore, this study aims to create a means that can predict early graduation time. In this study, data mining methods were used, namely the Naïve Bayes algorithm. After that, data processing and application development will be carried out using the Python program. The data used in the data mining process is three years of historical data and the data used for the trial are active student data for the second and third years. There are 5 types of patterns with an accuracy value of 81%, 87%, 92%, 92%, and 95%.

Download Full-text

Tweet Netizen Prediction Using Random Forest, Decision Tree, Naïve Bayes, And Ensemble Algorithm (Case Study The Governor Of DKI Jakarta)

SinkrOn ◽

10.33395/sinkron.v5i1.10565 ◽

2020 ◽

Vol 5 (1) ◽

pp. 9-20

Author(s):

Antonius Yadi Kuntoro

Keyword(s):

Social Media ◽

Random Forest ◽

Decision Tree ◽

Naive Bayes ◽

Naïve Bayes ◽

Training Data ◽

Testing Data ◽

Negative Comments ◽

Ensemble Algorithm ◽

Bayes Algorithm

Abstract — The current Governor of DKI Jakarta, even though he has been elected since 2017 is always interesting to talk about or even comment on. Comments that appear come from the media directly or through social media. Twitter has become one of the social media that is often used as a media to comment on elected governors and can even become a trending topic on Twitter social media. Netizens who comment are also varied, some are always Tweeting criticism, some are commenting Positively, and some are only re-Tweeting. In this research, a prediction of whether active Netizens will tend to always lead to Positive or Negative comments will be carried out in this study. Model algorithms used are Decision Tree, Naïve Bayes, Random Forest, and also Ensemble. Twitter data that is processed must go through preprocessing first before proceeding using Rapidminer. In trials using Rapidminer conducted in four trials by dividing into two parts, namely testing data and training data. Comparisons made are 10% testing data: 90% Training data, then 20% testing data: 80% training data, then 30% testing data: 70% training data, and the last is 35% testing data: 65% training data. The average Accuracy for the Decision Tree algorithm is 93.15%, while for the Naïve Bayes algorithm the Accuracy is 91.55%, then for the Random Forest algorithm is 93.41, and the last is the Ensemble algorithm with an Accuracy of 93, 42%. here. Keywords — Decision Tree, Naïve Bayes, Random Forest, Set, Twitter.

Download Full-text

Data Mining Application in Predicting Bank Loan Defaulters

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.d2037.029420 ◽

2020 ◽

Vol 9 (4) ◽

pp. 2733-2744

Keyword(s):

Data Mining ◽

Decision Tree ◽

Model Building ◽

Naive Bayes ◽

Large Data ◽

Naïve Bayes ◽

Bank Loan ◽

Classification Model ◽

Data Set ◽

Data Mining Application

Data mining is the key tools for discoveries of knowledge from large data set. Nowadays, most of the organizations using this technology to maintain their data. This paper focuses on the Bank sector in Risk management specifically, detecting Bank loan defaulters through the data mining application to examine the patterns of different attribute which would contribute for detecting and predicting defaulters thus preventing wrong loans. This process can be done without change the current systems and the data. Then it helps to distinguish borrowers who repay loans promptly from those who don’t and avoid wrong loan allotment. In order to show the results of the study Classification model is implemented in order to find interesting patterns among attributes of customer. A total of 20461 sample data were taken by data base admin randomly from 3 consecutive years from the Bank database to build and test the model. In this research we used Classification model of decision tree and Naïve Bayes in Weka 3.7 tool for experiments. Modeling methodology applied to this paper was CIRSP-DM (Cross Industry Standard for Data Mining), which involves business understanding, data understanding, data preparation, model building, evaluation and deployment. Decision tree classifications with J48 implementation with 8 experiments were performed. Two experiments with different parameters were made for Naïve Bayes. Finally, evaluation and analysis of the models were performed then given a best solution to predict the defaulters.

Download Full-text

IMPLEMENTASI KLASIFIKASI NAIVE BAYES DALAM MEMPREDIKSI LAMA STUDI MAHASISWA (STUDI KASUS : UNIVERSITAS DHYANA PURA)

SINTECH (Science and Information Technology) Journal ◽

10.31598/sintechjournal.v4i2.964 ◽

2021 ◽

Vol 4 (2) ◽

pp. 202-209

Author(s):

Kelvin Hennry Loudry Malelak ◽

I Made Dwi Ardiada ◽

Gerson Feoh

Keyword(s):

Data Mining ◽

Undergraduate Students ◽

Naive Bayes ◽

Naïve Bayes ◽

Training Data ◽

Accuracy Rate ◽

Testing Data ◽

Study Programs ◽

Bayes Algorithm ◽

Academic Year

Under normal conditions, undergraduate or undergraduate students from a university can complete their studies for 4 years or 8 semesters. In fact, many students complete their study period of more than 4 years. Is known that in fact in the 2015/2016 academic year there were 744 people who were accepted as students. Of the 744 people who were accepted, 405 people had completed a study period of about 4 years and the remaining 39 people completed their studies for 5 years and 300 of them did not continue their studies. Based on the problem on, so This study implements a classification that can help Dhyana Pura University in predicting the length of study for students who are currently studying in various study programs at Dhyana Pura University. The author's method serves in the classification to predict long student study period is the Naive Bayes algorithm. By using the Java-based Rapid Miner tool to classify graduation data. Then the implementation of data mining which is divided into 968 training data and 193 data testing data with naive Bayes has succeeded in obtaining an accuracy rate of 100% which also has very good parameters.

Download Full-text

Application of Data Mining with Classification Methods for Promotion of New Student Admissions at Muhammadiyah University of Sidoarjo Using Web-Based Naïve Bayes Algorithm

Procedia of Engineering and Life Science ◽

10.21070/pels.v1i2.1062 ◽

2021 ◽

Vol 1 (2) ◽

Author(s):

Vianti Widyasari ◽

Arief Senja Fitrani

Keyword(s):

Data Mining ◽

Naive Bayes ◽

Naïve Bayes ◽

Training Data ◽

Classification Method ◽

New School ◽

School Year ◽

New Students ◽

Bayes Algorithm ◽

New Student

The University of Muhammadiyah Sidoarjo (UMSIDA) is one of Indonesia's superior and innovative private colleges in developing IPTEKS based on Islamic values for community welfare. UMSIDA that has stood long enough with the number of students received in each year is quite a lot. Each new school year opening, this private college regularly organizes new student admissions (PMB) activities. Admission for new students (PMB) at UMSIDA can be done at pmb.umsida.ac.id. Therefore, research aims to create data mining applications classification method with the algorithm Naïve Bayes. This research uses the classification method used to Megukur accuracy level. To predict the promotion of new students receiving Muhammadiyah Sidoarjo University (UMSIDA) can be done using the Naïve Bayes algorithm with 7 predefined variables. Offline and online predictor of the dataset of 2601 data is divided into 2 as many as 70% of 2000 Training data and as much as 30% from 601 of Testing data.

Download Full-text

Prediksi Kelulusan dan Putus Studi Mahasiswa dengan Pendekatan Bertingkat pada Perguruan Tinggi

SIMADA (Jurnal Sistem Informasi & Manajemen Basis Data) ◽

10.30873/simada.v3i2.2359 ◽

2021 ◽

Vol 3 (2) ◽

pp. 140-148

Author(s):

Hermanto Hermanto

Keyword(s):

Data Mining ◽

Decision Tree ◽

Nearest Neighbor ◽

Naive Bayes ◽

Naïve Bayes ◽

Drop Out ◽

Training Data ◽

K Nearest Neighbor ◽

Quality Of Higher Education

Currently, the problem of college failure, its on-time graduation, and the factors that cause it is still an interesting research topic (C. Marquez-Vera, C. Romero and S. Ventura, 2011). This study compares three data mining classification algorithms namely Naive Bayes, Decision Tree and K-Nearest Neighbor to predict graduation and dropout risk for students to improve the quality of higher education and the most accurate algorithms to use Prepare graduation and dropout prediction Student studies. The best algorithm for predicting graduation and dropout is the decision tree with the best accuracy value of 99.15% with a training data ratio of 30%. Keyword : Data Mining; Algoritma Naive Bayes; Decision Tree; K-Nearest Neighbor; Predict Graduation; Drop Out.

Download Full-text