scholarly journals A Comparison Between Backpropagation, Holt-Winter, and Polynomial Regression Methods in Forecasting Dog Bites Cases in Bali

Author(s):  
Gede Eridya Bayu ◽  
I Ketut Gede Darma Putra ◽  
Ni Kadek Dwi Rusjayanthi

Rabies is a zoonotic disease that is usually transmitted to humans through animal bites. It can cause severe damage to the central nervous system and is generally fatal. Dog bite cases are considered the leading cause of rabies transmission in Bali. The government's preventive action is expected to reduce the problem of increasing the number of dog bite cases so that it does not spread quickly and cause casualties. Data mining is an attempt to extract knowledge from a set of data. The use of data mining in this study is to forecast the number of dog bite cases in Bali. Forecasting predicts what will happen in the future based on relevant data in the past and placing it in a mathematical model. Data mining methods used in forecasting dog bite cases are backpropagation, holt-winters, polynomial regression methods. This forecasting aims to help the government predict dog bite cases in the coming period to prepare appropriate countermeasures. Forecasting is done using data on bite cases every month in Bali province for five years, from 2015 to 2019. Dog bite case data is divided into four datasets for each attribute, namely data on the number of dog bite cases, the number of vaccinations, the number of male deaths, the number of female deaths. The four datasets are divided into training data and testing data to share 80% training data and 20% testing data. The results obtained are that the backpropagation method is better at predicting dog bite case data with an average MAPE error rate of 11.59%, while the holt-winters method has an average MAPE error rate of 16.05%, and the polynomial regression method has an average MAPE error rate of 19.91% Keywords : Dog Bites, Rabies, Forecasting, Backpropagation, Holt-Winter, Polynomial Regression

2021 ◽  
Vol 6 (2) ◽  
pp. 14-19
Author(s):  
Dinita Rahmalia ◽  
Mohammad Syaiful Pradana ◽  
Teguh Herlambang

There are many smartphones with various price sold in market. The price of smartphone is affected by some components such as weight, internal storage, memory (RAM), rear camera, front camera and brands. There are two methods for classifying price class of smartphone in market such as Learning Vector Quantization (LVQ) and Backpropagation (BP). From classifying price class of smartphone in market using LVQ and BP, there are the differences on the both of them. LVQ classifies price range of smartphone by euclidean distance of weight and data on its iteration. BP classifies price range of smartphone by gradient descent of target and output on its iteration. In multi output classification, one object may have multi output. Based on simulation results, BP gives the better accuracy and error rate in training data and testing data than LVQ.  


2020 ◽  
Vol 7 (3) ◽  
pp. 443
Author(s):  
Azahari Azahari ◽  
Yulindawati Yulindawati ◽  
Dewi Rosita ◽  
Syamsuddin Mallala

<p class="Abstrak">Prediksi  kelulusan  dibutuhkan  oleh  manajemen  perguruan  tinggi  dalam  menentukan kebijakan  preventif  terkait  pencegahan  dini  kasus drop  out. Lama masa studi setiap mahasiswa bisa disebabkan dengan berbagai faktor.  Dengan  menggunakan <em>data mining</em> algoritma <em>naive bayes</em> dan <em>neural network</em> dapat  dilakukan  prediksi  kelulusan  mahasiswa di  STMIK  Widya  Cipta  Dharma (WiCiDa) Samarinda . Atribut yang digunakan yaitu, umur saat masuk kuliah, klasifikasi kota asal Sekolah Menengah Atas, pekerjaan ayah, program studi, kelas, jumlah saudara, dan Indeks Prestasi Kumulatif (IPK). Sampel mahasiswa yang lulus dan <em>drop-out</em> pada tahun 2011 sampai 2019 dijadikan sebagai data <em>training</em> dan data <em>testing</em>. Sedangkan angkatan 2015–2018 digunakan sebagai data target yang akan diprediksi masa studinya. Sebanyak 3229 mahasiswa, 1769 sebagai data <em>training</em>, 321 sebagai data <em>testing</em>, dan 1139 sebagai data target. Semua data diambil dari data mahasiswa program strata 1, dan tidak mengikut sertakan data mahasiswa D3 dan alih jenjang/transfer.  Dari data <em>testing </em>diperoleh tingkat akurasi hanya 57,63%. Hasil penelitian menunjukkan banyaknya kelemahan dari hasil prediksi <em>naive bayes</em> dikarenakan tingkat akurasi kevalidannya tergolong tidak terlalu tinggi. Sedangkan akurasi prediksi <em>neural network</em> adalah 72,58%, sehingga metode alternatif inilah yang lebih baik. Proses evaluasi dan analisis dilakukan untuk melihat dimana letak kesalahan dan kebenaran dalam hasil prediksi masa studi.</p><div><div><p><em><strong>Abstract</strong></em></p><p class="Abstract"><em>Graduation predictions are required by the higher education institution preventive policies related to the early prevention of drop-out cases. The duration of study, for each student can be caused by various factors. By using the data mining algorithm Naive bayes and neural network, the student graduation in STMIK Widya Cipta Dharma (WiCiDa) can be predicted. The attributes used are as follows: age at admission, classification of cities from high school, father’s occupation, study program, class, number of siblings, and grade point average (GPA). Samples of students who graduated and dropped out between year 2011 and 2019 were used as training data and testing data. While the year class of 2015to 2018 is used as the target data, which will be predicted during the study period. According to the data mining algorithm Naive bayes, there are 3229 students; 1769 as training data, 321 as testing data, and 1139 as target data. All data is taken from students enrolled in undergraduate program and does not include data on diploma students and transfer student. From the testing data, an accuracy rate only 57.63%. The other side, prediction accuracy of the neural network is 72.58%, so this alternative method is the best chosen. The research results show the many weaknesses of the results of prediction of Naive bayes because the level of accuracy of its validity is not high. The evaluation and analysis process are conducted to see where the errors and truths are in the results of the study period predictions.</em></p><p><em><strong><br /></strong></em></p></div></div>


2020 ◽  
Vol 1 (2) ◽  
pp. 77-88
Author(s):  
Nur Isnaini Parihah ◽  
Sari Hartini ◽  
Juarni Siregar

The birth rate is something that can affect the increase in population growth. Large population is a burden for development. According to Malthus's Theory which states that a large population growth is not the welfare that is obtained but rather poverty will be encountered if the population is not well controlled. The number of baby births in Tridaya Sakti Village is increasing every year. Therefore Data Mining using the Naive Bayes algorithm can help in the calculation of predicting infant birth rates in Tridaya Sakti Village. Data Mining in predicting the number of infant birth rates aims to determine the number of infant birth rates for the coming year using the Naive Bayes algorithm. By looking at the prediction patterns of each variable and testing training data on testing data. It is hoped that the Naive Bayes algorithm can solve the problem in Tridaya Sakti Village in handling and overcoming the calculation of infant birth rates and can help the Tridaya Sakti Village in regulating population growth in the coming years. The results obtained from the data that have been taken and calculated by Data Mining using the Naive Bayes algorithm produce an information that can be used as a reference to find out the number of births. Performance and time in data processing are more effective and efficient as well as more accurate and accurate predictions of the number of baby births.   Keywords: Naive Bayes, Birth of a Baby, Prediction   Abstrak   Angka kelahiran merupakan suatu hal yang dapat mempengaruhi peningkatan pertumbuhan penduduk. Jumlah penduduk yang besar merupakan beban bagi pembangunan. Menurut Teori Malthus yang menyatakan bahwa pertumbuhan jumlah penduduk yang besar bukanlah kesejahteraan yang didapat tapi justru kemelaratan akan ditemui bilamana jumlah penduduk tidak dikendalikan dengan baik. Jumlah angka kelahiran bayi di Desa Tridaya Sakti setiap tahunnya semakin bertambah. Maka dari itu Data Mining dengan menggunakan algoritman Naive Bayes dapat membantu dalam perhitungan memprediksi angka kelahiran bayi di Desa Tridaya Sakti. Data Mining dalam memprediksi jumlah angka kelahiran bayi bertujuan untuk mengetahui jumlah angka kelahiran bayi tahun yang akan mendatang mengunakan algoritma Naive Bayes. Dengan melihat pola prediksi dari setiap variabel dan melakukan pengujian data training terhadap data testing. Diharapkan algoritma Naive Bayes ini dapat menyelesaikan permasalahan di Desa Tridaya Sakti dalam menangani dan mengatasi perhitungan angka kelahiran bayi dan dapat membantu pihak Desa Tridaya Sakti dalam mengatur pertumbuhan jumlah penduduk tahun yang akan mendatang. Hasil yang diperoleh dari data yang sudah diambil dan dihitung dengan Data Mining mengunakan algoritam Naive Bayes menghasilkan sebuah informasi yang dapat digunakan sebagai acuan untuk mengetahui jumlah angka kelahiran bayi. Kinerja dan waktu dalam proses pengolahan data lebih efektif dan efesien serta dari prediksi jumlah kelahiran bayi lebih tepat dan akurat. Kata Kunci: Naive Bayes, Kelahiran Bayi, Prediks  


2016 ◽  
Vol 31 (2) ◽  
pp. 495-513 ◽  
Author(s):  
Ruixin Yang

Abstract In hopes of better understanding the rapid intensification (RI) of tropical cyclones, the classification technique as a data mining process is used in this mining experiment. The mining results are expected to increase accurate forecasting abilities for RI through exhaustive data distillation. In this work, the Statistical Hurricane Intensity Prediction Scheme (SHIPS) database for the Atlantic basin during the period 1982–2009 is used as the data source and the Waikato Environment for Knowledge Analysis (WEKA) software is used for various classifier implementations. As in most classification applications, accuracies in model building with training data may be high. However, accuracies with testing data usually deteriorate. Various special steps are carried out in an effort to improve the accuracy. These steps include setting the cost parameters for overcoming the unbalanced RI samples, temporal averages of variable values for more accurate environmental estimation, feature filtering for irrelevant feature removal, and subset feature selections. The best performance measures of the training results are above 90% for probability of detection (POD) with 10%–20% false alarm ratios (FARs) for cases of RI within 24 h. However, the performance on the testing data is not as good. The reported RI forecasting accuracies in this work are lower than the goals set by NOAA in their Hurricane Forecast Improvement Project. Nevertheless, this work sheds light on the future direction of RI investigations using data mining techniques. Many more studies are needed before we can fully understand the potential and/or limitations of data mining techniques in RI investigations.


2012 ◽  
Vol 2012 ◽  
pp. 1-13 ◽  
Author(s):  
Jussi Turkka ◽  
Fedor Chernogorov ◽  
Kimmo Brigatti ◽  
Tapani Ristaniemi ◽  
Jukka Lempiäinen

A data-mining framework for analyzing a cellular network drive testing database is described in this paper. The presented method is designed to detect sleeping base stations, network outage, and change of the dominance areas in a cognitive and self-organizing manner. The essence of the method is to find similarities between periodical network measurements and previously known outage data. For this purpose, diffusion maps dimensionality reduction and nearest neighbor data classification methods are utilized. The method is cognitive because it requires training data for the outage detection. In addition, the method is autonomous because it uses minimization of drive testing (MDT) functionality to gather the training and testing data. Motivation of classifying MDT measurement reports to periodical, handover, and outage categories is to detect areas where periodical reports start to become similar to the outage samples. Moreover, these areas are associated with estimated dominance areas to detected sleeping base stations. In the studied verification case, measurement classification results in an increase of the amount of samples which can be used for detection of performance degradations, and consequently, makes the outage detection faster and more reliable.


2021 ◽  
Vol 5 (2) ◽  
pp. 335-341
Author(s):  
I Made Yudha Arya Dala ◽  
I Ketut Gede Darma Putra ◽  
Putu Wira Buana

Dengue disease has been known to the people of Indonesia since 1779. The Aedes mosquito has two types, namely Aedes aegypti and Aedes albopictus. Aedes aegypti is a mosquito that carries the dengue virus. The dengue fever cases in Bali province tend to increase from year to year, especially when approaching the rainy season. The government's preventive action is needed to tackle the spread of the dengue virus and casualties. Data mining attempts to extract known knowledge or use historical data to find regularity patterns and relationships in a set of data. In this study, data mining predicts the number of dengue cases in Bali's province. The prediction uses several database variables to predict future variables' values, which are not currently known. The process of estimating predictive values ​​based on patterns in a data set. This forecasting aims to assist the government in predicting dengue fever cases in the coming period to prepare appropriate prevention efforts. Forecasting dengue fever cases are carried out using three methods: backpropagation, gaussians, and support-vector machine. The amount of data used was 528 sample data, from 2008 to 2018. The results obtained are that the backpropagation method is better at predicting dengue fever cases with a MAPE error rate of 0.025. Simultaneously, the gaussian method has a MAPE error rate of 0.035, and support-vector machine has a MAPE error rate of 0.060.  


2020 ◽  
Vol 8 (2) ◽  
pp. 181
Author(s):  
Ni Wayan Wiantari ◽  
I Wayan Supriana

CBR (Case Based Reasoning) method is a reasoning method that uses old knowledge to overcome new problems. CBR will provide solutions to new cases by looking at old cases that are closest to new cases. One case that can use the CBR method is a case of cesarean section because there are several factors that affect cesarean section as well as features in the system, including: age, number of pregnancies, time of delivery, blood pressure, and heart status so that not everyone can do surgery cesar. In this study a system was used to determine whether a patient could have a cesarean section or not by using the CBR method and calculate similarity using Naive Bayes. The percentage correlation value of each feature is sought using SPSS because each feature has a different effect on the results. The number of cesarean section data was 80 data, in this study were divided into 70% training data (56) and 30% testing data (24). Where the new case data will be compared with the old case data in the database, and then the similarity criteria are calculated based on the existing formula. The results of testing of 24 data testing there are 5 data whose results are incompatible and 19 data whose results are in accordance with the data before it is shared. So that the accuracy of the cesarean section with the CBR method using Nayve Bayes is 79%.


2021 ◽  
Vol 4 (2) ◽  
pp. 202-209
Author(s):  
Kelvin Hennry Loudry Malelak ◽  
I Made Dwi Ardiada ◽  
Gerson Feoh

Under normal conditions, undergraduate or undergraduate students from a university can complete their studies for 4 years or 8 semesters. In fact, many students complete their study period of more than 4 years. Is known that in fact in the 2015/2016 academic year there were 744 people who were accepted as students. Of the 744 people who were accepted, 405 people had completed a study period of about 4 years and the remaining 39 people completed their studies for 5 years and 300 of them did not continue their studies. Based on the problem on, so This study implements a classification that can help Dhyana Pura University in predicting the length of study for students who are currently studying in various study programs at Dhyana Pura University. The author's method serves in the classification to predict long student study period is the Naive Bayes algorithm. By using the Java-based Rapid Miner tool to classify graduation data. Then the implementation of data mining which is divided into 968 training data and 193 data testing data with naive Bayes has succeeded in obtaining an accuracy rate of 100% which also has very good parameters.


2021 ◽  
Vol 2 (2) ◽  
pp. 97-106
Author(s):  
Lydia Yohana Lumban Gaol ◽  
M. Safii ◽  
Dedi Suhendro

Graduation is an important element in an accreditation assessment process of an institution or university. It is important to find out information about the predictions of student graduation in the Information Systems Study Program at STIKOM Tunas Bangsa Pematangsiantar, so that students who cannot graduate on time can be identified earlier. The application of data mining can be used to predict student graduation. Method that often used to predict student graduation is classification method. This research using C4.5 Algorithm. Data that used as training data are from alumni of Information Systems Study Program at STIKOM Tunas Bangsa and the final level student data of Information Systems Study Program at STIKOM Tunas Bangsa used as testing data. Through this research, it is expected that the results can provide information on predictions of student graduation on time and as a suggestion for the university in making good decisions for improvement in the future.


2021 ◽  
Vol 2 (2) ◽  
pp. 74-81
Author(s):  
Muhammad Ridho Matondang ◽  
Muhammad Ridwan Lubis ◽  
Heru Satria Tambunan

Increasing the amount of demand for natural resource needs is increasing. One of them is natural resources in the sea and coast. The current condition of capture fisheries in Indonesia is not yet optimal. This is indicated by the increase in the volume of capture fisheries production which is very slow. The purpose of this study is to make data classification for the prediction of the average volume increase in capture fisheries with data mining techniques. Data mining techniques are applied to determine the data patterns of the capture fisheries dataset, so the results of the classification can be applied to evaluate the factors that affect the volume of capture fisheries. The classification algorithm used is C45. The results of the classification were tested with rapidminer in classifying data. The level of performance is indicated by the accuracy value. The accuracy value is obtained by testing the results of the classification of training data and testing data. Comparison of accuracy values between the algorithms used can be seen the best algorithm in making the classification of capture fisheries data.


Sign in / Sign up

Export Citation Format

Share Document