PENGGUNAAN DATA MINING DALAM HIT RATE IMPORTASI JALUR MERAH DENGAN MODEL DECISION TREE

ABSTRACT: Customs and Excise faces a big challenge to be able to increase the hit rate of red line imports by 40% in accordance with the Blueprint for the 2014-2025 Ministry of Finance Institutional Transformation Program and international benchmarks. Through a qualitative study, this study aims to determine the use of data mining that is applied to the risk engine based on import data, people's experiences, and research results of customs institutions of other countries. The data mining method used is CRISP-DM, classification method, and decision tree model, using data imported from the red line KPU BC Type A Tanjung Priok for the period September – December 2019 and January 2020. The results show that the use of data mining can increase the hit rate of red line importation. The most relevant attribute in classifying data is the sending country which is categorized as a root node, while the import duty tariff attribute does not provide information on data classification. This research is expected to provide a new perspective for the KPU BC Type A Tanjung Priok in an effort to improve the risk engine targeting and risk engine routing of Customs and Excise. Keywords: CRISP-DM, data mining, decision tree, hit rate, the red line import. ABSTRAK: Bea dan Cukai menghadapi tantangan besar untuk dapat meningkatkan capaian hit rate importasi jalur merah sebesar 40% sesuai dengan Cetak Biru Program Transformasi Kelembagaan Kementerian Keuangan Tahun 2014 – 2025 dan benchmark internasional. Melalui studi kualitatif, penelitian ini bertujuan untuk mengetahui penggunaan data mining yang diterapkan dalam risk engine berdasarkan data importasi, pengalaman orang, dan data hasil penelitian institusi kepabeanan negara lain. Metode data mining yang digunakan adalah CRISP-DM, metode klasifikasi, dan model decision tree, dengan menggunakan data importasi jalur merah Kantor Pelayanan Utama (KPU) Bea dan Cukai (BC) Tipe A Tanjung Priok periode September – Desember 2019 dan Januari 2020. Hasil penelitian menunjukkan bahwa penggunaan data mining dapat meningkatkan capaian hit rate importasi jalur merah. Atribut yang paling relevan dalam mengklasifikasikan data adalah negara pengirim yang dikategorikan sebagai root node (akar), sedangkan atribut tarif bea masuk tidak memberikan informasi dalam klasifikasi data. Penelitian ini diharapkan dapat memberikan pandangan baru bagi KPU BC Tipe A Tanjung Priok dalam upaya perbaikan risk engine targeting dan risk engine penjaluran Bea dan Cukai. Kata Kunci: CRISP-DM, data mining, decision tree, hit rate, importasi jalur merah.

Download Full-text

Fault Diagnosis of Automobile ECUs with Data Mining Technologies

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.40-41.156 ◽

2010 ◽

Vol 40-41 ◽

pp. 156-161 ◽

Cited By ~ 1

Author(s):

Yang Li ◽

Yan Qiang Li ◽

Zhi Xue Wang

Keyword(s):

Data Mining ◽

Fault Diagnosis ◽

Decision Tree ◽

Data Stream ◽

Electronic Control Unit ◽

Rapid Development ◽

Reliability And Validity ◽

Control Unit ◽

Use Of Data ◽

The Cost

With the rapid development of automotive ECUs(Electronic Control Unit), the fault diagnosis becomes increasingly complicated. And the link between fault and symptom becomes less obvious. In order to improve the maintenance quality and efficiency, the paper proposes a fault diagnosis approach based on data mining technologies. By making full use of data stream, we firstly extract fault symptom vectors by processing data stream, and then establish a diagnosis decision tree through the ID3 decision tree algorithm, and finally store the link rules between faults and the related symptoms into historical fault database as a foundation for the fault diagnosis. The database provides the basis of trend judgments for a future fault. To verify this approach, an example of diagnosing faults of entertainment ECU is showed. The test result testifies the reliability and validity of this diagnostic method and reduces the cost of ECU diagnosis.

Download Full-text

Usefulness of a decision tree model for the analysis of adverse drug reactions: Evaluation of a risk prediction model of vancomycin-associated nephrotoxicity constructed using a data mining procedure

Journal of Evaluation in Clinical Practice ◽

10.1111/jep.12767 ◽

2017 ◽

Vol 23 (6) ◽

pp. 1240-1246 ◽

Cited By ~ 10

Author(s):

Shungo Imai ◽

Takehiro Yamada ◽

Kumiko Kasashi ◽

Masaki Kobayashi ◽

Ken Iseki

Keyword(s):

Data Mining ◽

Decision Tree ◽

Prediction Model ◽

Risk Prediction ◽

Adverse Drug Reactions ◽

Risk Prediction Model ◽

Decision Tree Model ◽

Drug Reactions ◽

Tree Model ◽

Mining Procedure

Download Full-text

PENERAPAN DATA MINING MENGGUNAKAN ALGORITMA C4.5 TEHADAP PENGARUH PENJUALAN KOPI PADA PT. JPW INDONESIA

Jurnal Sistem Informasi dan Informatika (Simika) ◽

10.47080/simika.v3i1.836 ◽

2020 ◽

Vol 3 (1) ◽

pp. 40-54

Author(s):

Ikong Ifongki

Keyword(s):

Data Mining ◽

Decision Tree ◽

Decision Rules ◽

Large Data ◽

Added Value ◽

Data Set ◽

Use Of Data ◽

Decision Tree Classification ◽

C4.5 Algorithm

Data mining is a series of processes to explore the added value of a data set in the form of knowledge that has not been known manually. The use of data mining techniques is expected to provide knowledge - knowledge that was previously hidden in the data warehouse, so that it becomes valuable information. C4.5 algorithm is a decision tree classification algorithm that is widely used because it has the main advantages of other algorithms. The advantages of the C4.5 algorithm can produce decision trees that are easily interpreted, have an acceptable level of accuracy, are efficient in handling discrete type attributes and can handle discrete and numeric type attributes. The output of the C4.5 algorithm is a decision tree like other classification techniques, a decision tree is a structure that can be used to divide a large data set into smaller sets of records by applying a series of decision rules, with each series of division members of the resulting set become similar to each other. In this case study what is discussed is the effect of coffee sales by processing 106 data from 1087 coffee sales data at PT. JPW Indonesia. Data samples taken will be calculated manually using Microsoft Excel and Rapidminer software. The results of the calculation of the C4.5 algorithm method show that the Quantity and Price attributes greatly affect coffee sales so that sales at PT. JPW Indonesia is still often unstable.

Download Full-text

Application of Data Mining for Decision Tree Model of Multi-variety Discrete Production and Manufacture

2010 Third International Symposium on Intelligent Information Technology and Security Informatics ◽

10.1109/iitsi.2010.13 ◽

2010 ◽

Cited By ~ 1

Author(s):

Jianfang Sun

Keyword(s):

Data Mining ◽

Decision Tree ◽

Decision Tree Model ◽

Tree Model

Download Full-text

Applying Data Mining Techniques on Continuous Sensed Data for Daily Living Activity Recognition

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.738-739.191 ◽

2015 ◽

Vol 738-739 ◽

pp. 191-196

Author(s):

Yun Jie Li ◽

Hui Song

Keyword(s):

Neural Network ◽

Data Mining ◽

Decision Tree ◽

Daily Living ◽

Tree Model ◽

Data Set ◽

Data Mining Techniques ◽

Daily Living Activity ◽

The Neural Network ◽

Better Than

In this paper, several data mining techniques were discussed and analyzed in order to achieve the objective of human daily activities recognition based on a continuous sensing data set. The data mining techniques of decision tree, Naïve Bayes and Neural Network were successfully applied to the data set. The paper also proposed an idea of combining the Neural Network with the Decision Tree, the result shows that it works much better than the typical Neural Network and the typical Decision Tree model.

Download Full-text

Data mining algorithm for development of a predictive model for mitigating loan risk in Nigerian banks

Journal of Applied Sciences and Environmental Management ◽

10.4314/jasem.v25i9.11 ◽

2021 ◽

Vol 25 (9) ◽

pp. 1613-1616

Author(s):

O.B. Alaba ◽

E.O. Taiwo ◽

O.A. Abass

Keyword(s):

Data Mining ◽

Decision Tree ◽

Predictive Model ◽

Risk Model ◽

Data Mining Algorithm ◽

Tree Model ◽

Mining Algorithm ◽

Loan Risk ◽

Structured Questionnaire ◽

J48 Decision Tree

The focus of this paper is on the development of data mining algorithm for developing of predictive loan risk model for Nigerian banks. The model classifies and predicts the risk involved in granting loans to customers as either good or bad loan by collecting data based on J48 decision tree, BayesNet and Naïve Bayes algorithms for a period of ten (10) years (2010 2019) from using structured questionnaire. The formulation and simulation of the predictive model were carried out using Waikato Environment for Knowledge Analysis (WEKA) software. The performance of the three algorithms for predicting loan risk was done based on accuracy and error rate metrics. The study revealed that J48 decision tree model is the most efficient of all the three models.

Download Full-text

Predicting Paediatric Malaria Occurrence Using Classification Algorithm in Data Mining

Journal of Advances in Mathematics and Computer Science ◽

10.9734/jamcs/2019/v31i430118 ◽

2019 ◽

pp. 1-10 ◽

Cited By ~ 1

Author(s):

T. C. Olayinka ◽

S. C. Chiemeke

Keyword(s):

Data Mining ◽

Decision Tree ◽

Prediction Method ◽

Developed Countries ◽

Classification Algorithms ◽

Data Sets ◽

Tree Model ◽

Healthcare Data ◽

Paediatric Malaria ◽

The Developed Countries

This paper gives the current overview of the application of data mining techniques on the haematological and biochemical dataset to predict the occurrence of malaria in children between age zero (0) and five (5). Malaria has been eradicated from the developed countries but still affecting a large part of the world negatively. A larger percentage of malaria is estimated to affect young children in sub-Sahara Africa. In order to reduce mortality from paediatric malaria, there should be an efficient and effective prediction method. In healthcare, data mining is one of the most vital and motivating areas of research with the objective of finding meaningful information from huge data sets and provides an efficient analytical approach for detecting unknown and valuable information in healthcare data. In this study, a model was built to predict the occurrence of malaria in children between age zero (0) and five (5) years, using decision tree classification algorithms on WEKA workbench tool. The classification algorithms used are LMT, REPTree, Hoeffding tree and J48. A J48 algorithm was used for building the decision tree model since it has higher accuracy for performance with least error margin.

Download Full-text

Penerapan Algoritma C4.5 Dalam Memprediksi Ketersediaan Uang Pada Mesin ATM

JURNAL MEDIA INFORMATIKA BUDIDARMA ◽

10.30865/mib.v5i2.2933 ◽

2021 ◽

Vol 5 (2) ◽

pp. 556

Author(s):

Firman Syahputra ◽

Hartono Hartono ◽

Rika Rosnelly

Keyword(s):

Data Mining ◽

Decision Tree ◽

Travel Time ◽

Tree Model ◽

Optimal Service ◽

C4.5 Algorithm ◽

Cash Transaction ◽

Auc Value ◽

Using Data ◽

Balance Variable

This study aims to provide an evaluation of the availability of money in ATM machines using data mining. Data mining with the C4.5 algorithm is used to predict cash demand or total cash withdrawals at ATMs. To determine the need for ATM cash based on cash transaction data. It is hoped that this forecasting can help the monitoring department in making decisions about the money requirements that must be allocated to each ATM machine. The results of this study are expected to assist the ATM management unit in optimizing and monitoring the availability of money at an ATM machine for cash needs, so that it can provide optimal service to customers. Algortima C4.5 is an algorithm that is able to form a decision tree, where the decision tree will then generate new knowledge. The results of the test matched the data on the availability of money at the ATM machine. The results of implementing the C4.5 method on the availability of money at the ATM machine are seen from the travel time to the ATM location and also the remaining balance in the machine. The resulting decision tree model is to make the balance variable as the root, then the travel time as a branch at Level 1 with the variables fast, medium, long, and the bank becomes a branch at the last level (Level 2). Then the C4.5 algorithm was tested using the K-Fold Cross validation method with the value of fold = 10, it can be seen that the accuracy rate is 85%, the Precision value is 80% and the Recall value is 66.67%. While the AUC (Area Under Curve) value is 0.833, this shows that if the AUC value approaches the value 1, the accuracy level is getting better

Download Full-text

Predicting Students Performance Based on Their Academic Profile

مجلة جامعة فلسطين التقنية للأبحاث ◽

10.53671/pturj.v8i2.91 ◽

2020 ◽

Vol 8 (2) ◽

pp. 23-39

Author(s):

Hadi Khalilia ◽

Thaer Sammar ◽

Yazeed Sleet

Keyword(s):

Data Mining ◽

Decision Tree ◽

High Speed ◽

Information Gain ◽

Educational Data Mining ◽

Computer Engineering ◽

Use Of Data ◽

Important Field ◽

Index Measure ◽

The University

Data mining is an important field; it has been widely used in different domains. One of the fields that make use of data mining is Educational Data Mining. In this study, we apply machine learning models on data obtained from Palestine Technical University-Kadoorie (PTUK) in Tulkarm for students in the department of computer engineering and applied computing. Students in both fields study the same major courses; C++ and Java. Therefore, we focused on these courses to predict student’s performance. The goal of our study is predicting students’ performance measured by (GPA) in the major. There are many techniques that are used in the educational data mining field. We applied three models on the obtained data which have been commonly used in the educational data mining field; the decision tree with information gain measure, the decision tree with Gini index measure, and the naive Bayes model. We used these models in our work because they are efficient and they have a high speed in data classification, and prediction. The results suggest that the decision tree with information gain measure outperforms other models with 0.66 accuracy. We had a deeper look on key features that we train our models; precisely, their branch of study at school, field of study in the University, and whether or not the students have a scholarship. These features have an influence on the prediction. For example, the accuracy of the decision tree with information gain measure increases to 0.71 when applied on the subset of students who studied in the scientific branch at high school. This study is important for both the students and the higher management of PTUK. The university will be able to do some predictions on the performance of the students. In the carried experiments, the prediction of the model was inline with the actual expectation.

Download Full-text

An Educational Data Mining Application by Using Multiple Intelligences

Examining Multiple Intelligences and Digital Technologies for Enhanced Learning Opportunities - Advances in Educational Technologies and Instructional Design ◽

10.4018/978-1-7998-0249-5.ch005 ◽

2020 ◽

pp. 93-110

Author(s):

Esra Aksoy ◽

Serkan Narli ◽

Mehmet Akif Aksoy

Keyword(s):

Data Mining ◽

Decision Tree ◽

Learning Styles ◽

Multiple Intelligences ◽

Gifted Students ◽

Personality Types ◽

Decision Tree Model ◽

Tree Model ◽

Educational Domain ◽

Using Data

The aim of this chapter is to illustrate both uses of data mining methods and the way of these methods can be applied in education by using students' multiple intelligences. Data mining is a data analysis methodology that has been successfully used in different areas including the educational domain. In this context, in this study, an application of EDM will be illustrated by using multiple intelligence and some other variables (e.g., learning styles and personality types). The decision tree model was implemented using students' learning styles, multiple intelligences, and personality types to identify gifted students. The sample size was 735 middle school students. The constructed decision tree model with 70% validity revealed that examination of mathematically gifted students using data mining techniques may be possible if specific characteristics are included.

Download Full-text