scholarly journals C4.5 Versus Other Decision Trees: A Review

2015 ◽  
Vol 4 (3) ◽  
pp. 173-182
Author(s):  
Salih Özsoy ◽  
Gökhan Gümüş ◽  
Savriddin KHALILOV

In this study, Data Mining, one of the latest technologies of the Information Systems, was introduced and Classification a Data Mining method and the Classification algorithms were discussed. A classification was applied by using C4.5 decision tree algorithm on a dataset about Labor Relations from http://archive.ics.uci.edu/ml/datasets.html. Finally, C4.5 algorithm was compared to some other decision tree algorithms. C4.5 was the one of the successful classifier.

2012 ◽  
Vol 457-458 ◽  
pp. 754-757
Author(s):  
Hong Yan Zhao

The Decision Tree technology, which is the main technology of the Data Mining classification and forecast, is the classifying rule that infers the Decision Tree manifestation through group of out-of-orders, the non-rule examples. Based on the research background of The Decision Tree’s concept, the C4.5 Algorithm and the construction of The Decision Tree, the using of C4.5 Decision Tree Algorithm was applied to result analysis of students’ score for the purpose of improving the teaching quality.


Decision tree algorithms, being accurate and comprehensible classifiers, have been one of the most widely used classifiers in data mining and machine learning. However, like many other classification algorithms, decision tree algorithms focus on extracting patterns with high generality and in the process, these ignore some rare but useful and interesting patterns that may exist in small disjuncts of data. Such extraordinary patterns with low support and high confidence capture very specific but exceptional behavior present in data. This paper proposes a novel Enhanced Decision Tree Algorithm for Discovering Intra and Inter-class Exceptions (EDTADE). Intra-class exceptions cover objects of unique interest within a class whereas inter-class exceptions capture rare conditions due to which we are forced shift the class of few unusual objects. For instance, whales and bats are examples of intra-class exceptions since these have unique characteristics within the class of mammals. Further, most of the birds are flying creatures, but the rare birds, like penguin and ostrich fall in the category of no flying birds. Here, penguin and ostrich are inter-class exceptions. In fact, without knowing about such exceptional patterns, our knowledge about a domain is incomplete. We have enhanced the decision tree algorithm by defining a framework for capturing intra and inter-class exceptions at leaf nodes of a decision tree. The proposed algorithm (EDTADE) is applied to many datasets from UCI Machine Learning Repository. The results show that the EDTADE has been successful in discovering many intra and inter-class exceptions. The decision tree augmented with intra and inter-class exceptions are more accurate, comprehensible as well as interesting since these provide additional knowledge in the form of exceptional patterns that deviate from the general rules discovered for classification


2013 ◽  
Vol 397-400 ◽  
pp. 2296-2300 ◽  
Author(s):  
Fei Shuai ◽  
Jun Quan Li

In current, there are complex relationship between the assets of information security product. According to this characteristic, we propose a new asset recognition algorithm (ART) on the improvement of the C4.5 decision tree algorithm, and analyze the computational complexity and space complexity of the proposed algorithm. Finally, we demonstrate that our algorithm is more precise than C4.5 algorithm in asset recognition by an application example whose result verifies the availability of our algorithm.Keywordsdecision tree, information security product, asset recognition, C4.5


2013 ◽  
Vol 380-384 ◽  
pp. 1469-1472
Author(s):  
Gui Jun Shan

Partition methods for real data play an extremely important role in decision tree algorithms in data mining and machine learning because the decision tree algorithms require that the values of attributes are discrete. In this paper, we propose a novel partition method for real data in decision tree using statistical criterion. This method constructs a statistical criterion to find accurate merging intervals. In addition, we present a heuristic partition algorithm to achieve a desired partition result with the aim to improve the performance of decision tree algorithms. Empirical experiments on UCI real data show that the new algorithm generates a better partition scheme that improves the classification accuracy of C4.5 decision tree than existing algorithms.


2014 ◽  
Vol 926-930 ◽  
pp. 703-707
Author(s):  
Hu Yong

Aimed at the student the result problem, give student the result data scoops out the model. The decision tree method is a very valid classification method, in the data that scoop out. According to student the result data characteristics, adopted the C4.5 decision tree algorithm. C4.5 algorithm is the improvement algorithm of the decision trees core algorithm ID3, it construct in brief, the speed compare quickly, easy realization. Selection decision belongs to sex, scoop out the result enunciation, that algorithm can be right to get student the result data classification, and some worthy conclusion, provide the decision the analysis.


2020 ◽  
Vol 27 (3) ◽  
pp. 29-43
Author(s):  
Sihem Oujdi ◽  
Hafida Belbachir ◽  
Faouzi Boufares

Using data mining techniques on spatial data is more complex than on classical data. To be able to extract useful patterns, the spatial data mining algorithms must deal with the representation of data as stack of thematic layers and consider, in addition to the object of interest itself, its neighbors linked through implicit spatial relations. The application of the classification by decision trees combined with the visualization tools represents a convenient decision support tool for spatial data analysis. The purpose of this paper is to provide and evaluate an alternative spatial classification algorithm that supports the thematic-layered data organization, by the adaptation of the C4.5 decision tree algorithm to spatial data, named S-C4.5, inspired by the SCART and spatial ID3 algorithms and the adoption of the Spatial Join Index. Our work concerns both data organization and the algorithm adaptation. Decision tree construction was experimented on traffic accident dataset and benchmarked on both computation time and memory consumption according to different experimentations: study of phenomenon by a single and then by multiple other phenomena, including one or more spatial relations. Different approaches used show compromised and balanced results between memory usage and computation time.


2019 ◽  
Vol 281 ◽  
pp. 05003
Author(s):  
Reem Razzaq Abdul Hussein ◽  
Dr.Muayad Sadik Croock ◽  
Dr Salih Mahdi Al-Qaraawi

Data-mining methods, which can be optimized via different methods, are applied in crime detection. This work, the decision tree algorithm is used for classifying and optimizing its structure with the smart method. This method is applied to two datasets: Iraq and India criminals. The goal of the proposed method is to identify criminals using a mining method based on smart search. This contribution helps in the acquisition of better results than those provided by traditional mining methods via controlling the size of the tree through decreasing leaf size.


2019 ◽  
Vol 7 (2) ◽  
Author(s):  
Dyah Wulandari ◽  
Nur Lutfiyana ◽  
Heny Sumarno

Abstract - Credit is the provision of money or equivalent claims, based on agreements or agreements on loans between banks and other parties which require the borrowing party to repay the debt after a certain period of time with the amount of interest, compensation or profit sharing. From the credit customer data available at BSM KCP Kemang Pratama still has Non Performing Financing (NPF) or Bad Credit.In analyzing a credit sometimes an analyst does an inaccurate analysis, so there are some customers who are less able to make credit payments, resulting in bad credit. So the researchers conducted an analysis using the C4.5 decision tree algorithm and Rapid Miner application for determining credit worthiness. From the analysis of credit customer data using the C4.5 decision tree algorithm method, the feasibility of credit recipient customers is very effective and produces a value of accuracy on Rapid Miner 5.3 of 80%, Precision of 100% and Recall of 0% so as to minimize the risk.Keywords— Credit, C4.5 Algorithm, Rapid Miner, Value AccuracyAbstrak - Kredit merupakan penyediaan uang atau tagihan yang dapat disamakan dengan hal itu, berdasarkan persetujuan atau kesepakatan pinjaman-pinjaman antara bank dengan pihak lain yang mewajibkan pihak peminjam untuk melunasi utangnya setelah jangka waktu tertentu dengan jumlah bunga, imbalan atau pembagian hasil keuntungan. Dari data nasabah kredit yang ada pada BSM KCP Kemang Pratama masih memiliki Non Performing Financing (NPF) atau Kredit Macet. Dalam menganalisa sebuah kredit terkadang seorang analis melakukan analisa tidak akurat, sehingga ada beberapa nasabah yang kurang mampu dalam melakukan pembayaran kredit, dan pada akhirnya mengakibatkan kredit macet. Peneliti melakukan analisis menggunakan algoritma decision tree C4.5 dan aplikasi Rapid Miner untuk penentuan kelayakan pemberian kredit. Dari analisis data nasabah kredit menggunakan metode Algoritma decision tree C4.5 menghasilkan kelayakan nasabah penerima kredit sangat efektif dan menghasilkan nilai akurasi pada Rapid Miner 5.3 sebesar 80%, Precision sebesar 100% dan Recall sebesar 0% sehingga dapat meminimalisir resiko yang terjadi.Kata kunci— Kredit, Algoritma C4.5, Rapid Miner, Nilai Akurasi


2019 ◽  
Vol 5 (1) ◽  
pp. 75-86
Author(s):  
Farid Fadli ◽  
Belsana Butar Butar

Abstract: According to the WHO report in 2004, Indonesia is the largest country with the highest number of sufferers and death rates due to dengue fever. If it is not handled properly, the postponed treatment can be fatal. In this study, the authors used the kepuutsan tree method with C4.5 algorithm to process patient data to predict whether patients experienced bloody help regarding existing indications with the help of Rapidminer software. The results of data processing using Rapidminer were evaluated and validated with a confussion matrix and AUC curve, the results of data processing using the C4.5 algorithm had an accuracy of 72% and AUC had a value of 0.758 with a fair classification category. Keywords: Algorithm C4.5, Decision Tree, Data Mining


2021 ◽  
Vol 2 (4) ◽  
pp. 181-189
Author(s):  
Yuda Irawan

Based on data from UDD PMI Kampar Regency, many donors must have provisions to become blood donors. So far, blood donor selection has been made manually to determine whether potential donors can donate blood or not. Meanwhile, today's information system has not yet explored further information from the large amount of data stored as knowledge. There is a need for organizational consolidation and continuous evaluation of the performance that has been carried out by PMI in dealing with social and humanitarian problems. By making a data mining application with a classification method using the Decision Tree C4.5 Algorithm in predicting someone worthy or not to donate blood, it can be calculated from the results of variables that are continuous or critical, such as variables of age, body weight, hemoglobin (HB) levels, blood pressure. (systolic and diastolic), The data that enters the information system is calculated using the Decision Tree C4.5 Algorithm formula, which results in detailed results and can produce valid and more accurate values. With the data mining application using the Decision Tree Algorithm C4.5 method, potential blood donors' eligibility can be classified based on age, body weight, hemoglobin, and blood pressure. Hemoglobin with the highest gain value (0.861212618) is the variable that most determines blood donation success.


Sign in / Sign up

Export Citation Format

Share Document