CUDT: A CUDA Based Decision Tree Algorithm

Decision tree is one of the famous classification methods in data mining. Many researches have been proposed, which were focusing on improving the performance of decision tree. However, those algorithms are developed and run on traditional distributed systems. Obviously the latency could not be improved while processing huge data generated by ubiquitous sensing node in the era without new technology help. In order to improve data processing latency in huge data mining, in this paper, we design and implement a new parallelized decision tree algorithm on a CUDA (compute unified device architecture), which is a GPGPU solution provided by NVIDIA. In the proposed system, CPU is responsible for flow control while the GPU is responsible for computation. We have conducted many experiments to evaluate system performance of CUDT and made a comparison with traditional CPU version. The results show that CUDT is 5∼55 times faster than Weka-j48 and is 18 times speedup than SPRINT for large data set.

Download Full-text

An Analysis of Factors Influencing Foreign Language Self-Efficacy Based on C5.0 Decision Tree Algorithm in Data Mining

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.538.460 ◽

2014 ◽

Vol 538 ◽

pp. 460-464

Author(s):

Xue Li

Keyword(s):

Data Mining ◽

Decision Tree ◽

Foreign Language ◽

Language Learning ◽

Learning Strategies ◽

Self Efficacy ◽

Foreign Language Learning ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

C5.0 Decision Tree

Based on inter-correlation and permeability among disciplines, the author makes an attempt to apply the information science to cognitive linguistics to provide a new perspective for the study of foreign languages. The correlation between self-efficacy and such four factors as anxiety, learning strategies, motivation and learners’ past achievement is analyzed by means of data mining and the extent to which the above factors affect self-efficacy in language learning is explored in this paper. The paper employs the decision tree algorithm in SPSS Clementine. C5.0 decision tree algorithm is adopted to analyze data in the study. The results are elicited from the researches carried out in this paper. The increased anxiety is bound to weaken learners’ motivation over time. It is obvious that learners have low self-efficacy. It is very important to employ strategies in foreign language learning. Ignorance of using learning strategies may result in unplanned learning with unsatisfactory achievements in spite of more efforts involved. Self-efficacy in foreign language learning may be weakened accordingly. Learners’ past achievement is a reference dimension in measuring self-efficacy with weaker influence.

Download Full-text

The Application of Improved Decision Tree Algorithm in Data Mining of Employment Rate: Evidence from China

2009 First International Workshop on Database Technology and Applications ◽

10.1109/dbta.2009.72 ◽

2009 ◽

Cited By ~ 1

Author(s):

Yuxiang Shao ◽

Qing Chen ◽

Weiming Yin

Keyword(s):

Data Mining ◽

Decision Tree ◽

Employment Rate ◽

Decision Tree Algorithm ◽

Tree Algorithm

Download Full-text

Data Mining for Detecting E-learning Courses Anomalies: An Application of Decision Tree Algorithm

International Journal on Advanced Science Engineering and Information Technology ◽

10.18517/ijaseit.8.3.2756 ◽

2018 ◽

Vol 8 (3) ◽

pp. 980 ◽

Cited By ~ 2

Author(s):

Fatiha Elghibari ◽

Rachid Elouahbi ◽

Fatima El Khoukhi

Keyword(s):

Data Mining ◽

Decision Tree ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

E Learning

Download Full-text

Soil Data Analysis and Crop Yield Prediction in Data Mining using R – Programming

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.c8683.019320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 1857-1860

Keyword(s):

Data Mining ◽

Data Analysis ◽

Decision Tree ◽

Crop Yield ◽

Climatic Condition ◽

Research Work ◽

Yield Prediction ◽

Decision Tree Algorithm ◽

Data Set ◽

R Programming

Data mining is better choices in emerging research filed- soil data analysis. crop yield prediction is an important issue for selecting the crop. earlier prediction of crop is done by the experience of farmer on a particular type of field and crop. predicting the crop is done by the farmer’s experience based on the factors like soil types, climatic condition, seasons, and weather, rainfall and irrigation facilities. data mining techniques is the better choice for predicting the crop. the analysis of soil plays an important role in agricultural filed. soil fertility prediction is one of the very important factors in agriculture this research work implements to predict yield of crop, decision tree algorithm is used to find yield. the aim of this research to pinpoint the accuracy and to finding the yield of the crop using decision tree and c 4.5 algorithm is used to predict the yield of crop using rprogramming and also to find range of magnesium found in the collected soil data set. this prediction will be very useful for the farmer to predict the crop yield for cultivation

Download Full-text

PENERAPAN DATA MINING MENGGUNAKAN ALGORITMA C4.5 TEHADAP PENGARUH PENJUALAN KOPI PADA PT. JPW INDONESIA

Jurnal Sistem Informasi dan Informatika (Simika) ◽

10.47080/simika.v3i1.836 ◽

2020 ◽

Vol 3 (1) ◽

pp. 40-54

Author(s):

Ikong Ifongki

Keyword(s):

Data Mining ◽

Decision Tree ◽

Decision Rules ◽

Large Data ◽

Added Value ◽

Data Set ◽

Use Of Data ◽

Decision Tree Classification ◽

C4.5 Algorithm

Data mining is a series of processes to explore the added value of a data set in the form of knowledge that has not been known manually. The use of data mining techniques is expected to provide knowledge - knowledge that was previously hidden in the data warehouse, so that it becomes valuable information. C4.5 algorithm is a decision tree classification algorithm that is widely used because it has the main advantages of other algorithms. The advantages of the C4.5 algorithm can produce decision trees that are easily interpreted, have an acceptable level of accuracy, are efficient in handling discrete type attributes and can handle discrete and numeric type attributes. The output of the C4.5 algorithm is a decision tree like other classification techniques, a decision tree is a structure that can be used to divide a large data set into smaller sets of records by applying a series of decision rules, with each series of division members of the resulting set become similar to each other. In this case study what is discussed is the effect of coffee sales by processing 106 data from 1087 coffee sales data at PT. JPW Indonesia. Data samples taken will be calculated manually using Microsoft Excel and Rapidminer software. The results of the calculation of the C4.5 algorithm method show that the Quantity and Price attributes greatly affect coffee sales so that sales at PT. JPW Indonesia is still often unstable.

Download Full-text

Using data mining to predict emergency department length of stay greater than 4 hours: Derivation and single‐site validation of a decision tree algorithm

Emergency Medicine Australasia ◽

10.1111/1742-6723.13421 ◽

2019 ◽

Vol 32 (3) ◽

pp. 416-421 ◽

Cited By ~ 2

Author(s):

Md Anisur Rahman ◽

Bridget Honan ◽

Thomas Glanville ◽

Peter Hough ◽

Katie Walker

Keyword(s):

Data Mining ◽

Emergency Department ◽

Length Of Stay ◽

Decision Tree ◽

Single Site ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

Using Data

Download Full-text

The Analysis and Application of the C4.5 Algorithm in Decision Tree Technology

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.457-458.754 ◽

2012 ◽

Vol 457-458 ◽

pp. 754-757

Author(s):

Hong Yan Zhao

Keyword(s):

Data Mining ◽

Decision Tree ◽

Teaching Quality ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

C4.5 Algorithm ◽

C4.5 Decision Tree ◽

Result Analysis

The Decision Tree technology, which is the main technology of the Data Mining classification and forecast, is the classifying rule that infers the Decision Tree manifestation through group of out-of-orders, the non-rule examples. Based on the research background of The Decision Tree’s concept, the C4.5 Algorithm and the construction of The Decision Tree, the using of C4.5 Decision Tree Algorithm was applied to result analysis of students’ score for the purpose of improving the teaching quality.

Download Full-text

A data mining approach to optimize pellets manufacturing process based on a decision tree algorithm

European Journal of Pharmaceutical Sciences ◽

10.1016/j.ejps.2015.03.013 ◽

2015 ◽

Vol 73 ◽

pp. 44-48 ◽

Cited By ~ 22

Author(s):

Joanna Ronowicz ◽

Markus Thommes ◽

Peter Kleinebudde ◽

Jerzy Krysiński

Keyword(s):

Data Mining ◽

Decision Tree ◽

Manufacturing Process ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

Data Mining Approach

Download Full-text

A Decision Tree Algorithm for Distributed Data Mining: Towards Network Intrusion Detection

Computational Science and Its Applications – ICCSA 2004 - Lecture Notes in Computer Science ◽

10.1007/978-3-540-24768-5_22 ◽

2004 ◽

pp. 206-212 ◽

Cited By ~ 7

Author(s):

Sung Baik ◽

Jerzy Bala

Keyword(s):

Data Mining ◽

Intrusion Detection ◽

Decision Tree ◽

Distributed Data Mining ◽

Network Intrusion Detection ◽

Distributed Data ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

Network Intrusion

Download Full-text

The Method of Attribute Reduction Based on Decision Tree

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.230-232.1303 ◽

2011 ◽

Vol 230-232 ◽

pp. 1303-1307 ◽

Cited By ~ 2

Author(s):

Fa Chao Li ◽

Hong Ze Yin ◽

Fei Guan

Keyword(s):

Data Mining ◽

Decision Tree ◽

Reduction Method ◽

Attribute Reduction ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

The Core ◽

Basic Ideas ◽

Specific Implementation ◽

Implementation Steps

This paper is for refining database in the data mining process. Based on the analysis of the features and disadvantages of this decision tree algorithm and the substantive characteristics of data mining, we propose the concept of the core samples set and prove its invariance. On this basis, we build an attribute reduction method based on decision tree algorithm and then give a specific implementation steps, further, combined with a specific instance analyze the characteristics and efficiency of the method. Results show that the attribute reduction method based on the decision tree has good maneuverability and explicableness. This method can simply realize the attribute reduction of information system and its basic ideas completely adapt to the attribute reduction problems of the uncertain environment.

Download Full-text