scholarly journals Automatic Design of Decision-Tree Algorithms with Evolutionary Algorithms

2013 ◽  
Vol 21 (4) ◽  
pp. 659-684 ◽  
Author(s):  
Rodrigo C. Barros ◽  
Márcio P. Basgalupp ◽  
André C. P. L. F. de Carvalho ◽  
Alex A. Freitas

This study reports the empirical analysis of a hyper-heuristic evolutionary algorithm that is capable of automatically designing top-down decision-tree induction algorithms. Top-down decision-tree algorithms are of great importance, considering their ability to provide an intuitive and accurate knowledge representation for classification problems. The automatic design of these algorithms seems timely, given the large literature accumulated over more than 40 years of research in the manual design of decision-tree induction algorithms. The proposed hyper-heuristic evolutionary algorithm, HEAD-DT, is extensively tested using 20 public UCI datasets and 10 microarray gene expression datasets. The algorithms automatically designed by HEAD-DT are compared with traditional decision-tree induction algorithms, such as C4.5 and CART. Experimental results show that HEAD-DT is capable of generating algorithms which are significantly more accurate than C4.5 and CART.

Author(s):  
Ferdinand Bollwein ◽  
Stephan Westphal

AbstractUnivariate decision tree induction methods for multiclass classification problems such as CART, C4.5 and ID3 continue to be very popular in the context of machine learning due to their major benefit of being easy to interpret. However, as these trees only consider a single attribute per node, they often get quite large which lowers their explanatory value. Oblique decision tree building algorithms, which divide the feature space by multidimensional hyperplanes, often produce much smaller trees but the individual splits are hard to interpret. Moreover, the effort of finding optimal oblique splits is very high such that heuristics have to be applied to determine local optimal solutions. In this work, we introduce an effective branch and bound procedure to determine global optimal bivariate oblique splits for concave impurity measures. Decision trees based on these bivariate oblique splits remain fairly interpretable due to the restriction to two attributes per split. The resulting trees are significantly smaller and more accurate than their univariate counterparts due to their ability of adapting better to the underlying data and capturing interactions of attribute pairs. Moreover, our evaluation shows that our algorithm even outperforms algorithms based on heuristically obtained multivariate oblique splits despite the fact that we are focusing on two attributes only.


Author(s):  
PRAMOD PATIL ◽  
ALKA LONDHE ◽  
PARAG KULKARNI

Most of the decision tree algorithms rely on impurity measures to evaluate the goodness of hyperplanes at each node while learning a decision tree in a top-down fashion. These impurity measures are not differentiable with relation to the hyperplane parameters. Therefore the algorithms for decision tree learning using impurity measures need to use some search techniques for finding the best hyperplane at every node. These impurity measures don’t properly capture the geometric structures of the data. In this paper a Two-Class algorithm for learning oblique decision trees is proposed. Aggravated by this, the algorithm uses a strategy, to evaluate the hyperplanes in such a way that the (linear) geometric structure in the data is taken into consideration. At each node of the decision tree, algorithm finds the clustering hyperplanes for both the classes. The clustering hyperplanes are obtained by solving the generalized Eigen-value problem. Then the data is splitted based on angle bisector and recursively learn the left and right sub-trees of the node. Since, in general, there will be two angle bisectors; one is selected which is better based on an impurity measure gini index. Thus the algorithm combines the ideas of linear tendencies in data and purity of nodes to find better decision trees. This idea leads to small decision trees and better performance.


Author(s):  
Marek Kretowski ◽  
Marcin Czajkowski

Decision trees represent one of the main predictive techniques in knowledge discovery. This chapter describes evolutionary induced trees, which are emerging alternatives to the greedy top-down solutions. Most typical tree-based system searches only for locally optimal decisions at each node and do not guarantee the optimal solution. Application of evolutionary algorithms to the problem of decision tree induction allows searching for the structure of the tree, tests in internal nodes and regression functions in the leaves (for model trees) at the same time. As a result, such globally induced decision tree is able to avoid local optima and usually leads to better prediction than the greedy counterparts.


Author(s):  
Marek Kretowski ◽  
Marcin Czajkowski

Decision trees represent one of the main predictive techniques in knowledge discovery. This chapter describes evolutionary-induced trees, which are emerging alternatives to the greedy top-down solutions. Most typical tree-based systems search only for locally optimal decisions at each node and do not guarantee the optimal solution. Application of evolutionary algorithms to the problem of decision tree induction allows searching for the structure of the tree, tests in internal nodes, and regression functions in the leaves (for model trees) at the same time. As a result, such globally induced decision trees are able to avoid local optima and usually lead to better prediction than the greedy counterparts.


2012 ◽  
Vol 13 (1) ◽  
Author(s):  
Rodrigo C Barros ◽  
Ana T Winck ◽  
Karina S Machado ◽  
Márcio P Basgalupp ◽  
André CPLF de Carvalho ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document