Learning decision trees from decision rules: A method and initial results from a comparative study

1993 ◽  
Vol 2 (3) ◽  
pp. 279-304 ◽  
Author(s):  
I. F. Imam ◽  
R. S. Michalski
2013 ◽  
Vol 40 (15) ◽  
pp. 6047-6054 ◽  
Author(s):  
Joaquín Abellán ◽  
Griselda López ◽  
Juan de Oña

Author(s):  
Marek Kretowski ◽  
Marek Grzes

Decision trees are, besides decision rules, one of the most popular forms of knowledge representation in Knowledge Discovery in Databases process (Fayyad, Piatetsky-Shapiro, Smyth & Uthurusamy, 1996) and implementations of the classical decision tree induction algorithms are included in the majority of data mining systems. A hierarchical structure of a tree-based classifier, where appropriate tests from consecutive nodes are subsequently applied, closely resembles a human way of decision making. This makes decision trees natural and easy to understand even for an inexperienced analyst. The popularity of the decision tree approach can also be explained by their ease of application, fast classification and what may be the most important, their effectiveness. Two main types of decision trees can be distinguished by the type of tests in non-terminal nodes: univariate and multivariate decision trees. In the first group, a single attribute is used in each test. For a continuousvalued feature usually an inequality test with binary outcomes is applied and for a nominal attribute mutually exclusive groups of attribute values are associated with outcomes. As a good representative of univariate inducers, the well-known C4.5 system developed by Quinlan (1993) should be mentioned. In univariate trees a split is equivalent to partitioning the feature space with an axis-parallel hyper-plane. If decision boundaries of a particular dataset are not axis-parallel, using such tests may lead to an overcomplicated classifier. This situation is known as the “staircase effect”. The problem can be mitigated by applying more sophisticated multivariate tests, where more than one feature can be taken into account. The most common form of such tests is an oblique split, which is based on a linear combination of features (hyper-plane). The decision tree which applies only oblique tests is often called oblique or linear, whereas heterogeneous trees with univariate, linear and other multivariate (e.g., instance-based) tests can be called mixed decision trees (Llora & Wilson, 2004). It should be emphasized that computational complexity of the multivariate induction is generally significantly higher than the univariate induction. CART (Breiman, Friedman, Olshen & Stone, 1984) and OC1 (Murthy, Kasif & Salzberg, 1994) are well known examples of multivariate systems.


2008 ◽  
pp. 2978-2992
Author(s):  
Jianting Zhang ◽  
Wieguo Liu ◽  
Le Gruenwald

Decision trees (DT) has been widely used for training and classification of remotely sensed image data due to its capability to generate human interpretable decision rules and its relatively fast speed in training and classification. This chapter proposes a successive decision tree (SDT) approach where the samples in the ill-classified branches of a previous resulting decision tree are used to construct a successive decision tree. The decision trees are chained together through pointers and used for classification. SDT aims at constructing more interpretable decision trees while attempting to improve classification accuracies. The proposed approach is applied to two real remotely sensed image datasets for evaluations in terms of classification accuracy and interpretability of the resulting decision rules.


Author(s):  
Malcolm J. Beynon ◽  
Paul Jones

This chapter considers the soft computing approach called fuzzy decision trees (FDT), a form of classification analysis. The consideration of decision tree analysis in a fuzzy environment brings further interpretability and readability to the constructed ‘if .. then ..’ decision rules. Two sets of FDT analyses are presented, the first on a small example data set, offering a tutorial on the rudiments of one FDT technique. The second FDT analysis considers the investigation of an e-learning database, and the elucidation of the relationship between weekly online activity of students and their final mark on a university course module. Emphasis throughout the chapter is on the visualization of results, including the fuzzification of weekly online activity levels of students and overall performance.


Sign in / Sign up

Export Citation Format

Share Document