Learning Optimal Decision Trees Using Caching Branch-and-Bound Search

Gaël Aglin; Siegfried Nijssen; Pierre Schaus

doi:10.1609/aaai.v34i04.5711

Learning Optimal Decision Trees Using Caching Branch-and-Bound Search

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5711 ◽

2020 ◽

Vol 34 (04) ◽

pp. 3146-3153

Author(s):

Gaël Aglin ◽

Siegfried Nijssen ◽

Pierre Schaus

Keyword(s):

Decision Tree ◽

Decision Trees ◽

Branch And Bound ◽

Search Space ◽

Mixed Integer ◽

Optimal Decision ◽

New Approach ◽

Branch And Bound Search ◽

New Type ◽

Formal Requirements

Several recent publications have studied the use of Mixed Integer Programming (MIP) for finding an optimal decision tree, that is, the best decision tree under formal requirements on accuracy, fairness or interpretability of the predictive model. These publications used MIP to deal with the hard computational challenge of finding such trees. In this paper, we introduce a new efficient algorithm, DL8.5, for finding optimal decision trees, based on the use of itemset mining techniques. We show that this new approach outperforms earlier approaches with several orders of magnitude, for both numerical and discrete data, and is generic as well. The key idea underlying this new approach is the use of a cache of itemsets in combination with branch-and-bound search; this new type of cache also stores results for parts of the search space that have been traversed partially.

Download Full-text

Optimal Decision Trees on Simplicial Complexes

The Electronic Journal of Combinatorics ◽

10.37236/1900 ◽

2005 ◽

Vol 12 (1) ◽

Cited By ~ 1

Author(s):

Jakob Jonsson

Keyword(s):

Decision Tree ◽

Decision Trees ◽

Simplicial Complex ◽

Elementary Theory ◽

Simplicial Complexes ◽

Optimal Decision ◽

Property A ◽

Recursive Definition ◽

Topological Combinatorics ◽

Definition Of

We consider topological aspects of decision trees on simplicial complexes, concentrating on how to use decision trees as a tool in topological combinatorics. By Robin Forman's discrete Morse theory, the number of evasive faces of a given dimension $i$ with respect to a decision tree on a simplicial complex is greater than or equal to the $i$th reduced Betti number (over any field) of the complex. Under certain favorable circumstances, a simplicial complex admits an "optimal" decision tree such that equality holds for each $i$; we may hence read off the homology directly from the tree. We provide a recursive definition of the class of semi-nonevasive simplicial complexes with this property. A certain generalization turns out to yield the class of semi-collapsible simplicial complexes that admit an optimal discrete Morse function in the analogous sense. In addition, we develop some elementary theory about semi-nonevasive and semi-collapsible complexes. Finally, we provide explicit optimal decision trees for several well-known simplicial complexes.

Download Full-text

Learning Optimal Decision Trees using Constraint Programming (Extended Abstract)

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/662 ◽

2020 ◽

Cited By ~ 2

Author(s):

Hélène Verhaeghe ◽

Siegfried Nijssen ◽

Gilles Pesant ◽

Claude-Guy Quimper ◽

Pierre Schaus

Keyword(s):

Decision Trees ◽

Constraint Programming ◽

Greedy Algorithms ◽

Building Blocks ◽

Optimal Decision ◽

Programming Approach ◽

Learning Problem ◽

New Approach ◽

Good Classification ◽

Additional Constraints

Decision trees are among the most popular classification models in machine learning. Traditionally, they are learned using greedy algorithms. However, such algorithms have their disadvantages: it is difficult to limit the size of the decision trees while maintaining a good classification accuracy, and it is hard to impose additional constraints on the models that are learned. For these reasons, there has been a recent interest in exact and flexible algorithms for learning decision trees. In this paper, we introduce a new approach to learn decision trees using constraint programming. Compared to earlier approaches, we show that our approach obtains better performance, while still being sufficiently flexible to allow for the inclusion of constraints. Our approach builds on three key building blocks: (1) the use of AND/OR search, (2) the use of caching, (3) the use of the CoverSize global constraint proposed recently for the problem of itemset mining. This allows our constraint programming approach to deal in a much more efficient way with the decompositions in the learning problem.

Download Full-text

Learning Optimal Decision Trees with SAT

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/189 ◽

2018 ◽

Cited By ~ 8

Author(s):

Nina Narodytska ◽

Alexey Ignatiev ◽

Filipe Pereira ◽

Joao Marques-Silva

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Decision Trees ◽

Practical Interest ◽

Training Data ◽

Fundamental Importance ◽

Optimal Decision ◽

Past Work ◽

Computational Problem ◽

Natural Mapping

Explanations of machine learning (ML) predictions are of fundamental importance in different settings. Moreover, explanations should be succinct, to enable easy understanding by humans. Decision trees represent an often used approach for developing explainable ML models, motivated by the natural mapping between decision tree paths and rules. Clearly, smaller trees correlate well with smaller rules, and so one challenge is to devise solutions for computing smallest size decision trees given training data. Although simple to formulate, the computation of smallest size decision trees turns out to be an extremely challenging computational problem, for which no practical solutions are known. This paper develops a SAT-based model for computing smallest-size decision trees given training data. In sharp contrast with past work, the proposed SAT model is shown to scale for publicly available datasets of practical interest.

Download Full-text

A Building Block Approach to Genetic Programming for Rule Discovery

Data Mining ◽

10.4018/978-1-930708-25-9.ch009 ◽

2011 ◽

pp. 174-190 ◽

Cited By ~ 1

Author(s):

Andries P. Engelbrecht ◽

L. Schoeman ◽

Sonja Rouwhorst

Keyword(s):

Decision Tree ◽

Genetic Programming ◽

Decision Trees ◽

Building Block ◽

Evolutionary Process ◽

Building Blocks ◽

Experimental Results ◽

Rule Discovery ◽

New Approach ◽

Block Approach

Genetic programming has recently been used successfully to extract knowledge in the form of IF-THEN rules. For these genetic programming approaches to knowledge extraction from data, individuals represent decision trees. The main objective of the evolutionary process is therefore to evolve the best decision tree, or classifier, to describe the data. Rules are then extracted, after convergence, from the best individual. The current genetic programming approaches to evolve decision trees are computationally complex, since individuals are initialized to complete decision trees. This chapter discusses a new approach to genetic programming for rule extraction, namely the building block approach. This approach starts with individuals consisting of only one building block, and adds new building blocks during the evolutionary process when the simplicity of the individuals cannot account for the complexity in the underlying data. Experimental results are presented and compared with that of C4.5 and CN2. The chapter shows that the building block approach achieves very good accuracies compared to that of C4.5 and CN2. It is also shown that the building block approach extracts substantially less rules.

Download Full-text

Action Detection by Fusing Hierarchically Filtered Motion with Spatiotemporal Interest Point Features

Human Behavior Recognition Technologies ◽

10.4018/978-1-4666-3682-8.ch012 ◽

2013 ◽

pp. 249-267

Author(s):

YingLi Tian ◽

Liangliang Cao ◽

Zicheng Liu ◽

Zhengyou Zhang

Keyword(s):

Feature Extraction ◽

Gaussian Mixture Models ◽

Gaussian Mixture ◽

Single Type ◽

Interest Point ◽

New Approach ◽

Action Detection ◽

Branch And Bound Search ◽

New Type ◽

Cluttered Background

This chapter addresses the problem of action detection from cluttered videos. In recent years, many feature extraction schemes have been designed to describe various aspects of actions. However, due to the difficulty of action detection, e.g., the cluttered background and potential occlusions, a single type of feature cannot effectively solve the action detection problems in cluttered videos. In this chapter, the authors propose a new type of feature, Hierarchically Filtered Motion (HFM), and further investigate the fusion of HFM with Spatiotemporal Interest Point (STIP) features for action detection from cluttered videos. In order to effectively and efficiently detect actions, they propose a new approach that combines Gaussian Mixture Models (GMMs) with Branch-and-Bound search to locate interested actions in cluttered videos. The proposed new HFM features and action detection method have been evaluated on the classical KTH dataset and the challenging MSR Action Dataset II, which consists of crowded videos with moving people or vehicles in the background. Experiment results demonstrate that the proposed method significantly outperforms existing techniques, especially for action detection in crowded videos.

Download Full-text

A New Approach Based on Bat Algorithm for Inducing Optimal Decision Trees Classifiers

Information Systems and Technologies to Support Learning - Smart Innovation, Systems and Technologies ◽

10.1007/978-3-030-03577-8_69 ◽

2018 ◽

pp. 631-640 ◽

Cited By ~ 1

Author(s):

Ikram Bida ◽

Saliha Aouat

Keyword(s):

Decision Trees ◽

Bat Algorithm ◽

Optimal Decision ◽

New Approach

Download Full-text

Automated Development of Clinical Strategies Using Multistage Decision Analysis

Methods of Information in Medicine ◽

10.1055/s-0038-1635469 ◽

1986 ◽

Vol 25 (04) ◽

pp. 207-214 ◽

Cited By ~ 3

Author(s):

P. Glasziou

Keyword(s):

Decision Tree ◽

Decision Analysis ◽

Decision Trees ◽

Optimal Strategy ◽

Cholestatic Jaundice ◽

Clinical Strategies ◽

Simultaneous Study

SummaryThe development of investigative strategies by decision analysis has been achieved by explicitly drawing the decision tree, either by hand or on computer. This paper discusses the feasibility of automatically generating and analysing decision trees from a description of the investigations and the treatment problem. The investigation of cholestatic jaundice is used to illustrate the technique.Methods to decrease the number of calculations required are presented. It is shown that this method makes practical the simultaneous study of at least half a dozen investigations. However, some new problems arise due to the possible complexity of the resulting optimal strategy. If protocol errors and delays due to testing are considered, simpler strategies become desirable. Generation and assessment of these simpler strategies are discussed with examples.

Download Full-text

Adaptive Random Decision Tree: A New Approach for Data Mining with Privacy Preserving

International Journal of Innovative Research in Computer and Communication Engineering ◽

10.15680/ijircce.2015.0307004 ◽

2015 ◽

Vol 03 (07) ◽

pp. 6378-6384

Author(s):

Hemlata B. Deorukhakar, Prof. Pradnya Kasture

Keyword(s):

Data Mining ◽

Decision Tree ◽

Privacy Preserving ◽

New Approach

Download Full-text

A New Approach for Cardio Vascular Disease Prediction Using Decision Tree

International Journal of Psychosocial Rehabilitation ◽

10.37200/ijpr/v24i5/pr2020796 ◽

2020 ◽

Vol 24 (5) ◽

pp. 7944-7952

Author(s):

Dr. Saravanabhavan C.

Keyword(s):

Decision Tree ◽

Vascular Disease ◽

Disease Prediction ◽

New Approach ◽

Cardio Vascular Disease ◽

Cardio Vascular

Download Full-text

A new approach to select the best subset of predictors in linear regression modelling: bi-objective mixed integer linear programming

ANZIAM Journal ◽

10.21914/anziamj.v61i0.12784 ◽

2019 ◽

Vol 61 ◽

pp. 64

Author(s):

Hadi Charkhgard ◽

Ali Eshragh

Keyword(s):

Linear Programming ◽

Linear Regression ◽

Integer Linear Programming ◽

Mixed Integer Linear Programming ◽

Mixed Integer ◽

New Approach ◽

Regression Modelling

Download Full-text