Optimal Decision Trees on Simplicial Complexes

Jakob Jonsson

doi:10.37236/1900

Optimal Decision Trees on Simplicial Complexes

The Electronic Journal of Combinatorics ◽

10.37236/1900 ◽

2005 ◽

Vol 12 (1) ◽

Cited By ~ 1

Author(s):

Jakob Jonsson

Keyword(s):

Decision Tree ◽

Decision Trees ◽

Simplicial Complex ◽

Elementary Theory ◽

Simplicial Complexes ◽

Optimal Decision ◽

Property A ◽

Recursive Definition ◽

Topological Combinatorics ◽

Definition Of

We consider topological aspects of decision trees on simplicial complexes, concentrating on how to use decision trees as a tool in topological combinatorics. By Robin Forman's discrete Morse theory, the number of evasive faces of a given dimension $i$ with respect to a decision tree on a simplicial complex is greater than or equal to the $i$th reduced Betti number (over any field) of the complex. Under certain favorable circumstances, a simplicial complex admits an "optimal" decision tree such that equality holds for each $i$; we may hence read off the homology directly from the tree. We provide a recursive definition of the class of semi-nonevasive simplicial complexes with this property. A certain generalization turns out to yield the class of semi-collapsible simplicial complexes that admit an optimal discrete Morse function in the analogous sense. In addition, we develop some elementary theory about semi-nonevasive and semi-collapsible complexes. Finally, we provide explicit optimal decision trees for several well-known simplicial complexes.

Download Full-text

Algebraic Shifting and Sequentially Cohen-Macaulay Simplicial Complexes

The Electronic Journal of Combinatorics ◽

10.37236/1245 ◽

1996 ◽

Vol 3 (1) ◽

Cited By ~ 22

Author(s):

Art M. Duval

Keyword(s):

Simplicial Complex ◽

Simplicial Complexes ◽

Pure Case ◽

Definition Of ◽

Algebraic Shifting

Björner and Wachs generalized the definition of shellability by dropping the assumption of purity; they also introduced the $h$-triangle, a doubly-indexed generalization of the $h$-vector which is combinatorially significant for nonpure shellable complexes. Stanley subsequently defined a nonpure simplicial complex to be sequentially Cohen-Macaulay if it satisfies algebraic conditions that generalize the Cohen-Macaulay conditions for pure complexes, so that a nonpure shellable complex is sequentially Cohen-Macaulay. We show that algebraic shifting preserves the $h$-triangle of a simplicial complex $K$ if and only if $K$ is sequentially Cohen-Macaulay. This generalizes a result of Kalai's for the pure case. Immediate consequences include that nonpure shellable complexes and sequentially Cohen-Macaulay complexes have the same set of possible $h$-triangles.

Download Full-text

Learning Optimal Decision Trees with SAT

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/189 ◽

2018 ◽

Cited By ~ 8

Author(s):

Nina Narodytska ◽

Alexey Ignatiev ◽

Filipe Pereira ◽

Joao Marques-Silva

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Decision Trees ◽

Practical Interest ◽

Training Data ◽

Fundamental Importance ◽

Optimal Decision ◽

Past Work ◽

Computational Problem ◽

Natural Mapping

Explanations of machine learning (ML) predictions are of fundamental importance in different settings. Moreover, explanations should be succinct, to enable easy understanding by humans. Decision trees represent an often used approach for developing explainable ML models, motivated by the natural mapping between decision tree paths and rules. Clearly, smaller trees correlate well with smaller rules, and so one challenge is to devise solutions for computing smallest size decision trees given training data. Although simple to formulate, the computation of smallest size decision trees turns out to be an extremely challenging computational problem, for which no practical solutions are known. This paper develops a SAT-based model for computing smallest-size decision trees given training data. In sharp contrast with past work, the proposed SAT model is shown to scale for publicly available datasets of practical interest.

Download Full-text

Learning Optimal Decision Trees Using Caching Branch-and-Bound Search

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5711 ◽

2020 ◽

Vol 34 (04) ◽

pp. 3146-3153

Author(s):

Gaël Aglin ◽

Siegfried Nijssen ◽

Pierre Schaus

Keyword(s):

Decision Tree ◽

Decision Trees ◽

Branch And Bound ◽

Search Space ◽

Mixed Integer ◽

Optimal Decision ◽

New Approach ◽

Branch And Bound Search ◽

New Type ◽

Formal Requirements

Several recent publications have studied the use of Mixed Integer Programming (MIP) for finding an optimal decision tree, that is, the best decision tree under formal requirements on accuracy, fairness or interpretability of the predictive model. These publications used MIP to deal with the hard computational challenge of finding such trees. In this paper, we introduce a new efficient algorithm, DL8.5, for finding optimal decision trees, based on the use of itemset mining techniques. We show that this new approach outperforms earlier approaches with several orders of magnitude, for both numerical and discrete data, and is generic as well. The key idea underlying this new approach is the use of a cache of itemsets in combination with branch-and-bound search; this new type of cache also stores results for parts of the search space that have been traversed partially.

Download Full-text

Minimum Query Set for Decision Tree Construction

Entropy ◽

10.3390/e23121682 ◽

2021 ◽

Vol 23 (12) ◽

pp. 1682

Author(s):

Wojciech Wieczorek ◽

Jan Kozak ◽

Łukasz Strąk ◽

Arkadiusz Nowakowski

Keyword(s):

Genetic Algorithm ◽

Decision Tree ◽

Programming Model ◽

Building Blocks ◽

Optimal Decision ◽

Second Stage ◽

Tree Construction ◽

Series Of Experiments ◽

Definition Of ◽

Classification Quality

A new two-stage method for the construction of a decision tree is developed. The first stage is based on the definition of a minimum query set, which is the smallest set of attribute-value pairs for which any two objects can be distinguished. To obtain this set, an appropriate linear programming model is proposed. The queries from this set are building blocks of the second stage in which we try to find an optimal decision tree using a genetic algorithm. In a series of experiments, we show that for some databases, our approach should be considered as an alternative method to classical ones (CART, C4.5) and other heuristic approaches in terms of classification quality.

Download Full-text

Automated Development of Clinical Strategies Using Multistage Decision Analysis

Methods of Information in Medicine ◽

10.1055/s-0038-1635469 ◽

1986 ◽

Vol 25 (04) ◽

pp. 207-214 ◽

Cited By ~ 3

Author(s):

P. Glasziou

Keyword(s):

Decision Tree ◽

Decision Analysis ◽

Decision Trees ◽

Optimal Strategy ◽

Cholestatic Jaundice ◽

Clinical Strategies ◽

Simultaneous Study

SummaryThe development of investigative strategies by decision analysis has been achieved by explicitly drawing the decision tree, either by hand or on computer. This paper discusses the feasibility of automatically generating and analysing decision trees from a description of the investigations and the treatment problem. The investigation of cholestatic jaundice is used to illustrate the technique.Methods to decrease the number of calculations required are presented. It is shown that this method makes practical the simultaneous study of at least half a dozen investigations. However, some new problems arise due to the possible complexity of the resulting optimal strategy. If protocol errors and delays due to testing are considered, simpler strategies become desirable. Generation and assessment of these simpler strategies are discussed with examples.

Download Full-text

Performance Improvement of Decision Tree: A Robust Classifier Using Tabu Search Algorithm

Applied Sciences ◽

10.3390/app11156728 ◽

2021 ◽

Vol 11 (15) ◽

pp. 6728

Author(s):

Muhammad Asfand Hafeez ◽

Muhammad Rashid ◽

Hassan Tariq ◽

Zain Ul Abideen ◽

Saud S. Alotaibi ◽

...

Keyword(s):

Machine Learning ◽

Tabu Search ◽

Decision Tree ◽

Decision Trees ◽

Search Algorithm ◽

Learning Algorithms ◽

Performance Comparison ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Tabu Search Algorithm

Classification and regression are the major applications of machine learning algorithms which are widely used to solve problems in numerous domains of engineering and computer science. Different classifiers based on the optimization of the decision tree have been proposed, however, it is still evolving over time. This paper presents a novel and robust classifier based on a decision tree and tabu search algorithms, respectively. In the aim of improving performance, our proposed algorithm constructs multiple decision trees while employing a tabu search algorithm to consistently monitor the leaf and decision nodes in the corresponding decision trees. Additionally, the used tabu search algorithm is responsible to balance the entropy of the corresponding decision trees. For training the model, we used the clinical data of COVID-19 patients to predict whether a patient is suffering. The experimental results were obtained using our proposed classifier based on the built-in sci-kit learn library in Python. The extensive analysis for the performance comparison was presented using Big O and statistical analysis for conventional supervised machine learning algorithms. Moreover, the performance comparison to optimized state-of-the-art classifiers is also presented. The achieved accuracy of 98%, the required execution time of 55.6 ms and the area under receiver operating characteristic (AUROC) for proposed method of 0.95 reveals that the proposed classifier algorithm is convenient for large datasets.

Download Full-text

A Practical Tutorial for Decision Tree Induction

ACM Computing Surveys ◽

10.1145/3429739 ◽

2021 ◽

Vol 54 (1) ◽

pp. 1-38

Author(s):

Víctor Adrián Sosa Hernández ◽

Raúl Monroy ◽

Miguel Angel Medina-Pérez ◽

Octavio Loyola-González ◽

Francisco Herrera

Keyword(s):

Decision Tree ◽

Decision Trees ◽

Machine Learning Techniques ◽

Evaluation Measures ◽

Decision Tree Induction ◽

Learning Techniques ◽

Tree Models ◽

Evaluation Measure ◽

Main Components ◽

Support Decision Making

Experts from different domains have resorted to machine learning techniques to produce explainable models that support decision-making. Among existing techniques, decision trees have been useful in many application domains for classification. Decision trees can make decisions in a language that is closer to that of the experts. Many researchers have attempted to create better decision tree models by improving the components of the induction algorithm. One of the main components that have been studied and improved is the evaluation measure for candidate splits. In this article, we introduce a tutorial that explains decision tree induction. Then, we present an experimental framework to assess the performance of 21 evaluation measures that produce different C4.5 variants considering 110 databases, two performance measures, and 10× 10-fold cross-validation. Furthermore, we compare and rank the evaluation measures by using a Bayesian statistical analysis. From our experimental results, we present the first two performance rankings in the literature of C4.5 variants. Moreover, we organize the evaluation measures into two groups according to their performance. Finally, we introduce meta-models that automatically determine the group of evaluation measures to produce a C4.5 variant for a new database and some further opportunities for decision tree models.

Download Full-text

Evolutionary Algorithm for Improving Decision Tree with Global Discretization in Manufacturing

Sensors ◽

10.3390/s21082849 ◽

2021 ◽

Vol 21 (8) ◽

pp. 2849

Author(s):

Sungbum Jun

Keyword(s):

Decision Tree ◽

Evolutionary Algorithm ◽

Decision Trees ◽

Manufacturing Systems ◽

Ensemble Methods ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Industrial Internet ◽

Tree Models ◽

Real World Datasets

Due to the recent advance in the industrial Internet of Things (IoT) in manufacturing, the vast amount of data from sensors has triggered the need for leveraging such big data for fault detection. In particular, interpretable machine learning techniques, such as tree-based algorithms, have drawn attention to the need to implement reliable manufacturing systems, and identify the root causes of faults. However, despite the high interpretability of decision trees, tree-based models make a trade-off between accuracy and interpretability. In order to improve the tree’s performance while maintaining its interpretability, an evolutionary algorithm for discretization of multiple attributes, called Decision tree Improved by Multiple sPLits with Evolutionary algorithm for Discretization (DIMPLED), is proposed. The experimental results with two real-world datasets from sensors showed that the decision tree improved by DIMPLED outperformed the performances of single-decision-tree models (C4.5 and CART) that are widely used in practice, and it proved competitive compared to the ensemble methods, which have multiple decision trees. Even though the ensemble methods could produce slightly better performances, the proposed DIMPLED has a more interpretable structure, while maintaining an appropriate performance level.

Download Full-text

Induction of fuzzy decision trees and its refinement using gradient projected-neuro-fuzzy decision tree

International Journal of Advanced Intelligence Paradigms ◽

10.1504/ijaip.2014.066983 ◽

2014 ◽

Vol 6 (4) ◽

pp. 346 ◽

Cited By ~ 6

Author(s):

Swathi Jamjala Narayanan ◽

Rajen B. Bhatt ◽

Ilango Paramasivam ◽

M. Khalid ◽

B.K. Tripathy

Keyword(s):

Decision Tree ◽

Decision Trees ◽

Fuzzy Decision ◽

Fuzzy Decision Tree ◽

Neuro Fuzzy ◽

Fuzzy Decision Trees

Download Full-text

Forest Pruning Based on Branch Importance

Computational Intelligence and Neuroscience ◽

10.1155/2017/3162571 ◽

2017 ◽

Vol 2017 ◽

pp. 1-11 ◽

Cited By ~ 2

Author(s):

Xiangkui Jiang ◽

Chang-an Wu ◽

Huaping Guo

Keyword(s):

Decision Tree ◽

Decision Trees ◽

Selection Algorithm ◽

Ensemble Size ◽

Ensemble Pruning ◽

Generalization Ability ◽

Ensemble Selection ◽

Novel Strategy

A forest is an ensemble with decision trees as members. This paper proposes a novel strategy to pruning forest to enhance ensemble generalization ability and reduce ensemble size. Unlike conventional ensemble pruning approaches, the proposed method tries to evaluate the importance of branches of trees with respect to the whole ensemble using a novel proposed metric called importance gain. The importance of a branch is designed by considering ensemble accuracy and the diversity of ensemble members, and thus the metric reasonably evaluates how much improvement of the ensemble accuracy can be achieved when a branch is pruned. Our experiments show that the proposed method can significantly reduce ensemble size and improve ensemble accuracy, no matter whether ensembles are constructed by a certain algorithm such as bagging or obtained by an ensemble selection algorithm, no matter whether each decision tree is pruned or unpruned.

Download Full-text