Differential Evolution and Perceptron Decision Trees for Classification Tasks

Author(s):  
R. A. Lopes ◽  
A. R. R. Freitas ◽  
R. C. Pedrosa Silva ◽  
Frederico Gadelha Guimarães
Author(s):  
Gaël Aglin ◽  
Siegfried Nijssen ◽  
Pierre Schaus

Decision Trees (DTs) are widely used Machine Learning (ML) models with a broad range of applications. The interest in these models has increased even further in the context of Explainable AI (XAI), as decision trees of limited depth are very interpretable models. However, traditional algorithms for learning DTs are heuristic in nature; they may produce trees that are of suboptimal quality under depth constraints. We introduce PyDL8.5, a Python library to infer depth-constrained Optimal Decision Trees (ODTs). PyDL8.5 provides an interface for DL8.5, an efficient algorithm for inferring depth-constrained ODTs. The library provides an easy-to-use scikit-learn compatible interface. It cannot only be used for classification tasks, but also for regression, clustering, and other tasks. We introduce an interface that allows users to easily implement these other learning tasks. We provide a number of examples of how to use this library.


1997 ◽  
Vol 12 (01) ◽  
pp. 1-40 ◽  
Author(s):  
LEONARD A. BRESLOW ◽  
DAVID W. AHA

Induced decision trees are an extensively-researched solution to classification tasks. For many practical tasks, the trees produced by tree-generation algorithms are not comprehensible to users due to their size and complexity. Although many tree induction algorithms have been shown to produce simpler, more comprehensible trees (or data structures derived from trees) with good classification accuracy, tree simplification has usually been of secondary concern relative to accuracy, and no attempt has been made to survey the literature from the perspective of simplification. We present a framework that organizes the approaches to tree simplification and summarize and critique the approaches within this framework. The purpose of this survey is to provide researchers and practitioners with a concise overview of tree-simplification approaches and insight into their relative capabilities. In our final discussion, we briefly describe some empirical findings and discuss the application of tree induction algorithms to case retrieval in case-based reasoning systems.


2012 ◽  
pp. 414-427 ◽  
Author(s):  
Marco Vannucci ◽  
Valentina Colla ◽  
Silvia Cateni ◽  
Mirko Sgarbi

In this chapter a survey on the problem of classification tasks in unbalanced datasets is presented. The effect of the imbalance of the distribution of target classes in databases is analyzed with respect to the performance of standard classifiers such as decision trees and support vector machines, and the main approaches to improve the generally not satisfactory results obtained by such methods are described. Finally, two typical applications coming from real world frameworks are introduced, and the uses of the techniques employed for the related classification tasks are shown in practice.


Author(s):  
Maxime De Bois ◽  
Mounîm A. El Yacoubi ◽  
Mehdi Ammi

The adoption of deep learning in healthcare is hindered by their “black box” nature. In this paper, we explore the RETAIN architecture for the task of glucose forecasting for diabetic people. By using a two-level attention mechanism, the recurrent-neural-network-based RETAIN model is interpretable. We evaluate the RETAIN model on the type-2 IDIAB and the type-1 OhioT1DM datasets by comparing its statistical and clinical performances against two deep models and three models based on decision trees. We show that the RETAIN model offers a very good compromise between accuracy and interpretability, being almost as accurate as the LSTM and FCN models while remaining interpretable. We show the usefulness of its interpretable nature by analyzing the contribution of each variable to the final prediction. It revealed that signal values older than 1[Formula: see text]h are not used by the RETAIN model for 30[Formula: see text]min ahead of time prediction of glucose. Also, we show how the RETAIN model changes its behavior upon the arrival of an event such as carbohydrate intakes or insulin infusions. In particular, it showed that the patient’s state before the event is particularly important for the prediction. Overall the RETAIN model, thanks to its interpretability, seems to be a very promising model for regression or classification tasks in healthcare.


2013 ◽  
Vol 12 ◽  
pp. CIN.S10356 ◽  
Author(s):  
Benjamin Ulfenborg ◽  
Karin Klinga-Levan ◽  
Björn Olsson

We present a novel machine learning approach for the classification of cancer samples using expression data. We refer to the method as “decision trunks,” since it is loosely based on decision trees, but contains several modifications designed to achieve an algorithm that: (1) produces smaller and more easily interpretable classifiers than decision trees; (2) is more robust in varying application scenarios; and (3) achieves higher classification accuracy. The decision trunk algorithm has been implemented and tested on 26 classification tasks, covering a wide range of cancer forms, experimental methods, and classification scenarios. This comprehensive evaluation indicates that the proposed algorithm performs at least as well as the current state of the art algorithms in terms of accuracy, while producing classifiers that include on average only 2–3 markers. We suggest that the resulting decision trunks have clear advantages over other classifiers due to their transparency, interpretability, and their correspondence with human decision-making and clinical testing practices.


2020 ◽  
Vol 198 ◽  
pp. 105922 ◽  
Author(s):  
Jianjian Yan ◽  
Zhongnan Zhang ◽  
Kunhui Lin ◽  
Fan Yang ◽  
Xiongbiao Luo

Author(s):  
Marco Vannucci ◽  
Valentina Colla ◽  
Silvia Cateni ◽  
Mirko Sgarbi

In this chapter a survey on the problem of classification tasks in unbalanced datasets is presented. The effect of the imbalance of the distribution of target classes in databases is analyzed with respect to the performance of standard classifiers such as decision trees and support vector machines, and the main approaches to improve the generally not satisfactory results obtained by such methods are described. Finally, two typical applications coming from real world frameworks are introduced, and the uses of the techniques employed for the related classification tasks are shown in practice.


Sign in / Sign up

Export Citation Format

Share Document