PyDL8.5: a Library for Learning Optimal Decision Trees

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/750 ◽

2020 ◽

Author(s):

Gaël Aglin ◽

Siegfried Nijssen ◽

Pierre Schaus

Keyword(s):

Machine Learning ◽

Decision Trees ◽

Efficient Algorithm ◽

Optimal Decision ◽

Learning Tasks ◽

Explainable Ai ◽

Classification Tasks ◽

Interpretable Models ◽

Limited Depth

Decision Trees (DTs) are widely used Machine Learning (ML) models with a broad range of applications. The interest in these models has increased even further in the context of Explainable AI (XAI), as decision trees of limited depth are very interpretable models. However, traditional algorithms for learning DTs are heuristic in nature; they may produce trees that are of suboptimal quality under depth constraints. We introduce PyDL8.5, a Python library to infer depth-constrained Optimal Decision Trees (ODTs). PyDL8.5 provides an interface for DL8.5, an efficient algorithm for inferring depth-constrained ODTs. The library provides an easy-to-use scikit-learn compatible interface. It cannot only be used for classification tasks, but also for regression, clustering, and other tasks. We introduce an interface that allows users to easily implement these other learning tasks. We provide a number of examples of how to use this library.

mAML: an automated machine learning pipeline with a microbiome repository for human disease classification

10.1101/2020.02.11.943316 ◽

2020 ◽

Author(s):

Fenglong Yang ◽

Quan Zou

Keyword(s):

Machine Learning ◽

Human Disease ◽

High Performance ◽

Model Building ◽

Disease Classification ◽

Benchmark Datasets ◽

Automated Machine Learning ◽

Classification Tasks ◽

Interpretable Models ◽

Multi Class Classification

AbstractDue to the concerted efforts to utilize the microbial features to improve disease prediction capabilities, automated machine learning (AutoML) systems designed to get rid of the tediousness in manually performing ML tasks are in great demand. Here we developed mAML, an ML model-building pipeline, which can automatically and rapidly generate optimized and interpretable models for personalized microbial classification tasks in a reproducible way. The pipeline is deployed on a web-based platform and the server is user-friendly, flexible, and has been designed to be scalable according to the specific requirements. This pipeline exhibits high performance for 13 benchmark datasets including both binary and multi-class classification tasks. In addition, to facilitate the application of mAML and expand the human disease-related microbiome learning repository, we developed GMrepo ML repository (GMrepo Microbiome Learning repository) from the GMrepo database. The repository involves 120 microbial classification tasks for 85 human-disease phenotypes referring to 12,429 metagenomic samples and 38,643 amplicon samples. The mAML pipeline and the GMrepo ML repository are expected to be important resources for researches in microbiology and algorithm developments.Database URLhttp://39.100.246.211:8050/Home

Learning Optimal Decision Trees with SAT

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/189 ◽

2018 ◽

Cited By ~ 8

Author(s):

Nina Narodytska ◽

Alexey Ignatiev ◽

Filipe Pereira ◽

Joao Marques-Silva

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Decision Trees ◽

Practical Interest ◽

Training Data ◽

Fundamental Importance ◽

Optimal Decision ◽

Past Work ◽

Computational Problem ◽

Natural Mapping

Explanations of machine learning (ML) predictions are of fundamental importance in different settings. Moreover, explanations should be succinct, to enable easy understanding by humans. Decision trees represent an often used approach for developing explainable ML models, motivated by the natural mapping between decision tree paths and rules. Clearly, smaller trees correlate well with smaller rules, and so one challenge is to devise solutions for computing smallest size decision trees given training data. Although simple to formulate, the computation of smallest size decision trees turns out to be an extremely challenging computational problem, for which no practical solutions are known. This paper develops a SAT-based model for computing smallest-size decision trees given training data. In sharp contrast with past work, the proposed SAT model is shown to scale for publicly available datasets of practical interest.

Measuring directed triadic closure with closure coefficients

Network Science ◽

10.1017/nws.2020.20 ◽

2020 ◽

Vol 8 (4) ◽

pp. 551-573 ◽

Cited By ~ 1

Author(s):

Hao Yin ◽

Austin R. Benson ◽

Johan Ugander

Keyword(s):

Machine Learning ◽

Recent Work ◽

Real World ◽

Directed Graphs ◽

Configuration Model ◽

Undirected Graphs ◽

Learning Tasks ◽

Triadic Closure ◽

Interpretable Models

AbstractRecent work studying triadic closure in undirected graphs has drawn attention to the distinction between measures that focus on the “center” node of a wedge (i.e., length-2 path) versus measures that focus on the “initiator,” a distinction with considerable consequences. Existing measures in directed graphs, meanwhile, have all been center-focused. In this work, we propose a family of eight directed closure coefficients that measure the frequency of triadic closure in directed graphs from the perspective of the node initiating closure. The eight coefficients correspond to different labeled wedges, where the initiator and center nodes are labeled, and we observe dramatic empirical variation in these coefficients on real-world networks, even in cases when the induced directed triangles are isomorphic. To understand this phenomenon, we examine the theoretical behavior of our closure coefficients under a directed configuration model. Our analysis illustrates an underlying connection between the closure coefficients and moments of the joint in- and out-degree distributions of the network, offering an explanation of the observed asymmetries. We also use our directed closure coefficients as predictors in two machine learning tasks. We find interpretable models with AUC scores above 0.92 in class-balanced binary prediction, substantially outperforming models that use traditional center-focused measures.

mAML: an automated machine learning pipeline with a microbiome repository for human disease classification

Database ◽

10.1093/database/baaa050 ◽

2020 ◽

Vol 2020 ◽

Author(s):

Fenglong Yang ◽

Quan Zou

Keyword(s):

Machine Learning ◽

Human Disease ◽

High Performance ◽

Model Building ◽

Disease Classification ◽

Benchmark Datasets ◽

Automated Machine Learning ◽

Classification Tasks ◽

Interpretable Models ◽

Multi Class Classification

Abstract Due to the concerted efforts to utilize the microbial features to improve disease prediction capabilities, automated machine learning (AutoML) systems aiming to get rid of the tediousness in manually performing ML tasks are in great demand. Here we developed mAML, an ML model-building pipeline, which can automatically and rapidly generate optimized and interpretable models for personalized microbiome-based classification tasks in a reproducible way. The pipeline is deployed on a web-based platform, while the server is user-friendly and flexible and has been designed to be scalable according to the specific requirements. This pipeline exhibits high performance for 13 benchmark datasets including both binary and multi-class classification tasks. In addition, to facilitate the application of mAML and expand the human disease-related microbiome learning repository, we developed GMrepo ML repository (GMrepo Microbiome Learning repository) from the GMrepo database. The repository involves 120 microbiome-based classification tasks for 85 human-disease phenotypes referring to 12 429 metagenomic samples and 38 643 amplicon samples. The mAML pipeline and the GMrepo ML repository are expected to be important resources for researches in microbiology and algorithm developments. Database URL: http://lab.malab.cn/soft/mAML

Automatic Modeling of Logic Device Performance Based on Machine Learning and Explainable AI

2020 International Conference on Simulation of Semiconductor Processes and Devices (SISPAD) ◽

10.23919/sispad49475.2020.9241681 ◽

2020 ◽

Author(s):

Seungju Kim ◽

Kwangseok Lee ◽

Hyeon-Kyun Noh ◽

Youngkyu Shin ◽

Kyu-Baik Chang ◽

...

Keyword(s):

Machine Learning ◽

Device Performance ◽

Logic Device ◽

Automatic Modeling ◽

Explainable Ai

Understanding Machine Learning for Diversified Portfolio Construction by Explainable AI

SSRN Electronic Journal ◽

10.2139/ssrn.3528616 ◽

2020 ◽

Author(s):

Markus Jaeger ◽

Stephan Krügel ◽

Dimitri Marinelli ◽

Jochen Papenbrock ◽

Peter Schwendner

Keyword(s):

Machine Learning ◽

Portfolio Construction ◽

Explainable Ai ◽

Diversified Portfolio

An Introduction to Machine Learning for Panel Data: Decision Trees, Random Forests, and Other Dendrological Methods

SSRN Electronic Journal ◽

10.2139/ssrn.3717879 ◽

2020 ◽

Author(s):

James Ming Chen

Keyword(s):

Machine Learning ◽

Panel Data ◽

Decision Trees ◽

Random Forests

Clinical applications of artificial intelligence and machine learning in cancer diagnosis: looking into the future

Cancer Cell International ◽

10.1186/s12935-021-01981-1 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Muhammad Javed Iqbal ◽

Zeeshan Javed ◽

Haleema Sadia ◽

Ijaz A. Qureshi ◽

Asma Irshad ◽

...

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Cancer Diagnosis ◽

Disease Risk ◽

Clinical Applications ◽

Optimal Decision ◽

Great Promise ◽

Time Range ◽

Base System ◽

The Future

AbstractArtificial intelligence (AI) is the use of mathematical algorithms to mimic human cognitive abilities and to address difficult healthcare challenges including complex biological abnormalities like cancer. The exponential growth of AI in the last decade is evidenced to be the potential platform for optimal decision-making by super-intelligence, where the human mind is limited to process huge data in a narrow time range. Cancer is a complex and multifaced disorder with thousands of genetic and epigenetic variations. AI-based algorithms hold great promise to pave the way to identify these genetic mutations and aberrant protein interactions at a very early stage. Modern biomedical research is also focused to bring AI technology to the clinics safely and ethically. AI-based assistance to pathologists and physicians could be the great leap forward towards prediction for disease risk, diagnosis, prognosis, and treatments. Clinical applications of AI and Machine Learning (ML) in cancer diagnosis and treatment are the future of medical guidance towards faster mapping of a new treatment for every individual. By using AI base system approach, researchers can collaborate in real-time and share knowledge digitally to potentially heal millions. In this review, we focused to present game-changing technology of the future in clinics, by connecting biology with Artificial Intelligence and explain how AI-based assistance help oncologist for precise treatment.

Evaluating Nonlinear Decision Trees for Binary Classification Tasks with Other Existing Methods

2020 IEEE Symposium Series on Computational Intelligence (SSCI) ◽

10.1109/ssci47803.2020.9308505 ◽

2020 ◽

Author(s):

Yashesh Dhebar ◽

Sparsh Gupta ◽

Kalyanmoy Deb

Keyword(s):

Decision Trees ◽

Binary Classification ◽

Classification Tasks

Selection of Suitable Machine Learning Algorithms for Classification Tasks in Reverse Logistics

Procedia CIRP ◽

10.1016/j.procir.2021.01.086 ◽

2021 ◽

Vol 96 ◽

pp. 272-277

Author(s):

Hannah Lickert ◽

Aleksandra Wewer ◽

Sören Dittmann ◽

Pinar Bilge ◽

Franz Dietrich

Keyword(s):

Machine Learning ◽

Reverse Logistics ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Classification Tasks ◽

Selection Of