scholarly journals On the Identifiability of Hierarchical Decision Models

2021 ◽  
Author(s):  
Roman Bresson ◽  
Johanne Cohen ◽  
Eyke Hüllermeier ◽  
Christophe Labreuche ◽  
Michèle Sebag

Interpretability is a desirable property for machine learning and decision models, particularly in the context of safety-critical applications. Another most desirable property of the sought model is to be unique or {\em identifiable} in the considered class of models: the fact that the same functional dependency can be represented by a number of syntactically different models adversely affects the model interpretability, and prevents the expert from easily checking their validity. This paper focuses on the Choquet integral (CI) models and their hierarchical extensions (HCI). HCIs aim to support expert decision making, by gradually aggregating preferences based on criteria; they are widely used in multi-criteria decision aiding {and are receiving interest from the} Machine Learning {community}, as they preserve the high readability of CIs while efficiently scaling up w.r.t. the number of criteria. The main contribution is to establish the identifiability property of HCI under mild conditions: two HCIs implementing the same aggregation function on the criteria space necessarily have the same hierarchical structure and aggregation parameters. The identifiability property holds even when the marginal utility functions are learned from the data. This makes the class of HCI models a most appropriate choice in domains where the model interpretability and reliability are of primary concern.

2017 ◽  
Author(s):  
Benjamin Sanchez-Lengeling ◽  
Carlos Outeiral ◽  
Gabriel L. Guimaraes ◽  
Alan Aspuru-Guzik

Molecular discovery seeks to generate chemical species tailored to very specific needs. In this paper, we present ORGANIC, a framework based on Objective-Reinforced Generative Adversarial Networks (ORGAN), capable of producing a distribution over molecular space that matches with a certain set of desirable metrics. This methodology combines two successful techniques from the machine learning community: a Generative Adversarial Network (GAN), to create non-repetitive sensible molecular species, and Reinforcement Learning (RL), to bias this generative distribution towards certain attributes. We explore several applications, from optimization of random physicochemical properties to candidates for drug discovery and organic photovoltaic material design.


2018 ◽  
Vol 18 (3-4) ◽  
pp. 623-637 ◽  
Author(s):  
ARINDAM MITRA ◽  
CHITTA BARAL

AbstractOver the years the Artificial Intelligence (AI) community has produced several datasets which have given the machine learning algorithms the opportunity to learn various skills across various domains. However, a subclass of these machine learning algorithms that aimed at learning logic programs, namely the Inductive Logic Programming algorithms, have often failed at the task due to the vastness of these datasets. This has impacted the usability of knowledge representation and reasoning techniques in the development of AI systems. In this research, we try to address this scalability issue for the algorithms that learn answer set programs. We present a sound and complete algorithm which takes the input in a slightly different manner and performs an efficient and more user controlled search for a solution. We show via experiments that our algorithm can learn from two popular datasets from machine learning community, namely bAbl (a question answering dataset) and MNIST (a dataset for handwritten digit recognition), which to the best of our knowledge was not previously possible. The system is publicly available athttps://goo.gl/KdWAcV.


Author(s):  
Roman Bresson ◽  
Johanne Cohen ◽  
Eyke Hüllermeier ◽  
Christophe Labreuche ◽  
Michèle Sebag

Multi-Criteria Decision Making (MCDM) aims at modelling expert preferences and assisting decision makers in identifying options best accommodating expert criteria. An instance of MCDM model, the Choquet integral is widely used in real-world applications, due to its ability to capture interactions between criteria while retaining interpretability. Aimed at a better scalability and modularity, hierarchical Choquet integrals involve intermediate aggregations of the interacting criteria, at the cost of a more complex elicitation. The paper presents a machine learning-based approach for the automatic identification of hierarchical MCDM models, composed of 2-additive Choquet integral aggregators and of marginal utility functions on the raw features from data reflecting expert preferences. The proposed NEUR-HCI framework relies on a specific neural architecture, enforcing by design the Choquet model constraints and supporting its end-to-end training. The empirical validation of NEUR-HCI on real-world and artificial benchmarks demonstrates the merits of the approach compared to state-of-art baselines.


Sensors ◽  
2019 ◽  
Vol 19 (9) ◽  
pp. 1988 ◽  
Author(s):  
Lourdes Martínez-Villaseñor ◽  
Hiram Ponce ◽  
Jorge Brieva ◽  
Ernesto Moya-Albor ◽  
José Núñez-Martínez ◽  
...  

Falls, especially in elderly persons, are an important health problem worldwide. Reliable fall detection systems can mitigate negative consequences of falls. Among the important challenges and issues reported in literature is the difficulty of fair comparison between fall detection systems and machine learning techniques for detection. In this paper, we present UP-Fall Detection Dataset. The dataset comprises raw and feature sets retrieved from 17 healthy young individuals without any impairment that performed 11 activities and falls, with three attempts each. The dataset also summarizes more than 850 GB of information from wearable sensors, ambient sensors and vision devices. Two experimental use cases were shown. The aim of our dataset is to help human activity recognition and machine learning research communities to fairly compare their fall detection solutions. It also provides many experimental possibilities for the signal recognition, vision, and machine learning community.


2012 ◽  
Vol 10 (10) ◽  
pp. 547
Author(s):  
Mei Zhang ◽  
Gregory Johnson ◽  
Jia Wang

<span style="font-family: Times New Roman; font-size: small;"> </span><p style="margin: 0in 0.5in 0pt; text-align: justify; mso-pagination: none; mso-layout-grid-align: none;" class="MsoNormal"><span style="color: black; font-size: 10pt; mso-themecolor: text1;"><span style="font-family: Times New Roman;">A takeover success prediction model aims at predicting the probability that a takeover attempt will succeed by using publicly available information at the time of the announcement.<span style="mso-spacerun: yes;"> </span>We perform a thorough study using machine learning techniques to predict takeover success.<span style="mso-spacerun: yes;"> </span>Specifically, we model takeover success prediction as a binary classification problem, which has been widely studied in the machine learning community.<span style="mso-spacerun: yes;"> </span>Motivated by the recent advance in machine learning, we empirically evaluate and analyze many state-of-the-art classifiers, including logistic regression, artificial neural network, support vector machines with different kernels, decision trees, random forest, and Adaboost.<span style="mso-spacerun: yes;"> </span>The experiments validate the effectiveness of applying machine learning in takeover success prediction, and we found that the support vector machine with linear kernel and the Adaboost with stump weak classifiers perform the best for the task.<span style="mso-spacerun: yes;"> </span>The result is consistent with the general observations of these two approaches.</span></span></p><span style="font-family: Times New Roman; font-size: small;"> </span>


Digital technology is fast changing in the recent years and with this change, the number of data systems, sources, and formats has also increased exponentially. So the process of extracting data from these multiple source systems and transforming it to suit for various analytics processes is gaining importance at an alarming rate. In order to handle Big Data, the process of transformation is quite challenging, as data generation is a continuous process. In this paper, we extract data from various heterogeneous sources from the web and try to transform it into a form which is vastly used in data warehousing so that it caters to the analytical needs of the machine learning community.


2021 ◽  
Author(s):  
Arash Keshavarzi Arshadi ◽  
Milad Salem ◽  
Arash Firouzbakht ◽  
Jiann Shiun Yuan

Abstract Deep learning’s automatic feature extraction has been a revolutionary addition to computational drug discovery, infusing both the capabilities of learning abstract features and discovering complex molecular patterns via learning from molecular data. Since biological and chemical knowledge is necessary for overcoming the challenges of data curation, balancing, training, and evaluation, it is important for databases to contain meaningful information regarding the exact target and disease of each bioassay. The existing depositories such as PubChem or ChMBL offer the screening data of millions of molecules against a variety of cells and targets, however, their bioassays contain complex biological information which can hinder their usage by the machine learning community. In this work, a comprehensive disease and target-based dataset are collected from PubChem in order to facilitate and accelerate molecular machine learning for better drug discovery. MolData is one the largest efforts to date for democratizing the molecular machine learning, with roughly 170 million drug screening results from 1.4 million unique molecules assigned to specific diseases and targets. It also provides 30 unique categories of targets and diseases. Correlation analysis of the MolData bioassays unveils valuable information for drug repurposing for multiple diseases including cancer, metabolic disorders, and infectious diseases. Finally, we provide a benchmark of more than 30 models trained on each category using multitask learning. MolData aims to pave the way for computational drug discovery and accelerate the advancement of molecular artificial intelligence in a practical manner. The MolData benchmark data is available at https:// github.com/Transilico/MolData as well as within the supplementary materials.


Sign in / Sign up

Export Citation Format

Share Document