scholarly journals Human-Centric Justification of Machine Learning Predictions

Author(s):  
Or Biran ◽  
Kathleen McKeown

Human decision makers in many domains can make use of predictions made by machine learning models in their decision making process, but the usability of these predictions is limited if the human is unable to justify his or her trust in the prediction. We propose a novel approach to producing justifications that is geared towards users without machine learning expertise, focusing on domain knowledge and on human reasoning, and utilizing natural language generation. Through a task-based experiment, we show that our approach significantly helps humans to correctly decide whether or not predictions are accurate, and significantly increases their satisfaction with the justification.

Author(s):  
Maxat Kulmanov ◽  
Fatima Zohra Smaili ◽  
Xin Gao ◽  
Robert Hoehndorf

Ontologies have long been employed in the life sciences to formally represent and reason over domain knowledge, and they are employed in almost every major biological database. Recently, ontologies are increasingly being used to provide background knowledge in similarity-based analysis and machine learning models. The methods employed to combine ontologies and machine learning are still novel and actively being developed. We provide an overview over the methods that use ontologies to compute similarity and incorporate them in machine learning methods; in particular, we outline how semantic similarity measures and ontology embeddings can exploit the background knowledge in biomedical ontologies, and how ontologies can provide constraints that improve machine learning models. The methods and experiments we describe are available as a set of executable notebooks, and we also provide a set of slides and additional resources at https://github.com/bio-ontology-research-group/machine-learning-with-ontologies.Key pointsOntologies provide background knowledge that can be exploited in machine learning models.Ontology embeddings are structure-preserving maps from ontologies into vector spaces and provide an important method for utilizing ontologies in machine learning. Embeddings can preserve different structures in ontologies, including their graph structures, syntactic regularities, or their model-theoretic semantics.Axioms in ontologies, in particular those involving negation, can be used as constraints in optimization and machine learning to reduce the search space.


Author(s):  
Maxat Kulmanov ◽  
Fatima Zohra Smaili ◽  
Xin Gao ◽  
Robert Hoehndorf

Abstract Ontologies have long been employed in the life sciences to formally represent and reason over domain knowledge and they are employed in almost every major biological database. Recently, ontologies are increasingly being used to provide background knowledge in similarity-based analysis and machine learning models. The methods employed to combine ontologies and machine learning are still novel and actively being developed. We provide an overview over the methods that use ontologies to compute similarity and incorporate them in machine learning methods; in particular, we outline how semantic similarity measures and ontology embeddings can exploit the background knowledge in ontologies and how ontologies can provide constraints that improve machine learning models. The methods and experiments we describe are available as a set of executable notebooks, and we also provide a set of slides and additional resources at https://github.com/bio-ontology-research-group/machine-learning-with-ontologies.


Processes ◽  
2020 ◽  
Vol 8 (2) ◽  
pp. 224 ◽  
Author(s):  
Sami Sader ◽  
István Husti ◽  
Miklós Daróczi

In this paper, multiclass classification is used to develop a novel approach to enhance failure mode and effects analysis and the generation of risk priority number. This is done by developing four machine learning models using auto machine learning. Failure mode and effects analysis is a technique that is used in industry to identify possible failures that may occur and the effects of these failures on the system. Meanwhile, risk priority number is a numeric value that is calculated by multiplying three associated parameters namely severity, occurrence and detectability. The value of risk priority number determines the next actions to be made. A dataset that includes a one-year registry of 1532 failures with their description, severity, occurrence, and detectability is used to develop four models to predict the values of severity, occurrence, and detectability. Meanwhile, the resulted models are evaluated using 10% of the dataset. Evaluation results show that the proposed models have high accuracy whereas the average value of precision, recall, and F1 score are in the range of 86.6–93.2%, 67.9–87.9%, 0.892–0.765% respectively. The proposed work helps in carrying out failure mode and effects analysis in a more efficient way as compared to the conventional techniques.


Author(s):  
Ramazan Esmeli ◽  
Mohamed Bader-El-Den ◽  
Hassana Abdullahi

AbstractPurchase prediction has an important role for decision-makers in e-commerce to improve consumer experience, provide personalised recommendations and increase revenue. Many works investigated purchase prediction for session logs by analysing users’ behaviour to predict purchase intention after a session has ended. In most cases, e-shoppers prefer to be anonymous while browsing the websites and after a session has ended, identifying users and offering discounts can be challenging. Therefore, after a session ends, predicting purchase intention may not be useful for the e-commerce strategists. In this work, we propose and develop an early purchase prediction framework using advanced machine learning models to investigate how early purchase intention in an ongoing session can be predicted. Since users could be anonymous, this could help to give real-time offers and discounts before the session ends. We use dynamically created session features after each interaction in a session, and propose a utility scoring method to evaluate how early machine learning models can predict the probability of purchase intention. The proposed framework is validated with a real-world dataset. Computational experiments show machine learning models can identify purchase intention early with good performance in terms of Area Under Curve (AUC) score which shows success rate of machine learning models on early purchase prediction.


2020 ◽  
Vol 32 ◽  
pp. 03003
Author(s):  
Bhushan Deore ◽  
Aditya Kyatham ◽  
Shubham Narkhede

The following paper provides a novel approach for Network Intrusion Detection System using Machine Learning and Deep Learning. This approach uses two MLP (Multi-Layer Perceptron) models one having 3 layers and other having 6 layers. Random Forest is also used for classification. These models are ensembled in such a way that the final accuracy is boosted and also the testing time is reduced. Researchers have implemented various ways for the ensemble of multiple models but we are using contradiction management concept to ensemble machine learning models. Contradiction Management concept means if two machine learning models are contradicting in their decisions (in our case 3-layer MLP and Random Forest), then the third model’s (6-layer MLP) decision is considered whose accuracy is higher than the previous models. The third model is only used for testing when the previous two models contradict in their decision because the testing time of third model is higher than the two previous models as the third model has complex architecture. This approach increased the final accuracy as ensemble of multiple models is done and also testing time has reduced. The novelty of this paper is the choice and the combination of the models for the purpose of Network security.


Author(s):  
Alexey Ignatiev ◽  
Nina Narodytska ◽  
Joao Marques-Silva

The growing range of applications of Machine Learning (ML) in a multitude of settings motivates the ability of computing small explanations for predictions made. Small explanations are generally accepted as easier for human decision makers to understand. Most earlier work on computing explanations is based on heuristic approaches, providing no guarantees of quality, in terms of how close such solutions are from cardinality- or subset-minimal explanations. This paper develops a constraint-agnostic solution for computing explanations for any ML model. The proposed solution exploits abductive reasoning, and imposes the requirement that the ML model can be represented as sets of constraints using some target constraint reasoning system for which the decision problem can be answered with some oracle. The experimental results, obtained on well-known datasets, validate the scalability of the proposed approach as well as the quality of the computed solutions.


2018 ◽  
Vol 10 (1) ◽  
Author(s):  
Wang-Chi Cheung ◽  
Weiwen Zhang ◽  
Yong Liu ◽  
Feng Yang ◽  
Rick-Siow-Mong Goh

Recent studies have revealed the success of data-driven machine health monitoring, which motivates the use of machine learning models in machine health prognostic tasks. While the machine learning approach to health monitoring is gaining importance, the construction of machine learning models is often impeded by the difficulty in choosing the underlying hyper-parameter configuration (HP-config), which governs the construction of the machine learning model. While an effective choice of HP-config can be achieved with human effort, such an effort is often time consuming and requires domain knowledge. In this paper, we consider the use of Bayesian optimization algorithms, which automate an effective choice of HP-config by solving the associated hyperparameter optimization problem. Numerical experiments on the data from PHM 2016 Data Challenge demonstrate the salience of the proposed automatic framework, and exhibit improvement over default HP-configs in standard machine learning packages or chosen by a human agent.


Sign in / Sign up

Export Citation Format

Share Document