A Survey on the Explainability of Supervised Machine Learning

Journal of Artificial Intelligence Research ◽

10.1613/jair.1.12228 ◽

2021 ◽

Vol 70 ◽

pp. 245-317

Author(s):

Nadia Burkart ◽

Marco F. Huber

Keyword(s):

Machine Learning ◽

Decision Making ◽

State Of The Art ◽

High Accuracy ◽

Supervised Machine Learning ◽

Future Directions ◽

Survey Paper ◽

Black Boxes ◽

Highly Sensitive

Predictions obtained by, e.g., artificial neural networks have a high accuracy but humans often perceive the models as black boxes. Insights about the decision making are mostly opaque for humans. Particularly understanding the decision making in highly sensitive areas such as healthcare or finance, is of paramount importance. The decision-making behind the black boxes requires it to be more transparent, accountable, and understandable for humans. This survey paper provides essential definitions, an overview of the different principles and methodologies of explainable Supervised Machine Learning (SML). We conduct a state-of-the-art survey that reviews past and recent explainable SML approaches and classifies them according to the introduced definitions. Finally, we illustrate principles by means of an explanatory case study and discuss important future directions.

Download Full-text

A Machine Vision Approach for Bioreactor Foam Sensing

SLAS TECHNOLOGY Translating Life Sciences Innovation ◽

10.1177/24726303211008861 ◽

2021 ◽

pp. 247263032110088

Author(s):

Jonas Austerjost ◽

Robert Söldner ◽

Christoffer Edlund ◽

Johan Trygg ◽

David Pollard ◽

...

Keyword(s):

Machine Learning ◽

Machine Vision ◽

State Of The Art ◽

Low Cost ◽

High Accuracy ◽

Consumer Electronics ◽

Learning System ◽

Automotive Applications ◽

Fine Grained

Machine vision is a powerful technology that has become increasingly popular and accurate during the last decade due to rapid advances in the field of machine learning. The majority of machine vision applications are currently found in consumer electronics, automotive applications, and quality control, yet the potential for bioprocessing applications is tremendous. For instance, detecting and controlling foam emergence is important for all upstream bioprocesses, but the lack of robust foam sensing often leads to batch failures from foam-outs or overaddition of antifoam agents. Here, we report a new low-cost, flexible, and reliable foam sensor concept for bioreactor applications. The concept applies convolutional neural networks (CNNs), a state-of-the-art machine learning system for image processing. The implemented method shows high accuracy for both binary foam detection (foam/no foam) and fine-grained classification of foam levels.

Download Full-text

A Survey on Bias and Fairness in Machine Learning

ACM Computing Surveys ◽

10.1145/3457607 ◽

2021 ◽

Vol 54 (6) ◽

pp. 1-35

Author(s):

Ninareh Mehrabi ◽

Fred Morstatter ◽

Nripsuta Saxena ◽

Kristina Lerman ◽

Aram Galstyan

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Deep Learning ◽

Real World ◽

State Of The Art ◽

Future Directions ◽

Discriminatory Behavior ◽

Real World Applications ◽

Near Future ◽

Different Sources

With the widespread use of artificial intelligence (AI) systems and applications in our everyday lives, accounting for fairness has gained significant importance in designing and engineering of such systems. AI systems can be used in many sensitive environments to make important and life-changing decisions; thus, it is crucial to ensure that these decisions do not reflect discriminatory behavior toward certain groups or populations. More recently some work has been developed in traditional machine learning and deep learning that address such challenges in different subdomains. With the commercialization of these systems, researchers are becoming more aware of the biases that these applications can contain and are attempting to address them. In this survey, we investigated different real-world applications that have shown biases in various ways, and we listed different sources of biases that can affect AI applications. We then created a taxonomy for fairness definitions that machine learning researchers have defined to avoid the existing bias in AI systems. In addition to that, we examined different domains and subdomains in AI showing what researchers have observed with regard to unfair outcomes in the state-of-the-art methods and ways they have tried to address them. There are still many future directions and solutions that can be taken to mitigate the problem of bias in AI systems. We are hoping that this survey will motivate researchers to tackle these issues in the near future by observing existing work in their respective fields.

Download Full-text

Supervised learning for the detection of negation and of its scope in French and Brazilian Portuguese biomedical corpora

Natural Language Engineering ◽

10.1017/s1351324920000352 ◽

2020 ◽

pp. 1-21 ◽

Cited By ~ 2

Author(s):

Clément Dalloux ◽

Vincent Claveau ◽

Natalia Grabar ◽

Lucas Emanuel Silva Oliveira ◽

Claudia Maria Cabral Moro ◽

...

Keyword(s):

Machine Learning ◽

Information Extraction ◽

State Of The Art ◽

Automatic Detection ◽

Brazilian Portuguese ◽

Supervised Machine Learning ◽

Biomedical Domain ◽

Learning Approaches ◽

Cross Domain ◽

Automatic Methods

Abstract Automatic detection of negated content is often a prerequisite in information extraction systems in various domains. In the biomedical domain especially, this task is important because negation plays an important role. In this work, two main contributions are proposed. First, we work with languages which have been poorly addressed up to now: Brazilian Portuguese and French. Thus, we developed new corpora for these two languages which have been manually annotated for marking up the negation cues and their scope. Second, we propose automatic methods based on supervised machine learning approaches for the automatic detection of negation marks and of their scopes. The methods show to be robust in both languages (Brazilian Portuguese and French) and in cross-domain (general and biomedical languages) contexts. The approach is also validated on English data from the state of the art: it yields very good results and outperforms other existing approaches. Besides, the application is accessible and usable online. We assume that, through these issues (new annotated corpora, application accessible online, and cross-domain robustness), the reproducibility of the results and the robustness of the NLP applications will be augmented.

Download Full-text

Semi-supervised machine learning approaches for predicting the chronology of archaeological sites: A case study of temples from medieval Angkor, Cambodia

PLoS ONE ◽

10.1371/journal.pone.0205649 ◽

2018 ◽

Vol 13 (11) ◽

pp. e0205649 ◽

Cited By ~ 4

Author(s):

Sarah Klassen ◽

Jonathan Weed ◽

Damian Evans

Keyword(s):

Machine Learning ◽

Supervised Machine Learning ◽

Archaeological Sites ◽

Learning Approaches

Download Full-text

Machine learning in critical care: state-of-the-art and a sepsis case study

BioMedical Engineering OnLine ◽

10.1186/s12938-018-0569-2 ◽

2018 ◽

Vol 17 (S1) ◽

Cited By ~ 11

Author(s):

Alfredo Vellido ◽

Vicent Ribas ◽

Carles Morales ◽

Adolfo Ruiz Sanmartín ◽

Juan Carlos Ruiz Rodríguez

Keyword(s):

Machine Learning ◽

Critical Care ◽

State Of The Art

Download Full-text

Discovering Business Processes from Email Logs using fastText and Process Mining

10.36227/techrxiv.12283835 ◽

2020 ◽

Author(s):

Yaghoub rashnavadi ◽

Sina Behzadifard ◽

Reza Farzadnia ◽

sina zamani

Keyword(s):

Machine Learning ◽

Business Processes ◽

Oil And Gas ◽

Process Mining ◽

The Body ◽

Process Models ◽

Supervised Machine Learning ◽

Implicit Information ◽

Oil And Gas Sector

<p>Communication has never been more accessible than today. With the help of Instant messengers and Email Services, millions of people can transfer information with ease, and this trend has affected organizations as well. There are billions of organizational emails sent or received daily, and their main goal is to facilitate the daily operation of organizations. Behind this vast corpus of human-generated content, there is much implicit information that can be mined and used to improve or optimize the organizations’ operations. Business processes are one of those implicit knowledge areas that can be discovered from Email logs of an Organization, as most of the communications are followed inside Emails. The purpose of this research is to propose an approach to discover the process models in the Email log. In this approach, we combine two tools, supervised machine learning and process mining. With the help of supervised machine learning, fastText classifier, we classify the body text of emails to the activity-related. Then the generated log will be mined with process mining techniques to find process models. We illustrate the approach with a case study company from the oil and gas sector.</p>

Download Full-text

Decision making via semi-supervised machine learning techniques

10.12681/eadd/38100 ◽

2016 ◽

Author(s):

Ευτύχιος Πρωτοπαπαδάκης

Keyword(s):

Machine Learning ◽

Decision Making ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Learning Techniques

Ο όρος μάθηση με μερική επίβλεψη αναφέρεται σε ένα ευρύ πεδίο τεχνικών μηχανικής μάθησης, οι οποίες χρησιμοποιούν τα μη τιτλοφορημένα δεδομένα για να εξάγουν επιπλέον ωφέλιμη πληροφορία. Η μερική επίβλεψη αντιμετωπίζει προβλήματα που σχετίζονται με την επεξεργασία και την αξιοποίηση μεγάλου όγκου δεδομένων και τα όποια κόστη σχετίζονται με αυτά (π.χ. χρόνος επεξεργασίας, ανθρώπινα λάθη). Απώτερος σκοπός είναι η ασφαλή εξαγωγή συμπερασμάτων, κανόνων ή προτάσεων. Τα μοντέλα λήψης απόφασης που χρησιμοποιούν τεχνικές μερικής μάθησης έχουν ποικίλα πλεονεκτήματα. Σε πρώτη φάση, χρειάζονται μικρό πλήθος τιτλοφορημένων δεδομένων για την αρχικοποίηση τους. Στη συνέχεια, τα νέα δεδομένα που θα εμφανιστούν αξιοποιούνται και τροποποιούν κατάλληλα το μοντέλο. Ως εκ τούτου, έχουμε ένα συνεχώς εξελισσόμενο μοντέλο λήψης αποφάσεων, με την ελάχιστη δυνατή προσπάθεια.Τεχνικές που προσαρμόζονται εύκολα και οικονομικά είναι οι κατεξοχήν κατάλληλες για τον έλεγχο συστημάτων, στα οποία παρατηρούνται συχνές αλλαγές στον τρόπο λειτουργίας. Ενδεικτικά πεδία εφαρμογής εφαρμογής ευέλικτων συστημάτων υποστήριξης λήψης αποφάσεων με μερική μάθηση είναι: η επίβλεψη γραμμών παραγωγής, η επιτήρηση θαλάσσιων συνόρων, η φροντίδα ηλικιωμένων, η εκτίμηση χρηματοπιστωτικού κινδύνου, ο έλεγχος για δομικές ατέλειες και η διαφύλαξη της πολιτιστικής κληρονομιάς.

Download Full-text

An integrated machine learning: Utility theory framework for real-time predictive maintenance in pumping systems

Proceedings of the Institution of Mechanical Engineers Part B Journal of Engineering Manufacture ◽

10.1177/0954405420970517 ◽

2020 ◽

pp. 095440542097051

Author(s):

Raghad M Khorsheed ◽

Omer Faruk Beyca

Keyword(s):

Machine Learning ◽

Decision Making ◽

Utility Theory ◽

Predictive Maintenance ◽

Manufacturing Operations ◽

Pumping Stations ◽

Bearing Failures ◽

Temperature Levels ◽

The Right

Bearings are the most widely used mechanical parts in rotating machinery under high load and high rotational speeds. Operating continuously under such harsh conditions, wear and failure are imminent. Developing defects give rise to even-higher vibration and temperature levels. In general, mechanical defects in a machine cause high vibration levels. Therefore, bearing fault identification and early detection enables the maintenance team to repair the problem before it triggers catastrophic failure in the bearing. Machine downtime is thus avoided or minimized. This paper explores the use of Machine Learning (ML) integrated with decision-making techniques to predict possible bearing failures and improve the overall manufacturing operations by applying the correct maintenance actions at the right time. The accuracy of the Predictive Maintenance (PdM) module has been tested on real industrial production datasets. The paper proposes an effective PdM methodology using different ML algorithms to detect failures before they happen and reduce pump downtime. The performance of the tested ML algorithms is based on five performance indicators: accuracy, precision, F-score, recall, and an area under curve (AUC). Experimental results revealed that all tested ML algorithms are successful and effective. Furthermore, decision making with utility theory has been employed to exploit the probability of failures and thus help to perform the appropriate maintenance interventions. This provides a logical framework for decision-makers to identify the optimum action with the maximum expected benefit. As a case study, the model is applied on forwarding pumping stations belonging to the Sewerage Treatment Company (STC), one of the largest sewage stations in Qatar.

Download Full-text

Machine learning for digital try-on: Challenges and progress

Computational Visual Media ◽

10.1007/s41095-020-0189-1 ◽

2020 ◽

Author(s):

Junbang Liang ◽

Ming C. Lin

Keyword(s):

Machine Learning ◽

State Of The Art ◽

Economic Benefits ◽

The Body ◽

Future Directions ◽

Recent Advances ◽

Practical Constraints

Abstract Digital try-on systems for e-commerce have the potential to change people's lives and provide notable economic benefits. However, their development is limited by practical constraints, such as accurate sizing of the body and realism of demonstrations. We enumerate three open challenges remaining for a complete and easy-to-use try-on system that recent advances in machine learning make increasingly tractable. For each, we describe the problem, introduce state-of-the-art approaches, and provide future directions.

Download Full-text

Towards a Multi-Layered Phishing Detection

Sensors ◽

10.3390/s20164540 ◽

2020 ◽

Vol 20 (16) ◽

pp. 4540

Author(s):

Kieran Rendall ◽

Antonia Nisioti ◽

Alexios Mylonas

Keyword(s):

Machine Learning ◽

State Of The Art ◽

Detection System ◽

Single Layer ◽

Supervised Machine Learning ◽

Data Driven ◽

Feature Sets ◽

Phishing Attacks ◽

Production Environments ◽

Phishing Detection

Phishing is one of the most common threats that users face while browsing the web. In the current threat landscape, a targeted phishing attack (i.e., spear phishing) often constitutes the first action of a threat actor during an intrusion campaign. To tackle this threat, many data-driven approaches have been proposed, which mostly rely on the use of supervised machine learning under a single-layer approach. However, such approaches are resource-demanding and, thus, their deployment in production environments is infeasible. Moreover, most previous works utilise a feature set that can be easily tampered with by adversaries. In this paper, we investigate the use of a multi-layered detection framework in which a potential phishing domain is classified multiple times by models using different feature sets. In our work, an additional classification takes place only when the initial one scores below a predefined confidence level, which is set by the system owner. We demonstrate our approach by implementing a two-layered detection system, which uses supervised machine learning to identify phishing attacks. We evaluate our system with a dataset consisting of active phishing attacks and find that its performance is comparable to the state of the art.

Download Full-text