The Conditional Entropy Bottleneck

Much of the field of Machine Learning exhibits a prominent set of failure modes, including vulnerability to adversarial examples, poor out-of-distribution (OoD) detection, miscalibration, and willingness to memorize random labelings of datasets. We characterize these as failures of robust generalization, which extends the traditional measure of generalization as accuracy or related metrics on a held-out set. We hypothesize that these failures to robustly generalize are due to the learning systems retaining too much information about the training data. To test this hypothesis, we propose the Minimum Necessary Information (MNI) criterion for evaluating the quality of a model. In order to train models that perform well with respect to the MNI criterion, we present a new objective function, the Conditional Entropy Bottleneck (CEB), which is closely related to the Information Bottleneck (IB). We experimentally test our hypothesis by comparing the performance of CEB models with deterministic models and Variational Information Bottleneck (VIB) models on a variety of different datasets and robustness challenges. We find strong empirical evidence supporting our hypothesis that MNI models improve on these problems of robust generalization.

Download Full-text

Zero-Shot Feature Selection via Transferring Supervised Knowledge

International Journal of Data Warehousing and Mining ◽

10.4018/ijdwm.2021040101 ◽

2021 ◽

Vol 17 (2) ◽

pp. 1-20

Author(s):

Zheng Wang ◽

Qiao Wang ◽

Tingzhang Zhao ◽

Chaokun Wang ◽

Xiaojun Ye

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Dimensionality Reduction ◽

Real World ◽

Rapid Growth ◽

Learning Systems ◽

Training Data ◽

Effective Technique ◽

Supervised Methods ◽

Real World Datasets

Feature selection, an effective technique for dimensionality reduction, plays an important role in many machine learning systems. Supervised knowledge can significantly improve the performance. However, faced with the rapid growth of newly emerging concepts, existing supervised methods might easily suffer from the scarcity and validity of labeled data for training. In this paper, the authors study the problem of zero-shot feature selection (i.e., building a feature selection model that generalizes well to “unseen” concepts with limited training data of “seen” concepts). Specifically, they adopt class-semantic descriptions (i.e., attributes) as supervision for feature selection, so as to utilize the supervised knowledge transferred from the seen concepts. For more reliable discriminative features, they further propose the center-characteristic loss which encourages the selected features to capture the central characteristics of seen concepts. Extensive experiments conducted on various real-world datasets demonstrate the effectiveness of the method.

Download Full-text

Alignment for Advanced Machine Learning Systems

Ethics of Artificial Intelligence ◽

10.1093/oso/9780190905033.003.0013 ◽

2020 ◽

pp. 342-382

Author(s):

Jessica Taylor ◽

Eliezer Yudkowsky ◽

Patrick LaVictoire ◽

Andrew Critch

Keyword(s):

Machine Learning ◽

Side Effects ◽

Objective Function ◽

Unintended Consequences ◽

Learning Systems ◽

Large Impact ◽

Future Research ◽

Objective Functions ◽

Research Areas ◽

The Right

This chapter surveys eight research areas organized around one question: As learning systems become increasingly intelligent and autonomous, what design principles can best ensure that their behavior is aligned with the interests of the operators? The chapter focuses on two major technical obstacles to AI alignment: the challenge of specifying the right kind of objective functions and the challenge of designing AI systems that avoid unintended consequences and undesirable behavior even in cases where the objective function does not line up perfectly with the intentions of the designers. The questions surveyed include the following: How can we train reinforcement learners to take actions that are more amenable to meaningful assessment by intelligent overseers? What kinds of objective functions incentivize a system to “not have an overly large impact” or “not have many side effects”? The chapter discusses these questions, related work, and potential directions for future research, with the goal of highlighting relevant research topics in machine learning that appear tractable today.

Download Full-text

Improving the Quality of Positive Datasets for the Establishment of Machine Learning Models for pre-microRNA Detection

Journal of Integrative Bioinformatics ◽

10.1515/jib-2017-0032 ◽

2017 ◽

Vol 14 (2) ◽

Author(s):

Müşerref Duygu Saçar Demirci ◽

Jens Allmer

Keyword(s):

Machine Learning ◽

Training Data ◽

Quality Data ◽

Virus Infections ◽

High Quality ◽

Disease States ◽

Positive Data ◽

Mirna Detection ◽

Post Transcriptional Regulation

AbstractMicroRNAs (miRNAs) are involved in the post-transcriptional regulation of protein abundance and thus have a great impact on the resulting phenotype. It is, therefore, no wonder that they have been implicated in many diseases ranging from virus infections to cancer. This impact on the phenotype leads to a great interest in establishing the miRNAs of an organism. Experimental methods are complicated which led to the development of computational methods for pre-miRNA detection. Such methods generally employ machine learning to establish models for the discrimination between miRNAs and other sequences. Positive training data for model establishment, for the most part, stems from miRBase, the miRNA registry. The quality of the entries in miRBase has been questioned, though. This unknown quality led to the development of filtering strategies in attempts to produce high quality positive datasets which can lead to a scarcity of positive data. To analyze the quality of filtered data we developed a machine learning model and found it is well able to establish data quality based on intrinsic measures. Additionally, we analyzed which features describing pre-miRNAs could discriminate between low and high quality data. Both models are applicable to data from miRBase and can be used for establishing high quality positive data. This will facilitate the development of better miRNA detection tools which will make the prediction of miRNAs in disease states more accurate. Finally, we applied both models to all miRBase data and provide the list of high quality hairpins.

Download Full-text

Human-Like Sketch Object Recognition via Analogical Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33011336 ◽

2019 ◽

Vol 33 ◽

pp. 1336-1343

Author(s):

Kezhen Chen ◽

Irina Rabkina ◽

Matthew D. McLure ◽

Kenneth D. Forbus

Keyword(s):

Computer Vision ◽

Deep Learning ◽

Object Recognition ◽

Image Recognition ◽

Visual Representations ◽

Learning Systems ◽

Training Data ◽

Adversarial Examples ◽

Analogical Learning

Deep learning systems can perform well on some image recognition tasks. However, they have serious limitations, including requiring far more training data than humans do and being fooled by adversarial examples. By contrast, analogical learning over relational representations tends to be far more data-efficient, requiring only human-like amounts of training data. This paper introduces an approach that combines automatically constructed qualitative visual representations with analogical learning to tackle a hard computer vision problem, object recognition from sketches. Results from the MNIST dataset and a novel dataset, the Coloring Book Objects dataset, are provided. Comparison to existing approaches indicates that analogical generalization can be used to identify sketched objects from these datasets with several orders of magnitude fewer examples than deep learning systems require.

Download Full-text

Machine Learning Interpretability: A Survey on Methods and Metrics

Electronics ◽

10.3390/electronics8080832 ◽

2019 ◽

Vol 8 (8) ◽

pp. 832 ◽

Cited By ~ 36

Author(s):

Diogo V. Carvalho ◽

Eduardo M. Pereira ◽

Jaime S. Cardoso

Keyword(s):

Machine Learning ◽

Social Impact ◽

Research Field ◽

Learning Systems ◽

Future Directions ◽

The Past ◽

Current State ◽

Black Boxes ◽

Interpretable Models

Machine learning systems are becoming increasingly ubiquitous. These systems’s adoption has been expanding, accelerating the shift towards a more algorithmic society, meaning that algorithmically informed decisions have greater potential for significant social impact. However, most of these accurate decision support systems remain complex black boxes, meaning their internal logic and inner workings are hidden to the user and even experts cannot fully understand the rationale behind their predictions. Moreover, new regulations and highly regulated domains have made the audit and verifiability of decisions mandatory, increasing the demand for the ability to question, understand, and trust machine learning systems, for which interpretability is indispensable. The research community has recognized this interpretability problem and focused on developing both interpretable models and explanation methods over the past few years. However, the emergence of these methods shows there is no consensus on how to assess the explanation quality. Which are the most suitable metrics to assess the quality of an explanation? The aim of this article is to provide a review of the current state of the research field on machine learning interpretability while focusing on the societal impact and on the developed methods and metrics. Furthermore, a complete literature review is presented in order to identify future directions of work on this field.

Download Full-text

Key Concepts in AI Safety: Robustness and Adversarial Examples

10.51593/20190041 ◽

2021 ◽

Author(s):

Tim Rudner ◽

Helen Toner

Keyword(s):

Machine Learning ◽

Learning Systems ◽

Safety Issues ◽

Key Concepts ◽

Adversarial Examples ◽

Learning Research ◽

Modern Machine ◽

Ai Safety

This paper is the second installment in a series on “AI safety,” an area of machine learning research that aims to identify causes of unintended behavior in machine learning systems and develop tools to ensure these systems work safely and reliably. The first paper in the series, “Key Concepts in AI Safety: An Overview,” described three categories of AI safety issues: problems of robustness, assurance, and specification. This paper introduces adversarial examples, a major challenge to robustness in modern machine learning systems.

Download Full-text

Machine Learning Techniques for Hypoglycemia Prediction: Trends and Challenges

Sensors ◽

10.3390/s21020546 ◽

2021 ◽

Vol 21 (2) ◽

pp. 546

Author(s):

Omer Mujahid ◽

Ivan Contreras ◽

Josep Vehi

Keyword(s):

Machine Learning ◽

Diabetic Patient ◽

Life Quality ◽

Training Data ◽

Machine Learning Techniques ◽

Diabetic Patients ◽

Learning Approaches ◽

Prediction Horizon ◽

Learning Techniques

(1) Background: the use of machine learning techniques for the purpose of anticipating hypoglycemia has increased considerably in the past few years. Hypoglycemia is the drop in blood glucose below critical levels in diabetic patients. This may cause loss of cognitive ability, seizures, and in extreme cases, death. In almost half of all the severe cases, hypoglycemia arrives unannounced and is essentially asymptomatic. The inability of a diabetic patient to anticipate and intervene the occurrence of a hypoglycemic event often results in crisis. Hence, the prediction of hypoglycemia is a vital step in improving the life quality of a diabetic patient. The objective of this paper is to review work performed in the domain of hypoglycemia prediction by using machine learning and also to explore the latest trends and challenges that the researchers face in this area; (2) Methods: literature obtained from PubMed and Google Scholar was reviewed. Manuscripts from the last five years were searched for this purpose. A total of 903 papers were initially selected of which 57 papers were eventually shortlisted for detailed review; (3) Results: a thorough dissection of the shortlisted manuscripts provided an interesting split between the works based on two categories: hypoglycemia prediction and hypoglycemia detection. The entire review was carried out keeping this categorical distinction in perspective while providing a thorough overview of the machine learning approaches used to anticipate hypoglycemia, the type of training data, and the prediction horizon.

Download Full-text

Data-Centric Explanations: Explaining Training Data of Machine Learning Systems to Promote Transparency

Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems ◽

10.1145/3411764.3445736 ◽

2021 ◽

Author(s):

Ariful Islam Anik ◽

Andrea Bunt

Keyword(s):

Machine Learning ◽

Learning Systems ◽

Training Data

Download Full-text

Design of a Fracture Detection System based on Deep Program in a Convolutional Neural Network

Webology ◽

10.14704/web/v18i2/web18336 ◽

2021 ◽

Vol 18 (2) ◽

pp. 509-518

Author(s):

Payman Hussein Hussan ◽

Syefy Mohammed Mangj Al-Razoky ◽

Hasanain Mohammed Manji Al-Rzoky

Keyword(s):

Neural Network ◽

Machine Learning ◽

Bone Fractures ◽

Detection System ◽

Training Data ◽

Learning Models ◽

Fracture Detection ◽

Data Set ◽

Final Fracture

This paper presents an efficient method for finding fractures in bones. For this purpose, the pre-processing set includes increasing the quality of images, removing additional objects, removing noise and rotating images. The input images then enter the machine learning phase to detect the final fracture. At this stage, a Convolutional Neural Networks is created by Genetic Programming (GP). In this way, learning models are implemented in the form of GP programs. And evolve during the evolution of this program. Then finally the best program for classifying incoming images is selected. The data set in this work is divided into training and test friends who have nothing in common. The ratio of training data to test is equal to 80 to 20. Finally, experimental results show good results for the proposed method for bone fractures.

Download Full-text

Rationale Discovery and Explainable AI

10.3233/faia210341 ◽

2021 ◽

Author(s):

Cor Steging ◽

Silja Renooij ◽

Bart Verheij

Keyword(s):

Machine Learning ◽

Feature Detection ◽

State Of The Art ◽

High Accuracy ◽

Learning Systems ◽

Training Data ◽

Relevant Feature ◽

Explainable Ai ◽

The Right ◽

The Impact

The justification of an algorithm’s outcomes is important in many domains, and in particular in the law. However, previous research has shown that machine learning systems can make the right decisions for the wrong reasons: despite high accuracies, not all of the conditions that define the domain of the training data are learned. In this study, we investigate what the system does learn, using state-of-the-art explainable AI techniques. With the use of SHAP and LIME, we are able to show which features impact the decision making process and how the impact changes with different distributions of the training data. However, our results also show that even high accuracy and good relevant feature detection are no guarantee for a sound rationale. Hence these state-of-the-art explainable AI techniques cannot be used to fully expose unsound rationales, further advocating the need for a separate method for rationale evaluation.

Download Full-text