scholarly journals Key Concepts in AI Safety: Robustness and Adversarial Examples

2021 ◽  
Author(s):  
Tim Rudner ◽  
Helen Toner

This paper is the second installment in a series on “AI safety,” an area of machine learning research that aims to identify causes of unintended behavior in machine learning systems and develop tools to ensure these systems work safely and reliably. The first paper in the series, “Key Concepts in AI Safety: An Overview,” described three categories of AI safety issues: problems of robustness, assurance, and specification. This paper introduces adversarial examples, a major challenge to robustness in modern machine learning systems.

2021 ◽  
Author(s):  
Tim G. J. Rduner ◽  
◽  
Helen Toner

This paper is the fourth installment in a series on “AI safety,” an area of machine learning research that aims to identify causes of unintended behavior in machine learning systems and develop tools to ensure these systems work safely and reliably. The first paper in the series, “Key Concepts in AI Safety: An Overview,” outlined three categories of AI safety issues—problems of robustness, assurance, and specification—and the subsequent two papers described problems of robustness and assurance, respectively. This paper introduces specification as a key element in designing modern machine learning systems that operate as intended.


2021 ◽  
Author(s):  
Tim Rudner ◽  
Helen Toner

This paper is the third installment in a series on “AI safety,” an area of machine learning research that aims to identify causes of unintended behavior in machine learning systems and develop tools to ensure these systems work safely and reliably. The first paper in the series, “Key Concepts in AI Safety: An Overview,” described three categories of AI safety issues: problems of robustness, assurance, and specification. This paper introduces interpretability as a means to enable assurance in modern machine learning systems.


2021 ◽  
Author(s):  
Tim Rudner ◽  
Helen Toner

This paper is the first installment in a series on “AI safety,” an area of machine learning research that aims to identify causes of unintended behavior in machine learning systems and develop tools to ensure these systems work safely and reliably. In it, the authors introduce three categories of AI safety issues: problems of robustness, assurance, and specification. Other papers in this series elaborate on these and further key concepts.


2021 ◽  
Author(s):  
Bernhard Mehlig

This modern and self-contained book offers a clear and accessible introduction to the important topic of machine learning with neural networks. In addition to describing the mathematical principles of the topic, and its historical evolution, strong connections are drawn with underlying methods from statistical physics and current applications within science and engineering. Closely based around a well-established undergraduate course, this pedagogical text provides a solid understanding of the key aspects of modern machine learning with artificial neural networks, for students in physics, mathematics, and engineering. Numerous exercises expand and reinforce key concepts within the book and allow students to hone their programming skills. Frequent references to current research develop a detailed perspective on the state-of-the-art in machine learning research.


2021 ◽  
Author(s):  
Zachary Arnold ◽  
◽  
Helen Toner

As modern machine learning systems become more widely used, the potential costs of malfunctions grow. This policy brief describes how trends we already see today—both in newly deployed artificial intelligence systems and in older technologies—show how damaging the AI accidents of the future could be. It describes a wide range of hypothetical but realistic scenarios to illustrate the risks of AI accidents and offers concrete policy suggestions to reduce these risks.


Entropy ◽  
2020 ◽  
Vol 22 (9) ◽  
pp. 999 ◽  
Author(s):  
Ian Fischer

Much of the field of Machine Learning exhibits a prominent set of failure modes, including vulnerability to adversarial examples, poor out-of-distribution (OoD) detection, miscalibration, and willingness to memorize random labelings of datasets. We characterize these as failures of robust generalization, which extends the traditional measure of generalization as accuracy or related metrics on a held-out set. We hypothesize that these failures to robustly generalize are due to the learning systems retaining too much information about the training data. To test this hypothesis, we propose the Minimum Necessary Information (MNI) criterion for evaluating the quality of a model. In order to train models that perform well with respect to the MNI criterion, we present a new objective function, the Conditional Entropy Bottleneck (CEB), which is closely related to the Information Bottleneck (IB). We experimentally test our hypothesis by comparing the performance of CEB models with deterministic models and Variational Information Bottleneck (VIB) models on a variety of different datasets and robustness challenges. We find strong empirical evidence supporting our hypothesis that MNI models improve on these problems of robust generalization.


2020 ◽  
Vol 1 ◽  
pp. 6
Author(s):  
Petru Hlihor ◽  
Riccardo Volpi ◽  
Luigi Malagò

Adversarial Examples represent a serious problem affecting the security of machine learning systems. In this paper we focus on a defense mechanism based on reconstructing images before classification using an autoencoder. We experiment on several types of autoencoders and evaluate the impact of strategies such as injecting noise in the input during training and in the latent space at inference time.We tested the models on adversarial examples generated with the Carlini-Wagner attack, in a white-box scenario and on the stacked system composed by the autoencoder and the classifier.


2015 ◽  
Vol 1 (1) ◽  
pp. 12 ◽  
Author(s):  
Alan F. Blackwell

<div class="page" title="Page 1"><div class="layoutArea"><div class="column"><p><span>Classic theories of user interaction have been framed in relation to symbolic models of planning and problem solving, responding in part to the cognitive theories associated with AI research. However, the behavior of modern machine-learning systems is determined by statistical models of the world rather than explicit symbolic descriptions. Users increasingly interact with the world and with others in ways that are mediated by such models. This paper explores the way in which this new generation of technology raises fresh challenges for the critical evaluation of interactive systems. It closes with some proposed measures for the design of inference-based systems that are more open to humane design and use. </span></p></div></div></div>


2018 ◽  
Vol 12 ◽  
pp. 85-98
Author(s):  
Bojan Kostadinov ◽  
Mile Jovanov ◽  
Emil STANKOV

Data collection and machine learning are changing the world. Whether it is medicine, sports or education, companies and institutions are investing a lot of time and money in systems that gather, process and analyse data. Likewise, to improve competitiveness, a lot of countries are making changes to their educational policy by supporting STEM disciplines. Therefore, it’s important to put effort into using various data sources to help students succeed in STEM. In this paper, we present a platform that can analyse student’s activity on various contest and e-learning systems, combine and process the data, and then present it in various ways that are easy to understand. This in turn enables teachers and organizers to recognize talented and hardworking students, identify issues, and/or motivate students to practice and work on areas where they’re weaker.


Sign in / Sign up

Export Citation Format

Share Document