Exploring Fault Parameter Space Using Reinforcement Learning-based Fault Injection

Author(s):  
Mehrdad Moradi ◽  
Bentley James Oakes ◽  
Mustafa Saraoglu ◽  
Andrey Morozov ◽  
Klaus Janschek ◽  
...  
Author(s):  
Hui Xu ◽  
Chong Zhang ◽  
Jiaxing Wang ◽  
Deqiang Ouyang ◽  
Yu Zheng ◽  
...  

Efficient exploration is a major challenge in Reinforcement Learning (RL) and has been studied extensively. However, for a new task existing methods explore either by taking actions that maximize task agnostic objectives (such as information gain) or applying a simple dithering strategy (such as noise injection), which might not be effective enough. In this paper, we investigate whether previous learning experiences can be leveraged to guide exploration of current new task. To this end, we propose a novel Exploration with Structured Noise in Parameter Space (ESNPS) approach. ESNPS utilizes meta-learning and directly uses meta-policy parameters, which contain prior knowledge, as structured noises to perturb the base model for effective exploration in new tasks. Experimental results on four groups of tasks: cheetah velocity, cheetah direction, ant velocity and ant direction demonstrate the superiority of ESNPS against a number of competitive baselines.


Author(s):  
Freek Stulp ◽  
Olivier Sigaud

AbstractPolicy improvement methods seek to optimize the parameters of a policy with respect to a utility function. Owing to current trends involving searching in parameter space (rather than action space) and using reward-weighted averaging (rather than gradient estimation), reinforcement learning algorithms for policy improvement, e.g. PoWER and PI


Author(s):  
Thomas Rückstieß ◽  
Frank Sehnke ◽  
Tom Schaul ◽  
Daan Wierstra ◽  
Yi Sun ◽  
...  

AbstractThis paper discusses parameter-based exploration methods for reinforcement learning. Parameter-based methods perturb parameters of a general function approximator directly, rather than adding noise to the resulting actions. Parameter-based exploration unifies reinforcement learning and black-box optimization, and has several advantages over action perturbation. We review two recent parameter-exploring algorithms: Natural Evolution Strategies and Policy Gradients with Parameter-Based Exploration. Both outperform state-of-the-art algorithms in several complex high-dimensional tasks commonly found in robot control. Furthermore, we describe how a novel exploration method, State-Dependent Exploration, can modify existing algorithms to mimic exploration in parameter space.


2012 ◽  
Vol 12 (3) ◽  
pp. 39-52 ◽  
Author(s):  
Houman Dallali ◽  
Petar Kormushev ◽  
Zhibin Li ◽  
Darwin Caldwell

Abstract In ZMP trajectory generation using simple models, often a considerable amount of trials and errors are involved to obtain locally stable gaits by manually tuning the gait parameters. In this paper a 15 degrees of Freedom dynamic model of a compliant humanoid robot is used, combined with reinforcement learning to perform global search in the parameter space to produce stable gaits. It is shown that for a given speed, multiple sets of parameters, namely step sizes and lateral sways, are obtained by the learning algorithm which can lead to stable walking. The resulting set of gaits can be further studied in terms of parameter sensitivity and also to include additional optimization criteria to narrow down the chosen walking trajectories for the humanoid robot.


Decision ◽  
2016 ◽  
Vol 3 (2) ◽  
pp. 115-131 ◽  
Author(s):  
Helen Steingroever ◽  
Ruud Wetzels ◽  
Eric-Jan Wagenmakers

Sign in / Sign up

Export Citation Format

Share Document