Evolutionary Optimization of Neural Networks for Reinforcement Learning Algorithms

Author(s):  
H. Braun ◽  
T. Ragg
2012 ◽  
Vol 24 (2) ◽  
pp. 330-339 ◽  
Author(s):  
Kazuaki Yamada ◽  

This paper proposes a new reinforcement learning algorithm that can learn, using neural networks and CMAC, a mapping function between highdimensional sensors and the motors of an autonomous robot. Conventional reinforcement learning algorithms require a lot of memory because they use lookup tables to describe high-dimensional mapping functions. Researchers have therefore tried to develop reinforcement learning algorithms that can learn the high-dimensional mapping functions. We apply the proposed method to an autonomous robot navigation problem and a multi-link robot arm reaching problem, and we evaluate the effectiveness of the method.


Author(s):  
Kazuaki Yamada ◽  

Reinforcement learning approaches are attracting attention as a technique for constructing a trial-anderror mapping function between sensors and motors of an autonomous mobile robot. Conventional reinforcement learning approaches use a look-up table to express the mapping function between grid state and grid action spaces. The grid size greatly adversely affects the learning performance of reinforcement learning algorithms. To avoid this, researchers have proposed reinforcement learning algorithms using neural networks to express the mapping function between continuous state space and action. A designer, however, must set the number of middle neurons and initial values of weight parameters appropriately to improve the approximate accuracy of neural networks. This paper proposes a new method that automatically sets the number ofmiddle neurons and initial values of weight parameters based on the dimension number of the sensor space. The feasibility of proposed method is demonstrated using an autonomous mobile robot navigation problem and is evaluated by comparing it with two types of Q-learning as follows: Q-learning using RBF networks and Q-learning using neural networks whose parameters are set by a designer.


2021 ◽  
Vol 2 (1) ◽  
pp. 1-25
Author(s):  
Yongsen Ma ◽  
Sheheryar Arshad ◽  
Swetha Muniraju ◽  
Eric Torkildson ◽  
Enrico Rantala ◽  
...  

In recent years, Channel State Information (CSI) measured by WiFi is widely used for human activity recognition. In this article, we propose a deep learning design for location- and person-independent activity recognition with WiFi. The proposed design consists of three Deep Neural Networks (DNNs): a 2D Convolutional Neural Network (CNN) as the recognition algorithm, a 1D CNN as the state machine, and a reinforcement learning agent for neural architecture search. The recognition algorithm learns location- and person-independent features from different perspectives of CSI data. The state machine learns temporal dependency information from history classification results. The reinforcement learning agent optimizes the neural architecture of the recognition algorithm using a Recurrent Neural Network (RNN) with Long Short-Term Memory (LSTM). The proposed design is evaluated in a lab environment with different WiFi device locations, antenna orientations, sitting/standing/walking locations/orientations, and multiple persons. The proposed design has 97% average accuracy when testing devices and persons are not seen during training. The proposed design is also evaluated by two public datasets with accuracy of 80% and 83%. The proposed design needs very little human efforts for ground truth labeling, feature engineering, signal processing, and tuning of learning parameters and hyperparameters.


2021 ◽  
Vol 11 (11) ◽  
pp. 4948
Author(s):  
Lorenzo Canese ◽  
Gian Carlo Cardarilli ◽  
Luca Di Di Nunzio ◽  
Rocco Fazzolari ◽  
Daniele Giardino ◽  
...  

In this review, we present an analysis of the most used multi-agent reinforcement learning algorithms. Starting with the single-agent reinforcement learning algorithms, we focus on the most critical issues that must be taken into account in their extension to multi-agent scenarios. The analyzed algorithms were grouped according to their features. We present a detailed taxonomy of the main multi-agent approaches proposed in the literature, focusing on their related mathematical models. For each algorithm, we describe the possible application fields, while pointing out its pros and cons. The described multi-agent algorithms are compared in terms of the most important characteristics for multi-agent reinforcement learning applications—namely, nonstationarity, scalability, and observability. We also describe the most common benchmark environments used to evaluate the performances of the considered methods.


2021 ◽  
Vol 298 ◽  
pp. 117164
Author(s):  
Marco Biemann ◽  
Fabian Scheller ◽  
Xiufeng Liu ◽  
Lizhen Huang

Algorithms ◽  
2021 ◽  
Vol 14 (8) ◽  
pp. 226
Author(s):  
Wenzel Pilar von Pilchau ◽  
Anthony Stein ◽  
Jörg Hähner

State-of-the-art Deep Reinforcement Learning Algorithms such as DQN and DDPG use the concept of a replay buffer called Experience Replay. The default usage contains only the experiences that have been gathered over the runtime. We propose a method called Interpolated Experience Replay that uses stored (real) transitions to create synthetic ones to assist the learner. In this first approach to this field, we limit ourselves to discrete and non-deterministic environments and use a simple equally weighted average of the reward in combination with observed follow-up states. We could demonstrate a significantly improved overall mean average in comparison to a DQN network with vanilla Experience Replay on the discrete and non-deterministic FrozenLake8x8-v0 environment.


Sign in / Sign up

Export Citation Format

Share Document