A hierarchical reinforcement learning algorithm based on heuristic reward function

2016 ◽

Vol 10 (1) ◽

pp. 69-79 ◽

Cited By ~ 1

Author(s):

Juan Yan ◽

Huibin Yang

Keyword(s):

Reinforcement Learning ◽

Learning Algorithm ◽

Weight Vector ◽

The Self ◽

Simulation Experiments ◽

Wheeled Robots ◽

Reward Function ◽

Hierarchical Reinforcement Learning ◽

Balancing Control ◽

Reinforcement Learning Algorithm

Self-balancing control is the basis for applications of two-wheeled robots. In order to improve the self-balancing of two-wheeled robots, we propose a hierarchical reinforcement learning algorithm for controlling the balance of two-wheeled robots. After describing the subgoals of hierarchical reinforcement learning, we extract features for subgoals, define a feature value vector and its corresponding weight vector, and propose a reward function with additional subgoal reward function. Finally, we give a hierarchical reinforcement learning algorithm for finding the optimal strategy. Simulation experiments show that, the proposed algorithm is more effectiveness than traditional reinforcement learning algorithm in convergent speed. So in our system, the robots can get self-balanced very quickly.

Download Full-text

An Extension of a Hierarchical Reinforcement Learning Algorithm for Multiagent Settings

Lecture Notes in Computer Science - Recent Advances in Reinforcement Learning ◽

10.1007/978-3-642-29946-9_26 ◽

2012 ◽

pp. 261-272

Author(s):

Ioannis Lambrou ◽

Vassilis Vassiliades ◽

Chris Christodoulou

Keyword(s):

Reinforcement Learning ◽

Learning Algorithm ◽

Hierarchical Reinforcement Learning ◽

Reinforcement Learning Algorithm

Download Full-text

A Reward Optimization Method Based on Action Subrewards in Hierarchical Reinforcement Learning

The Scientific World JOURNAL ◽

10.1155/2014/120760 ◽

2014 ◽

Vol 2014 ◽

pp. 1-6

Author(s):

Yuchen Fu ◽

Quan Liu ◽

Xionghong Ling ◽

Zhiming Cui

Keyword(s):

Reinforcement Learning ◽

Learning Algorithm ◽

Optimization Method ◽

Curse Of Dimensionality ◽

Convergence Speed ◽

Learning Method ◽

Trial And Error ◽

State Spaces ◽

Reward Function ◽

Hierarchical Reinforcement Learning

Reinforcement learning (RL) is one kind of interactive learning methods. Its main characteristics are “trial and error” and “related reward.” A hierarchical reinforcement learning method based on action subrewards is proposed to solve the problem of “curse of dimensionality,” which means that the states space will grow exponentially in the number of features and low convergence speed. The method can reduce state spaces greatly and choose actions with favorable purpose and efficiency so as to optimize reward function and enhance convergence speed. Apply it to the online learning in Tetris game, and the experiment result shows that the convergence speed of this algorithm can be enhanced evidently based on the new method which combines hierarchical reinforcement learning algorithm and action subrewards. The “curse of dimensionality” problem is also solved to a certain extent with hierarchical method. All the performance with different parameters is compared and analyzed as well.

Download Full-text

A Modular Hierarchical Reinforcement Learning Algorithm

Lecture Notes in Computer Science - Intelligent Computing Theories and Applications ◽

10.1007/978-3-642-31576-3_48 ◽

2012 ◽

pp. 375-382

Author(s):

Zhibin Liu ◽

Xiaoqin Zeng ◽

Huiyi Liu

Keyword(s):

Reinforcement Learning ◽

Learning Algorithm ◽

Hierarchical Reinforcement Learning ◽

Reinforcement Learning Algorithm

Download Full-text

Seizure Control in a Computational Model Using a Reinforcement Learning Stimulation Paradigm

International Journal of Neural Systems ◽

10.1142/s0129065717500125 ◽

2017 ◽

Vol 27 (07) ◽

pp. 1750012 ◽

Cited By ~ 8

Author(s):

Vivek Nagaraj ◽

Andrew Lamperski ◽

Theoden I Netoff

Keyword(s):

Reinforcement Learning ◽

Computational Model ◽

Therapeutic Efficacy ◽

Learning Algorithm ◽

Field Potential ◽

Stimulation Frequency ◽

Patient Specific ◽

Reward Function ◽

Wide Range ◽

Reinforcement Learning Algorithm

Neuromodulation technologies such as vagus nerve stimulation and deep brain stimulation, have shown some efficacy in controlling seizures in medically intractable patients. However, inherent patient-to-patient variability of seizure disorders leads to a wide range of therapeutic efficacy. A patient specific approach to determining stimulation parameters may lead to increased therapeutic efficacy while minimizing stimulation energy and side effects. This paper presents a reinforcement learning algorithm that optimizes stimulation frequency for controlling seizures with minimum stimulation energy. We apply our method to a computational model called the epileptor. The epileptor model simulates inter-ictal and ictal local field potential data. In order to apply reinforcement learning to the Epileptor, we introduce a specialized reward function and state-space discretization. With the reward function and discretization fixed, we test the effectiveness of the temporal difference reinforcement learning algorithm (TD(0)). For periodic pulsatile stimulation, we derive a relation that describes, for any stimulation frequency, the minimal pulse amplitude required to suppress seizures. The TD(0) algorithm is able to identify parameters that control seizures quickly. Additionally, our results show that the TD(0) algorithm refines the stimulation frequency to minimize stimulation energy thereby converging to optimal parameters reliably. An advantage of the TD(0) algorithm is that it is adaptive so that the parameters necessary to control the seizures can change over time. We show that the algorithm can converge on the optimal solution in simulation with slow and fast inter-seizure intervals.

Download Full-text