Policy Gradient Reinforcement Learning Method for Backward Motion Control of Tractor-Trailer Mobile Robot

This article describes the use of the Deep Deterministic Policy Gradient network, a deep reinforcement learning algorithm, for mobile robot navigation. The neural network structure has as inputs laser range findings, angular and linear velocities of the robot, and position and orientation of the mobile robot with respect to a goal position. The outputs of the network will be the angular and linear velocities used as control signals for the robot. The experiments demonstrated that deep reinforcement learning’s techniques that uses continuous actions, are efficient for decision-making in a mobile robot. Nevertheless, the design of the reward functions constitutes an important issue in the performance of deep reinforcement learning algorithms. In order to show the performance of the Deep Reinforcement Learning algorithm, we have applied successfully the proposed architecture in simulated environments and in experiments with a real robot.

Download Full-text

A Method of Personalized Driving Decision for Smart Car Based on Deep Reinforcement Learning

Information ◽

10.3390/info11060295 ◽

2020 ◽

Vol 11 (6) ◽

pp. 295 ◽

Cited By ~ 1

Author(s):

Xinpeng Wang ◽

Chaozhong Wu ◽

Jie Xue ◽

Zhijun Chen

Keyword(s):

Reinforcement Learning ◽

Decision Model ◽

Gradient Algorithm ◽

Learning Goals ◽

Learning Method ◽

Automatic Driving ◽

Proposed Model ◽

Policy Gradient ◽

Self Learning ◽

Better Than

To date, automatic driving technology has become a hotspot in academia. It is necessary to provide a personalization of automatic driving decision for each passenger. The purpose of this paper is to propose a self-learning method for personalized driving decisions. First, collect and analyze driving data from different drivers to set learning goals. Then, Deep Deterministic Policy Gradient algorithm is utilized to design a driving decision system. Furthermore, personalized factors are introduced for some observed parameters to build a personalized driving decision model. Finally, compare the proposed method with classic Deep Reinforcement Learning algorithms. The results show that the performance of the personalized driving decision model is better than the classic algorithms, and it is similar to the manual driving situation. Therefore, the proposed model can effectively learn the human-like personalized driving decisions of different drivers for structured road. Based on this model, the smart car can accomplish personalized driving.

Download Full-text

Motion Control of Non-Holonomic Constrained Mobile Robot Using Deep Reinforcement Learning

2019 IEEE 4th International Conference on Advanced Robotics and Mechatronics (ICARM) ◽

10.1109/icarm.2019.8834284 ◽

2019 ◽

Author(s):

Rui Gao ◽

Xueshan Gao ◽

Peng Liang ◽

Feng Han ◽

Bingqing Lan ◽

...

Keyword(s):

Reinforcement Learning ◽

Mobile Robot ◽

Motion Control

Download Full-text

Reward-Free Reinforcement Learning Algorithm Using Prediction Network

Fuzzy Systems and Data Mining VI - Frontiers in Artificial Intelligence and Applications ◽

10.3233/faia200744 ◽

2020 ◽

Author(s):

Zhen Yu ◽

Yimin Feng ◽

Lijun Liu

Keyword(s):

Reinforcement Learning ◽

Learning Algorithm ◽

Value Functions ◽

Learning Method ◽

Reward Function ◽

Network Training ◽

Learning Tasks ◽

Reward Value ◽

Policy Gradient ◽

Reward Functions

In general reinforcement learning tasks, the formulation of reward functions is a very important step in reinforcement learning. The reward function is not easy to formulate in a large number of systems. The network training effect is sensitive to the reward function, and different reward value functions will get different results. For a class of systems that meet specific conditions, the traditional reinforcement learning method is improved. A state quantity function is designed to replace the reward function, which is more efficient than the traditional reward function. At the same time, the predictive network link is designed so that the network can learn the value of the general state by using the special state. The overall structure of the network will be improved based on the Deep Deterministic Policy Gradient (DDPG) algorithm. Finally, the algorithm was successfully applied in the environment of FrozenLake, and achieved good performance. The experiment proves the effectiveness of the algorithm and realizes rewardless reinforcement learning in a class of systems.

Download Full-text