Experimental Study on Behavior Acquisition of Mobile Robot by Deep Q-Network

Deep Q-network (DQN) is one of the most famous methods of deep reinforcement learning. DQN approximates the action-value function using Convolutional Neural Network (CNN) and updates it using Q-learning. In this study, we applied DQN to robot behavior learning in a simulation environment. We constructed the simulation environment for a two-wheeled mobile robot using the robot simulation software, Webots. The mobile robot acquired good behavior such as avoiding walls and moving along a center line by learning from high-dimensional visual information supplied as input data. We propose a method that reuses the best target network so far when the learning performance suddenly falls. Moreover, we incorporate Profit Sharing method into DQN in order to accelerate learning. Through the simulation experiment, we confirmed that our method is effective.

Download Full-text

KSU-IMR Mobile Robot Navigation Maps Building and Learning

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.789-790.717 ◽

2015 ◽

Vol 789-790 ◽

pp. 717-722

Author(s):

Ebrahim Mattar ◽

K. Al Mutib ◽

M. AlSulaiman ◽

Hedjar Ramdane

Keyword(s):

Mobile Robot ◽

Dimensionality Reduction ◽

Robot Navigation ◽

Mobile Robot Navigation ◽

Research Outcomes ◽

Neuro Fuzzy ◽

Robot Behavior ◽

Behavior Learning ◽

Mobile Maps

It is essential to learn a robot navigation environment. We describe research outcomes for KSU-IMR mapping and intelligence. This is for navigating and robot behavior learning. The mobile maps learning and intelligence was based on hybrid paradigms and AI functionaries. Intelligence was based on ANN-PCA for dimensionality reduction, and Neuro-Fuzzy architecture.

Download Full-text

An Improved Q-learning Algorithm for Path-Planning of a Mobile Robot

International Journal of Computer Applications ◽

10.5120/8073-1468 ◽

2012 ◽

Vol 51 (9) ◽

pp. 40-46 ◽

Cited By ~ 3

Author(s):

Pradipta KDas ◽

S. C. Mandhata ◽

H. S. Behera ◽

S. N. Patro

Keyword(s):

Path Planning ◽

Mobile Robot ◽

Learning Algorithm ◽

Q Learning

Download Full-text

Differential reinforcement-type shaping Q-Learning method based on animal training for autonomous mobile robot

2008 IEEE International Conference on Fuzzy Systems (IEEE World Congress on Computational Intelligence) ◽

10.1109/fuzzy.2008.4630654 ◽

2008 ◽

Author(s):

Yoichiro Maeda ◽

Satoshi Hanaka

Keyword(s):

Mobile Robot ◽

Differential Reinforcement ◽

Autonomous Mobile Robot ◽

Learning Method ◽

Q Learning ◽

Animal Training

Download Full-text

Multi-Objective Optimization of Traffic Signal Timing for Oversaturated Intersection

Mathematical Problems in Engineering ◽

10.1155/2013/182643 ◽

2013 ◽

Vol 2013 ◽

pp. 1-9 ◽

Cited By ~ 14

Author(s):

Yan Li ◽

Lijie Yu ◽

Siran Tao ◽

Kuanmin Chen

Keyword(s):

Simulation Software ◽

Traffic Signal ◽

Signal Control ◽

Traffic Signal Control ◽

Signal Timing ◽

Simulation Environment ◽

Multi Objective Optimization ◽

Good Efficiency ◽

Multi Objective ◽

Traffic Signal Timing

For the purpose of improving the efficiency of traffic signal control for isolate intersection under oversaturated conditions, a multi-objective optimization algorithm for traffic signal control is proposed. Throughput maximum and average queue ratio minimum are selected as the optimization objectives of the traffic signal control under oversaturated condition. A simulation environment using VISSIM SCAPI was utilized to evaluate the convergence and the optimization results under various settings and traffic conditions. It is written by C++/CRL to connect the simulation software VISSIM and the proposed algorithm. The simulation results indicated that the signal timing plan generated by the proposed algorithm has good efficiency in managing the traffic flow at oversaturated intersection than the commonly utilized signal timing optimization software Synchro. The update frequency applied in the simulation environment was 120 s, and it can meet the requirements of signal timing plan update in real filed. Thus, the proposed algorithm has the capability of searching Pareto front of the multi-objective problem domain under both normal condition and over-saturated condition.

Download Full-text

Path planning of a mobile robot in a free-space environment using Q-learning

Progress in Artificial Intelligence ◽

10.1007/s13748-018-00168-6 ◽

2018 ◽

Vol 8 (1) ◽

pp. 133-142 ◽

Cited By ~ 3

Author(s):

Jianxun Jiang ◽

Jianbin Xin

Keyword(s):

Path Planning ◽

Mobile Robot ◽

Free Space ◽

Space Environment ◽

Q Learning

Download Full-text

Visual information processing using cellular neural networks for mobile robot

2007 IEEE International Conference on Grey Systems and Intelligent Services ◽

10.1109/gsis.2007.4443432 ◽

2007 ◽

Author(s):

Guobao Xu ◽

Yixin Yin ◽

Lu Yin ◽

Yanshuang Hao ◽

Zhenyu Wang

Keyword(s):

Neural Networks ◽

Information Processing ◽

Mobile Robot ◽

Visual Information ◽

Visual Information Processing ◽

Cellular Neural Networks

Download Full-text

Mobile robot navigation using neural Q-learning

Proceedings of 2004 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.04EX826) ◽

10.1109/icmlc.2004.1380601 ◽

2005 ◽

Cited By ~ 6

Author(s):

Guo-Sheng Yang ◽

Er-Kui Chen ◽

Cheng-Wan An

Keyword(s):

Mobile Robot ◽

Robot Navigation ◽

Mobile Robot Navigation ◽

Q Learning

Download Full-text

Direct Gradient-Based Reinforcement Learning for Robot Behavior Learning

Informatics in Control, Automation and Robotics II ◽

10.1007/978-1-4020-5626-0_21 ◽

2007 ◽

pp. 175-182 ◽

Cited By ~ 2

Author(s):

Andres El-Fakdi ◽

Marc Carreras ◽

Pere Ridao

Keyword(s):

Reinforcement Learning ◽

Robot Behavior ◽

Gradient Based ◽

Behavior Learning

Download Full-text

A Resource Allocation Algorithm for Ultra-Dense Networks Based on Deep Reinforcement Learning

International Journal of Computers Communications & Control ◽

10.15837/ijccc.2021.2.4189 ◽

2021 ◽

Vol 16 (2) ◽

Author(s):

Huashuai Zhang ◽

Tingmei Wang ◽

Haiwei Shen

Keyword(s):

Resource Allocation ◽

Reinforcement Learning ◽

Data Traffic ◽

Wireless Data ◽

Resource Allocation Algorithm ◽

Allocation Algorithm ◽

Q Learning ◽

Dense Networks ◽

Target Network ◽

Wireless Resource Allocation

The resource optimization of ultra-dense networks (UDNs) is critical to meet the huge demand of users for wireless data traffic. But the mainstream optimization algorithms have many problems, such as the poor optimization effect, and high computing load. This paper puts forward a wireless resource allocation algorithm based on deep reinforcement learning (DRL), which aims to maximize the total throughput of the entire network and transform the resource allocation problem into a deep Q-learning process. To effectively allocate resources in UDNs, the DRL algorithm was introduced to improve the allocation efficiency of wireless resources; the authors adopted the resource allocation strategy of the deep Q-network (DQN), and employed empirical repetition and target network to overcome the instability and divergence of the results caused by the previous network state, and to solve the overestimation of the Q value. Simulation results show that the proposed algorithm can maximize the total throughput of the network, while making the network more energy-efficient and stable. Thus, it is very meaningful to introduce the DRL to the research of UDN resource allocation.

Download Full-text

Autonomous Path Planning Scheme Research for Mobile Robot

Cybernetics and Information Technologies ◽

10.1515/cait-2016-0072 ◽

2016 ◽

Vol 16 (4) ◽

pp. 113-125

Author(s):

Jianxian Cai ◽

Xiaogang Ruan ◽

Pengxuan Li

Keyword(s):

Path Planning ◽

Mobile Robot ◽

Autonomous Navigation ◽

Learning Algorithm ◽

Action Learning ◽

Cognitive Learning ◽

Learning Ability ◽

Q Learning ◽

Planning Strategy ◽

Navigation Path

Abstract An autonomous path-planning strategy based on Skinner operant conditioning principle and reinforcement learning principle is developed in this paper. The core strategies are the use of tendency cell and cognitive learning cell, which simulate bionic orientation and asymptotic learning ability. Cognitive learning cell is designed on the base of Boltzmann machine and improved Q-Learning algorithm, which executes operant action learning function to approximate the operative part of robot system. The tendency cell adjusts network weights by the use of information entropy to evaluate the function of operate action. The results of the simulation experiment in mobile robot showed that the designed autonomous path-planning strategy lets the robot realize autonomous navigation path planning. The robot learns to select autonomously according to the bionic orientate action and have fast convergence rate and higher adaptability.

Download Full-text