scholarly journals Behavior Acquisition of an Autonomous Robot by Reinforcement Learning Based on Globally Coupled Chaotic System. 2nd Report. Learning Navigational Behaviors in Dynamic Environment.

1997 ◽  
Vol 63 (615) ◽  
pp. 3977-3983
Author(s):  
Yoichiro NAKAMURA ◽  
Kazuhiro OHKURA ◽  
Kanji UEDA
2020 ◽  
Vol 12 (20) ◽  
pp. 8718 ◽  
Author(s):  
Seunghoon Lee ◽  
Yongju Cho ◽  
Young Hoon Lee

In the injection mold industry, it is important for manufacturers to satisfy the delivery date for the products that customers order. The mold products are diverse, and each product has a different manufacturing process. Owing to the nature of mold, mold manufacturing is a complex and dynamic environment. To meet the delivery date of the customers, the scheduling of mold production is important and is required to be sustainable and intelligent even in the complicated system and dynamic situation. To address this, in this paper, deep reinforcement learning (RL) is proposed for injection mold production scheduling. Before presenting the RL algorithm, a mathematical model for the mold scheduling problem is presented, and a Markov decision process framework is proposed for RL. The deep Q-network, which is an algorithm for RL, is employed to find the scheduling policy to minimize the total weighted tardiness. The results of experiments demonstrate that the proposed deep RL method outperforms the dispatching rules that are presented for minimizing the total weighted tardiness.


Sensors ◽  
2019 ◽  
Vol 19 (18) ◽  
pp. 3837 ◽  
Author(s):  
Junjie Zeng ◽  
Rusheng Ju ◽  
Long Qin ◽  
Yue Hu ◽  
Quanjun Yin ◽  
...  

In this paper, we propose a novel Deep Reinforcement Learning (DRL) algorithm which can navigate non-holonomic robots with continuous control in an unknown dynamic environment with moving obstacles. We call the approach MK-A3C (Memory and Knowledge-based Asynchronous Advantage Actor-Critic) for short. As its first component, MK-A3C builds a GRU-based memory neural network to enhance the robot’s capability for temporal reasoning. Robots without it tend to suffer from a lack of rationality in face of incomplete and noisy estimations for complex environments. Additionally, robots with certain memory ability endowed by MK-A3C can avoid local minima traps by estimating the environmental model. Secondly, MK-A3C combines the domain knowledge-based reward function and the transfer learning-based training task architecture, which can solve the non-convergence policies problems caused by sparse reward. These improvements of MK-A3C can efficiently navigate robots in unknown dynamic environments, and satisfy kinetic constraints while handling moving objects. Simulation experiments show that compared with existing methods, MK-A3C can realize successful robotic navigation in unknown and challenging environments by outputting continuous acceleration commands.


Sign in / Sign up

Export Citation Format

Share Document