scholarly journals Collision-free path planning for welding manipulator via hybrid algorithm of deep reinforcement learning and inverse kinematics

Author(s):  
Jie Zhong ◽  
Tao Wang ◽  
Lianglun Cheng

AbstractIn actual welding scenarios, an effective path planner is needed to find a collision-free path in the configuration space for the welding manipulator with obstacles around. However, as a state-of-the-art method, the sampling-based planner only satisfies the probability completeness and its computational complexity is sensitive with state dimension. In this paper, we propose a path planner for welding manipulators based on deep reinforcement learning for solving path planning problems in high-dimensional continuous state and action spaces. Compared with the sampling-based method, it is more robust and is less sensitive with state dimension. In detail, to improve the learning efficiency, we introduce the inverse kinematics module to provide prior knowledge while a gain module is also designed to avoid the local optimal policy, we integrate them into the training algorithm. To evaluate our proposed planning algorithm in multiple dimensions, we conducted multiple sets of path planning experiments for welding manipulators. The results show that our method not only improves the convergence performance but also is superior in terms of optimality and robustness of planning compared with most other planning algorithms.

Sensors ◽  
2020 ◽  
Vol 20 (19) ◽  
pp. 5493
Author(s):  
Junli Gao ◽  
Weijie Ye ◽  
Jing Guo ◽  
Zhongjuan Li

This paper proposes a novel incremental training mode to address the problem of Deep Reinforcement Learning (DRL) based path planning for a mobile robot. Firstly, we evaluate the related graphic search algorithms and Reinforcement Learning (RL) algorithms in a lightweight 2D environment. Then, we design the algorithm based on DRL, including observation states, reward function, network structure as well as parameters optimization, in a 2D environment to circumvent the time-consuming works for a 3D environment. We transfer the designed algorithm to a simple 3D environment for retraining to obtain the converged network parameters, including the weights and biases of deep neural network (DNN), etc. Using these parameters as initial values, we continue to train the model in a complex 3D environment. To improve the generalization of the model in different scenes, we propose to combine the DRL algorithm Twin Delayed Deep Deterministic policy gradients (TD3) with the traditional global path planning algorithm Probabilistic Roadmap (PRM) as a novel path planner (PRM+TD3). Experimental results show that the incremental training mode can notably improve the development efficiency. Moreover, the PRM+TD3 path planner can effectively improve the generalization of the model.


Author(s):  
Hrishikesh Dey ◽  
Rithika Ranadive ◽  
Abhishek Chaudhari

Path planning algorithm integrated with a velocity profile generation-based navigation system is one of the most important aspects of an autonomous driving system. In this paper, a real-time path planning solution to obtain a feasible and collision-free trajectory is proposed for navigating an autonomous car on a virtual highway. This is achieved by designing the navigation algorithm to incorporate a path planner for finding the optimal path, and a velocity planning algorithm for ensuring a safe and comfortable motion along the obtained path. The navigation algorithm was validated on the Unity 3D Highway-Simulated Environment for practical driving while maintaining velocity and acceleration constraints. The autonomous vehicle drives at the maximum specified velocity until interrupted by vehicular traffic, whereas then, the path planner, based on the various constraints provided by the simulator using µWebSockets, decides to either decelerate the vehicle or shift to a more secure lane. Subsequently, a splinebased trajectory generation for this path results in continuous and smooth trajectories. The velocity planner employs an analytical method based on trapezoidal velocity profile to generate velocities for the vehicle traveling along the precomputed path. To provide smooth control, an s-like trapezoidal profile is considered that uses a cubic spline for generating velocities for the ramp-up and ramp-down portions of the curve. The acceleration and velocity constraints, which are derived from road limitations and physical systems, are explicitly considered. Depending upon these constraints and higher module requirements (e.g., maintaining velocity, and stopping), an appropriate segment of the velocity profile is deployed. The motion profiles for all the use-cases are generated and verified graphically.


Sensors ◽  
2020 ◽  
Vol 20 (11) ◽  
pp. 3039
Author(s):  
Bao Chau Phan ◽  
Ying-Chih Lai ◽  
Chin E. Lin

On the issues of global environment protection, the renewable energy systems have been widely considered. The photovoltaic (PV) system converts solar power into electricity and significantly reduces the consumption of fossil fuels from environment pollution. Besides introducing new materials for the solar cells to improve the energy conversion efficiency, the maximum power point tracking (MPPT) algorithms have been developed to ensure the efficient operation of PV systems at the maximum power point (MPP) under various weather conditions. The integration of reinforcement learning and deep learning, named deep reinforcement learning (DRL), is proposed in this paper as a future tool to deal with the optimization control problems. Following the success of deep reinforcement learning (DRL) in several fields, the deep Q network (DQN) and deep deterministic policy gradient (DDPG) are proposed to harvest the MPP in PV systems, especially under a partial shading condition (PSC). Different from the reinforcement learning (RL)-based method, which is only operated with discrete state and action spaces, the methods adopted in this paper are used to deal with continuous state spaces. In this study, DQN solves the problem with discrete action spaces, while DDPG handles the continuous action spaces. The proposed methods are simulated in MATLAB/Simulink for feasibility analysis. Further tests under various input conditions with comparisons to the classical Perturb and observe (P&O) MPPT method are carried out for validation. Based on the simulation results in this study, the performance of the proposed methods is outstanding and efficient, showing its potential for further applications.


2021 ◽  
Vol 188 ◽  
pp. 106350
Author(s):  
Guichao Lin ◽  
Lixue Zhu ◽  
Jinhui Li ◽  
Xiangjun Zou ◽  
Yunchao Tang

Sign in / Sign up

Export Citation Format

Share Document