The Use of Reinforcement Learning in the Task of Moving Objects with the Robotic Arm

Analysis on Deep Reinforcement Learning in Industrial Robotic Arm

2020 International Conference on Intelligent Computing and Human-Computer Interaction (ICHCI) ◽

10.1109/ichci51889.2020.00094 ◽

2020 ◽

Author(s):

Hengyue Guan

Keyword(s):

Reinforcement Learning ◽

Robotic Arm

Download Full-text

Generative Decoding of Intracortical Neuronal Signals for Online Control of Robotic Arm to Intercept Moving Objects

Journal of Physics Conference Series ◽

10.1088/1742-6596/1576/1/012057 ◽

2020 ◽

Vol 1576 ◽

pp. 012057

Author(s):

Chenyang Li ◽

Yiheng Zhang ◽

Tianwei Wang ◽

Xinxiu Xu ◽

Qifan Wang ◽

...

Keyword(s):

Moving Objects ◽

Online Control ◽

Robotic Arm

Download Full-text

Navigation in Unknown Dynamic Environments Based on Deep Reinforcement Learning

Sensors ◽

10.3390/s19183837 ◽

2019 ◽

Vol 19 (18) ◽

pp. 3837 ◽

Cited By ~ 7

Author(s):

Junjie Zeng ◽

Rusheng Ju ◽

Long Qin ◽

Yue Hu ◽

Quanjun Yin ◽

...

Keyword(s):

Reinforcement Learning ◽

Domain Knowledge ◽

Moving Objects ◽

Dynamic Environment ◽

Dynamic Environments ◽

Continuous Control ◽

Complex Environments ◽

Reward Function ◽

Knowledge Based ◽

Task Architecture

In this paper, we propose a novel Deep Reinforcement Learning (DRL) algorithm which can navigate non-holonomic robots with continuous control in an unknown dynamic environment with moving obstacles. We call the approach MK-A3C (Memory and Knowledge-based Asynchronous Advantage Actor-Critic) for short. As its first component, MK-A3C builds a GRU-based memory neural network to enhance the robot’s capability for temporal reasoning. Robots without it tend to suffer from a lack of rationality in face of incomplete and noisy estimations for complex environments. Additionally, robots with certain memory ability endowed by MK-A3C can avoid local minima traps by estimating the environmental model. Secondly, MK-A3C combines the domain knowledge-based reward function and the transfer learning-based training task architecture, which can solve the non-convergence policies problems caused by sparse reward. These improvements of MK-A3C can efficiently navigate robots in unknown dynamic environments, and satisfy kinetic constraints while handling moving objects. Simulation experiments show that compared with existing methods, MK-A3C can realize successful robotic navigation in unknown and challenging environments by outputting continuous acceleration commands.

Download Full-text

Customizing skills for assistive robotic manipulators, an inverse reinforcement learning approach with error-related potentials

Communications Biology ◽

10.1038/s42003-021-02891-8 ◽

2021 ◽

Vol 4 (1) ◽

Author(s):

Iason Batzianoulis ◽

Fumiaki Iwane ◽

Shupeng Wei ◽

Carolina Gaspar Pinto Ramos Correia ◽

Ricardo Chavarriaga ◽

...

Keyword(s):

Reinforcement Learning ◽

Real Life ◽

Robotic Manipulators ◽

Robotic Arm ◽

Inverse Reinforcement Learning ◽

Motor Disabilities ◽

Robot Controller ◽

Robot Behavior ◽

Related Potentials ◽

Assistive Robotic

AbstractRobotic assistance via motorized robotic arm manipulators can be of valuable assistance to individuals with upper-limb motor disabilities. Brain-computer interfaces (BCI) offer an intuitive means to control such assistive robotic manipulators. However, BCI performance may vary due to the non-stationary nature of the electroencephalogram (EEG) signals. It, hence, cannot be used safely for controlling tasks where errors may be detrimental to the user. Avoiding obstacles is one such task. As there exist many techniques to avoid obstacles in robotics, we propose to give the control to the robot to avoid obstacles and to leave to the user the choice of the robot behavior to do so a matter of personal preference as some users may be more daring while others more careful. We enable the users to train the robot controller to adapt its way to approach obstacles relying on BCI that detects error-related potentials (ErrP), indicative of the user’s error expectation of the robot’s current strategy to meet their preferences. Gaussian process-based inverse reinforcement learning, in combination with the ErrP-BCI, infers the user’s preference and updates the obstacle avoidance controller so as to generate personalized robot trajectories. We validate the approach in experiments with thirteen able-bodied subjects using a robotic arm that picks up, places and avoids real-life objects. Results show that the algorithm can learn user’s preference and adapt the robot behavior rapidly using less than five demonstrations not necessarily optimal.

Download Full-text

Deep Reinforcement Learning with Interactive Feedback in a Human–Robot Environment

Applied Sciences ◽

10.3390/app10165574 ◽

2020 ◽

Vol 10 (16) ◽

pp. 5574 ◽

Cited By ~ 4

Author(s):

Ithan Moreira ◽

Javier Rivas ◽

Francisco Cruz ◽

Richard Dazeley ◽

Angel Ayala ◽

...

Keyword(s):

Reinforcement Learning ◽

Learning Process ◽

Robotic Arm ◽

Learning Approach ◽

Speed Up ◽

Open Issue ◽

Domestic Environments ◽

The Given ◽

Different Sources ◽

Interactive Feedback

Robots are extending their presence in domestic environments every day, it being more common to see them carrying out tasks in home scenarios. In the future, robots are expected to increasingly perform more complex tasks and, therefore, be able to acquire experience from different sources as quickly as possible. A plausible approach to address this issue is interactive feedback, where a trainer advises a learner on which actions should be taken from specific states to speed up the learning process. Moreover, deep reinforcement learning has been recently widely used in robotics to learn the environment and acquire new skills autonomously. However, an open issue when using deep reinforcement learning is the excessive time needed to learn a task from raw input images. In this work, we propose a deep reinforcement learning approach with interactive feedback to learn a domestic task in a Human–Robot scenario. We compare three different learning methods using a simulated robotic arm for the task of organizing different objects; the proposed methods are (i) deep reinforcement learning (DeepRL); (ii) interactive deep reinforcement learning using a previously trained artificial agent as an advisor (agent–IDeepRL); and (iii) interactive deep reinforcement learning using a human advisor (human–IDeepRL). We demonstrate that interactive approaches provide advantages for the learning process. The obtained results show that a learner agent, using either agent–IDeepRL or human–IDeepRL, completes the given task earlier and has fewer mistakes compared to the autonomous DeepRL approach.

Download Full-text

A hybrid MPC for constrained deep reinforcement learning applied for planar robotic arm

ISA Transactions ◽

10.1016/j.isatra.2021.03.046 ◽

2021 ◽

Author(s):

Mostafa Al-Gabalawy

Keyword(s):

Reinforcement Learning ◽

Robotic Arm

Download Full-text

Humanoid motion planning of robotic arm based on human arm action feature and reinforcement learning

Mechatronics ◽

10.1016/j.mechatronics.2021.102630 ◽

2021 ◽

Vol 78 ◽

pp. 102630

Author(s):

Aolei Yang ◽

Yanling Chen ◽

Wasif Naeem ◽

Minrui Fei ◽

Ling Chen

Keyword(s):

Reinforcement Learning ◽

Motion Planning ◽

Robotic Arm ◽

Human Arm

Download Full-text

A Validation Approach for Deep Reinforcement Learning of a Robotic Arm in a 3D Simulated Environment

2021 IEEE 19th World Symposium on Applied Machine Intelligence and Informatics (SAMI) ◽

10.1109/sami50585.2021.9378684 ◽

2021 ◽

Author(s):

Monica Gruosso ◽

Nicola Capece ◽

Ugo Erra ◽

Flavio Biancospino

Keyword(s):

Reinforcement Learning ◽

Robotic Arm ◽

Simulated Environment

Download Full-text

Robotic Obstacle Avoidance: A Virtual Modeling And Reinforcement Learning

10.21203/rs.3.rs-601053/v1 ◽

2021 ◽

Author(s):

Ming-Fei Chen ◽

Han-Hsien Tsai ◽

Wen-Tse Hsiao

Keyword(s):

Reinforcement Learning ◽

Route Planning ◽

Target Position ◽

Learning System ◽

Robotic Arm ◽

Policy Gradient ◽

Visualization Toolkit ◽

Reward Functions ◽

Virtual Modeling ◽

Self Learning

Abstract This study developed a robotic arm self-learning system based on virtual modeling and reinforcement learning. Using the model of a robotic arm, information concerning obstacles in the environment, initial coordinates of the robotic arm, and the target position, this system automatically generated a set of rotational angles to enable a robotic arm to be positioned such that it can avoid all obstacles and reach a target. The developed program was divided into three parts. The first part involves robotic arm simulation and collision detection; specifically, images of a six-axis robotic arm and obstacles were input to the Visualization ToolKit library to visualize the movements and surrounding environment of the robotic arm. Subsequently, an oriented bounding box algorithm was used to determine whether collisions had occurred. The second part concerned machine-learning–based route planning. The TensorFlow was used to establish a deep deterministic policy gradient model, and reinforcement learning was employed for the response to environmental variables. Different reward functions were designed for tests and discussions, and the program’s practicality was verified through actual machine operations. Finally, the application of reinforcement learning in route planning for a robotic arm was proved feasible by the experiment. Such an application facilitated automatic route planning and achieved an error of less than 10 mm from the target position.

Download Full-text

Flow-based Intrinsic Curiosity Module

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/286 ◽

2020 ◽

Author(s):

Hsuan-Kung Yang ◽

Po-Han Chiang ◽

Min-Fong Hong ◽

Chun-Yi Lee

Keyword(s):

Reinforcement Learning ◽

Optical Flow ◽

Moving Objects ◽

Sufficient Information ◽

Prediction Errors ◽

Flow Estimation ◽

Optical Flow Estimation ◽

Estimation Strategy ◽

Motion Features ◽

Encoding Efficiency

In this paper, we focus on a prediction-based novelty estimation strategy upon the deep reinforcement learning (DRL) framework, and present a flow-based intrinsic curiosity module (FICM) to exploit the prediction errors from optical flow estimation as exploration bonuses. We propose the concept of leveraging motion features captured between consecutive observations to evaluate the novelty of observations in an environment. FICM encourages a DRL agent to explore observations with unfamiliar motion features, and requires only two consecutive frames to obtain sufficient information when estimating the novelty. We evaluate our method and compare it with a number of existing methods on multiple benchmark environments, including Atari games, Super Mario Bros., and ViZDoom. We demonstrate that FICM is favorable to tasks or environments featuring moving objects, which allow FICM to utilize the motion features between consecutive observations. We further ablatively analyze the encoding efficiency of FICM, and discuss its applicable domains comprehensively. See here for our codes and demo videos.

Download Full-text