Reinforcement learning for motion control of humanoid robots

Our goal is to develop biped humanoid robots capable of working stably in a human living and working space, with a focus on their physical construction and motion control. At the first stage, we have developed a human-like biped robot, WABIAN (WAseda BIped humANoid), which has a thirty-five mechanical degrees of freedom. Its height is 1.66 [m] and its weight 107.4 [kg]. In this paper, a moment compensation method is described for stability, which is based on the motion of its head, legs and arms. Also, a follow walking method is proposed which is based on a pattern switching technique. By a combination of both methods, the biped robot is able to perform dynamic stamping, walking forward and backward in a continuous time while someone is pushing or pulling its hand in such a way. Using WABIAN, human-fellow walking experiments are conducted, and the effectiveness of the methods are verified.

Download Full-text

A Motion Control Method Based on the Knowledge Discovery and Data Mining for Humanoid Robots

Journal of the Robotics Society of Japan ◽

10.7210/jrsj.25.402 ◽

2007 ◽

Vol 25 (3) ◽

pp. 402-409

Author(s):

Kiyotake Kuwayama ◽

Shohei Kato ◽

Hidenori Itoh

Keyword(s):

Data Mining ◽

Knowledge Discovery ◽

Motion Control ◽

Control Method ◽

Humanoid Robots

Download Full-text

Comparison of Reinforcement Learning Algorithms for Motion Control of an Autonomous Robot in Gazebo Simulator

10.1109/itnt52450.2021.9649145 ◽

2021 ◽

Author(s):

Daniil Kozlov

Keyword(s):

Reinforcement Learning ◽

Motion Control ◽

Learning Algorithms ◽

Autonomous Robot

Download Full-text

Efficient Actor-Critic Reinforcement Learning With Embodiment of Muscle Tone for Posture Stabilization of the Human Arm

Neural Computation ◽

10.1162/neco_a_01333 ◽

2021 ◽

Vol 33 (1) ◽

pp. 129-156

Author(s):

Masami Iwamoto ◽

Daichi Kato

Keyword(s):

Reinforcement Learning ◽

Muscle Tone ◽

Muscle Tension ◽

Experimental Studies ◽

Control Policy ◽

Humanoid Robots ◽

Pedunculopontine Tegmental Nucleus ◽

Learning Efficiency ◽

Human Arm ◽

Posture Stabilization

This letter proposes a new idea to improve learning efficiency in reinforcement learning (RL) with the actor-critic method used as a muscle controller for posture stabilization of the human arm. Actor-critic RL (ACRL) is used for simulations to realize posture controls in humans or robots using muscle tension control. However, it requires very high computational costs to acquire a better muscle control policy for desirable postures. For efficient ACRL, we focused on embodiment that is supposed to potentially achieve efficient controls in research fields of artificial intelligence or robotics. According to the neurophysiology of motion control obtained from experimental studies using animals or humans, the pedunculopontine tegmental nucleus (PPTn) induces muscle tone suppression, and the midbrain locomotor region (MLR) induces muscle tone promotion. PPTn and MLR modulate the activation levels of mutually antagonizing muscles such as flexors and extensors in a process through which control signals are translated from the substantia nigra reticulata to the brain stem. Therefore, we hypothesized that the PPTn and MLR could control muscle tone, that is, the maximum values of activation levels of mutually antagonizing muscles using different sigmoidal functions for each muscle; then we introduced antagonism function models (AFMs) of PPTn and MLR for individual muscles, incorporating the hypothesis into the process to determine the activation level of each muscle based on the output of the actor in ACRL. ACRL with AFMs representing the embodiment of muscle tone successfully achieved posture stabilization in five joint motions of the right arm of a human adult male under gravity in predetermined target angles at an earlier period of learning than the learning methods without AFMs. The results obtained from this study suggest that the introduction of embodiment of muscle tone can enhance learning efficiency in posture stabilization disorders of humans or humanoid robots.

Download Full-text

MOTION CONTROL OF A NONLINEAR SPRING BY REINFORCEMENT LEARNING

Control and Intelligent Systems ◽

10.2316/journal.201.2008.1.201-1722 ◽

2008 ◽

Vol 36 (1) ◽

Cited By ~ 1

Author(s):

I.O. Bucak ◽

M.A. Zohdy ◽

M. Shillor

Keyword(s):

Reinforcement Learning ◽

Motion Control ◽

Nonlinear Spring

Download Full-text

Policy Gradient Reinforcement Learning Method for Backward Motion Control of Tractor-Trailer Mobile Robot

10.1007/978-981-16-6554-7_35 ◽

2021 ◽

pp. 303-311

Author(s):

Qiqi Wang ◽

Jin Cheng ◽

Han Zhang

Keyword(s):

Reinforcement Learning ◽

Mobile Robot ◽

Motion Control ◽

Learning Method ◽

Policy Gradient

Download Full-text

Vision-Based Multirotor Following Using Synthetic Learning Techniques

Sensors ◽

10.3390/s19214794 ◽

2019 ◽

Vol 19 (21) ◽

pp. 4794

Author(s):

Alejandro Rodriguez-Ramos ◽

Adrian Alvarez-Fernandez ◽

Hriday Bavle ◽

Pascual Campoy ◽

Jonathan P. How

Keyword(s):

Reinforcement Learning ◽

Object Detection ◽

Learning Strategies ◽

Motion Control ◽

Domain Adaptation ◽

Control Strategies ◽

Synthetic Data ◽

Real Data ◽

Stable Convergence ◽

Learning Techniques

Deep- and reinforcement-learning techniques have increasingly required large sets of real data to achieve stable convergence and generalization, in the context of image-recognition, object-detection or motion-control strategies. On this subject, the research community lacks robust approaches to overcome unavailable real-world extensive data by means of realistic synthetic-information and domain-adaptation techniques. In this work, synthetic-learning strategies have been used for the vision-based autonomous following of a noncooperative multirotor. The complete maneuver was learned with synthetic images and high-dimensional low-level continuous robot states, with deep- and reinforcement-learning techniques for object detection and motion control, respectively. A novel motion-control strategy for object following is introduced where the camera gimbal movement is coupled with the multirotor motion during the multirotor following. Results confirm that our present framework can be used to deploy a vision-based task in real flight using synthetic data. It was extensively validated in both simulated and real-flight scenarios, providing proper results (following a multirotor up to 1.3 m/s in simulation and 0.3 m/s in real flights).

Download Full-text

Estimation of stochastic representation of via-points in human motion control by reinforcement learning

The 2006 IEEE International Joint Conference on Neural Network Proceedings ◽

10.1109/ijcnn.2006.1716274 ◽

2006 ◽

Author(s):

Y. Wada ◽

K. Tokunaga

Keyword(s):

Reinforcement Learning ◽

Motion Control ◽

Human Motion ◽

Stochastic Representation

Download Full-text

Reinforcement learning for motion control of humanoid robots

Soft computing approaches to motion control for humanoid robots

Attitude Control Based Autonomous Underwater Vehicle Multi-mission Motion Control with Deep Reinforcement Learning

Quasi-human biped walking

A Motion Control Method Based on the Knowledge Discovery and Data Mining for Humanoid Robots

Comparison of Reinforcement Learning Algorithms for Motion Control of an Autonomous Robot in Gazebo Simulator

Efficient Actor-Critic Reinforcement Learning With Embodiment of Muscle Tone for Posture Stabilization of the Human Arm

MOTION CONTROL OF A NONLINEAR SPRING BY REINFORCEMENT LEARNING

Policy Gradient Reinforcement Learning Method for Backward Motion Control of Tractor-Trailer Mobile Robot

Vision-Based Multirotor Following Using Synthetic Learning Techniques

Estimation of stochastic representation of via-points in human motion control by reinforcement learning

Export Citation Format