Learning quadrupedal locomotion over challenging terrain

Legged locomotion can extend the operational domain of robots to some of the most challenging environments on Earth. However, conventional controllers for legged locomotion are based on elaborate state machines that explicitly trigger the execution of motion primitives and reflexes. These designs have increased in complexity but fallen short of the generality and robustness of animal locomotion. Here, we present a robust controller for blind quadrupedal locomotion in challenging natural environments. Our approach incorporates proprioceptive feedback in locomotion control and demonstrates zero-shot generalization from simulation to natural environments. The controller is trained by reinforcement learning in simulation. The controller is driven by a neural network policy that acts on a stream of proprioceptive signals. The controller retains its robustness under conditions that were never encountered during training: deformable terrains such as mud and snow, dynamic footholds such as rubble, and overground impediments such as thick vegetation and gushing water. The presented work indicates that robust locomotion in natural environments can be achieved by training in simple domains.

Download Full-text

On the Technological Instantiation of a Biomimetic Leg Concept for Agile Quadrupedal Locomotion

Journal of Mechanisms and Robotics ◽

10.1115/1.4028306 ◽

2015 ◽

Vol 7 (3) ◽

Cited By ~ 3

Author(s):

Elena Garcia ◽

Juan C. Arevalo ◽

Manuel Cestari ◽

Daniel Sanz-Merodio

Keyword(s):

Experimental Evaluation ◽

Complex Terrain ◽

Legged Locomotion ◽

Natural Environments ◽

Quadrupedal Locomotion ◽

Optimum Performance ◽

Natural Complex ◽

Quadruped Robots ◽

Locomotion System ◽

Leg Design

The legged locomotion system of biological quadrupeds has proven to be the most efficient in natural, complex terrain. Particularly, horses' legs have been evolved to provide speed, endurance, and strength superior to any other animal of equal size. Quadruped robots, emulating their biological counterparts, could become the best choice for field missions in complex or natural environments; however, they should be provided with optimum performance against mobility, payload, and endurance. The design of the leg mechanism is of paramount importance to achieve the targeted performance, and in order to design a leg mechanism able to provide the robot with such agile capabilities nature is the best source for inspiration. In this work, key principles underlying horse legs' power capabilities have been extracted and translated to a biomimetic leg concept. Afterwards, a real prototype has been designed following the biomimetic concept proposed. A key element in the biomimetic concept is the multifunctionality of the natural musculotendinous system, which has been mimicked by combining series elastic actuation and passive elements. This work provides an assessment of the benefits that bio-inspired solutions can provide versus the purely engineering approaches. The experimental evaluation of the bio-inspired prototype shows an improvement on the performance compared to a leg design based on purely engineering principles.

Download Full-text

A Reinforcement-Learning Neural Network for the Control of Nonlinear Systems

1991 American Control Conference ◽

10.23919/acc.1991.4791313 ◽

1991 ◽

Cited By ~ 3

Author(s):

Kevin L. Moore

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Nonlinear Systems

Download Full-text

Building HVAC Scheduling Using Reinforcement Learning via Neural Network Based Model Approximation

Proceedings of the 6th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation ◽

10.1145/3360322.3360861 ◽

2019 ◽

Cited By ~ 8

Author(s):

Chi Zhang ◽

Sanmukh R. Kuppannagari ◽

Rajgopal Kannan ◽

Viktor K. Prasanna

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Model Approximation

Download Full-text

Location- and Person-Independent Activity Recognition with WiFi, Deep Neural Networks, and Reinforcement Learning

ACM Transactions on Internet of Things ◽

10.1145/3424739 ◽

2021 ◽

Vol 2 (1) ◽

pp. 1-25

Author(s):

Yongsen Ma ◽

Sheheryar Arshad ◽

Swetha Muniraju ◽

Eric Torkildson ◽

Enrico Rantala ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Reinforcement Learning ◽

Activity Recognition ◽

Deep Neural Networks ◽

State Machine ◽

Recognition Algorithm ◽

The State ◽

Neural Architecture ◽

Learning Agent

In recent years, Channel State Information (CSI) measured by WiFi is widely used for human activity recognition. In this article, we propose a deep learning design for location- and person-independent activity recognition with WiFi. The proposed design consists of three Deep Neural Networks (DNNs): a 2D Convolutional Neural Network (CNN) as the recognition algorithm, a 1D CNN as the state machine, and a reinforcement learning agent for neural architecture search. The recognition algorithm learns location- and person-independent features from different perspectives of CSI data. The state machine learns temporal dependency information from history classification results. The reinforcement learning agent optimizes the neural architecture of the recognition algorithm using a Recurrent Neural Network (RNN) with Long Short-Term Memory (LSTM). The proposed design is evaluated in a lab environment with different WiFi device locations, antenna orientations, sitting/standing/walking locations/orientations, and multiple persons. The proposed design has 97% average accuracy when testing devices and persons are not seen during training. The proposed design is also evaluated by two public datasets with accuracy of 80% and 83%. The proposed design needs very little human efforts for ground truth labeling, feature engineering, signal processing, and tuning of learning parameters and hyperparameters.

Download Full-text

Robust Quadrotor Control through Reinforcement Learning with Disturbance Compensation

Applied Sciences ◽

10.3390/app11073257 ◽

2021 ◽

Vol 11 (7) ◽

pp. 3257

Author(s):

Chen-Huan Pi ◽

Wei-Yuan Ye ◽

Stone Cheng

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Control Strategy ◽

External Disturbance ◽

Control Agent ◽

Network Control ◽

Outdoor Environment ◽

Disturbance Compensation ◽

Tracking Accuracy ◽

Control Scheme

In this paper, a novel control strategy is presented for reinforcement learning with disturbance compensation to solve the problem of quadrotor positioning under external disturbance. The proposed control scheme applies a trained neural-network-based reinforcement learning agent to control the quadrotor, and its output is directly mapped to four actuators in an end-to-end manner. The proposed control scheme constructs a disturbance observer to estimate the external forces exerted on the three axes of the quadrotor, such as wind gusts in an outdoor environment. By introducing an interference compensator into the neural network control agent, the tracking accuracy and robustness were significantly increased in indoor and outdoor experiments. The experimental results indicate that the proposed control strategy is highly robust to external disturbances. In the experiments, compensation improved control accuracy and reduced positioning error by 75%. To the best of our knowledge, this study is the first to achieve quadrotor positioning control through low-level reinforcement learning by using a global positioning system in an outdoor environment.

Download Full-text

Integrated process-system modelling and control through graph neural network and reinforcement learning

CIRP Annals ◽

10.1016/j.cirp.2021.04.056 ◽

2021 ◽

Author(s):

Jing Huang ◽

Jianjing Zhang ◽

Qing Chang ◽

Robert X. Gao

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

System Modelling ◽

Integrated Process ◽

Process System ◽

And Control

Download Full-text

Research on Medical Image Registration Based on Graphic Neural Network Reinforcement Learning

Journal of Physics Conference Series ◽

10.1088/1742-6596/1693/1/012131 ◽

2020 ◽

Vol 1693 ◽

pp. 012131

Author(s):

Zhejia Dong

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Image Registration ◽

Medical Image ◽

Medical Image Registration

Download Full-text

Solving flow-shop scheduling problem with a reinforcement learning algorithm that generalizes the value function with neural network

Alexandria Engineering Journal ◽

10.1016/j.aej.2021.01.030 ◽

2021 ◽

Vol 60 (3) ◽

pp. 2787-2800

Author(s):

Jianfeng Ren ◽

Chunming Ye ◽

Feng Yang

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Value Function ◽

Flow Shop ◽

Learning Algorithm ◽

Flow Shop Scheduling ◽

Scheduling Problem ◽

Shop Scheduling ◽

The Value Function ◽

Reinforcement Learning Algorithm

Download Full-text

ShadingNet: Image Intrinsics by Fine-Grained Shading Decomposition

International Journal of Computer Vision ◽

10.1007/s11263-021-01477-5 ◽

2021 ◽

Author(s):

Anil S. Baslamisli ◽

Partha Das ◽

Hoang-An Le ◽

Sezer Karaoglu ◽

Theo Gevers

Keyword(s):

Neural Network ◽

Large Scale ◽

State Of The Art ◽

Image Decomposition ◽

Natural Environments ◽

Decomposition Algorithms ◽

Ambient Light ◽

Fine Grained ◽

Large Scale Dataset ◽

Direct Illumination

AbstractIn general, intrinsic image decomposition algorithms interpret shading as one unified component including all photometric effects. As shading transitions are generally smoother than reflectance (albedo) changes, these methods may fail in distinguishing strong photometric effects from reflectance variations. Therefore, in this paper, we propose to decompose the shading component into direct (illumination) and indirect shading (ambient light and shadows) subcomponents. The aim is to distinguish strong photometric effects from reflectance variations. An end-to-end deep convolutional neural network (ShadingNet) is proposed that operates in a fine-to-coarse manner with a specialized fusion and refinement unit exploiting the fine-grained shading model. It is designed to learn specific reflectance cues separated from specific photometric effects to analyze the disentanglement capability. A large-scale dataset of scene-level synthetic images of outdoor natural environments is provided with fine-grained intrinsic image ground-truths. Large scale experiments show that our approach using fine-grained shading decompositions outperforms state-of-the-art algorithms utilizing unified shading on NED, MPI Sintel, GTA V, IIW, MIT Intrinsic Images, 3DRMS and SRD datasets.

Download Full-text

Fully distributed actor-critic architecture for multitask deep reinforcement learning

The Knowledge Engineering Review ◽

10.1017/s0269888921000023 ◽

2021 ◽

Vol 36 ◽

Author(s):

Sergio Valcarcel Macua ◽

Ian Davies ◽

Aleksi Tukiainen ◽

Enrique Munoz de Cote

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Duality Theory ◽

Deep Neural Network ◽

Original Problem ◽

Almost Sure Convergence ◽

Continuous Control ◽

Access Data ◽

Central Station ◽

Common Policy

Abstract We propose a fully distributed actor-critic architecture, named diffusion-distributed-actor-critic Diff-DAC, with application to multitask reinforcement learning (MRL). During the learning process, agents communicate their value and policy parameters to their neighbours, diffusing the information across a network of agents with no need for a central station. Each agent can only access data from its local task, but aims to learn a common policy that performs well for the whole set of tasks. The architecture is scalable, since the computational and communication cost per agent depends on the number of neighbours rather than the overall number of agents. We derive Diff-DAC from duality theory and provide novel insights into the actor-critic framework, showing that it is actually an instance of the dual-ascent method. We prove almost sure convergence of Diff-DAC to a common policy under general assumptions that hold even for deep neural network approximations. For more restrictive assumptions, we also prove that this common policy is a stationary point of an approximation of the original problem. Numerical results on multitask extensions of common continuous control benchmarks demonstrate that Diff-DAC stabilises learning and has a regularising effect that induces higher performance and better generalisation properties than previous architectures.

Download Full-text