Accelerating the Deep Reinforcement Learning with Neural Network Compression

In recent years, Channel State Information (CSI) measured by WiFi is widely used for human activity recognition. In this article, we propose a deep learning design for location- and person-independent activity recognition with WiFi. The proposed design consists of three Deep Neural Networks (DNNs): a 2D Convolutional Neural Network (CNN) as the recognition algorithm, a 1D CNN as the state machine, and a reinforcement learning agent for neural architecture search. The recognition algorithm learns location- and person-independent features from different perspectives of CSI data. The state machine learns temporal dependency information from history classification results. The reinforcement learning agent optimizes the neural architecture of the recognition algorithm using a Recurrent Neural Network (RNN) with Long Short-Term Memory (LSTM). The proposed design is evaluated in a lab environment with different WiFi device locations, antenna orientations, sitting/standing/walking locations/orientations, and multiple persons. The proposed design has 97% average accuracy when testing devices and persons are not seen during training. The proposed design is also evaluated by two public datasets with accuracy of 80% and 83%. The proposed design needs very little human efforts for ground truth labeling, feature engineering, signal processing, and tuning of learning parameters and hyperparameters.

Download Full-text

Robust Quadrotor Control through Reinforcement Learning with Disturbance Compensation

Applied Sciences ◽

10.3390/app11073257 ◽

2021 ◽

Vol 11 (7) ◽

pp. 3257

Author(s):

Chen-Huan Pi ◽

Wei-Yuan Ye ◽

Stone Cheng

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Control Strategy ◽

External Disturbance ◽

Control Agent ◽

Network Control ◽

Outdoor Environment ◽

Disturbance Compensation ◽

Tracking Accuracy ◽

Control Scheme

In this paper, a novel control strategy is presented for reinforcement learning with disturbance compensation to solve the problem of quadrotor positioning under external disturbance. The proposed control scheme applies a trained neural-network-based reinforcement learning agent to control the quadrotor, and its output is directly mapped to four actuators in an end-to-end manner. The proposed control scheme constructs a disturbance observer to estimate the external forces exerted on the three axes of the quadrotor, such as wind gusts in an outdoor environment. By introducing an interference compensator into the neural network control agent, the tracking accuracy and robustness were significantly increased in indoor and outdoor experiments. The experimental results indicate that the proposed control strategy is highly robust to external disturbances. In the experiments, compensation improved control accuracy and reduced positioning error by 75%. To the best of our knowledge, this study is the first to achieve quadrotor positioning control through low-level reinforcement learning by using a global positioning system in an outdoor environment.

Download Full-text

Integrated process-system modelling and control through graph neural network and reinforcement learning

CIRP Annals ◽

10.1016/j.cirp.2021.04.056 ◽

2021 ◽

Author(s):

Jing Huang ◽

Jianjing Zhang ◽

Qing Chang ◽

Robert X. Gao

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

System Modelling ◽

Integrated Process ◽

Process System ◽

And Control

Download Full-text

Research on Medical Image Registration Based on Graphic Neural Network Reinforcement Learning

Journal of Physics Conference Series ◽

10.1088/1742-6596/1693/1/012131 ◽

2020 ◽

Vol 1693 ◽

pp. 012131

Author(s):

Zhejia Dong

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Image Registration ◽

Medical Image ◽

Medical Image Registration

Download Full-text

Low Rank Based End-to-End Deep Neural Network Compression

2021 Data Compression Conference (DCC) ◽

10.1109/dcc50243.2021.00031 ◽

2021 ◽

Author(s):

Swayambhoo Jain ◽

Shahab Hamidi-Rad ◽

Fabien Racape

Keyword(s):

Neural Network ◽

Deep Neural Network ◽

Low Rank ◽

End To End ◽

Network Compression

Download Full-text

Solving flow-shop scheduling problem with a reinforcement learning algorithm that generalizes the value function with neural network

Alexandria Engineering Journal ◽

10.1016/j.aej.2021.01.030 ◽

2021 ◽

Vol 60 (3) ◽

pp. 2787-2800

Author(s):

Jianfeng Ren ◽

Chunming Ye ◽

Feng Yang

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Value Function ◽

Flow Shop ◽

Learning Algorithm ◽

Flow Shop Scheduling ◽

Scheduling Problem ◽

Shop Scheduling ◽

The Value Function ◽

Reinforcement Learning Algorithm

Download Full-text

Fully distributed actor-critic architecture for multitask deep reinforcement learning

The Knowledge Engineering Review ◽

10.1017/s0269888921000023 ◽

2021 ◽

Vol 36 ◽

Author(s):

Sergio Valcarcel Macua ◽

Ian Davies ◽

Aleksi Tukiainen ◽

Enrique Munoz de Cote

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Duality Theory ◽

Deep Neural Network ◽

Original Problem ◽

Almost Sure Convergence ◽

Continuous Control ◽

Access Data ◽

Central Station ◽

Common Policy

Abstract We propose a fully distributed actor-critic architecture, named diffusion-distributed-actor-critic Diff-DAC, with application to multitask reinforcement learning (MRL). During the learning process, agents communicate their value and policy parameters to their neighbours, diffusing the information across a network of agents with no need for a central station. Each agent can only access data from its local task, but aims to learn a common policy that performs well for the whole set of tasks. The architecture is scalable, since the computational and communication cost per agent depends on the number of neighbours rather than the overall number of agents. We derive Diff-DAC from duality theory and provide novel insights into the actor-critic framework, showing that it is actually an instance of the dual-ascent method. We prove almost sure convergence of Diff-DAC to a common policy under general assumptions that hold even for deep neural network approximations. For more restrictive assumptions, we also prove that this common policy is a stationary point of an approximation of the original problem. Numerical results on multitask extensions of common continuous control benchmarks demonstrate that Diff-DAC stabilises learning and has a regularising effect that induces higher performance and better generalisation properties than previous architectures.

Download Full-text