Obtaining Human Experience for Intelligent Dredger Control: A Reinforcement Learning Approach

This work presents a reinforcement learning approach for intelligent decision-making of a Cutter Suction Dredger (CSD), which is a special type of vessel for deepening harbors, constructing ports or navigational channels, and reclaiming landfills. Currently, CSDs are usually controlled by human operators, and the production rate is mainly determined by the so-called cutting process (i.e., cutting the underwater soil into fragments). Long-term manual operation is likely to cause driving fatigue, resulting in operational accidents and inefficiencies. To reduce the labor intensity of the operator, we seek an intelligent controller the can manipulate the cutting process to replace human operators. To this end, our proposed reinforcement learning approach consists of two parts. In the first part, we employ a neural network model to construct a virtual environment based on the historical dredging data. In the second part, we develop a reinforcement learning model that can lean the optimal control policy by interacting with the virtual environment to obtain human experience. The results show that the proposed learning approach can successfully imitate the dredging behavior of an experienced human operator. Moreover, the learning approach can outperform the operator in a way that can make quick responses to the change in uncertain environments.

Download Full-text

Intelligent Decision-Making for 3-Dimensional Dynamic Obstacle Avoidance of UAV Based on Deep Reinforcement Learning

2019 11th International Conference on Wireless Communications and Signal Processing (WCSP) ◽

10.1109/wcsp.2019.8928110 ◽

2019 ◽

Author(s):

Xiao Han ◽

Jing Wang ◽

Jiayin Xue ◽

Qinyu Zhang

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Obstacle Avoidance ◽

3 Dimensional ◽

Intelligent Decision Making ◽

Intelligent Decision ◽

Dynamic Obstacle Avoidance ◽

Dynamic Obstacle

Download Full-text

Reinforcement learning approach to learning human experience in tuning cavity filters

2015 IEEE International Conference on Robotics and Biomimetics (ROBIO) ◽

10.1109/robio.2015.7419091 ◽

2015 ◽

Cited By ~ 4

Author(s):

Zhiyang Wang ◽

Jingfeng Yang ◽

Jianbing Hu ◽

Wei Feng ◽

Yongsheng Ou

Keyword(s):

Reinforcement Learning ◽

Human Experience ◽

Learning Approach ◽

Approach To Learning ◽

Cavity Filters

Download Full-text

Complementary Meta-Reinforcement Learning for Fault-Adaptive Control

Annual Conference of the PHM Society ◽

10.36001/phmconf.2020.v12i1.1289 ◽

2020 ◽

Vol 12 (1) ◽

pp. 8

Author(s):

Ibrahim Ahmed ◽

Marcos Quiñones-Grueiro ◽

Gautam Biswas

Keyword(s):

Adaptive Control ◽

Reinforcement Learning ◽

Fault Tolerant ◽

Control Policy ◽

Time Constraints ◽

Learning Approach ◽

Fuel Tanks ◽

Meta Learning ◽

New Policies ◽

Abrupt Faults

Faults are endemic to all systems. Adaptive fault-tolerant control accepts degraded performance under faults in exchange for continued operation. In systems with abrupt faults and strict time constraints, it is imperative for control to adapt fast to system changes. We present a meta-reinforcement learning approach that quickly adapts control policy. The approach builds upon model-agnostic meta learning (MAML). The controller maintains a complement of prior policies learned under system faults. This ``library" is evaluated on a system after a new fault to initialize the new policy. This contrasts with MAML where the controller samples new policies from a distribution of similar systems at each update step to achieve the new policy. Our approach improves sample efficiency of the reinforcement learning process. We evaluate this on a model of fuel tanks under abrupt faults.

Download Full-text

Deep reinforcement learning based intelligent decision making for multi-player sequential game with uncertain irrational players (Conference Presentation)

Sensors and Systems for Space Applications XIII ◽

10.1117/12.2556224 ◽

2020 ◽

Author(s):

Zejian Zhou ◽

Hao Xu

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Sequential Game ◽

Intelligent Decision Making ◽

Intelligent Decision

Download Full-text

Research and Implementation of Intelligent Decision Based on a Priori Knowledge and DQN Algorithms in Wargame Environment

Electronics ◽

10.3390/electronics9101668 ◽

2020 ◽

Vol 9 (10) ◽

pp. 1668

Author(s):

Yuxiang Sun ◽

Bo Yuan ◽

Tao Zhang ◽

Bojian Tang ◽

Wanwen Zheng ◽

...

Keyword(s):

Reinforcement Learning ◽

Learning Algorithm ◽

A Priori ◽

A Priori Knowledge ◽

Intelligent Decision Making ◽

Intelligent Decision ◽

Model Complex ◽

Reinforcement Learning Models ◽

High Level ◽

Priori Knowledge

The reinforcement learning problem of complex action control in a multi-player wargame has been a hot research topic in recent years. In this paper, a game system based on turn-based confrontation is designed and implemented with state-of-the-art deep reinforcement learning models. Specifically, we first design a Q-learning algorithm to achieve intelligent decision-making, which is based on the DQN (Deep Q Network) to model complex game behaviors. Then, an a priori knowledge-based algorithm PK-DQN (Prior Knowledge-Deep Q Network) is introduced to improve the DQN algorithm, which accelerates the convergence speed and stability of the algorithm. The experiments demonstrate the correctness of the PK-DQN algorithm, it is validated, and its performance surpasses the conventional DQN algorithm. Furthermore, the PK-DQN algorithm shows effectiveness in defeating the high level of rule-based opponents, which provides promising results for the exploration of the field of smart chess and intelligent game deduction.

Download Full-text

Deep Reinforcement Learning Based Intelligent Decision Making for Two-player Sequential Game with Uncertain Irrational Player

2019 IEEE Symposium Series on Computational Intelligence (SSCI) ◽

10.1109/ssci44817.2019.9002811 ◽

2019 ◽

Author(s):

Zejian Zhou ◽

Hao Xu

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Sequential Game ◽

Intelligent Decision Making ◽

Intelligent Decision

Download Full-text

Intelligent Decision-Making of Scheduling for Dynamic Permutation Flowshop via Deep Reinforcement Learning

Sensors ◽

10.3390/s21031019 ◽

2021 ◽

Vol 21 (3) ◽

pp. 1019

Author(s):

Shengluo Yang ◽

Zhigang Xu ◽

Junyi Wang

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Real Time ◽

Dynamic Scheduling ◽

Scheduling Problems ◽

Permutation Flowshop ◽

Scheduling System ◽

Single Action ◽

Intelligent Decision Making ◽

Intelligent Decision

Dynamic scheduling problems have been receiving increasing attention in recent years due to their practical implications. To realize real-time and the intelligent decision-making of dynamic scheduling, we studied dynamic permutation flowshop scheduling problem (PFSP) with new job arrival using deep reinforcement learning (DRL). A system architecture for solving dynamic PFSP using DRL is proposed, and the mathematical model to minimize total tardiness cost is established. Additionally, the intelligent scheduling system based on DRL is modeled, with state features, actions, and reward designed. Moreover, the advantage actor-critic (A2C) algorithm is adapted to train the scheduling agent. The learning curve indicates that the scheduling agent learned to generate better solutions efficiently during training. Extensive experiments are carried out to compare the A2C-based scheduling agent with every single action, other DRL algorithms, and meta-heuristics. The results show the well performance of the A2C-based scheduling agent considering solution quality, CPU times, and generalization. Notably, the trained agent generates a scheduling action only in 2.16 ms on average, which is almost instantaneous and can be used for real-time scheduling. Our work can help to build a self-learning, real-time optimizing, and intelligent decision-making scheduling system.

Download Full-text

UAV maneuvering decision -making algorithm based on Twin Delayed Deep Deterministic Policy Gradient Algorithm

Journal of Artificial Intelligence and Technology ◽

10.37965/jait.2021.12003 ◽

2021 ◽

Author(s):

Shuangxia Bai ◽

Shaomei Song ◽

Shiyang Liang ◽

Jianmei Wang ◽

Bo Li ◽

...

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Simulation Experiment ◽

Gradient Algorithm ◽

Intelligent Decision Making ◽

Intelligent Decision ◽

Air Combat ◽

Policy Gradient ◽

Markov Decision ◽

Combat Problems

Aiming at intelligent decision-making of UAV based on situation information in air combat, a novel maneuvering decision method based on deep reinforcement learning is proposed in this paper. The autonomous maneuvering model of UAV is established by Markov Decision Process. The Twin Delayed Deep Deterministic Policy Gradient(TD3) algorithm and the Deep Deterministic Policy Gradient (DDPG) algorithm in deep reinforcement learning are used to train the model, and the experimental results of the two algorithms are analyzed and compared. The simulation experiment results show that compared with the DDPG algorithm, the TD3 algorithm has stronger decision-making performance and faster convergence speed, and is more suitable forsolving combat problems. The algorithm proposed in this paper enables UAVs to autonomously make maneuvering decisions based on situation information such as position, speed, and relative azimuth, adjust their actions to approach and successfully strike the enemy, providing a new method for UAVs to make intelligent maneuvering decisions during air combat.

Download Full-text

An intelligent decision-making method for anti-jamming communication based on deep reinforcement learning

Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University ◽

10.1051/jnwpu/20213930641 ◽

2021 ◽

Vol 39 (3) ◽

pp. 641-649

Author(s):

Bailin Song ◽

Hua Xu ◽

Lei Jiang ◽

Ning Rao

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Success Rate ◽

Intelligent Decision Making ◽

Decision Network ◽

Intelligent Decision ◽

Experience Replay ◽

Average Success Rate ◽

Convergent Algorithm ◽

Fast Decision

In order to solve the problem of intelligent anti-jamming decision-making in battlefield communication, this paper designs an intelligent decision-making method for communication anti-jamming based on deep reinforcement learning. Introducing experience replay and dynamic epsilon mechanism based on PHC under the framework of DQN algorithm, a dynamic epsilon-DQN intelligent decision-making method is proposed. The algorithm can better select the value of epsilon according to the state of the decision network and improve the convergence speed and decision success rate. During the decision-making process, the jamming signals of all communication frequencies are detected, and the results are input into the decision-making algorithm as jamming discriminant information, so that we can effectively avoid being jammed under the condition of no prior jamming information. The experimental results show that the proposed method adapts to various communication models, has a fast decision-making speed, and the average success rate of the convergent algorithm can reach more than 95%, which has a great advantage over the existing decision-making methods.

Download Full-text