scholarly journals Batch Reinforcement Learning of Feasible Trajectories in a Ship Maneuvering Simulator

2018 ◽  
Author(s):  
José Amendola ◽  
Eduardo A. Tannuri ◽  
Fabio G. Cozman ◽  
Anna H. Reali

Ship control in port channels is a challenging problem that has resisted automated solutions. In this paper we focus on reinforcement learning of control signals so as to steer ships in their maneuvers. The learning process uses fitted Q iteration together with a Ship Maneuvering Simulator. Domain knowledge is used to develop a compact state-space model; we show how this model and the learning process lead to ship maneuvering under difficult conditions.

Author(s):  
Rameesha Thayale Veedu ◽  
Parameswaran Krishnankutty

Ship maneuvering performance is usually predicted in calm water conditions, which provide valuable information about ship’s turning ability and its directional stability in the early design stages. Investigation of maneuvering simulation in waves is more realistic since the ship usually sails through waves. So it is important to study the effect of waves on the turning ability of a ship. This paper presents the maneuvering simulation for a container ship in presence of regular waves based on unified state space model for ship maneuvering. Standard maneuvers like turning circle and zigzag maneuver are simulated for the head sea condition and the same are compared with calm water maneuvers. The present study shows that wave significantly affects the maneuvering characteristics of the ship and hence cannot be neglected.


2019 ◽  
Vol 63 (7) ◽  
pp. 995-1003
Author(s):  
Z Xu ◽  
L Cao ◽  
X Chen

Abstract Simple and efficient exploration remains a core challenge in deep reinforcement learning. While many exploration methods can be applied to high-dimensional tasks, these methods manually adjust exploration parameters according to domain knowledge. This paper proposes a novel method that can automatically balance exploration and exploitation, as well as combine on-policy and off-policy update targets through a dynamic weighted way based on value difference. The proposed method does not directly affect the probability of a selected action but utilizes the value difference produced during the learning process to adjust update target for guiding the direction of agent’s learning. We demonstrate the performance of the proposed method on CartPole-v1, MountainCar-v0, and LunarLander-v2 classic control tasks from the OpenAI Gym. Empirical evaluation results show that by integrating on-policy and off-policy update targets dynamically, this method exhibits superior performance and stability than does the exclusive use of the update target.


2005 ◽  
Vol 15 (09) ◽  
pp. 2717-2746 ◽  
Author(s):  
THOR I. FOSSEN

This article presents a unified state-space model for ship maneuvering, station-keeping, and control in a seaway. The frequency-dependent potential and viscous damping terms, which in classic theory results in a convolution integral not suited for real-time simulation, is compactly represented by using a state-space formulation. The separation of the vessel model into a low-frequency model (represented by zero-frequency added mass and damping) and a wave-frequency model (represented by motion transfer functions or RAOs), which is commonly used for simulation, is hence made superfluous.


Author(s):  
Eduardo F. Morales ◽  
Julio H. Zaragoza

This chapter introduces an approach for reinforcement learning based on a relational representation that: (i) can be applied over large search spaces, (ii) can incorporate domain knowledge, and (iii) can use previously learned policies on different, but similar, problems. The underlying idea is to represent states as sets of first order relations, actions in terms of those relations, and to learn policies over such generalized representation. It is shown how this representation can produce powerful abstractions and that policies learned over this generalized representation can be directly applied, without any further learning, to other problems that can be characterized by the same set of relations. To accelerate the learning process, we present an extension where traces of the tasks to be learned are provided by the user. These traces are used to select only a small subset of possible actions increasing the convergence of the learning algorithms. The effectiveness of the approach is tested on a flight simulator and on a mobile robot.


Robotics ◽  
2013 ◽  
pp. 248-273
Author(s):  
Eduardo F. Morales ◽  
Julio H. Zaragoza

This chapter introduces an approach for reinforcement learning based on a relational representation that: (i) can be applied over large search spaces, (ii) can incorporate domain knowledge, and (iii) can use previously learned policies on different, but similar, problems. The underlying idea is to represent states as sets of first order relations, actions in terms of those relations, and to learn policies over such generalized representation. It is shown how this representation can produce powerful abstractions and that policies learned over this generalized representation can be directly applied, without any further learning, to other problems that can be characterized by the same set of relations. To accelerate the learning process, we present an extension where traces of the tasks to be learned are provided by the user. These traces are used to select only a small subset of possible actions increasing the convergence of the learning algorithms. The effectiveness of the approach is tested on a flight simulator and on a mobile robot.


1998 ◽  
Vol 37 (12) ◽  
pp. 149-156 ◽  
Author(s):  
Carl-Fredrik Lindberg

This paper contains two contributions. First it is shown, in a simulation study using the IAWQ model, that a linear multivariable time-invariant state-space model can be used to predict the ammonium and nitrate concentration in the last aerated zone in a pre-denitrifying activated sludge process. Secondly, using the estimated linear model, a multivariable linear quadratic (LQ) controller is designed and used to control the ammonium and nitrate concentration.


Sign in / Sign up

Export Citation Format

Share Document