Data-Based Approximate Policy Iteration for Optimal Course-Keeping Control of Marine Surface Vessels

Author(s):  
Yuming Bai ◽  
Yifan Liu ◽  
Qihe Shan ◽  
Tieshan Li ◽  
Yuzhen Lu
Author(s):  
Daxue Liu ◽  
Jun Wu ◽  
Xin Xu

Multi-agent reinforcement learning (MARL) provides a useful and flexible framework for multi-agent coordination in uncertain dynamic environments. However, the generalization ability and scalability of algorithms to large problem sizes, already problematic in single-agent RL, is an even more formidable obstacle in MARL applications. In this paper, a new MARL method based on ordinal action selection and approximate policy iteration called OAPI (Ordinal Approximate Policy Iteration), is presented to address the scalability issue of MARL algorithms in common-interest Markov Games. In OAPI, an ordinal action selection and learning strategy is integrated with distributed approximate policy iteration not only to simplify the policy space and eliminate the conflicts in multi-agent coordination, but also to realize the approximation of near-optimal policies for Markov Games with large state spaces. Based on the simplified policy space using ordinal action selection, the OAPI algorithm implements distributed approximate policy iteration utilizing online least-squares policy iteration (LSPI). This resulted in multi-agent coordination with good convergence properties with reduced computational complexity. The simulation results of a coordinated multi-robot navigation task illustrate the feasibility and effectiveness of the proposed approach.


2020 ◽  
Vol 10 (5) ◽  
pp. 1686
Author(s):  
Yung-Yue Chen ◽  
Chun-Yen Lee ◽  
Shao-Han Tseng ◽  
Wei-Min Hu

For energy conservation, nonlinear-optimal-control-law design for marine surface vessels has become a crucial ocean technology for the current ship industry. A well-controlled marine surface vessel with optimal properties must possess accurate tracking capability for accomplishing sailing missions. To achieve this design target, a closed-form nonlinear optimal control law for the trajectory- and waypoint-tracking problem of autonomous marine surface vessels (AUSVs) is presented in this investigation. The proposed approach, based on the optimal control concept, can be effectively applied to generate control commands on marine surface vessels operating in sailing scenarios where ocean environmental disturbances are random and unpredictable. In general, it is difficult to directly obtain a closed-form solution from this optimal tracking problem. Fortunately, by having the adequate choice of state-variable transformation, the nonlinear optimal tracking problem of autonomous marine surface vessels can be converted into a solvable nonlinear time-varying differential equation. The solved closed-form solution can also be acquired with an easy-to-implement control structure for energy-saving purposes.


2008 ◽  
Vol 72 (3) ◽  
pp. 157-171 ◽  
Author(s):  
Christos Dimitrakakis ◽  
Michail G. Lagoudakis

Sign in / Sign up

Export Citation Format

Share Document