scholarly journals A Control Algorithm for Sea–Air Cooperative Observation Tasks Based on a Data-Driven Algorithm

2021 ◽  
Vol 9 (11) ◽  
pp. 1189
Author(s):  
Kai Hu ◽  
Xu Chen ◽  
Qingfeng Xia ◽  
Junlan Jin ◽  
Liguo Weng

There is tremendous demand for marine environmental observation, which requires the development of a multi-agent cooperative observation algorithm to guide Unmanned Surface Vehicles (USVs) and Unmanned Aerial Vehicles (UAVs) to observe isotherm data of the mesoscale vortex. The task include two steps: firstly, USVs search out the isotherm, navigate independently along the isotherm, and collect marine data; secondly, a UAV takes off, and in its one round trip, the UAV and USVs jointly perform the task of the UAV reading the observation data from USVs. In this paper, aiming at the first problem of the USV following the isotherm in an unknown environment, a data-driven Deep Deterministic Policy Gradient (DDPG) control algorithm is designed that allows USVs to navigate independently along isotherms in unknown environments. In addition, a hybrid cooperative control algorithm based on a multi-agent DDPG is adopted to solve the second problem, which enables USVs and a UAV to complete data reading tasks with the shortest flight distance of the UAV. The experimental simulation results show that the trained system can complete this tas, with good stability and accuracy.

2021 ◽  
Vol 11 (17) ◽  
pp. 8146
Author(s):  
Kai Hu ◽  
Lang Tian ◽  
Chenghang Weng ◽  
Liguo Weng ◽  
Qiang Zang ◽  
...  

In some environments where manual work cannot be carried out, snake manipulators are instead used to improve the level of automatic work and ensure personal safety. However, the structure of the snake manipulator is diverse, which renders it difficult to establish an environmental model of the control system. It is difficult to obtain an ideal control effect by using the traditional manipulator control method. In view of this, this paper proposes a data-driven snake manipulator control algorithm. After collecting data, the algorithm uses the strong learning and decision-making ability of the deep deterministic strategy gradient to learn these system data. A data-driven controller based on the deep deterministic policy gradient was trained in order to solve the manipulator system control problem when the control system environment model is uncertain or even unknown. The data of simulation experiments show that the control algorithm has good stability and accuracy in the case of model uncertainty.


Energies ◽  
2019 ◽  
Vol 12 (7) ◽  
pp. 1402 ◽  
Author(s):  
Haibo Zhang ◽  
Xiaoming Liu ◽  
Honghai Ji ◽  
Zhongsheng Hou ◽  
Lingling Fan

Data-driven intelligent transportation systems (D2ITSs) have drawn significant attention lately. This work investigates a novel multi-agent-based data-driven distributed adaptive cooperative control (MA-DD-DACC) method for multi-direction queuing strength balance with changeable cycle in urban traffic signal timing. Compared with the conventional signal control strategies, the proposed MA-DD-DACC method combined with an online parameter learning law can be applied for traffic signal control in a distributed manner by merely utilizing the collected I/O traffic queueing length data and network topology of multi-direction signal controllers at a single intersection. A Lyapunov-based stability analysis shows that the proposed approach guarantees uniform ultimate boundedness of the distributed consensus coordinated errors of queuing strength. The numerical and experimental comparison simulations are performed on a VISSIM-VB-MATLAB joint simulation platform to verify the effectiveness of the proposed approach.


2022 ◽  
pp. 1-20
Author(s):  
D. Xu ◽  
G. Chen

Abstract In this paper, we expolore Multi-Agent Reinforcement Learning (MARL) methods for unmanned aerial vehicle (UAV) cluster. Considering that the current UAV cluster is still in the program control stage, the fully autonomous and intelligent cooperative combat has not been realised. In order to realise the autonomous planning of the UAV cluster according to the changing environment and cooperate with each other to complete the combat goal, we propose a new MARL framework. It adopts the policy of centralised training with decentralised execution, and uses Actor-Critic network to select the execution action and then to make the corresponding evaluation. The new algorithm makes three key improvements on the basis of Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm. The first is to improve learning framework; it makes the calculated Q value more accurate. The second is to add collision avoidance setting, which can increase the operational safety factor. And the third is to adjust reward mechanism; it can effectively improve the cluster’s cooperative ability. Then the improved MADDPG algorithm is tested by performing two conventional combat missions. The simulation results show that the learning efficiency is obviously improved, and the operational safety factor is further increased compared with the previous algorithm.


Sign in / Sign up

Export Citation Format

Share Document