Leader-Following Multi-Agent Coordination Control Accompanied With Hierarchical Q(λ)-Learning for Pursuit

Frontiers in Control Engineering ◽

10.3389/fcteg.2021.721475 ◽

2021 ◽

Vol 2 ◽

Author(s):

Zhe-Yang Zhu ◽

Cheng-Lin Liu

Keyword(s):

Two Dimensional ◽

Coordination Control ◽

Pursuit Problem ◽

Q Learning ◽

Leader Following ◽

Multi Agent ◽

Simulation Results ◽

Agent Coordination

In this paper, we investigate a pursuit problem with multi-pursuer and single evader in a two-dimensional grid space with obstacles. Taking a different approach to previous studies, this paper aims to address a pursuit problem in which only some pursuers can directly access the evader’s position. It also proposes using a hierarchical Q(λ)-learning with improved reward, with simulation results indicating that the proposed method outperforms Q-learning.

Download Full-text

Adaptive fuzzy iterative learning control with initial-state learning for coordination control of leader-following multi-agent systems

Fuzzy Sets and Systems ◽

10.1016/j.fss.2013.10.010 ◽

2014 ◽

Vol 248 ◽

pp. 122-137 ◽

Cited By ~ 40

Author(s):

Junmin Li ◽

Jinsha Li

Keyword(s):

Iterative Learning Control ◽

Learning Control ◽

Iterative Learning ◽

Coordination Control ◽

Multi Agent Systems ◽

Initial State ◽

Agent Systems ◽

Leader Following ◽

Multi Agent ◽

State Learning

Download Full-text

A generalized hierarchical nearly cyclic pursuit for the leader-following consensus problem in multi-agent systems

Transactions of the Institute of Measurement and Control ◽

10.1177/0142331216686157 ◽

2017 ◽

Vol 40 (5) ◽

pp. 1529-1537 ◽

Cited By ~ 3

Author(s):

Muhammad Iqbal ◽

John Leth ◽

Trung D Ngo

Keyword(s):

Convergence Rate ◽

Consensus Problem ◽

Multi Agent Systems ◽

Agent Systems ◽

Cyclic Pursuit ◽

Leader Following ◽

Multi Agent ◽

Simulation Results ◽

Pursuit Strategy

In this paper, we solve the leader-following consensus problem using a hierarchical nearly cyclic pursuit (HNCP) strategy for multi-agent systems. We extend the nearly cyclic pursuit strategy and the two-layer HNCP to the generalized L-layer HNCP that enables the agents to rendezvous at a point dictated by a beacon. We prove that the convergence rate of the generalized L-layer HNCP for the leader-following consensus problem is faster than that of the nearly cyclic pursuit. Simulation results demonstrate the effectiveness of the proposed method.

Download Full-text

Multi-agent coordination method based on fuzzy Q-learning

2008 7th World Congress on Intelligent Control and Automation ◽

10.1109/wcica.2008.4593811 ◽

2008 ◽

Cited By ~ 1

Author(s):

Jun Peng ◽

Miao Liu ◽

Min Wu ◽

Xiaoyong Zhang ◽

Kuo-Chi Lin

Keyword(s):

Q Learning ◽

Multi Agent ◽

Agent Coordination

Download Full-text

Jamming-Resilient Wideband Cognitive Radios with Multi-Agent Reinforcement Learning

International Journal of Software Science and Computational Intelligence ◽

10.4018/ijssci.2018070101 ◽

2018 ◽

Vol 10 (3) ◽

pp. 1-23 ◽

Cited By ~ 1

Author(s):

Mohamed A. Aref ◽

Sudharman K. Jayaweera

Keyword(s):

Learning Algorithm ◽

Cognitive Radios ◽

System Model ◽

Interference Avoidance ◽

Q Learning ◽

Selection Policy ◽

Cognitive Framework ◽

Multi Agent ◽

Simulation Results ◽

The Impact

This article presents a design of a wideband autonomous cognitive radio (WACR) for anti-jamming and interference-avoidance. The proposed system model allows multiple WACRs to simultaneously operate over the same spectrum range producing a multi-agent environment. The objective of each radio is to predict and evade a dynamic jammer signal as well as avoiding transmissions of other WACRs. The proposed cognitive framework is made of two operations: sensing and transmission. Each operation is helped by its own learning algorithm based on Q-learning, but both will be experiencing the same RF environment. The simulation results indicate that the proposed cognitive anti-jamming technique has low computational complexity and significantly outperforms non-cognitive sub-band selection policy while being sufficiently robust against the impact of sensing errors.

Download Full-text

Identification of Hessian matrix in distributed gradient-based multi-agent coordination control systems

Numerical Algebra Control & Optimization ◽

10.3934/naco.2019020 ◽

2019 ◽

Vol 9 (3) ◽

pp. 297-318

Author(s):

Zhiyong Sun ◽

◽

Toshiharu Sugie ◽

◽

Keyword(s):

Control Systems ◽

Hessian Matrix ◽

Coordination Control ◽

Gradient Based ◽

Multi Agent ◽

Agent Coordination

Download Full-text

Four-Dimensional Trajectory Generation for UAVs Based on Multi-Agent Q Learning

Journal of Navigation ◽

10.1017/s0373463320000016 ◽

2020 ◽

Vol 73 (4) ◽

pp. 874-891

Author(s):

Wenjie Zhao ◽

Zhou Fang ◽

Zuqiang Yang

Keyword(s):

Unmanned Aerial Vehicles ◽

Arrival Time ◽

Trajectory Generation ◽

Hill Climbing ◽

Q Learning ◽

Continuous State ◽

Multi Agent ◽

Multiple Unmanned Aerial Vehicles ◽

Simulation Results ◽

Multi Uav

A distributed four-dimensional (4D) trajectory generation method based on multi-agent Q learning is presented for multiple unmanned aerial vehicles (UAVs). Based on this method, each vehicle can intelligently generate collision-free 4D trajectories for time-constrained cooperative flight tasks. For a single UAV, the 4D trajectory is generated by the bionic improved tau gravity guidance strategy, which can synchronously guide the position and velocity to the desired values at the arrival time. Furthermore, to optimise trajectory parameters, the continuous state and action wire fitting neural network Q (WFNNQ) learning method is applied. For multi-UAV applications, the learning is organised by the win or learn fast-policy hill climbing (WoLF-PHC) algorithm. Dynamic simulation results show that the proposed method can efficiently provide 4D trajectories for the multi-UAV system in challenging simultaneous arrival tasks, and the fully trained method can be used in similar trajectory generation scenarios.

Download Full-text

EVALUATING STAGGERED WORKING HOURS USING A MULTI-AGENT-BASED Q-LEARNING MODEL

Transport ◽

10.3846/16484142.2014.953997 ◽

2014 ◽

Vol 29 (3) ◽

pp. 296-306 ◽

Cited By ~ 2

Author(s):

Min Yang ◽

Dounan Tang ◽

Haoyang Ding ◽

Wei Wang ◽

Tianming Luo ◽

...

Keyword(s):

Learning Algorithm ◽

Activity Patterns ◽

Dynamic Environment ◽

Working Hours ◽

Experimental Tests ◽

Urban Transport ◽

Agent Based ◽

Q Learning ◽

Multi Agent ◽

Simulation Results

Staggered working hours has the potential to alleviate excessive demands on urban transport networks during the morning and afternoon peak hours and influence the travel behavior of individuals by affecting their activity schedules and reducing their commuting times. This study proposes a multi-agent-based Q-learning algorithm for evaluating the influence of staggered work hours by simulating travelers’ time and location choices in their activity patterns. Interactions among multiple travelers were also considered. Various types of agents were identified based on real activity–travel data for a mid-sized city in China. Reward functions based on time and location information were constructed using Origin–Destination (OD) survey data to simulate individuals’ temporal and spatial choices simultaneously. Interactions among individuals were then described by introducing a road impedance function to formulate a dynamic environment in which one traveler’s decisions influence the decisions of other travelers. Lastly, by applying the Q-learning algorithm, individuals’ activity–travel patterns under staggered working hours were simulated. Based on the simulation results, the effects of staggered working hours were evaluated on both a macroscopic level, at which the space–time distribution of the traffic volume in the network was determined, and a microscopic level, at which the timing of individuals’ leisure activities and their daily household commuting costs were determined. Based on the simulation results and experimental tests, an optimal scheme for staggering working hours was developed.

Download Full-text

Research on multi-agent collaborative hunting algorithm based on game theory and Q-learning for a single escaper

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-191222 ◽

2021 ◽

Vol 40 (1) ◽

pp. 205-219

Author(s):

Yanbin Zheng ◽

Wenxin Fan ◽

Mengyun Han

Keyword(s):

Game Theory ◽

Simulation Experiment ◽

Learning Ability ◽

Equilibrium Strategy ◽

Q Learning ◽

Cooperative Hunting ◽

Multi Agent ◽

Coordination And Collaboration ◽

Agent Coordination ◽

Better Than

The multi-agent collaborative hunting problem is a typical problem in multi-agent coordination and collaboration research. Aiming at the multi-agent hunting problem with learning ability, a collaborative hunt method based on game theory and Q-learning is proposed. Firstly, a cooperative hunting team is established and a game model of cooperative hunting is built. Secondly, through the learning of the escaper’s strategy choice, the trajectory of the escaper’s limited T-step cumulative reward is established, and the trajectory is adjusted to the hunter’s strategy set. Finally, the Nash equilibrium solution is obtained by solving the cooperative hunt game, and each hunter executes the equilibrium strategy to complete the hunt task. C# simulation experiment shows that under the same conditions, this method can effectively solve the hunting problem of a single runaway with learning ability in the obstacle environment, and the comparative analysis of experimental data shows that the efficiency of this method is better than other methods.

Download Full-text