scholarly journals Multi-UAV Cooperative Task Assignment Based on Half Random Q-Learning

Symmetry ◽  
2021 ◽  
Vol 13 (12) ◽  
pp. 2417
Author(s):  
Pengxing Zhu ◽  
Xi Fang

Unmanned aerial vehicle (UAV) clusters usually face problems such as complex environments, heterogeneous combat subjects, and realistic interference factors in the course of mission assignment. In order to reduce resource consumption and improve the task execution rate, it is very important to develop a reasonable allocation plan for the tasks. Therefore, this paper constructs a heterogeneous UAV multitask assignment model based on several realistic constraints and proposes an improved half-random Q-learning (HR Q-learning) algorithm. The algorithm is based on the Q-learning algorithm under reinforcement learning, and by changing the way the Q-learning algorithm selects the next action in the process of random exploration, the probability of obtaining an invalid action in the random case is reduced, and the exploration efficiency is improved, thus increasing the possibility of obtaining a better assignment scheme, this also ensures symmetry and synergy in the distribution process of the drones. Simulation experiments show that compared with Q-learning algorithm and other heuristic algorithms, HR Q-learning algorithm can improve the performance of task execution, including the ability to improve the rationality of task assignment, increasing the value of gains by 12.12%, this is equivalent to an average of one drone per mission saved, and higher success rate of task execution. This improvement provides a meaningful attempt for UAV task assignment.

Entropy ◽  
2021 ◽  
Vol 23 (6) ◽  
pp. 737
Author(s):  
Fengjie Sun ◽  
Xianchang Wang ◽  
Rui Zhang

An Unmanned Aerial Vehicle (UAV) can greatly reduce manpower in the agricultural plant protection such as watering, sowing, and pesticide spraying. It is essential to develop a Decision-making Support System (DSS) for UAVs to help them choose the correct action in states according to the policy. In an unknown environment, the method of formulating rules for UAVs to help them choose actions is not applicable, and it is a feasible solution to obtain the optimal policy through reinforcement learning. However, experiments show that the existing reinforcement learning algorithms cannot get the optimal policy for a UAV in the agricultural plant protection environment. In this work we propose an improved Q-learning algorithm based on similar state matching, and we prove theoretically that there has a greater probability for UAV choosing the optimal action according to the policy learned by the algorithm we proposed than the classic Q-learning algorithm in the agricultural plant protection environment. This proposed algorithm is implemented and tested on datasets that are evenly distributed based on real UAV parameters and real farm information. The performance evaluation of the algorithm is discussed in detail. Experimental results show that the algorithm we proposed can efficiently learn the optimal policy for UAVs in the agricultural plant protection environment.


2020 ◽  
Author(s):  
Lídia Rocha ◽  
Kelen Vivaldini

Unmanned Aerial Vehicle (UAV) has been increasingly employed in several missions with a pre-defined path. Over the years, UAV has become necessary in complex environments, where it demands high computational cost and execution time for traditional algorithms. To solve this problem meta-heuristic algorithms are used. Meta-heuristics are generic algorithms to solve problems without having to describe each step until the result and search for the best possible answer in an acceptable computational time. The simulations are made in Python, with it, a statistical analyses was realized based on execution time and path length between algorithms Particle Swarm Optimization (PSO), Grey Wolf Optimization (GWO) and Glowworm Swarm Optimization (GSO). Despite the GWO returns the paths in a shorter time, the PSO showed better performance with similar execution time and shorter path length. However, the reliability of the algorithms will depend on the size of the environment. PSO is less reliable in large environments, while the GWO maintains the same reliability.


Author(s):  
Shaurya Shriyam ◽  
Satyandra K. Gupta

Most complex missions comprise of spatially separated tasks which have to be finished using teams of mobile robots. The main challenges for planning such missions are forming effective coalitions among available robots and assigning them to tasks in such a way that the expected mission completion time is minimized. Our model allows task execution by a fraction of the assigned team even when the rest of the team has not yet arrived at the task location. We also allow tasks to be interrupted and robots of assigned teams to be rescheduled from an unfinished task to another task. We describe five different heuristic algorithms to compute schedules for all robots assigned to the mission. We compare them and analyze the computational performance of the best performing strategy. We also show how to handle uncertainty that may arise during traveling or task execution and then study the effect of varying uncertainty on the minimization of mission completion time.


2009 ◽  
Vol 3 (1) ◽  
pp. 16-26
Author(s):  
Peng-Yeng Yin ◽  
Benjamin B.M. Shao ◽  
Yung-Pin Cheng ◽  
Chung-Chao Yeh

We consider the assignment of program tasks to processors in distributed computing systems such that system cost is minimized and resource constraints are satisfied. Several formulations for this task assignment problem (TAP) have been proposed in the literature. Most of these TAP formulations, however, are NP-complete and thus finding exact solutions is computationally intractable. Recently, some approximation methods like simulated annealing have been proposed, and simulation results exhibited the potential to solve the TAP using metaheuristics. In order to better understand the strengths and weaknesses of various metaheuristics applied to the TAP, we first propose two alternative metaheuristics— one using genetic algorithm and the other reinforcement learning algorithm—as well as their implementation details. Extensive computational evidences of the two heuristic algorithms against that of simulated annealing are presented, compared and discussed. Based on these experimental results, a hybrid strategy employing both metaheuristics is then proposed in order to solve the TAP more effectively and efficiently.


2009 ◽  
Vol 28 (12) ◽  
pp. 3268-3270
Author(s):  
Chao WANG ◽  
Jing GUO ◽  
Zhen-qiang BAO

Sign in / Sign up

Export Citation Format

Share Document