decentralized execution Latest Research Papers

In many existing multi-agent reinforcement learning tasks, each agent observes all the other agents from its own perspective. In addition, the training process is centralized, namely the critic of each agent can access the policies of all the agents. This scheme has certain limitations since every single agent can only obtain the information of its neighbor agents due to the communication range in practical applications. Therefore, in this paper, a multi-agent distributed deep deterministic policy gradient (MAD3PG) approach is presented with decentralized actors and distributed critics to realize multi-agent distributed tracking. The distinguishing feature of the proposed framework is that we adopted the multi-agent distributed training with decentralized execution, where each critic only takes the agent’s and the neighbor agents’ policies into account. Experiments were conducted in the distributed tracking tasks based on multi-agent particle environments where N(N=3,N=5) agents track a target agent with partial observation. The results showed that the proposed method achieves a higher reward with a shorter training time compared to other methods, including MADDPG, DDPG, PPO, and DQN. The proposed novel method leads to a more efficient and effective multi-agent tracking.

Download Full-text

Decentralized Offloading Strategies Based on Reinforcement Learning for Multi-Access Edge Computing

Information ◽

10.3390/info12090343 ◽

2021 ◽

Vol 12 (9) ◽

pp. 343

Author(s):

Chunyang Hu ◽

Jingchen Li ◽

Haobin Shi ◽

Bin Ning ◽

Qiong Gu

Keyword(s):

Reinforcement Learning ◽

Large Scale ◽

Learning Model ◽

Learning Technologies ◽

Edge Computing ◽

Gradient Algorithm ◽

Computing Systems ◽

Decentralized Execution ◽

Multi Access ◽

Reinforcement Learning Model

Using reinforcement learning technologies to learn offloading strategies for multi-access edge computing systems has been developed by researchers. However, large-scale systems are unsuitable for reinforcement learning, due to their huge state spaces and offloading behaviors. For this reason, this work introduces the centralized training and decentralized execution mechanism, designing a decentralized reinforcement learning model for multi-access edge computing systems. Considering a cloud server and several edge servers, we separate the training and execution in the reinforcement learning model. The execution happens in edge devices of the system, and edge servers need no communication. Conversely, the training process occurs at the cloud device, which causes a lower transmission latency. The developed method uses a deep deterministic policy gradient algorithm to optimize offloading strategies. The simulated experiment shows that our method can learn the offloading strategy for each edge device efficiently.

Download Full-text

Multi-UAV Navigation for Partially Observable Communication Coverage by Graph Reinforcement Learning

10.36227/techrxiv.15048273.v1 ◽

2021 ◽

Author(s):

Zhenhui Ye

Keyword(s):

Reinforcement Learning ◽

Ad Hoc Network ◽

Ad Hoc ◽

State Of The Art ◽

Target Area ◽

Proposed Model ◽

Decentralized Execution ◽

Partially Observable ◽

Gated Recurrent Unit ◽

Uav Navigation

<div>In this paper, we aim to design a deep reinforcement learning(DRL) based control solution to navigate a swarm of unmanned aerial vehicles (UAVs) to fly around an unexplored target area under provide optimal communication coverage for the ground mobile users. Compared with existing DRL-based solutions that mainly solve the problem with global observation and centralized training, a practical and efficient Decentralized Training and Decentralized Execution(DTDE) framework is desirable to train and deploy each UAV in a distributed manner. To this end, we propose a novel DRL approach named Deep Recurrent Graph Network(DRGN) that makes use of Graph Attention Network-based Flying Ad-hoc Network(GAT-FANET) to achieve inter-UAV communications and Gated Recurrent Unit (GRU) to record historical information. We conducted extensive experiments to define an appropriate structure for GAT-FANET and examine the performance of DRGN. The simulation results show that the proposed model outperforms four state-of-the-art DRL-based approaches and four heuristic baselines, and demonstrate the scalability, transferability, robustness, and interpretability of DRGN.</div>

Download Full-text

Multi-UAV Navigation for Partially Observable Communication Coverage by Graph Reinforcement Learning

10.36227/techrxiv.15048273 ◽

2021 ◽

Author(s):

Zhenhui Ye

Keyword(s):

Reinforcement Learning ◽

Ad Hoc Network ◽

Ad Hoc ◽

State Of The Art ◽

Target Area ◽

Proposed Model ◽

Decentralized Execution ◽

Partially Observable ◽

Gated Recurrent Unit ◽

Uav Navigation

<div>In this paper, we aim to design a deep reinforcement learning(DRL) based control solution to navigate a swarm of unmanned aerial vehicles (UAVs) to fly around an unexplored target area under provide optimal communication coverage for the ground mobile users. Compared with existing DRL-based solutions that mainly solve the problem with global observation and centralized training, a practical and efficient Decentralized Training and Decentralized Execution(DTDE) framework is desirable to train and deploy each UAV in a distributed manner. To this end, we propose a novel DRL approach named Deep Recurrent Graph Network(DRGN) that makes use of Graph Attention Network-based Flying Ad-hoc Network(GAT-FANET) to achieve inter-UAV communications and Gated Recurrent Unit (GRU) to record historical information. We conducted extensive experiments to define an appropriate structure for GAT-FANET and examine the performance of DRGN. The simulation results show that the proposed model outperforms four state-of-the-art DRL-based approaches and four heuristic baselines, and demonstrate the scalability, transferability, robustness, and interpretability of DRGN.</div>

Download Full-text

Meta-process: a noval approach for decentralized execution of process

2021 International Conference on Service Science (ICSS) ◽

10.1109/icss53362.2021.00014 ◽

2021 ◽

Author(s):

Lei Fang ◽

Zhongguo Yang ◽

Shenghui Qin ◽

Mingzhu Zhang ◽

Sikandar Ali ◽

...

Keyword(s):

Decentralized Execution

Download Full-text

Towards resilient water supply in centralized control and decentralized execution mode

Journal of Water Supply Research and Technology—AQUA ◽

10.2166/aqua.2021.162 ◽

2021 ◽

Author(s):

Kegong Diao

Keyword(s):

Water Supply ◽

Distribution Systems ◽

Water Distribution ◽

Fractal Theory ◽

Centralized Control ◽

Complex Network Theory ◽

Natural Systems ◽

Multiscale Structure ◽

Sustainable Water Supply ◽

Decentralized Execution

Abstract This paper shares a vision that sustainable water supply requires resilient water infrastructures which are presumably in the centralized control and decentralized execution (CCDE) mode with multiscale resilience. The CCDE should be planned based on the multiscale structure of water infrastructures, in which the systems are divided into a number of hierarchically organized subsystems. The CCDE allows independent execution of all subsystems under normal situations yet coordination of subsystems at different scales to mitigate any disturbances during failure events, i.e. the multiscale resilience. This vision is discussed in detail for water distribution systems (WDSs). Specifically, the conceptual design of the multiscale CCDE is described, and progress on understanding the multiscale structures in WDSs is summarized based on the literature review. Furthermore, a few theories consistent with the multiscale CCDE concept are discussed which include the decomposition theorems, fractal theory, control theories, and complex network theory. The next step in the vision will be to identify the optimal multiscale structure for the CCDE based on the best trade-off of different goals of WDS analysis and management. This process needs supports from not only innovative modelling tools and extensive datasets and theories but also inspiring exemplar systems, e.g. natural systems.

Download Full-text

End-To-End Deep Reinforcement Learning for Decentralized Task Allocation and Navigation for a Multi-Robot System

Applied Sciences ◽

10.3390/app11072895 ◽

2021 ◽

Vol 11 (7) ◽

pp. 2895

Author(s):

Ahmed Elfakharany ◽

Zool Hilmi Ismail

Keyword(s):

Reinforcement Learning ◽

Task Allocation ◽

Robot System ◽

Simulation Results ◽

End To End ◽

Robot Task ◽

Decentralized Execution ◽

Multi Robot

In this paper, we present a novel deep reinforcement learning (DRL) based method that is used to perform multi-robot task allocation (MRTA) and navigation in an end-to-end fashion. The policy operates in a decentralized manner mapping raw sensor measurements to the robot’s steering commands without the need to construct a map of the environment. We also present a new metric called the Task Allocation Index (TAI), which measures the performance of a method that performs MRTA and navigation from end-to-end in performing MRTA. The policy was trained on a simulated gazebo environment. The centralized learning and decentralized execution paradigm was used for training the policy. The policy was evaluated quantitatively and visually. The simulation results showed the effectiveness of the proposed method deployed on multiple Turtlebot3 robots.

Download Full-text

2. Centralized Control/Decentralized Execution (DoD)

Killer Apps ◽

10.1515/9781478007272-005 ◽

2020 ◽

pp. 48-60

Keyword(s):

Centralized Control ◽

Decentralized Execution

Download Full-text

Research on the Multiagent Joint Proximal Policy Optimization Algorithm Controlling Cooperative Fixed-Wing UAV Obstacle Avoidance

Sensors ◽

10.3390/s20164546 ◽

2020 ◽

Vol 20 (16) ◽

pp. 4546

Author(s):

Weiwei Zhao ◽

Hairong Chu ◽

Xikui Miao ◽

Lihong Guo ◽

Honghai Shen ◽

...

Keyword(s):

Reinforcement Learning ◽

Attitude Control ◽

Cooperative Control ◽

Learning Algorithm ◽

State Equations ◽

Learning Agent ◽

Environmental Adaptability ◽

Decentralized Execution ◽

Policy Optimization ◽

Multi Uav

Multiple unmanned aerial vehicle (UAV) collaboration has great potential. To increase the intelligence and environmental adaptability of multi-UAV control, we study the application of deep reinforcement learning algorithms in the field of multi-UAV cooperative control. Aiming at the problem of a non-stationary environment caused by the change of learning agent strategy in reinforcement learning in a multi-agent environment, the paper presents an improved multiagent reinforcement learning algorithm—the multiagent joint proximal policy optimization (MAJPPO) algorithm with the centralized learning and decentralized execution. This algorithm uses the moving window averaging method to make each agent obtain a centralized state value function, so that the agents can achieve better collaboration. The improved algorithm enhances the collaboration and increases the sum of reward values obtained by the multiagent system. To evaluate the performance of the algorithm, we use the MAJPPO algorithm to complete the task of multi-UAV formation and the crossing of multiple-obstacle environments. To simplify the control complexity of the UAV, we use the six-degree of freedom and 12-state equations of the dynamics model of the UAV with an attitude control loop. The experimental results show that the MAJPPO algorithm has better performance and better environmental adaptability.

Download Full-text

Hybrid Learning for Multi-agent Cooperation with Sub-optimal Demonstrations

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/420 ◽

2020 ◽

Author(s):

Peixi Peng ◽

Junliang Xing ◽

Lili Cao

Keyword(s):

Real Time ◽

Hybrid Learning ◽

Learning Method ◽

Response Dynamics ◽

Best Response ◽

Learning From Demonstrations ◽

Multi Agent ◽

Agent Cooperation ◽

Actor Networks ◽

Decentralized Execution

This paper aims to learn multi-agent cooperation where each agent performs its actions in a decentralized way. In this case, it is very challenging to learn decentralized policies when the rewards are global and sparse. Recently, learning from demonstrations (LfD) provides a promising way to handle this challenge. However, in many practical tasks, the available demonstrations are often sub-optimal. To learn better policies from these sub-optimal demonstrations, this paper follows a centralized learning and decentralized execution framework and proposes a novel hybrid learning method based on multi-agent actor-critic. At first, the expert trajectory returns generated from demonstration actions are used to pre-train the centralized critic network. Then, multi-agent decisions are made by best response dynamics based on the critic and used to train the decentralized actor networks. Finally, the demonstrations are updated by the actor networks, and the critic and actor networks are learned jointly by running the above two steps alliteratively. We evaluate the proposed approach on a real-time strategy combat game. Experimental results show that the approach outperforms many competing demonstration-based methods.

Download Full-text

decentralized execution
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Multi-Agent Distributed Deep Deterministic Policy Gradient for Partially Observable Tracking

Decentralized Offloading Strategies Based on Reinforcement Learning for Multi-Access Edge Computing

Multi-UAV Navigation for Partially Observable Communication Coverage by Graph Reinforcement Learning

Multi-UAV Navigation for Partially Observable Communication Coverage by Graph Reinforcement Learning

Meta-process: a noval approach for decentralized execution of process

Towards resilient water supply in centralized control and decentralized execution mode

End-To-End Deep Reinforcement Learning for Decentralized Task Allocation and Navigation for a Multi-Robot System

2. Centralized Control/Decentralized Execution (DoD)

Research on the Multiagent Joint Proximal Policy Optimization Algorithm Controlling Cooperative Fixed-Wing UAV Obstacle Avoidance

Hybrid Learning for Multi-agent Cooperation with Sub-optimal Demonstrations

Export Citation Format

decentralized executionRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Multi-Agent Distributed Deep Deterministic Policy Gradient for Partially Observable Tracking

Decentralized Offloading Strategies Based on Reinforcement Learning for Multi-Access Edge Computing

Multi-UAV Navigation for Partially Observable Communication Coverage by Graph Reinforcement Learning

Multi-UAV Navigation for Partially Observable Communication Coverage by Graph Reinforcement Learning

Meta-process: a noval approach for decentralized execution of process

Towards resilient water supply in centralized control and decentralized execution mode

End-To-End Deep Reinforcement Learning for Decentralized Task Allocation and Navigation for a Multi-Robot System

2. Centralized Control/Decentralized Execution (DoD)

Research on the Multiagent Joint Proximal Policy Optimization Algorithm Controlling Cooperative Fixed-Wing UAV Obstacle Avoidance

Hybrid Learning for Multi-agent Cooperation with Sub-optimal Demonstrations

decentralized execution
Recently Published Documents