Exploring communication protocols and centralized critics in multi-agent deep learning

Tackling multi-agent environments where each agent has a local limited observation of the global state is a non-trivial task that often requires hand-tuned solutions. A team of agents coordinating in such scenarios must handle the complex underlying environment, while each agent only has partial knowledge about the environment. Deep reinforcement learning has been shown to achieve super-human performance in single-agent environments, and has since been adapted to the multi-agent paradigm. This paper proposes A3C3, a multi-agent deep learning algorithm, where agents are evaluated by a centralized referee during the learning phase, but remain independent from each other in actual execution. This referee’s neural network is augmented with a permutation invariance architecture to increase its scalability to large teams. A3C3 also allows agents to learn communication protocols with which agents share relevant information to their team members, allowing them to overcome their limited knowledge, and achieve coordination. A3C3 and its permutation invariant augmentation is evaluated in multiple multi-agent test-beds, which include partially-observable scenarios, swarm environments, and complex 3D soccer simulations.

Download Full-text

Multi Agent Deep Learning with Cooperative Communication

Journal of Artificial Intelligence and Soft Computing Research ◽

10.2478/jaiscr-2020-0013 ◽

2020 ◽

Vol 10 (3) ◽

pp. 189-207

Author(s):

David Simões ◽

Nuno Lau ◽

Luís Paulo Reis

Keyword(s):

Deep Learning ◽

Cooperative Communication ◽

State Of The Art ◽

Communication Protocols ◽

Relevant Information ◽

Distributed Execution ◽

Multi Agent ◽

Multiple Environments ◽

Partially Observable ◽

Multi Agents

AbstractWe consider the problem of multi agents cooperating in a partially-observable environment. Agents must learn to coordinate and share relevant information to solve the tasks successfully. This article describes Asynchronous Advantage Actor-Critic with Communication (A3C2), an end-to-end differentiable approach where agents learn policies and communication protocols simultaneously. A3C2 uses a centralized learning, distributed execution paradigm, supports independent agents, dynamic team sizes, partially-observable environments, and noisy communications. We compare and show that A3C2 outperforms other state-of-the-art proposals in multiple environments.

Download Full-text

Comparison of Naive Bayes, Back Propagation, And Deep Learning algorithm to Measure the Performance Using Datasets

i-manager’s Journal on Software Engineering ◽

10.26634/jse.11.2.13443 ◽

2016 ◽

Vol 11 (2) ◽

pp. 1

Author(s):

SHARMILA J. ◽

Keyword(s):

Deep Learning ◽

Naive Bayes ◽

Learning Algorithm ◽

Back Propagation ◽

Naïve Bayes ◽

Deep Learning Algorithm

Download Full-text

Development and Validation of a Deep Learning Algorithm to Evaluate Endoscopic Disease Activity of Ulcerative Colitis.

Case Medical Research ◽

10.31525/ct1-nct03973437 ◽

2019 ◽

Author(s):

Keyword(s):

Ulcerative Colitis ◽

Deep Learning ◽

Disease Activity ◽

Learning Algorithm ◽

Deep Learning Algorithm ◽

Development And Validation

Download Full-text

Review for "Smart grid security enhancement by detection and classification of non‐technical losses employing deep learning algorithm"

10.1002/2050-7038.12521/v1/review4 ◽

2020 ◽

Keyword(s):

Deep Learning ◽

Smart Grid ◽

Learning Algorithm ◽

Grid Security ◽

Security Enhancement ◽

Smart Grid Security ◽

Deep Learning Algorithm

Download Full-text

Review for "Smart grid security enhancement by detection and classification of non‐technical losses employing deep learning algorithm"

10.1002/2050-7038.12521/v1/review2 ◽

2020 ◽

Keyword(s):

Deep Learning ◽

Smart Grid ◽

Learning Algorithm ◽

Grid Security ◽

Security Enhancement ◽

Smart Grid Security ◽

Deep Learning Algorithm

Download Full-text

Faculty Opinions recommendation of Classification of the clinical images for benign and malignant cutaneous tumors using a deep learning algorithm.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.732648119.793554280 ◽

2018 ◽

Author(s):

Christine Ko

Keyword(s):

Deep Learning ◽

Learning Algorithm ◽

Clinical Images ◽

Deep Learning Algorithm

Download Full-text

Barcode Detection and Classification using SSD (Single Shot Multibox Detector) Deep Learning Algorithm

SSRN Electronic Journal ◽

10.2139/ssrn.3568499 ◽

2020 ◽

Author(s):

Akshata Kolekar ◽

Vipul Dalal

Keyword(s):

Deep Learning ◽

Learning Algorithm ◽

Single Shot ◽

Deep Learning Algorithm

Download Full-text

Study on Radar Echo-Filling in an Occlusion Area by a Deep Learning Algorithm

Remote Sensing ◽

10.3390/rs13091779 ◽

2021 ◽

Vol 13 (9) ◽

pp. 1779

Author(s):

Xiaoyan Yin ◽

Zhiqun Hu ◽

Jiafeng Zheng ◽

Boyong Li ◽

Yuanyuan Zuo

Keyword(s):

Deep Learning ◽

Loss Function ◽

Learning Algorithm ◽

Weather Radar ◽

Loss Functions ◽

Training Dataset ◽

Echo Intensity ◽

Common Mean ◽

Deep Learning Algorithm ◽

Radar Beam

Radar beam blockage is an important error source that affects the quality of weather radar data. An echo-filling network (EFnet) is proposed based on a deep learning algorithm to correct the echo intensity under the occlusion area in the Nanjing S-band new-generation weather radar (CINRAD/SA). The training dataset is constructed by the labels, which are the echo intensity at the 0.5° elevation in the unblocked area, and by the input features, which are the intensity in the cube including multiple elevations and gates corresponding to the location of bottom labels. Two loss functions are applied to compile the network: one is the common mean square error (MSE), and the other is a self-defined loss function that increases the weight of strong echoes. Considering that the radar beam broadens with distance and height, the 0.5° elevation scan is divided into six range bands every 25 km to train different models. The models are evaluated by three indicators: explained variance (EVar), mean absolute error (MAE), and correlation coefficient (CC). Two cases are demonstrated to compare the effect of the echo-filling model by different loss functions. The results suggest that EFnet can effectively correct the echo reflectivity and improve the data quality in the occlusion area, and there are better results for strong echoes when the self-defined loss function is used.

Download Full-text