Generating individual intrinsic reward for cooperative multiagent reinforcement learning

Multiagent reinforcement learning holds considerable promise to deal with cooperative multiagent tasks. Unfortunately, the only global reward shared by all agents in the cooperative tasks may lead to the lazy agent problem. To cope with such a problem, we propose a generating individual intrinsic reward algorithm, which introduces an intrinsic reward encoder to generate an individual intrinsic reward for each agent and utilizes the hypernetworks as the decoder to help to estimate the individual action values of the decomposition methods based on the generated individual intrinsic reward. Experimental results in the StarCraft II micromanagement benchmark prove that the proposed algorithm can increase learning efficiency and improve policy performance.

Download Full-text

Towards High-Level Intrinsic Exploration in Reinforcement Learning

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/733 ◽

2020 ◽

Author(s):

Nicolas Bougie ◽

Ryutaro Ichise

Keyword(s):

Reinforcement Learning ◽

Time Horizon ◽

State Of The Art ◽

Experimental Results ◽

Prior Work ◽

Extrinsic Rewards ◽

Intrinsic Reward ◽

Long Time ◽

End To End ◽

High Level

Deep reinforcement learning (DRL) methods traditionally struggle with tasks where environment rewards are sparse or delayed, which entails that exploration remains one of the key challenges of DRL. Instead of solely relying on extrinsic rewards, many state-of-the-art methods use intrinsic curiosity as exploration signal. While they hold promise of better local exploration, discovering global exploration strategies is beyond the reach of current methods. We propose a novel end-to-end intrinsic reward formulation that introduces high-level exploration in reinforcement learning. Our curiosity signal is driven by a fast reward that deals with local exploration and a slow reward that incentivizes long-time horizon exploration strategies. We formulate curiosity as the error in an agent’s ability to reconstruct the observations given their contexts. Experimental results show that this high-level exploration enables our agents to outperform prior work in several Atari games.

Download Full-text

FMRQ—A Multiagent Reinforcement Learning Algorithm for Fully Cooperative Tasks

IEEE Transactions on Cybernetics ◽

10.1109/tcyb.2016.2544866 ◽

2017 ◽

Vol 47 (6) ◽

pp. 1367-1379 ◽

Cited By ~ 27

Author(s):

Zhen Zhang ◽

Dongbin Zhao ◽

Junwei Gao ◽

Dongqing Wang ◽

Yujie Dai

Keyword(s):

Reinforcement Learning ◽

Learning Algorithm ◽

Cooperative Tasks ◽

Multiagent Reinforcement Learning ◽

Reinforcement Learning Algorithm

Download Full-text

Learning Automata-Based Multiagent Reinforcement Learning for Optimization of Cooperative Tasks

IEEE Transactions on Neural Networks and Learning Systems ◽

10.1109/tnnls.2020.3025711 ◽

2020 ◽

pp. 1-14

Author(s):

Zhen Zhang ◽

Dongqing Wang ◽

Junwei Gao

Keyword(s):

Reinforcement Learning ◽

Learning Automata ◽

Cooperative Tasks ◽

Multiagent Reinforcement Learning

Download Full-text

Opponent portrait for multiagent reinforcement learning in competitive environment

International Journal of Intelligent Systems ◽

10.1002/int.22594 ◽

2021 ◽

Author(s):

Yuxi Ma ◽

Meng Shen ◽

Yuhang Zhao ◽

Zhao Li ◽

Xiaoyao Tong ◽

...

Keyword(s):

Reinforcement Learning ◽

Competitive Environment ◽

Multiagent Reinforcement Learning

Download Full-text

Experience Sharing Based Memetic Transfer Learning for Multiagent Reinforcement Learning

Memetic Computing ◽

10.1007/s12293-021-00339-4 ◽

2021 ◽

Author(s):

Tonghao Wang ◽

Xingguang Peng ◽

Yaochu Jin ◽

Demin Xu

Keyword(s):

Reinforcement Learning ◽

Transfer Learning ◽

Multiagent Reinforcement Learning

Download Full-text

Adaptive dynamic programming and deep reinforcement learning for the control of an unmanned surface vehicle: Experimental results

Control Engineering Practice ◽

10.1016/j.conengprac.2021.104807 ◽

2021 ◽

Vol 111 ◽

pp. 104807

Author(s):

Alejandro Gonzalez-Garcia ◽

David Barragan-Alcantar ◽

Ivana Collado-Gonzalez ◽

Leonardo Garrido

Keyword(s):

Dynamic Programming ◽

Reinforcement Learning ◽

Adaptive Dynamic Programming ◽

Experimental Results ◽

Unmanned Surface Vehicle ◽

Adaptive Dynamic

Download Full-text

EVA 2.0: Emotional and rational multimodal argumentation between virtual agents

it - Information Technology ◽

10.1515/itit-2020-0050 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Niklas Rach ◽

Klaus Weber ◽

Yuchi Yang ◽

Stefan Ultes ◽

Elisabeth André ◽

...

Keyword(s):

Reinforcement Learning ◽

User Feedback ◽

Virtual Agents ◽

Multi Agent System ◽

Dialogue Game ◽

Multi Agent ◽

The Individual ◽

Minimal Bias ◽

Intuitive Interface ◽

Emotional Level

Abstract Persuasive argumentation depends on multiple aspects, which include not only the content of the individual arguments, but also the way they are presented. The presentation of arguments is crucial – in particular in the context of dialogical argumentation. However, the effects of different discussion styles on the listener are hard to isolate in human dialogues. In order to demonstrate and investigate various styles of argumentation, we propose a multi-agent system in which different aspects of persuasion can be modelled and investigated separately. Our system utilizes argument structures extracted from text-based reviews for which a minimal bias of the user can be assumed. The persuasive dialogue is modelled as a dialogue game for argumentation that was motivated by the objective to enable both natural and flexible interactions between the agents. In order to support a comparison of factual against affective persuasion approaches, we implemented two fundamentally different strategies for both agents: The logical policy utilizes deep Reinforcement Learning in a multi-agent setup to optimize the strategy with respect to the game formalism and the available argument. In contrast, the emotional policy selects the next move in compliance with an agent emotion that is adapted to user feedback to persuade on an emotional level. The resulting interaction is presented to the user via virtual avatars and can be rated through an intuitive interface.

Download Full-text

HESR innovation in analysis that supports decision making and action

European Journal of Public Health ◽

10.1093/eurpub/ckaa165.1239 ◽

2020 ◽

Vol 30 (Supplement_5) ◽

Author(s):

B Barr

Keyword(s):

Decision Making ◽

Health Equity ◽

Social Protection ◽

Well Being ◽

Status Report ◽

Policy Performance ◽

European Health ◽

The Individual ◽

Health And Well Being ◽

Inform Decision Making

Abstract The European Health Equity Status Report makes innovative use of microdata, at the level of the individual, to decompose the relative contributions of five essential underlying conditions to inequities in health and well-being. These essential conditions comprise: (1) Health services (2) Income security and social protection (3) Living conditions (4) Social and human capital (5) Employment and working conditions. Combining microdata across over twenty sources, the work of HESRi has also produced disaggregated indicators in health, well-being, and each of the five essential conditions. In conjunction with indicators of policy performance and investment, the HESRi Health Equity Dataset of over 100 indicators is the first of its kind, as a resource for monitoring and analysing inequities across the essential conditions and policies to inform decision making and action to reduce gaps in health and well-being.

Download Full-text

Devanagari Text Detection From Natural Scene Images

International Journal of Computer Vision and Image Processing ◽

10.4018/ijcvip.2020070104 ◽

2020 ◽

Vol 10 (3) ◽

pp. 44-59

Author(s):

Sankirti Sandeep Shiravale ◽

R. Jayadevan ◽

Sanjeev S. Sannakki

Keyword(s):

Edge Detection ◽

Image Understanding ◽

Text Detection ◽

Experimental Results ◽

Combined Approach ◽

Natural Scene ◽

Light Conditions ◽

The Individual ◽

Natural Scene Images ◽

Better Than

Text present in a camera captured scene images is semantically rich and can be used for image understanding. Automatic detection, extraction, and recognition of text are crucial in image understanding applications. Text detection from natural scene images is a tedious task due to complex background, uneven light conditions, multi-coloured and multi-sized font. Two techniques, namely ‘edge detection' and ‘colour-based clustering', are combined in this paper to detect text in scene images. Region properties are used for elimination of falsely generated annotations. A dataset of 1250 images is created and used for experimentation. Experimental results show that the combined approach performs better than the individual approaches.

Download Full-text

Simulation of Compressor Performance Deterioration due to Erosion

Volume 3: Coal, Biomass and Alternative Fuels; Combustion and Fuels; Oil and Gas Applications ◽

10.1115/89-gt-182 ◽

1989 ◽

Cited By ~ 1

Author(s):

W. Tabakoff ◽

A. N. Lakshminarasimha ◽

M. Pasin

Keyword(s):

Fault Model ◽

Experimental Results ◽

Performance Tests ◽

Individual Stage ◽

One Stage ◽

Before And After ◽

Performance Deterioration ◽

Compressor Performance ◽

Multistage Compressor ◽

The Individual

Experimental results obtained from cascades and one stage compressor performance tests before and after erosion were used to test a fault model to represent erosion. This model was implemented on a stage stacking program developed to demonstrate the effect of erosion in a multistage compressor. The effect of the individual stage erosion on the overall compressor performance is also demonstrated.

Download Full-text