Coordination Model with Reinforcement Learning for Ensuring Reliable On-Demand Services in Collective Adaptive Systems

SAMoD: Shared Autonomous Mobility-on-Demand using Decentralized Reinforcement Learning

2018 21st International Conference on Intelligent Transportation Systems (ITSC) ◽

10.1109/itsc.2018.8569608 ◽

2018 ◽

Cited By ~ 9

Author(s):

Maxime Gueriau ◽

Ivana Dusparic

Keyword(s):

Reinforcement Learning ◽

On Demand ◽

Autonomous Mobility

Download Full-text

On-Demand Channel Bonding in Heterogeneous WLANs: A Multi-Agent Deep Reinforcement Learning Approach

Sensors ◽

10.3390/s20102789 ◽

2020 ◽

Vol 20 (10) ◽

pp. 2789 ◽

Cited By ~ 1

Author(s):

Hang Qi ◽

Hao Huang ◽

Zhiqun Hu ◽

Xiangming Wen ◽

Zhaoming Lu

Keyword(s):

Reinforcement Learning ◽

Transmission Rate ◽

Single Agent ◽

Time Of Day ◽

Action Space ◽

Traffic Load ◽

Traffic Demand ◽

Channel Bonding ◽

On Demand ◽

Multi Agent

In order to meet the ever-increasing traffic demand of Wireless Local Area Networks (WLANs), channel bonding is introduced in IEEE 802.11 standards. Although channel bonding effectively increases the transmission rate, the wider channel reduces the number of non-overlapping channels and is more susceptible to interference. Meanwhile, the traffic load differs from one access point (AP) to another and changes significantly depending on the time of day. Therefore, the primary channel and channel bonding bandwidth should be carefully selected to meet traffic demand and guarantee the performance gain. In this paper, we proposed an On-Demand Channel Bonding (O-DCB) algorithm based on Deep Reinforcement Learning (DRL) for heterogeneous WLANs to reduce transmission delay, where the APs have different channel bonding capabilities. In this problem, the state space is continuous and the action space is discrete. However, the size of action space increases exponentially with the number of APs by using single-agent DRL, which severely affects the learning rate. To accelerate learning, Multi-Agent Deep Deterministic Policy Gradient (MADDPG) is used to train O-DCB. Real traffic traces collected from a campus WLAN are used to train and test O-DCB. Simulation results reveal that the proposed algorithm has good convergence and lower delay than other algorithms.

Download Full-text

Analysis and solution of a predator–protector–prey multi-robot system by a high-level reinforcement learning architecture and the adaptive systems theory

Robotics and Autonomous Systems ◽

10.1016/j.robot.2010.08.005 ◽

2010 ◽

Vol 58 (12) ◽

pp. 1266-1272 ◽

Cited By ~ 3

Author(s):

José Antonio Martín H. ◽

Javier de Lope ◽

Darío Maravall

Keyword(s):

Reinforcement Learning ◽

Systems Theory ◽

Adaptive Systems ◽

Robot System ◽

High Level ◽

Multi Robot

Download Full-text

Robot hand-eye cooperation based on improved inverse reinforcement learning

Industrial Robot the international journal of robotics research and application ◽

10.1108/ir-09-2021-0208 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Ning Yu ◽

Lin Nan ◽

Tao Ku

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Visual Information ◽

Industrial Robots ◽

Robot Hand ◽

Inverse Reinforcement Learning ◽

Generative Adversarial Network ◽

Content Type ◽

Coordination Model ◽

Proposed Model

Purpose How to make accurate action decisions based on visual information is one of the important research directions of industrial robots. The purpose of this paper is to design a highly optimized hand-eye coordination model of the robot to improve the robots’ on-site decision-making ability. Design/methodology/approach The combination of inverse reinforcement learning (IRL) algorithm and generative adversarial network can effectively reduce the dependence on expert samples and robots can obtain the decision-making performance that the degree of optimization is not lower than or even higher than that of expert samples. Findings The performance of the proposed model is verified in the simulation environment and real scene. By monitoring the reward distribution of the reward function and the trajectory of the robot, the proposed model is compared with other existing methods. The experimental results show that the proposed model has better decision-making performance in the case of less expert data. Originality/value A robot hand-eye cooperation model based on improved IRL is proposed and verified. Empirical investigations on real experiments reveal that overall, the proposed approach tends to improve the real efficiency by more than 10% when compared to alternative hand-eye cooperation methods.

Download Full-text

A Two-Layer Approach to Developing Self-Adaptive Multi-Agent Systems in Open Environment

International Journal of Agent Technologies and Systems ◽

10.4018/ijats.2014010104 ◽

2014 ◽

Vol 6 (1) ◽

pp. 65-85 ◽

Cited By ~ 2

Author(s):

Xinjun Mao ◽

Menggao Dong ◽

Haibin Zhu

Keyword(s):

Reinforcement Learning ◽

Adaptive Systems ◽

Multi Agent Systems ◽

Uncertain Environments ◽

Implementation Framework ◽

Fine Grain ◽

Adaptation Mechanisms ◽

Multi Agent ◽

Self Adaptation ◽

Self Adaptive

Development of self-adaptive systems situated in open and uncertain environments is a great challenge in the community of software engineering due to the unpredictability of environment changes and the variety of self-adaptation manners. Explicit specification of expected changes and various self-adaptations at design-time, an approach often adopted by developers, seems ineffective. This paper presents an agent-based approach that combines two-layer self-adaptation mechanisms and reinforcement learning together to support the development and running of self-adaptive systems. The approach takes self-adaptive systems as multi-agent organizations and enables the agent itself to make decisions on self-adaptation by learning at run-time and at different levels. The proposed self-adaptation mechanisms that are based on organization metaphors enable self-adaptation at two layers: fine-grain behavior level and coarse-grain organization level. Corresponding reinforcement learning algorithms on self-adaptation are designed and integrated with the two-layer self-adaptation mechanisms. This paper further details developmental technologies, based on the above approach, in establishing self-adaptive systems, including extended software architecture for self-adaptation, an implementation framework, and a development process. A case study and experiment evaluations are conducted to illustrate the effectiveness of the proposed approach.

Download Full-text

Spatio-Temporal Capsule-based Reinforcement Learning for Mobility-on-Demand Coordination

IEEE Transactions on Knowledge and Data Engineering ◽

10.1109/tkde.2020.2992565 ◽

2020 ◽

pp. 1-1

Author(s):

Suining He ◽

Kang G. Shin

Keyword(s):

Reinforcement Learning ◽

On Demand ◽

Spatio Temporal

Download Full-text

Spatio-Temporal Capsule-based Reinforcement Learning for Mobility-on-Demand Network Coordination

The World Wide Web Conference on - WWW '19 ◽

10.1145/3308558.3313401 ◽

2019 ◽

Cited By ~ 6

Author(s):

Suining He ◽

Kang G. Shin

Keyword(s):

Reinforcement Learning ◽

On Demand ◽

Network Coordination ◽

Spatio Temporal

Download Full-text

Multi-Agent Reinforcement Learning for Autonomous On Demand Vehicles

2019 IEEE Intelligent Vehicles Symposium (IV) ◽

10.1109/ivs.2019.8813876 ◽

2019 ◽

Author(s):

Ali Boyali ◽

Naohisa Hashimoto ◽

Vijay John ◽

Tankut Acarman

Keyword(s):

Reinforcement Learning ◽

On Demand ◽

Multi Agent

Download Full-text

Analyzing Strength-Based Classifier System from Reinforcement Learning Perspective

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2009.p0631 ◽

2009 ◽

Vol 13 (6) ◽

pp. 631-639

Author(s):

Atsushi Wada ◽

◽

Keiki Takadama ◽

◽

Keyword(s):

Reinforcement Learning ◽

Adaptive Systems ◽

Classifier Systems ◽

Q Learning ◽

State Action ◽

Classifier System ◽

Learning Classifier ◽

Value Estimation ◽

On Line ◽

On Line Learning

Learning Classifier Systems (LCSs) are rule-based adaptive systems that have both Reinforcement Learning (RL) and rule-discovery mechanisms for effective and practical on-line learning. With the aim of establishing a common theoretical basis between LCSs and RL algorithms to share each field's findings, a detailed analysis was performed to compare the learning processes of these two approaches. Based on our previous work on deriving an equivalence between the Zeroth-level Classifier System (ZCS) and Q-learning with Function Approximation (FA), this paper extends the analysis to the influence of actually applying the conditions for this equivalence. Comparative experiments have revealed interesting implications: (1) ZCS's original parameter, the deduction rate, plays a role in stabilizing the action selection, but (2) from the Reinforcement Learning perspective, such a process inhibits the ability to accurately estimate values for the entire state-action space, thus limiting the performance of ZCS in problems requiring accurate value estimation.

Download Full-text

Building autonomic systems using collaborative reinforcement learning

The Knowledge Engineering Review ◽

10.1017/s0269888906000956 ◽

2006 ◽

Vol 21 (3) ◽

pp. 231-238 ◽

Cited By ~ 12

Author(s):

JIM DOWLING ◽

RAYMOND CUNNINGHAM ◽

EOIN CURRAN ◽

VINNY CAHILL

Keyword(s):

Reinforcement Learning ◽

Ad Hoc ◽

Optimization Problems ◽

System Optimization ◽

Multi Agent Systems ◽

Learning Agents ◽

Coordination Model ◽

Routing Performance ◽

Multi Agent ◽

Unpredictable Environment

This paper presents Collaborative Reinforcement Learning (CRL), a coordination model for online system optimization in decentralized multi-agent systems. In CRL system optimization problems are represented as a set of discrete optimization problems, each of whose solution cost is minimized by model-based reinforcement learning agents collaborating on their solution. CRL systems can be built to provide autonomic behaviours such as optimizing system performance in an unpredictable environment and adaptation to partial failures. We evaluate CRL using an ad hoc routing protocol that optimizes system routing performance in an unpredictable network environment.

Download Full-text