State Action Separable Reinforcement Learning

2021 ◽

Vol 21 (4) ◽

pp. 1-22

Author(s):

Safa Otoum ◽

Burak Kantarci ◽

Hussein Mouftah

Keyword(s):

Reinforcement Learning ◽

Intrusion Detection ◽

Comparative Study ◽

Performance Metrics ◽

Action Learning ◽

Smart Devices ◽

Critical Infrastructures ◽

State Action ◽

Detection Techniques ◽

Depth Analysis

Volunteer computing uses Internet-connected devices (laptops, PCs, smart devices, etc.), in which their owners volunteer them as storage and computing power resources, has become an essential mechanism for resource management in numerous applications. The growth of the volume and variety of data traffic on the Internet leads to concerns on the robustness of cyberphysical systems especially for critical infrastructures. Therefore, the implementation of an efficient Intrusion Detection System for gathering such sensory data has gained vital importance. In this article, we present a comparative study of Artificial Intelligence (AI)-driven intrusion detection systems for wirelessly connected sensors that track crucial applications. Specifically, we present an in-depth analysis of the use of machine learning, deep learning and reinforcement learning solutions to recognise intrusive behavior in the collected traffic. We evaluate the proposed mechanisms by using KDD’99 as real attack dataset in our simulations. Results present the performance metrics for three different IDSs, namely the Adaptively Supervised and Clustered Hybrid IDS (ASCH-IDS), Restricted Boltzmann Machine-based Clustered IDS (RBC-IDS), and Q-learning based IDS (Q-IDS), to detect malicious behaviors. We also present the performance of different reinforcement learning techniques such as State-Action-Reward-State-Action Learning (SARSA) and the Temporal Difference learning (TD). Through simulations, we show that Q-IDS performs with detection rate while SARSA-IDS and TD-IDS perform at the order of .

Download Full-text

Cloud Load Balancing and Reinforcement Learning

Advances in Business Information Systems and Analytics - Cloud Computing Technologies for Green Enterprises ◽

10.4018/978-1-5225-3038-1.ch011 ◽

2018 ◽

pp. 266-291

Author(s):

Abdelghafour Harraz ◽

Mostapha Zbakh

Keyword(s):

Artificial Intelligence ◽

Reinforcement Learning ◽

Load Balancing ◽

Decision Process ◽

Cloud System ◽

Human Intervention ◽

Q Learning ◽

State Action ◽

Learning Techniques ◽

Markov Decision

Artificial Intelligence allows to create engines that are able to explore, learn environments and therefore create policies that permit to control them in real time with no human intervention. It can be applied, through its Reinforcement Learning techniques component, using frameworks such as temporal differences, State-Action-Reward-State-Action (SARSA), Q Learning to name a few, to systems that are be perceived as a Markov Decision Process, this opens door in front of applying Reinforcement Learning to Cloud Load Balancing to be able to dispatch load dynamically to a given Cloud System. The authors will describe different techniques that can used to implement a Reinforcement Learning based engine in a cloud system.

Download Full-text

Deep Reinforcement Learning by Balancing Offline Monte Carlo and Online Temporal Difference Use Based on Environment Experiences

Symmetry ◽

10.3390/sym12101685 ◽

2020 ◽

Vol 12 (10) ◽

pp. 1685 ◽

Cited By ~ 1

Author(s):

Chayoung Kim

Keyword(s):

Monte Carlo ◽

Reinforcement Learning ◽

Real Time ◽

Temporal Difference ◽

Q Learning ◽

State Action ◽

Proposed Model ◽

Reward Functions ◽

And Performance ◽

The Internet Of Things

Owing to the complexity involved in training an agent in a real-time environment, e.g., using the Internet of Things (IoT), reinforcement learning (RL) using a deep neural network, i.e., deep reinforcement learning (DRL) has been widely adopted on an online basis without prior knowledge and complicated reward functions. DRL can handle a symmetrical balance between bias and variance—this indicates that the RL agents are competently trained in real-world applications. The approach of the proposed model considers the combinations of basic RL algorithms with online and offline use based on the empirical balances of bias–variance. Therefore, we exploited the balance between the offline Monte Carlo (MC) technique and online temporal difference (TD) with on-policy (state-action–reward-state-action, Sarsa) and an off-policy (Q-learning) in terms of a DRL. The proposed balance of MC (offline) and TD (online) use, which is simple and applicable without a well-designed reward, is suitable for real-time online learning. We demonstrated that, for a simple control task, the balance between online and offline use without an on- and off-policy shows satisfactory results. However, in complex tasks, the results clearly indicate the effectiveness of the combined method in improving the convergence speed and performance in a deep Q-network.

Download Full-text

Autonomous control of real snake-like robot using reinforcement learning; Abstraction of state-action space using properties of real world

2007 3rd International Conference on Intelligent Sensors, Sensor Networks and Information ◽

10.1109/issnip.2007.4496875 ◽

2007 ◽

Cited By ~ 5

Author(s):

Kazuyuki Ito ◽

Yoshitaka Fukumori ◽

Akihiro Takayama

Keyword(s):

Reinforcement Learning ◽

Real World ◽

Autonomous Control ◽

Action Space ◽

State Action

Download Full-text

Autonomous control of a snake-like robot using reinforcement learning -Discussion of the role of the mechanical body in abstraction of state-action space-

2008 34th Annual Conference of IEEE Industrial Electronics ◽

10.1109/iecon.2008.4758190 ◽

2008 ◽

Author(s):

A. Takayama ◽

K. Ito ◽

T. Minamino

Keyword(s):

Reinforcement Learning ◽

Autonomous Control ◽

Action Space ◽

State Action

Download Full-text

Online Tuning of a PID Controller with a Fuzzy Reinforcement Learning MAS for Flow Rate Control of a Desalination Unit

Electronics ◽

10.3390/electronics8020231 ◽

2019 ◽

Vol 8 (2) ◽

pp. 231 ◽

Cited By ~ 2

Author(s):

Panagiotis Kofinas ◽

Anastasios I. Dounis

Keyword(s):

Reinforcement Learning ◽

Flow Rate ◽

Pid Controller ◽

Hybrid Control ◽

Q Learning ◽

State Action ◽

Continuous State ◽

Multi Agent ◽

Flow Rate Control ◽

Online Tuning

This paper proposes a hybrid Zeigler-Nichols (Z-N) fuzzy reinforcement learning MAS (Multi-Agent System) approach for online tuning of a Proportional Integral Derivative (PID) controller in order to control the flow rate of a desalination unit. The PID gains are set by the Z-N method and then are adapted online through the fuzzy Q-learning MAS. The fuzzy Q-learning is introduced in each agent in order to confront with the continuous state-action space. The global state of the MAS is defined by the value of the error and the derivative of error. The MAS consists of three agents and the output signal of each agent defines the percentage change of each gain. The increment or the reduction of each gain can be in the range of 0% to 100% of its initial value. The simulation results highlight the performance of the suggested hybrid control strategy through comparison with the conventional PID controller tuned by Z-N.

Download Full-text

Analyzing Strength-Based Classifier System from Reinforcement Learning Perspective

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2009.p0631 ◽

2009 ◽

Vol 13 (6) ◽

pp. 631-639

Author(s):

Atsushi Wada ◽

◽

Keiki Takadama ◽

◽

Keyword(s):

Reinforcement Learning ◽

Adaptive Systems ◽

Classifier Systems ◽

Q Learning ◽

State Action ◽

Classifier System ◽

Learning Classifier ◽

Value Estimation ◽

On Line ◽

On Line Learning

Learning Classifier Systems (LCSs) are rule-based adaptive systems that have both Reinforcement Learning (RL) and rule-discovery mechanisms for effective and practical on-line learning. With the aim of establishing a common theoretical basis between LCSs and RL algorithms to share each field's findings, a detailed analysis was performed to compare the learning processes of these two approaches. Based on our previous work on deriving an equivalence between the Zeroth-level Classifier System (ZCS) and Q-learning with Function Approximation (FA), this paper extends the analysis to the influence of actually applying the conditions for this equivalence. Comparative experiments have revealed interesting implications: (1) ZCS's original parameter, the deduction rate, plays a role in stabilizing the action selection, but (2) from the Reinforcement Learning perspective, such a process inhibits the ability to accurately estimate values for the entire state-action space, thus limiting the performance of ZCS in problems requiring accurate value estimation.

Download Full-text

Abstraction of state-action space by utilizing properties of the body and the environment - Application to an autonomous snake-like robot controlled by reinforcement learning -

Journal of Japan Society for Fuzzy Theory and Intelligent Informatics ◽

10.3156/jsoft.21.402 ◽

2009 ◽

Vol 21 (3) ◽

pp. 402-410 ◽

Cited By ~ 1

Author(s):

Kazuyuki ITO ◽

Akihiro TAKAYAMA

Keyword(s):

Reinforcement Learning ◽

The Body ◽

Action Space ◽

State Action

Download Full-text

Reinforcement learning in multi-dimensional state-action space using random rectangular coarse coding and Gibbs sampling

2007 IEEE/RSJ International Conference on Intelligent Robots and Systems ◽

10.1109/iros.2007.4399401 ◽

2007 ◽

Cited By ~ 10

Author(s):

Hajime Kimura

Keyword(s):

Reinforcement Learning ◽

Gibbs Sampling ◽

Action Space ◽

State Action ◽

Coarse Coding

Download Full-text

QUANTUM COMPUTATION FOR ACTION SELECTION USING REINFORCEMENT LEARNING

International Journal of Quantum Information ◽

10.1142/s0219749906002419 ◽

2006 ◽

Vol 04 (06) ◽

pp. 1071-1083 ◽

Cited By ~ 13

Author(s):

C. L. CHEN ◽

D. Y. DONG ◽

Z. H. CHEN

Keyword(s):

Decision Making ◽

Quantum Theory ◽

Reinforcement Learning ◽

Quantum Computation ◽

Action Selection ◽

Selection Method ◽

Superposition State ◽

Exploration And Exploitation ◽

Quantum Superposition ◽

State Action

This paper proposes a novel action selection method based on quantum computation and reinforcement learning (RL). Inspired by the advantages of quantum computation, the state/action in a RL system is represented with quantum superposition state. The probability of action eigenvalue is denoted by probability amplitude, which is updated according to rewards. And the action selection is carried out by observing quantum state according to collapse postulate of quantum measurement. The results of simulated experiments show that quantum computation can be effectively used to action selection and decision making through speeding up learning. This method also makes a good tradeoff between exploration and exploitation for RL using probability characteristics of quantum theory.

Download Full-text

State Action Separable Reinforcement Learning

A Comparative Study of AI-Based Intrusion Detection Techniques in Critical Infrastructures

Cloud Load Balancing and Reinforcement Learning

Deep Reinforcement Learning by Balancing Offline Monte Carlo and Online Temporal Difference Use Based on Environment Experiences

Autonomous control of real snake-like robot using reinforcement learning; Abstraction of state-action space using properties of real world

Autonomous control of a snake-like robot using reinforcement learning -Discussion of the role of the mechanical body in abstraction of state-action space-

Online Tuning of a PID Controller with a Fuzzy Reinforcement Learning MAS for Flow Rate Control of a Desalination Unit

Analyzing Strength-Based Classifier System from Reinforcement Learning Perspective

Abstraction of state-action space by utilizing properties of the body and the environment - Application to an autonomous snake-like robot controlled by reinforcement learning -

Reinforcement learning in multi-dimensional state-action space using random rectangular coarse coding and Gibbs sampling

QUANTUM COMPUTATION FOR ACTION SELECTION USING REINFORCEMENT LEARNING

Export Citation Format