Training Multiagent Systems by Q-Learning: Approaches and Empirical Results

2014 ◽  
Vol 31 (3) ◽  
pp. 498-512 ◽  
Author(s):  
Jose Manuel Lopez-Guede ◽  
Borja Fernandez-Gauna ◽  
Manuel Graña ◽  
Ekaitz Zulueta
2018 ◽  
Author(s):  
Stefan Niculae

Penetration testing is the practice of performing a simulated attack on a computer system in order to reveal its vulnerabilities. The most common approach is to gain information and then plan and execute the attack manually, by a security expert. This manual method cannot meet the speed and frequency required for efficient, large-scale secu- rity solutions development. To address this, we formalize penetration testing as a security game between an attacker who tries to compro- mise a network and a defending adversary actively protecting it. We compare multiple algorithms for finding the attacker’s strategy, from fixed-strategy to Reinforcement Learning, namely Q-Learning (QL), Extended Classifier Systems (XCS) and Deep Q-Networks (DQN). The attacker’s strength is measured in terms of speed and stealthi- ness, in the specific environment used in our simulations. The results show that QL surpasses human performance, XCS yields worse than human performance but is more stable, and the slow convergence of DQN keeps it from achieving exceptional performance, in addition, we find that all of these Machine Learning approaches outperform fixed-strategy attackers.


2020 ◽  
Vol 0 (0) ◽  
pp. 0-0
Author(s):  
seyed Hossein Jafarpour Rezaei ◽  
Mohammad Ali Rastegar

2021 ◽  
Author(s):  
Masoud Geravanchizadeh ◽  
Hossein Roushan

AbstractThe cocktail party phenomenon describes the ability of the human brain to focus auditory attention on a particular stimulus while ignoring other acoustic events. Selective auditory attention detection (SAAD) is an important issue in the development of brain-computer interface systems and cocktail party processors. This paper proposes a new dynamic attention detection system to process the temporal evolution of the input signal. In the proposed dynamic system, after preprocessing of the input signals, the probabilistic state space of the system is formed. Then, in the learning stage, different dynamic learning methods, including recurrent neural network (RNN) and reinforcement learning (Markov decision process (MDP) and deep Q-learning) are applied to make the final decision as to the attended speech. Among different dynamic learning approaches, the evaluation results show that the deep Q-learning approach (MDP+RNN) provides the highest classification accuracy (94.2%) with the least detection delay. The proposed SAAD system is advantageous, in the sense that the detection of attention is performed dynamically for the sequential inputs. Also, the system has the potential to be used in scenarios, where the attention of the listener might be switched in time in the presence of various acoustic events.


Sign in / Sign up

Export Citation Format

Share Document