scholarly journals Autonomous Penetration Testing Based on Improved Deep Q-Network

2021 ◽  
Vol 11 (19) ◽  
pp. 8823
Author(s):  
Shicheng Zhou ◽  
Jingju Liu ◽  
Dongdong Hou ◽  
Xiaofeng Zhong ◽  
Yue Zhang

Penetration testing is an effective way to test and evaluate cybersecurity by simulating a cyberattack. However, the traditional methods deeply rely on domain expert knowledge, which requires prohibitive labor and time costs. Autonomous penetration testing is a more efficient and intelligent way to solve this problem. In this paper, we model penetration testing as a Markov decision process problem and use reinforcement learning technology for autonomous penetration testing in large scale networks. We propose an improved deep Q-network (DQN) named NDSPI-DQN to address the sparse reward problem and large action space problem in large-scale scenarios. First, we reasonably integrate five extensions to DQN, including noisy nets, soft Q-learning, dueling architectures, prioritized experience replay, and intrinsic curiosity model to improve the exploration efficiency. Second, we decouple the action and split the estimators of the neural network to calculate two elements of action separately, so as to decrease the action space. Finally, the performance of algorithms is investigated in a range of scenarios. The experiment results demonstrate that our methods have better convergence and scaling performance.

Author(s):  
Haokun Chen ◽  
Xinyi Dai ◽  
Han Cai ◽  
Weinan Zhang ◽  
Xuejian Wang ◽  
...  

Reinforcement learning (RL) has recently been introduced to interactive recommender systems (IRS) because of its nature of learning from dynamic interactions and planning for long-run performance. As IRS is always with thousands of items to recommend (i.e., thousands of actions), most existing RL-based methods, however, fail to handle such a large discrete action space problem and thus become inefficient. The existing work that tries to deal with the large discrete action space problem by utilizing the deep deterministic policy gradient framework suffers from the inconsistency between the continuous action representation (the output of the actor network) and the real discrete action. To avoid such inconsistency and achieve high efficiency and recommendation effectiveness, in this paper, we propose a Tree-structured Policy Gradient Recommendation (TPGR) framework, where a balanced hierarchical clustering tree is built over the items and picking an item is formulated as seeking a path from the root to a certain leaf of the tree. Extensive experiments on carefully-designed environments based on two real-world datasets demonstrate that our model provides superior recommendation performance and significant efficiency improvement over state-of-the-art methods.


2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Lincan Li ◽  
Chiew Foong Kwong ◽  
Qianyu Liu ◽  
Jing Wang

This paper proposes a DRL-based cache content update policy in the cache-enabled network to improve the cache hit ratio and reduce the average latency. In contrast to the existing policies, a more practical cache scenario is considered in this work, in which the content requests vary by both time and location. Considering the constraint of the limited cache capacity, the dynamic content update problem is modeled as a Markov decision process (MDP). Besides that, the deep Q-learning network (DQN) algorithm is utilised to solve the MDP problem. Specifically, the neural network is optimised to approximate the Q value where the training data are chosen from the experience replay memory. The DQN agent derives the optimal policy for the cache decision. Compared with the existing policies, the simulation results show that our proposed policy is 56%–64% improved in terms of the cache hit ratio and 56%–59% decreased in terms of the average latency.


Sensors ◽  
2020 ◽  
Vol 20 (6) ◽  
pp. 1579
Author(s):  
Dongqi Wang ◽  
Qinghua Meng ◽  
Dongming Chen ◽  
Hupo Zhang ◽  
Lisheng Xu

Automatic detection of arrhythmia is of great significance for early prevention and diagnosis of cardiovascular disease. Traditional feature engineering methods based on expert knowledge lack multidimensional and multi-view information abstraction and data representation ability, so the traditional research on pattern recognition of arrhythmia detection cannot achieve satisfactory results. Recently, with the increase of deep learning technology, automatic feature extraction of ECG data based on deep neural networks has been widely discussed. In order to utilize the complementary strength between different schemes, in this paper, we propose an arrhythmia detection method based on the multi-resolution representation (MRR) of ECG signals. This method utilizes four different up to date deep neural networks as four channel models for ECG vector representations learning. The deep learning based representations, together with hand-crafted features of ECG, forms the MRR, which is the input of the downstream classification strategy. The experimental results of big ECG dataset multi-label classification confirm that the F1 score of the proposed method is 0.9238, which is 1.31%, 0.62%, 1.18% and 0.6% higher than that of each channel model. From the perspective of architecture, this proposed method is highly scalable and can be employed as an example for arrhythmia recognition.


2021 ◽  
Author(s):  
Miguel Dasilva ◽  
Christian Brandt ◽  
Marc Alwin Gieselmann ◽  
Claudia Distler ◽  
Alexander Thiele

Abstract Top-down attention, controlled by frontal cortical areas, is a key component of cognitive operations. How different neurotransmitters and neuromodulators flexibly change the cellular and network interactions with attention demands remains poorly understood. While acetylcholine and dopamine are critically involved, glutamatergic receptors have been proposed to play important roles. To understand their contribution to attentional signals, we investigated how ionotropic glutamatergic receptors in the frontal eye field (FEF) of male macaques contribute to neuronal excitability and attentional control signals in different cell types. Broad-spiking and narrow-spiking cells both required N-methyl-D-aspartic acid and α-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid receptor activation for normal excitability, thereby affecting ongoing or stimulus-driven activity. However, attentional control signals were not dependent on either glutamatergic receptor type in broad- or narrow-spiking cells. A further subdivision of cell types into different functional types using cluster-analysis based on spike waveforms and spiking characteristics did not change the conclusions. This can be explained by a model where local blockade of specific ionotropic receptors is compensated by cell embedding in large-scale networks. It sets the glutamatergic system apart from the cholinergic system in FEF and demonstrates that a reduction in excitability is not sufficient to induce a reduction in attentional control signals.


2021 ◽  
Vol 6 (1) ◽  
Author(s):  
Siddharth Arora ◽  
Alexandra Brintrup

AbstractThe relationship between a firm and its supply chain has been well studied, however, the association between the position of firms in complex supply chain networks and their performance has not been adequately investigated. This is primarily due to insufficient availability of empirical data on large-scale networks. To addresses this gap in the literature, we investigate the relationship between embeddedness patterns of individual firms in a supply network and their performance using empirical data from the automotive industry. In this study, we devise three measures that characterize the embeddedness of individual firms in a supply network. These are namely: centrality, tier position, and triads. Our findings caution us that centrality impacts individual performance through a diminishing returns relationship. The second measure, tier position, allows us to investigate the concept of tiers in supply networks because we find that as networks emerge, the boundaries between tiers become unclear. Performance of suppliers degrade as they move away from the focal firm (i.e., Toyota). The final measure, triads, investigates the effect of buying and selling to firms that supply the same customer, portraying the level of competition and cooperation in a supplier’s network. We find that increased coopetition (i.e., cooperative competition) is a performance enhancer, however, excessive complexity resulting from being involved in both upstream and downstream coopetition results in diminishing performance. These original insights help understand the drivers of firm performance from a network perspective and provide a basis for further research.


2009 ◽  
Vol 10 (1) ◽  
pp. 19 ◽  
Author(s):  
Tatsunori B Hashimoto ◽  
Masao Nagasaki ◽  
Kaname Kojima ◽  
Satoru Miyano

Sign in / Sign up

Export Citation Format

Share Document