reinforcement learning model
Recently Published Documents


TOTAL DOCUMENTS

168
(FIVE YEARS 86)

H-INDEX

17
(FIVE YEARS 4)

Symmetry ◽  
2022 ◽  
Vol 14 (1) ◽  
pp. 161
Author(s):  
Hyojoon Han ◽  
Hyukho Kim ◽  
Yangwoo Kim

The complexity of network intrusion detection systems (IDSs) is increasing due to the continuous increases in network traffic, various attacks and the ever-changing network environment. In addition, network traffic is asymmetric with few attack data, but the attack data are so complex that it is difficult to detect one. Many studies on improving intrusion detection performance using feature engineering have been conducted. These studies work well in the dataset environment; however, it is challenging to cope with a changing network environment. This paper proposes an intrusion detection hyperparameter control system (IDHCS) that controls and trains a deep neural network (DNN) feature extractor and k-means clustering module as a reinforcement learning model based on proximal policy optimization (PPO). An IDHCS controls the DNN feature extractor to extract the most valuable features in the network environment, and identifies intrusion through k-means clustering. Through iterative learning using the PPO-based reinforcement learning model, the system is optimized to improve performance automatically according to the network environment, where the IDHCS is used. Experiments were conducted to evaluate the system performance using the CICIDS2017 and UNSW-NB15 datasets. In CICIDS2017, an F1-score of 0.96552 was achieved and UNSW-NB15 achieved an F1-score of 0.94268. An experiment was conducted by merging the two datasets to build a more extensive and complex test environment. By merging datasets, the attack types in the experiment became more diverse and their patterns became more complex. An F1-score of 0.93567 was achieved in the merged dataset, indicating 97% to 99% performance compared with CICIDS2017 and UNSW-NB15. The results reveal that the proposed IDHCS improved the performance of the IDS by automating learning new types of attacks by managing intrusion detection features regardless of the network environment changes through continuous learning.


2021 ◽  
Vol 11 (23) ◽  
pp. 11208
Author(s):  
Wen Wen ◽  
Yuyu Yuan ◽  
Jincui Yang

Reinforcement learning has been applied to various types of financial assets trading, such as stocks, futures, and cryptocurrencies. Options, as a novel kind of derivative, have their characteristics. Because there are too many option contracts for one underlying asset and their price behavior is different. Besides, the validity period of an option contract is relatively short. To apply reinforcement learning to options trading, we propose the options trading reinforcement learning (OTRL) framework. We use options’ underlying asset data to train the reinforcement learning model. Candle data in different time intervals are utilized, respectively. The protective closing strategy is added to the model to prevent unbearable losses. Our experiments demonstrate that the most stable algorithm for obtaining high returns is proximal policy optimization (PPO) with the protective closing strategy. The deep Q network (DQN) can exceed the buy and hold strategy in options trading, as can soft actor critic (SAC). The OTRL framework is verified effectively.


Author(s):  
Quan Zhang ◽  
Qian Du ◽  
Guohua Liu

Abstract Objective Alzheimer's disease (AD), a common disease of the elderly with unknown etiology, has been bothering many people, especially with the aging of the population and the younger trend of this disease. Current AI methods based on individual information or magnetic resonance imaging (MRI) can solve the problem of diagnostic sensitivity and specificity, but still face the challenges of interpretability and clinical feasibility. In this study, we propose an interpretable multimodal deep reinforcement learning model for inferring pathological features and diagnosis of Alzheimer's disease. Approach First, for better clinical feasibility, the compressed-sensing MRI image is reconstructed by an interpretable deep reinforcement learning model. Then, the reconstructed MRI is input into the full convolution neural network to generate a pixel-level disease probability of risk map (DPM) of the whole brain for Alzheimer's disease. Finally, the DPM of important brain regions and individual information are input into the attention-based fully deep neural network to obtain the diagnosis results and analyze the biomarkers. 1349 multi-center samples were used to construct and test the model. Main Results Finally, the model obtained 99.6%±0.2, 97.9%±0.2, and 96.1%±0.3 area under curve (AUC) in ADNI, AIBL, and NACC, respectively. The model also provides an effective analysis of multimodal pathology and predicts the imaging biomarkers on MRI and the weight of each individual information. In this study, a deep reinforcement learning model was designed, which can not only accurately diagnose AD, but also analyze potential biomarkers. Significance In this study, a deep reinforcement learning model was designed. The model builds a bridge between clinical practice and artificial intelligence diagnosis and provides a viewpoint for the interpretability of artificial intelligence technology.


2021 ◽  
Vol 2113 (1) ◽  
pp. 012030
Author(s):  
Jing Li ◽  
Yanyang Liu ◽  
Xianguo Qing ◽  
Kai Xiao ◽  
Ying Zhang ◽  
...  

Abstract The nuclear reactor control system plays a crucial role in the operation of nuclear power plants. The coordinated control of power control and steam generator level control has become one of the most important control problems in these systems. In this paper, we propose a mathematical model of the coordinated control system, and then transform it into a reinforcement learning model and develop a deep reinforcement learning control algorithm so-called DDPG algorithm to solve the problem. Through simulation experiments, our proposed algorithm has shown an extremely remarkable control performance.


2021 ◽  
Vol 15 ◽  
Author(s):  
Lei Xiao ◽  
Todd F. Roberts

Basal ganglia (BG) circuits integrate sensory and motor-related information from the cortex, thalamus, and midbrain to guide learning and production of motor sequences. Birdsong, like speech, is comprised of precisely sequenced vocal elements. Learning song sequences during development relies on Area X, a vocalization related region in the medial striatum of the songbird BG. Area X receives inputs from cortical-like pallial song circuits and midbrain dopaminergic circuits and sends projections to the thalamus. It has recently been shown that thalamic circuits also send substantial projections back to Area X. Here, we outline a gated-reinforcement learning model for how Area X may use signals conveyed by thalamostriatal inputs to direct song learning. Integrating conceptual advances from recent mammalian and songbird literature, we hypothesize that thalamostriatal pathways convey signals linked to song syllable onsets and offsets and influence striatal circuit plasticity via regulation of cholinergic interneurons (ChIs). We suggest that syllable sequence associated vocal-motor information from the thalamus drive precisely timed pauses in ChIs activity in Area X. When integrated with concurrent corticostriatal and dopaminergic input, this circuit helps regulate plasticity on medium spiny neurons (MSNs) and the learning of syllable sequences. We discuss new approaches that can be applied to test core ideas of this model and how associated insights may provide a framework for understanding the function of BG circuits in learning motor sequences.


Sign in / Sign up

Export Citation Format

Share Document