Goal-based Target Network in Deep Q-Network with Hindsight Experience Replay

2021 ◽  
Vol 19 (7) ◽  
pp. 27-33
Author(s):  
Chayoung Kim
Author(s):  
Saleh R. Mousa ◽  
Sherif Ishak ◽  
Ragab M. Mousa ◽  
Julius Codjoe ◽  
Mohammed Elhenawy

Eco-approach and departure is a complex control problem wherein a driver’s actions are guided over a period of time or distance so as to optimize fuel consumption. Reinforcement learning (RL) is a machine learning paradigm that mimics human learning behavior, in which an agent attempts to solve a given control problem by interacting with the environment and developing an optimal policy. Unlike the methods implemented in previous studies for solving the eco-driving problem, RL does not require prior knowledge of the environment to be learned and processed. This paper develops a deep reinforcement learning (DRL) agent for solving the eco-approach and departure problem in the vicinity of signalized intersections for minimization of fuel consumption. The DRL algorithm utilizes a deep neural network for the RL. Novel strategies such as varying actions, prioritized experience replay, target network, and double learning were implemented to overcome the expected instabilities during the training process. The results revealed the significance of the DRL algorithm in reducing fuel consumption. Interestingly, the DRL algorithm was able to successfully learn the environment and guide vehicles through the intersection without red light running violation. On average, the DRL provided fuel savings of about 13.02% with no red light running violations.


Algorithms ◽  
2021 ◽  
Vol 14 (8) ◽  
pp. 226
Author(s):  
Wenzel Pilar von Pilchau ◽  
Anthony Stein ◽  
Jörg Hähner

State-of-the-art Deep Reinforcement Learning Algorithms such as DQN and DDPG use the concept of a replay buffer called Experience Replay. The default usage contains only the experiences that have been gathered over the runtime. We propose a method called Interpolated Experience Replay that uses stored (real) transitions to create synthetic ones to assist the learner. In this first approach to this field, we limit ourselves to discrete and non-deterministic environments and use a simple equally weighted average of the reward in combination with observed follow-up states. We could demonstrate a significantly improved overall mean average in comparison to a DQN network with vanilla Experience Replay on the discrete and non-deterministic FrozenLake8x8-v0 environment.


2021 ◽  
Vol 54 (3-4) ◽  
pp. 417-428
Author(s):  
Yanyan Dai ◽  
KiDong Lee ◽  
SukGyu Lee

For real applications, rotary inverted pendulum systems have been known as the basic model in nonlinear control systems. If researchers have no deep understanding of control, it is difficult to control a rotary inverted pendulum platform using classic control engineering models, as shown in section 2.1. Therefore, without classic control theory, this paper controls the platform by training and testing reinforcement learning algorithm. Many recent achievements in reinforcement learning (RL) have become possible, but there is a lack of research to quickly test high-frequency RL algorithms using real hardware environment. In this paper, we propose a real-time Hardware-in-the-loop (HIL) control system to train and test the deep reinforcement learning algorithm from simulation to real hardware implementation. The Double Deep Q-Network (DDQN) with prioritized experience replay reinforcement learning algorithm, without a deep understanding of classical control engineering, is used to implement the agent. For the real experiment, to swing up the rotary inverted pendulum and make the pendulum smoothly move, we define 21 actions to swing up and balance the pendulum. Comparing Deep Q-Network (DQN), the DDQN with prioritized experience replay algorithm removes the overestimate of Q value and decreases the training time. Finally, this paper shows the experiment results with comparisons of classic control theory and different reinforcement learning algorithms.


IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
Chaohai Kang ◽  
Chuiting Rong ◽  
Weijian Ren ◽  
Fengcai Huo ◽  
Pengyun Liu

2020 ◽  
Vol 15 (12) ◽  
pp. 1934578X2097762
Author(s):  
Zongchao Hong ◽  
Maolin Hong ◽  
Bo Liu ◽  
Ying Zhang ◽  
Yanfang Yang ◽  
...  

Coronavirus disease 2019 (COVID-19), caused by severe acute respiratory syndrome-coronavirus 2 (SARS-CoV-2), is often accompanied by injury to pulmonary function and pulmonary fibrosis. Feiluoning (FLN) is a new Chinese medicine prescription which is available for the treatment of severe and critical convalescence of COVID-19 patients. FLN also has a positive effect on pulmonary function injury and pulmonary fibrosis. We explored the potential mechanism of FLN’s effect on the convalescent treatment of COVID-19. According to the pharmacodynamic activity parameters, we screened the active chemical constituents of FLN by comparing the Traditional Chinese Medicine Systems Pharmacology Database and Analysis Platform. The Uniprot database was used to querying the corresponding target genes, and Cytoscape 3.6.1 was used to construct a herb-compound-target network. Protein interaction analysis, target gene function enrichment analysis, and signal pathway analysis were performed using the STRING, DAVID, and Kyoto Encyclopedia of Genes and Genomes pathway databases. Molecular docking was used to predict the binding capacity of the core compound with COVID-19 hydrolase 3 Cl and angiotensin-converting enzyme 2 (ACE2). The herb-compound-target network was successfully constructed and key targets identified, including prostaglandin G/H synthase 2, estrogen receptor 1, heat shock protein HSP 90, and androgen receptor. The major affected metabolic pathways were pathways in cancer, pancreatic cancer, nonsmall cell lung cancer, and toll-like receptor signaling. The core compounds of FLN, including quercetin, luteolin, kaempferol, and stigmasterol, could strongly bind to COVID-19 3 Cl hydrolase, and other compounds, including 7-O-methylisomucronulatol and medicocarpin, could strongly bind to ACE2. Thus, it is predicted that FLN has the characteristics of a multicomponent, multitarget, and multichannel overall control compound. FLN’s mechanism of action in the treatment of COVID-19 may be associated with the regulation of inflammation and immune-related signaling pathways, and the influence of COVID-19 3 Cl hydrolase binding ability.


PLoS ONE ◽  
2011 ◽  
Vol 6 (2) ◽  
pp. e16999 ◽  
Author(s):  
Ichigaku Takigawa ◽  
Koji Tsuda ◽  
Hiroshi Mamitsuka
Keyword(s):  

2015 ◽  
Vol 112 (23) ◽  
pp. 7327-7332 ◽  
Author(s):  
Tomasz Kurcon ◽  
Zhongyin Liu ◽  
Anika V. Paradkar ◽  
Christopher A. Vaiana ◽  
Sujeethraj Koppolu ◽  
...  

Glycosylation, the most abundant posttranslational modification, holds an unprecedented capacity for altering biological function. Our ability to harness glycosylation as a means to control biological systems is hampered by our inability to pinpoint the specific glycans and corresponding biosynthetic enzymes underlying a biological process. Herein we identify glycosylation enzymes acting as regulatory elements within a pathway using microRNA (miRNA) as a proxy. Leveraging the target network of the miRNA-200 family (miR-200f), regulators of epithelial-to-mesenchymal transition (EMT), we pinpoint genes encoding multiple promesenchymal glycosylation enzymes (glycogenes). We focus on three enzymes, beta-1,3-glucosyltransferase (B3GLCT), beta-galactoside alpha-2,3-sialyltransferase 5 (ST3GAL5), and (alpha-N-acetyl-neuraminyl-2,3-beta-galactosyl-1,3)-N-acetylgalactosaminide alpha-2,6-sialyltransferase 5 (ST6GALNAC5), encoding glycans that are difficult to analyze by traditional methods. Silencing these glycogenes phenocopied the effect of miR-200f, inducing mesenchymal-to-epithelial transition. In addition, all three are up-regulated in TGF-β–induced EMT, suggesting tight integration within the EMT-signaling network. Our work indicates that miRNA can act as a relatively simple proxy to decrypt which glycogenes, including those encoding difficult-to-analyze structures (e.g., proteoglycans, glycolipids), are functionally important in a biological pathway, setting the stage for the rapid identification of glycosylation enzymes driving disease states.


Author(s):  
Yunshu Du ◽  
Garrett Warnell ◽  
Assefaw Gebremedhin ◽  
Peter Stone ◽  
Matthew E. Taylor

2014 ◽  
Vol 989-994 ◽  
pp. 1464-1468
Author(s):  
Yang Tao ◽  
Kun Zhou

In the next generation of heterogeneous wireless network environment, to meet the network requirements of diverse services ,we propose a vertical handoff decision algorithm based on QoS evaluation that refine the handover unit to services. The proposed algorithm consider the needs of the services、 network conditions、 user preferences and other factors, and makes Analytic Hierarchy Process (AHP) and cost function combine to choose the target network that is best meet the requirements of services . Comparing with the vertical handoff decision based on RSS, simulation results show that the proposed method can take full account of the different QoS requirements of various services types to choose the appropriate network, and would not cause performance degradation.


Sign in / Sign up

Export Citation Format

Share Document