Goal-based Target Network in Deep Q-Network with Hindsight Experience Replay

Eco-approach and departure is a complex control problem wherein a driver’s actions are guided over a period of time or distance so as to optimize fuel consumption. Reinforcement learning (RL) is a machine learning paradigm that mimics human learning behavior, in which an agent attempts to solve a given control problem by interacting with the environment and developing an optimal policy. Unlike the methods implemented in previous studies for solving the eco-driving problem, RL does not require prior knowledge of the environment to be learned and processed. This paper develops a deep reinforcement learning (DRL) agent for solving the eco-approach and departure problem in the vicinity of signalized intersections for minimization of fuel consumption. The DRL algorithm utilizes a deep neural network for the RL. Novel strategies such as varying actions, prioritized experience replay, target network, and double learning were implemented to overcome the expected instabilities during the training process. The results revealed the significance of the DRL algorithm in reducing fuel consumption. Interestingly, the DRL algorithm was able to successfully learn the environment and guide vehicles through the intersection without red light running violation. On average, the DRL provided fuel savings of about 13.02% with no red light running violations.

Download Full-text

Application of Stream Splitting Moving Target Network Defenses

2018 International Conference on Computational Science and Computational Intelligence (CSCI) ◽

10.1109/csci46756.2018.00032 ◽

2018 ◽

Author(s):

Joshua Klosterman ◽

Jacob L. Williams ◽

Michael C. Shlanta ◽

Daniel W. Burwitz

Keyword(s):

Moving Target ◽

Target Network ◽

Stream Splitting

Download Full-text

Synthetic Experiences for Accelerating DQN Performance in Discrete Non-Deterministic Environments

Algorithms ◽

10.3390/a14080226 ◽

2021 ◽

Vol 14 (8) ◽

pp. 226

Author(s):

Wenzel Pilar von Pilchau ◽

Anthony Stein ◽

Jörg Hähner

Keyword(s):

Reinforcement Learning ◽

State Of The Art ◽

Learning Algorithms ◽

Weighted Average ◽

Up States ◽

Experience Replay

State-of-the-art Deep Reinforcement Learning Algorithms such as DQN and DDPG use the concept of a replay buffer called Experience Replay. The default usage contains only the experiences that have been gathered over the runtime. We propose a method called Interpolated Experience Replay that uses stored (real) transitions to create synthetic ones to assist the learner. In this first approach to this field, we limit ourselves to discrete and non-deterministic environments and use a simple equally weighted average of the reward in combination with observed follow-up states. We could demonstrate a significantly improved overall mean average in comparison to a DQN network with vanilla Experience Replay on the discrete and non-deterministic FrozenLake8x8-v0 environment.

Download Full-text

A real-time HIL control system on rotary inverted pendulum hardware platform based on double deep Q-network

Measurement and Control ◽

10.1177/00202940211000380 ◽

2021 ◽

Vol 54 (3-4) ◽

pp. 417-428

Author(s):

Yanyan Dai ◽

KiDong Lee ◽

SukGyu Lee

Keyword(s):

Control System ◽

Reinforcement Learning ◽

Inverted Pendulum ◽

Learning Algorithm ◽

Deep Understanding ◽

Control Engineering ◽

Experience Replay ◽

Real Hardware ◽

Rotary Inverted Pendulum ◽

Reinforcement Learning Algorithm

For real applications, rotary inverted pendulum systems have been known as the basic model in nonlinear control systems. If researchers have no deep understanding of control, it is difficult to control a rotary inverted pendulum platform using classic control engineering models, as shown in section 2.1. Therefore, without classic control theory, this paper controls the platform by training and testing reinforcement learning algorithm. Many recent achievements in reinforcement learning (RL) have become possible, but there is a lack of research to quickly test high-frequency RL algorithms using real hardware environment. In this paper, we propose a real-time Hardware-in-the-loop (HIL) control system to train and test the deep reinforcement learning algorithm from simulation to real hardware implementation. The Double Deep Q-Network (DDQN) with prioritized experience replay reinforcement learning algorithm, without a deep understanding of classical control engineering, is used to implement the agent. For the real experiment, to swing up the rotary inverted pendulum and make the pendulum smoothly move, we define 21 actions to swing up and balance the pendulum. Comparing Deep Q-Network (DQN), the DDQN with prioritized experience replay algorithm removes the overestimate of Q value and decreases the training time. Finally, this paper shows the experiment results with comparisons of classic control theory and different reinforcement learning algorithms.

Download Full-text

Deep Deterministic Policy Gradient Based on Double Network Prioritized Experience Replay

IEEE Access ◽

10.1109/access.2021.3074535 ◽

2021 ◽

pp. 1-1

Author(s):

Chaohai Kang ◽

Chuiting Rong ◽

Weijian Ren ◽

Fengcai Huo ◽

Pengyun Liu

Keyword(s):

Double Network ◽

Policy Gradient ◽

Experience Replay ◽

Gradient Based

Download Full-text

Preliminary Analysis of the Therapeutic Mechanism of Feiluoning in Convalescent Patients With COVID-19

Natural Product Communications ◽

10.1177/1934578x20977620 ◽

2020 ◽

Vol 15 (12) ◽

pp. 1934578X2097762

Author(s):

Zongchao Hong ◽

Maolin Hong ◽

Bo Liu ◽

Ying Zhang ◽

Yanfang Yang ◽

...

Keyword(s):

Chinese Medicine ◽

Pulmonary Fibrosis ◽

Pulmonary Function ◽

Target Genes ◽

Interaction Analysis ◽

Binding Capacity ◽

Chemical Constituents ◽

The Core ◽

Target Network ◽

Compound Target

Coronavirus disease 2019 (COVID-19), caused by severe acute respiratory syndrome-coronavirus 2 (SARS-CoV-2), is often accompanied by injury to pulmonary function and pulmonary fibrosis. Feiluoning (FLN) is a new Chinese medicine prescription which is available for the treatment of severe and critical convalescence of COVID-19 patients. FLN also has a positive effect on pulmonary function injury and pulmonary fibrosis. We explored the potential mechanism of FLN’s effect on the convalescent treatment of COVID-19. According to the pharmacodynamic activity parameters, we screened the active chemical constituents of FLN by comparing the Traditional Chinese Medicine Systems Pharmacology Database and Analysis Platform. The Uniprot database was used to querying the corresponding target genes, and Cytoscape 3.6.1 was used to construct a herb-compound-target network. Protein interaction analysis, target gene function enrichment analysis, and signal pathway analysis were performed using the STRING, DAVID, and Kyoto Encyclopedia of Genes and Genomes pathway databases. Molecular docking was used to predict the binding capacity of the core compound with COVID-19 hydrolase 3 Cl and angiotensin-converting enzyme 2 (ACE2). The herb-compound-target network was successfully constructed and key targets identified, including prostaglandin G/H synthase 2, estrogen receptor 1, heat shock protein HSP 90, and androgen receptor. The major affected metabolic pathways were pathways in cancer, pancreatic cancer, nonsmall cell lung cancer, and toll-like receptor signaling. The core compounds of FLN, including quercetin, luteolin, kaempferol, and stigmasterol, could strongly bind to COVID-19 3 Cl hydrolase, and other compounds, including 7-O-methylisomucronulatol and medicocarpin, could strongly bind to ACE2. Thus, it is predicted that FLN has the characteristics of a multicomponent, multitarget, and multichannel overall control compound. FLN’s mechanism of action in the treatment of COVID-19 may be associated with the regulation of inflammation and immune-related signaling pathways, and the influence of COVID-19 3 Cl hydrolase binding ability.

Download Full-text

Mining Significant Substructure Pairs for Interpreting Polypharmacology in Drug-Target Network

PLoS ONE ◽

10.1371/journal.pone.0016999 ◽

2011 ◽

Vol 6 (2) ◽

pp. e16999 ◽

Cited By ~ 27

Author(s):

Ichigaku Takigawa ◽

Koji Tsuda ◽

Hiroshi Mamitsuka

Keyword(s):

Drug Target ◽

Target Network

Download Full-text

miRNA proxy approach reveals hidden functions of glycosylation

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1502076112 ◽

2015 ◽

Vol 112 (23) ◽

pp. 7327-7332 ◽

Cited By ~ 21

Author(s):

Tomasz Kurcon ◽

Zhongyin Liu ◽

Anika V. Paradkar ◽

Christopher A. Vaiana ◽

Sujeethraj Koppolu ◽

...

Keyword(s):

Posttranslational Modification ◽

Epithelial To Mesenchymal Transition ◽

Biological Function ◽

Regulatory Elements ◽

Rapid Identification ◽

Mesenchymal Transition ◽

Disease States ◽

Mesenchymal To Epithelial Transition ◽

Genes Encoding ◽

Target Network

Glycosylation, the most abundant posttranslational modification, holds an unprecedented capacity for altering biological function. Our ability to harness glycosylation as a means to control biological systems is hampered by our inability to pinpoint the specific glycans and corresponding biosynthetic enzymes underlying a biological process. Herein we identify glycosylation enzymes acting as regulatory elements within a pathway using microRNA (miRNA) as a proxy. Leveraging the target network of the miRNA-200 family (miR-200f), regulators of epithelial-to-mesenchymal transition (EMT), we pinpoint genes encoding multiple promesenchymal glycosylation enzymes (glycogenes). We focus on three enzymes, beta-1,3-glucosyltransferase (B3GLCT), beta-galactoside alpha-2,3-sialyltransferase 5 (ST3GAL5), and (alpha-N-acetyl-neuraminyl-2,3-beta-galactosyl-1,3)-N-acetylgalactosaminide alpha-2,6-sialyltransferase 5 (ST6GALNAC5), encoding glycans that are difficult to analyze by traditional methods. Silencing these glycogenes phenocopied the effect of miR-200f, inducing mesenchymal-to-epithelial transition. In addition, all three are up-regulated in TGF-β–induced EMT, suggesting tight integration within the EMT-signaling network. Our work indicates that miRNA can act as a relatively simple proxy to decrypt which glycogenes, including those encoding difficult-to-analyze structures (e.g., proteoglycans, glycolipids), are functionally important in a biological pathway, setting the stage for the rapid identification of glycosylation enzymes driving disease states.

Download Full-text

Lucid dreaming for experience replay: refreshing past states with the current policy

Neural Computing and Applications ◽

10.1007/s00521-021-06104-5 ◽

2021 ◽

Author(s):

Yunshu Du ◽

Garrett Warnell ◽

Assefaw Gebremedhin ◽

Peter Stone ◽

Matthew E. Taylor

Keyword(s):

Current Policy ◽

Lucid Dreaming ◽

Experience Replay

Download Full-text

A Vertical Handoff Decision Algorithm Based on QoS Evaluation

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.989-994.1464 ◽

2014 ◽

Vol 989-994 ◽

pp. 1464-1468

Author(s):

Yang Tao ◽

Kun Zhou

Keyword(s):

User Preferences ◽

Vertical Handoff ◽

Heterogeneous Wireless Network ◽

Analytic Hierarchy ◽

Full Account ◽

Decision Algorithm ◽

Target Network ◽

Simulation Results ◽

Qos Evaluation ◽

Hierarchy Process

In the next generation of heterogeneous wireless network environment, to meet the network requirements of diverse services ,we propose a vertical handoff decision algorithm based on QoS evaluation that refine the handover unit to services. The proposed algorithm consider the needs of the services、 network conditions、 user preferences and other factors, and makes Analytic Hierarchy Process (AHP) and cost function combine to choose the target network that is best meet the requirements of services . Comparing with the vertical handoff decision based on RSS, simulation results show that the proposed method can take full account of the different QoS requirements of various services types to choose the appropriate network, and would not cause performance degradation.

Download Full-text