Policy Gradient Method based Energy Efficience Task Scheduling in Mobile Edge Blockchain

Author(s):  
Yin Yufeng ◽  
Wu Wenjun ◽  
Dong Junyu ◽  
Gao Yang ◽  
Sun Yang ◽  
...  
Nanophotonics ◽  
2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Sean Hooten ◽  
Raymond G. Beausoleil ◽  
Thomas Van Vaerenbergh

Abstract We present a proof-of-concept technique for the inverse design of electromagnetic devices motivated by the policy gradient method in reinforcement learning, named PHORCED (PHotonic Optimization using REINFORCE Criteria for Enhanced Design). This technique uses a probabilistic generative neural network interfaced with an electromagnetic solver to assist in the design of photonic devices, such as grating couplers. We show that PHORCED obtains better performing grating coupler designs than local gradient-based inverse design via the adjoint method, while potentially providing faster convergence over competing state-of-the-art generative methods. As a further example of the benefits of this method, we implement transfer learning with PHORCED, demonstrating that a neural network trained to optimize 8° grating couplers can then be re-trained on grating couplers with alternate scattering angles while requiring >10× fewer simulations than control cases.


2006 ◽  
Vol 54 (11) ◽  
pp. 911-920 ◽  
Author(s):  
Takamitsu Matsubara ◽  
Jun Morimoto ◽  
Jun Nakanishi ◽  
Masa-aki Sato ◽  
Kenji Doya

2021 ◽  
Vol 104 ◽  
pp. 104398
Author(s):  
Andrija Petrović ◽  
Mladen Nikolić ◽  
Miloš Jovanović ◽  
Miloš Bijanić ◽  
Boris Delibašić

2020 ◽  
pp. 107754632093014
Author(s):  
Xue-She Wang ◽  
James D Turner ◽  
Brian P Mann

This study describes an approach for attractor selection (or multistability control) in nonlinear dynamical systems with constrained actuation. Attractor selection is obtained using two different deep reinforcement learning methods: (1) the cross-entropy method and (2) the deep deterministic policy gradient method. The framework and algorithms for applying these control methods are presented. Experiments were performed on a Duffing oscillator, as it is a classic nonlinear dynamical system with multiple attractors. Both methods achieve attractor selection under various control constraints. Although these methods have nearly identical success rates, the deep deterministic policy gradient method has the advantages of a high learning rate, low performance variance, and a smooth control approach. This study demonstrates the ability of two reinforcement learning approaches to achieve constrained attractor selection.


Sign in / Sign up

Export Citation Format

Share Document