Actor-Critic–Like Stochastic Adaptive Search for Continuous Simulation Optimization

Operations Research ◽

10.1287/opre.2021.2214 ◽

2021 ◽

Author(s):

Qi Zhang ◽

Jiaqiao Hu

Keyword(s):

Performance Evaluation ◽

Reinforcement Learning ◽

Simulation Optimization ◽

Search Algorithms ◽

Benchmark Problems ◽

Adaptive Search ◽

Time Analysis ◽

Simulation Experiments ◽

Simulation Data ◽

Continuous Simulation

Many systems arising in applications from engineering design, manufacturing, and healthcare require the use of simulation optimization (SO) techniques to improve their performance. In “Actor-Critic–Like Stochastic Adaptive Search for Continuous Simulation Optimization,” Q. Zhang and J. Hu propose a randomized approach that integrates ideas from actor-critic reinforcement learning within a class of adaptive search algorithms for solving SO problems. The approach fully retains the previous simulation data and incorporates them into an approximation architecture to exploit knowledge of the objective function in searching for improved solutions. The authors provide a finite-time analysis for the method when only a single simulation observation is collected at each iteration. The method works well on a diverse set of benchmark problems and has the potential to yield good performance for complex problems using expensive simulation experiments for performance evaluation.

Download Full-text

Single Observation Adaptive Search for Continuous Simulation Optimization

Operations Research ◽

10.1287/opre.2018.1759 ◽

2018 ◽

Vol 66 (6) ◽

pp. 1713-1727 ◽

Cited By ~ 1

Author(s):

Seksan Kiatsupaibul ◽

Robert L. Smith ◽

Zelda B. Zabinsky

Keyword(s):

Simulation Optimization ◽

Adaptive Search ◽

Single Observation ◽

Continuous Simulation

Download Full-text

Adaptive Search Algorithms for Discrete Stochastic Optimization: A Smooth Best-Response Approach

IEEE Transactions on Automatic Control ◽

10.1109/tac.2016.2539225 ◽

2017 ◽

Vol 62 (1) ◽

pp. 161-176 ◽

Cited By ~ 3

Author(s):

Omid Namvar Gharehshiran ◽

Vikram Krishnamurthy ◽

George Yin

Keyword(s):

Stochastic Optimization ◽

Search Algorithms ◽

Adaptive Search ◽

Best Response ◽

Discrete Stochastic Optimization

Download Full-text

APPROACHES TO ADAPTIVE STOCHASTIC SEARCH BASED ON THE NONEXTENSIVE q-DISTRIBUTION

International Journal of Bifurcation and Chaos ◽

10.1142/s0218127406015921 ◽

2006 ◽

Vol 16 (07) ◽

pp. 2081-2091 ◽

Cited By ~ 2

Author(s):

GEORGE D. MAGOULAS ◽

ARISTOKLIS ANASTASIADIS

Keyword(s):

Search Space ◽

Stochastic Search ◽

Diffusion Method ◽

Benchmark Problems ◽

Adaptive Search ◽

Particle Swarm Optimizer ◽

Network Training ◽

Nonextensive Entropy ◽

Potential Benefits ◽

Entropic Index

This paper explores the use of the nonextensive q-distribution in the context of adaptive stochastic searching. The proposed approach consists of generating the "probability" of moving from one point of the search space to another through a probability distribution characterized by the q entropic index of the nonextensive entropy. The potential benefits of this technique are investigated by incorporating it in two different adaptive search algorithmic models to create new modifications of the diffusion method and the particle swarm optimizer. The performance of the modified search algorithms is evaluated in a number of nonlinear optimization and neural network training benchmark problems.

Download Full-text

Simulation-optimization using a reinforcement learning approach

2008 Winter Simulation Conference ◽

10.1109/wsc.2008.4736213 ◽

2008 ◽

Cited By ~ 4

Author(s):

Carlos D. Paternina-Arboleda ◽

Jairo R. Montoya-Torres ◽

Aldo Fabregas-Ariza

Keyword(s):

Reinforcement Learning ◽

Simulation Optimization ◽

Learning Approach

Download Full-text

Surrogate-Based Promising Area Search for Lipschitz Continuous Simulation Optimization

INFORMS Journal on Computing ◽

10.1287/ijoc.2017.0801 ◽

2018 ◽

Vol 30 (4) ◽

pp. 677-693

Author(s):

Qi Fan ◽

Jiaqiao Hu

Keyword(s):

Simulation Optimization ◽

Continuous Simulation ◽

Lipschitz Continuous ◽

Promising Area

Download Full-text

Application of Extreme Learning Machine in GPS Positioning Process

MATEC Web of Conferences ◽

10.1051/matecconf/201817601034 ◽

2018 ◽

Vol 176 ◽

pp. 01034

Author(s):

Chengxin Li ◽

Jing Peng ◽

Lv Zhicheng ◽

Mengli Wang ◽

Gang Ou

Keyword(s):

Extreme Learning Machine ◽

Experimental Simulation ◽

Data Generation ◽

Simulation Experiments ◽

Simulation Data ◽

Gps Positioning ◽

Network Training ◽

Generation Network ◽

Network Prediction ◽

Learning Machine

In the positioning process of GPS, the linear least squares algorithm and Kalman filtering algorithm are widely used but still have shortcomings. Application of extreme learning machine in this area is proposed in this paper, which breaks through the limitations of the traditional method of positioning based on mathematical models. Two simulation experiments of ELM in GPS positioning process are presented in this paper while the latter is a supplement to the former. Each one contains three phases, including simulation data generation, network training and network prediction, each of which is considered carefully. The feasibility of extreme learning machine is verified through experimental simulation. A more accurate positioning result can be obtained.

Download Full-text

A genetic algorithm rooted in integer encoding and fuzzy controller

IAES International Journal of Robotics and Automation (IJRA) ◽

10.11591/ijra.v8i2.pp113-124 ◽

2019 ◽

Vol 8 (2) ◽

pp. 113

Author(s):

M. Jalali Varnamkhasti

Keyword(s):

Genetic Diversity ◽

Fuzzy Logic Controller ◽

Fuzzy Controller ◽

Search Algorithms ◽

Benchmark Problems ◽

Selection Mechanism ◽

Genetic Operators ◽

Local Search Algorithms ◽

Loss Of Genetic Diversity ◽

Essential Problem

The premature convergence is the essential problem in genetic algorithms and it is strongly related to the loss of genetic diversity of the population. In this study, a new sexual selection mechanism which utilizing mate chromosome during selection proposed and then technique focuses on selecting and controlling the genetic operators by applying the fuzzy logic controller. Computational experiments are conducted on the proposed techniques and the results are compared with some other operators, heuristic and local search algorithms commonly used for solving benchmark problems published in the literature.

Download Full-text

The application of Deep Reinforcement Learning in Coordinated Control of Nuclear Reactors

Journal of Physics Conference Series ◽

10.1088/1742-6596/2113/1/012030 ◽

2021 ◽

Vol 2113 (1) ◽

pp. 012030

Author(s):

Jing Li ◽

Yanyang Liu ◽

Xianguo Qing ◽

Kai Xiao ◽

Ying Zhang ◽

...

Keyword(s):

Control System ◽

Reinforcement Learning ◽

Nuclear Power ◽

Power Plants ◽

Nuclear Reactor ◽

Coordinated Control ◽

Level Control ◽

Simulation Experiments ◽

Reinforcement Learning Model ◽

Reactor Control System

Abstract The nuclear reactor control system plays a crucial role in the operation of nuclear power plants. The coordinated control of power control and steam generator level control has become one of the most important control problems in these systems. In this paper, we propose a mathematical model of the coordinated control system, and then transform it into a reinforcement learning model and develop a deep reinforcement learning control algorithm so-called DDPG algorithm to solve the problem. Through simulation experiments, our proposed algorithm has shown an extremely remarkable control performance.

Download Full-text

Reactive Search strategies using Reinforcement Learning, local search algorithms and Variable Neighborhood Search

Expert Systems with Applications ◽

10.1016/j.eswa.2014.01.040 ◽

2014 ◽

Vol 41 (10) ◽

pp. 4939-4949 ◽

Cited By ~ 12

Author(s):

João Paulo Queiroz dos Santos ◽

Jorge Dantas de Melo ◽

Adrião Dória Duarte Neto ◽

Daniel Aloise

Keyword(s):

Reinforcement Learning ◽

Local Search ◽

Variable Neighborhood Search ◽

Search Strategies ◽

Search Algorithms ◽

Neighborhood Search ◽

Local Search Algorithms

Download Full-text

The Lord of the Ring Road: A Review and Evaluation of Autonomous Control Policies for Traffic in a Ring Road

ACM Transactions on Cyber-Physical Systems ◽

10.1145/3494577 ◽

2022 ◽

Vol 6 (1) ◽

pp. 1-25

Author(s):

Fang-Chieh Chou ◽

Alben Rome Bagabaldo ◽

Alexandre M. Bayen

Keyword(s):

Reinforcement Learning ◽

Traffic Control ◽

Road Traffic ◽

Control Algorithms ◽

Dynamical Models ◽

Vehicle Miles Traveled ◽

Simulation Experiments ◽

Traffic Condition ◽

Ring Road ◽

Stop And Go

This study focuses on the comprehensive investigation of stop-and-go waves appearing in closed-circuit ring road traffic wherein we evaluate various longitudinal dynamical models for vehicles. It is known that the behavior of human-driven vehicles, with other traffic elements such as density held constant, could stimulate stop-and-go waves, which do not dissipate on the circuit ring road. Stop-and-go waves can be dissipated by adding automated vehicles (AVs) to the ring. Thorough investigations of the performance of AV longitudinal control algorithms were carried out in Flow, which is an integrated platform for reinforcement learning on traffic control. Ten AV algorithms presented in the literature are evaluated. For each AV algorithm, experiments are carried out by varying distributions and penetration rates of AVs. Two different distributions of AVs are studied. For the first distribution scenario, AVs are placed consecutively. Penetration rates are varied from 1 AV (5%) to all AVs (100%). For the second distribution scenario, AVs are placed with even distribution of human-driven vehicles in between any two AVs. In this scenario, penetration rates are varied from 2 AVs (10%) to 11 AVs (50%). Multiple runs (10 runs) are simulated to average out the randomness in the results. From more than 3,000 simulation experiments, we investigated how AV algorithms perform differently with varying distributions and penetration rates while all AV algorithms remained fixed under all distributions and penetration rates. Time to stabilize, maximum headway, vehicle miles traveled, and fuel economy are used to evaluate their performance. Using these metrics, we find that the traffic condition improvement is not necessarily dependent on the distribution for most of the AV controllers, particularly when no cooperation among AVs is considered. Traffic condition is generally improved with a higher AV penetration rate with only one of the AV algorithms showing a contrary trend. Among all AV algorithms in this study, the reinforcement learning controller shows the most consistent improvement under all distributions and penetration rates.

Download Full-text