Route Optimization of Construction Machine by Deep Reinforcement Learning

Shunya Tanabe; Zeyuan Sun; Masayuki Nakatani; Yutaka Uchimura

doi:10.1541/ieejias.139.401

Simulation-based Reinforcement Learning Approach towards Construction Machine Automation

Proceedings of the 37th International Symposium on Automation and Robotics in Construction (ISARC) ◽

10.22260/isarc2020/0064 ◽

2020 ◽

Author(s):

Keita Matsumoto ◽

Atsushi Yamaguchi ◽

Takahiro Oka ◽

Masahiro Yasumoto ◽

Satoru Hara ◽

...

Keyword(s):

Reinforcement Learning ◽

Learning Approach ◽

Construction Machine ◽

Simulation Based

Download Full-text

Route Optimization of Unmanned Aerial Vehicle by using Reinforcement Learning

Journal of Physics Conference Series ◽

10.1088/1742-6596/1921/1/012076 ◽

2021 ◽

Vol 1921 ◽

pp. 012076

Author(s):

M Kundu ◽

D J Nagendra Kumar

Keyword(s):

Reinforcement Learning ◽

Unmanned Aerial Vehicle ◽

Route Optimization ◽

Aerial Vehicle

Download Full-text

Customized Bus Route Optimization Based on Reinforcement Learning

CICTP 2020 ◽

10.1061/9780784482933.251 ◽

2020 ◽

Author(s):

Ange Wang ◽

Liqun Peng ◽

Zhengtao Qin ◽

Hongzhi Guan ◽

Chuhao Nie

Keyword(s):

Reinforcement Learning ◽

Route Optimization

Download Full-text

Machine Learning-Based Optimization for Subsea Pipeline Route Design

10.4043/31031-ms ◽

2021 ◽

Author(s):

Subrata Bhowmik

Keyword(s):

Machine Learning ◽

Genetic Algorithm ◽

Reinforcement Learning ◽

Optimization Method ◽

Route Optimization ◽

Route Selection ◽

Optimal Route ◽

Operational Costs ◽

Route Design ◽

Pipeline Route

Abstract Optimal route selection for the subsea pipeline is a critical task for the pipeline design process, and the route selected can significantly affect the overall project cost. Therefore, it is necessary to design the routes to be economical and safe. On-bottom stability (OBS) and fixed obstacles like existing crossings and free spans are the main factors that affect the route selection. This article proposes a novel hybrid optimization method based on a typical Machine Learning algorithm for designing an optimal pipeline route. The proposed optimal route design is compared with one of the popular multi-objective optimization method named Genetic Algorithm (GA). The proposed pipeline route selection method uses a Reinforcement Learning (RL) algorithm, a particular type of machine learning method to train a pipeline system that would optimize the route selection of subsea pipelines. The route optimization tool evaluates each possible route by incorporating Onbottom stability criteria based on DNVGL-ST-109 standard and other constraints such as the minimum pipeline route length, static obstacles, pipeline crossings, and free-span section length. The cost function in the optimization method simultaneously handles the minimization of length and cost of mitigating procedures. Genetic Algorithm, a well established optimization method, has been used as a reference to compare the optimal route with the result from the proposed Reinforcement Learning based optimization method. Three different case studies are performed for finding the optimal route selection using the Reinforcement Learning (RL) approach considering the OBS criteria into its cost function and compared with the Genetic Algorithm (GA). The RL method saves upto 20% pipeline length for a complex problem with 15 crossings and 31 free spans. The RL optimization method provides the optimal routes, considering different aspects of the design and the costs associated with the various factors to stabilize a pipeline (mattress, trenching, burying, concrete coating, or even employing a more massive pipe with additional steel wall thickness). OBS criteria significantly influence the best route, indicating that the tool can reduce the pipeline's design time and minimize installation and operational costs of the pipeline. Conventionally the pipeline route optimization is performed by a manual process where the minimum roule length and static obstacles are considered to find an optimum route. The engineering is then performed to fulfill the criteria of this route, and this approach may not lead to an optimized engineering cost. The proposed Reinforced Learning method for route optimization is a mixed type, faster, and cost-efficient approach. It significantly minimizes the pipeline's installation and operational costs up to 20% of the conventional route selection process.

Download Full-text

Reinforcement Learning for Vehicle Route Optimization in SUMO

2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS) ◽

10.1109/hpcc/smartcity/dss.2018.00242 ◽

2018 ◽

Cited By ~ 1

Author(s):

Song Sang Koh ◽

Bo Zhou ◽

Po Yang ◽

Zaili Yang ◽

Hui Fang ◽

...

Keyword(s):

Reinforcement Learning ◽

Route Optimization ◽

Vehicle Route

Download Full-text

Route optimization for autonomous bulldozer by distributed deep reinforcement learning

2021 IEEE International Conference on Mechatronics (ICM) ◽

10.1109/icm46511.2021.9385686 ◽

2021 ◽

Author(s):

Yasuhiro Osaka ◽

Naoya Odajima ◽

Yutaka Uchimura

Keyword(s):

Reinforcement Learning ◽

Route Optimization

Download Full-text

Route Optimization via Environment-Aware Deep Network and Reinforcement Learning

ACM Transactions on Intelligent Systems and Technology ◽

10.1145/3461645 ◽

2021 ◽

Vol 12 (6) ◽

pp. 1-21

Author(s):

Pengzhan Guo ◽

Keli Xiao ◽

Zeyang Ye ◽

Wei Zhu

Keyword(s):

Reinforcement Learning ◽

Spatial Data ◽

Urban Areas ◽

Recommendation System ◽

Service Providers ◽

Route Optimization ◽

Spatial Data Analysis ◽

Parameter Determination ◽

Sequential Decision ◽

Taxi Drivers

Vehicle mobility optimization in urban areas is a long-standing problem in smart city and spatial data analysis. Given the complex urban scenario and unpredictable social events, our work focuses on developing a mobile sequential recommendation system to maximize the profitability of vehicle service providers (e.g., taxi drivers). In particular, we treat the dynamic route optimization problem as a long-term sequential decision-making task. A reinforcement-learning framework is proposed to tackle this problem, by integrating a self-check mechanism and a deep neural network for customer pick-up point monitoring. To account for unexpected situations (e.g., the COVID-19 outbreak), our method is designed to be capable of handling related environment changes with a self-adaptive parameter determination mechanism. Based on the yellow taxi data in New York City and vicinity before and after the COVID-19 outbreak, we have conducted comprehensive experiments to evaluate the effectiveness of our method. The results show consistently excellent performance, from hourly to weekly measures, to support the superiority of our method over the state-of-the-art methods (i.e., with more than 98% improvement in terms of the profitability for taxi drivers).

Download Full-text

Reinforcement Learning for Route Optimization with Robustness Guarantees

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/357 ◽

2021 ◽

Author(s):

Tobias Jacobs ◽

Francesco Alesiani ◽

Gulcin Ermis

Keyword(s):

Reinforcement Learning ◽

Optimization Problems ◽

Route Optimization ◽

Research Trend ◽

Learning To Learn ◽

Combinatorial Optimization Problems ◽

One Step ◽

Full Solution ◽

Original Objective ◽

The One

Application of deep learning to NP-hard combinatorial optimization problems is an emerging research trend, and a number of interesting approaches have been published over the last few years. In this work we address robust optimization, which is a more complex variant where a max-min problem is to be solved. We obtain robust solutions by solving the inner minimization problem exactly and apply Reinforcement Learning to learn a heuristic for the outer problem. The minimization term in the inner objective represents an obstacle to existing RL-based approaches, as its value depends on the full solution in a non-linear manner and cannot be evaluated for partial solutions constructed by the agent over the course of each episode. We overcome this obstacle by defining the reward in terms of the one-step advantage over a baseline policy whose role can be played by any fast heuristic for the given problem. The agent is trained to maximize the total advantage, which, as we show, is equivalent to the original objective. We validate our approach by solving min-max versions of standard benchmarks for the Capacitated Vehicle Routing and the Traveling Salesperson Problem, where our agents obtain near-optimal solutions and improve upon the baselines.

Download Full-text