Hybrid deep reinforcement learning based eco-driving for low-level connected and automated vehicles along signalized corridors

2021 ◽  
Vol 124 ◽  
pp. 102980
Author(s):  
Qiangqiang Guo ◽  
Ohay Angah ◽  
Zhijun Liu ◽  
Xuegang (Jeff) Ban
2021 ◽  
pp. 318-329
Author(s):  
Nikodem Pankiewicz ◽  
Tomasz Wrona ◽  
Wojciech Turlej ◽  
Mateusz Orłowski

2010 ◽  
Vol 1 (1) ◽  
pp. 39-59 ◽  
Author(s):  
Ender Özcan ◽  
Mustafa Misir ◽  
Gabriela Ochoa ◽  
Edmund K. Burke

Hyper-heuristics can be identified as methodologies that search the space generated by a finite set of low level heuristics for solving search problems. An iterative hyper-heuristic framework can be thought of as requiring a single candidate solution and multiple perturbation low level heuristics. An initially generated complete solution goes through two successive processes (heuristic selection and move acceptance) until a set of termination criteria is satisfied. A motivating goal of hyper-heuristic research is to create automated techniques that are applicable to a wide range of problems with different characteristics. Some previous studies show that different combinations of heuristic selection and move acceptance as hyper-heuristic components might yield different performances. This study investigates whether learning heuristic selection can improve the performance of a great deluge based hyper-heuristic using an examination timetabling problem as a case study.


2017 ◽  
Vol 2017 ◽  
pp. 1-17 ◽  
Author(s):  
S. C. Calvert ◽  
W. J. Schakel ◽  
J. W. C. van Lint

With low-level vehicle automation already available, there is a necessity to estimate its effects on traffic flow, especially if these could be negative. A long gradual transition will occur from manual driving to automated driving, in which many yet unknown traffic flow dynamics will be present. These effects have the potential to increasingly aid or cripple current road networks. In this contribution, we investigate these effects using an empirically calibrated and validated simulation experiment, backed up with findings from literature. We found that low-level automated vehicles in mixed traffic will initially have a small negative effect on traffic flow and road capacities. The experiment further showed that any improvement in traffic flow will only be seen at penetration rates above 70%. Also, the capacity drop appeared to be slightly higher with the presence of low-level automated vehicles. The experiment further investigated the effect of bottleneck severity and truck shares on traffic flow. Improvements to current traffic models are recommended and should include a greater detail and understanding of driver-vehicle interaction, both in conventional and in mixed traffic flow. Further research into behavioural shifts in driving is also recommended due to limited data and knowledge of these dynamics.


2019 ◽  
Author(s):  
Charles Findling ◽  
Nicolas Chopin ◽  
Etienne Koechlin

AbstractEveryday life features uncertain and ever-changing situations. In such environments, optimal adaptive behavior requires higher-order inferential capabilities to grasp the volatility of external contingencies. These capabilities however involve complex and rapidly intractable computations, so that we poorly understand how humans develop efficient adaptive behaviors in such environments. Here we demonstrate this counterintuitive result: simple, low-level inferential processes involving imprecise computations conforming to the psychophysical Weber Law actually lead to near-optimal adaptive behavior, regardless of the environment volatility. Using volatile experimental settings, we further show that such imprecise, low-level inferential processes accounted for observed human adaptive performances, unlike optimal adaptive models involving higher-order inferential capabilities, their biologically more plausible, algorithmic approximations and non-inferential adaptive models like reinforcement learning. Thus, minimal inferential capabilities may have evolved along with imprecise neural computations as contributing to near-optimal adaptive behavior in real-life environments, while leading humans to make suboptimal choices in canonical decision-making tasks.


2021 ◽  
Vol 5 (4) ◽  
pp. 1-24
Author(s):  
Siddharth Mysore ◽  
Bassel Mabsout ◽  
Kate Saenko ◽  
Renato Mancuso

We focus on the problem of reliably training Reinforcement Learning (RL) models (agents) for stable low-level control in embedded systems and test our methods on a high-performance, custom-built quadrotor platform. A common but often under-studied problem in developing RL agents for continuous control is that the control policies developed are not always smooth. This lack of smoothness can be a major problem when learning controllers as it can result in control instability and hardware failure. Issues of noisy control are further accentuated when training RL agents in simulation due to simulators ultimately being imperfect representations of reality—what is known as the reality gap . To combat issues of instability in RL agents, we propose a systematic framework, REinforcement-based transferable Agents through Learning (RE+AL), for designing simulated training environments that preserve the quality of trained agents when transferred to real platforms. RE+AL is an evolution of the Neuroflight infrastructure detailed in technical reports prepared by members of our research group. Neuroflight is a state-of-the-art framework for training RL agents for low-level attitude control. RE+AL improves and completes Neuroflight by solving a number of important limitations that hindered the deployment of Neuroflight to real hardware. We benchmark RE+AL on the NF1 racing quadrotor developed as part of Neuroflight. We demonstrate that RE+AL significantly mitigates the previously observed issues of smoothness in RL agents. Additionally, RE+AL is shown to consistently train agents that are flight capable and with minimal degradation in controller quality upon transfer. RE+AL agents also learn to perform better than a tuned PID controller, with better tracking errors, smoother control, and reduced power consumption. To the best of our knowledge, RE+AL agents are the first RL-based controllers trained in simulation to outperform a well-tuned PID controller on a real-world controls problem that is solvable with classical control.


Author(s):  
Qiuyuan Huang ◽  
Zhe Gan ◽  
Asli Celikyilmaz ◽  
Dapeng Wu ◽  
Jianfeng Wang ◽  
...  

We propose a hierarchically structured reinforcement learning approach to address the challenges of planning for generating coherent multi-sentence stories for the visual storytelling task. Within our framework, the task of generating a story given a sequence of images is divided across a two-level hierarchical decoder. The high-level decoder constructs a plan by generating a semantic concept (i.e., topic) for each image in sequence. The low-level decoder generates a sentence for each image using a semantic compositional network, which effectively grounds the sentence generation conditioned on the topic. The two decoders are jointly trained end-to-end using reinforcement learning. We evaluate our model on the visual storytelling (VIST) dataset. Empirical results from both automatic and human evaluations demonstrate that the proposed hierarchically structured reinforced training achieves significantly better performance compared to a strong flat deep reinforcement learning baseline.


Author(s):  
Ender Özcan ◽  
Mustafa Misir ◽  
Gabriela Ochoa ◽  
Edmund K. Burke

Hyper-heuristics can be identified as methodologies that search the space generated by a finite set of low level heuristics for solving search problems. An iterative hyper-heuristic framework can be thought of as requiring a single candidate solution and multiple perturbation low level heuristics. An initially generated complete solution goes through two successive processes (heuristic selection and move acceptance) until a set of termination criteria is satisfied. A motivating goal of hyper-heuristic research is to create automated techniques that are applicable to a wide range of problems with different characteristics. Some previous studies show that different combinations of heuristic selection and move acceptance as hyper-heuristic components might yield different performances. This study investigates whether learning heuristic selection can improve the performance of a great deluge based hyper-heuristic using an examination timetabling problem as a case study.


Sign in / Sign up

Export Citation Format

Share Document