Deep Reinforcement Learning Based Dynamic Route Planning for Minimizing Travel Time

Reinforcement learning (RL) has shown great potential for motorway ramp control, especially under the congestion caused by incidents. However, existing applications limited to single-agent tasks and based onQ-learning have inherent drawbacks for dealing with coordinated ramp control problems. For solving these problems, a Dyna-Qbased multiagent reinforcement learning (MARL) system named Dyna-MARL has been developed in this paper. Dyna-Qis an extension ofQ-learning, which combines model-free and model-based methods to obtain benefits from both sides. The performance of Dyna-MARL is tested in a simulated motorway segment in the UK with the real traffic data collected from AM peak hours. The test results compared with Isolated RL and noncontrolled situations show that Dyna-MARL can achieve a superior performance on improving the traffic operation with respect to increasing total throughput, reducing total travel time and CO2emission. Moreover, with a suitable coordination strategy, Dyna-MARL can maintain a highly equitable motorway system by balancing the travel time of road users from different on-ramps.

Download Full-text

Real-time urban regional route planning model for connected vehicles based on V2X communication

Journal of Transport and Land Use ◽

10.5198/jtlu.2020.1598 ◽

2020 ◽

Vol 13 (1) ◽

pp. 517-538 ◽

Cited By ~ 1

Author(s):

Pangwei Wang ◽

Hui Deng ◽

Juan Zhang ◽

Mingfang Zhang

Keyword(s):

Land Use ◽

Travel Time ◽

Real Time ◽

Land Use Planning ◽

Route Planning ◽

Urban Traffic ◽

Urban Transport ◽

Connected Vehicles ◽

Planning Model ◽

Optimal Route

Advancement in the novel technology of connected vehicles has presented opportunities and challenges for smart urban transport and land use. To improve the capacity of urban transport and optimize land-use planning, a novel real-time regional route planning model based on vehicle to X communication (V2X) is presented in this paper. First, considering the traffic signal timing and phase information collected by V2X, road section resistance values are calculated dynamically based on real-time vehicular driving data. Second, according to the topology structure of the current regional road network, all predicted routes are listed based on the Dijkstra algorithm. Third, the predicted travel time of each alternative route is calculated, while the predicted route with the least travel time is selected as the optimal route. Finally, we design the test scenario with different traffic saturation levels and collect 150 sets of data to analyze the feasibility of the proposed method. The numerical results have shown that the average travel times calculated by the proposed optimal route are 8.97 seconds, 12.54 seconds, and 21.85 seconds, which are much shorter than the results of traditional navigation routes. This proposed model can be further applied to the whole urban traffic network and contribute to a greater transport and land-use efficiency in the future.

Download Full-text