Intelligent Buses in a Loop Service: Emergence of No-Boarding and Holding Strategies

We study how N intelligent buses serving a loop of M bus stops learn a no-boarding strategy and a holding strategy by reinforcement learning. The no-boarding and holding strategies emerge from the actions of stay or leave when a bus is at a bus stop and everyone who wishes to alight has done so. A reward that encourages the buses to strive towards a staggered phase difference amongst them whilst picking up passengers allows the reinforcement learning process to converge to an optimal Q-table within a reasonable amount of simulation time. It is remarkable that this emergent behaviour of intelligent buses turns out to minimise the average waiting time of commuters, in various setups where buses move with the same speed or different speeds, during busy as well as lull periods. Cooperative actions are also observed, e.g., the buses learn to unbunch.

Download Full-text

Adaptive Traffic Signal Control Model on Intersections Based on Deep Reinforcement Learning

Journal of Advanced Transportation ◽

10.1155/2020/6505893 ◽

2020 ◽

Vol 2020 ◽

pp. 1-14

Author(s):

Duowei Li ◽

Jianping Wu ◽

Ming Xu ◽

Ziheng Wang ◽

Kezhen Hu

Keyword(s):

Reinforcement Learning ◽

Waiting Time ◽

Traffic Signals ◽

Control Model ◽

Traffic Signal ◽

Signal Control ◽

Traffic Signal Control ◽

Average Waiting Time ◽

Adaptive Traffic Signal Control ◽

Proposed Model

Controlling traffic signals to alleviate increasing traffic pressure is a concept that has received public attention for a long time. However, existing systems and methodologies for controlling traffic signals are insufficient for addressing the problem. To this end, we build a truly adaptive traffic signal control model in a traffic microsimulator, i.e., “Simulation of Urban Mobility” (SUMO), using the technology of modern deep reinforcement learning. The model is proposed based on a deep Q-network algorithm that precisely represents the elements associated with the problem: agents, environments, and actions. The real-time state of traffic, including the number of vehicles and the average speed, at one or more intersections is used as an input to the model. To reduce the average waiting time, the agents provide an optimal traffic signal phase and duration that should be implemented in both single-intersection cases and multi-intersection cases. The co-operation between agents enables the model to achieve an improvement in overall performance in a large road network. By testing with data sets pertaining to three different traffic conditions, we prove that the proposed model is better than other methods (e.g., Q-learning method, longest queue first method, and Webster fixed timing control method) for all cases. The proposed model reduces both the average waiting time and travel time, and it becomes more advantageous as the traffic environment becomes more complex.

Download Full-text

SISTEM PENDUKUNG KEPUTUSAN PREDIKSI JUMLAH PENUMPANG UNTUK EVALUASI KAPASITAS HALTE BUS TRANS JOGJA DENGAN METODE EXPONENTIAL SMOOTHING DAN LEAST SQUARE

Compiler ◽

10.28989/compiler.v2i1.33 ◽

2013 ◽

Vol 2 (1) ◽

Author(s):

Dwi Prasetiyo ◽

Anton Setiawan Honggowibowo ◽

Yuliani Indrianingsih

Keyword(s):

Time Series ◽

Waiting Time ◽

Exponential Smoothing ◽

Small Error ◽

Least Square ◽

Mean Absolute Deviation ◽

Absolute Deviation ◽

Bus Stop ◽

Trend Forecasting ◽

Bus Stops

The increasing number o f passengers Trans Jogja bus stops can result in the existing capacity can not accommodate the number of passengers comfortably. Problems that often arise include delays resulting bus passenger waiting time is longer and there is a buildup of the number of passengers at stops. As a result of these problems, the capacity o f passenger stops can be full so that prospective passengers waiting outside the bus stop. Forecasting is one very important element in the decision. In this study using stationary and trend forecasting the data because the data are not significant changes between time and swell in certain periods and a normal in periods others. Time series methods for forecasting the number o f passengers on the Trans Jogja stop using exponential smoothing calculation and least square. From these calculations the value sought MAD (Mean Absolute Deviation) or least square error is exponential smoothing and forecasting results with small error. Forecasting will be better if it contains fewer possible errors.

Download Full-text

Traffic Congestion Prediction using Deep Reinforcement Learning in Vehicular Ad-Hoc Networks (VANETS)

International journal of Computer Networks & Communications ◽

10.5121/ijcnc.2021.13401 ◽

2021 ◽

Vol 13 (04) ◽

pp. 01-19

Author(s):

Chantakarn Pholpol ◽

Teerapat Sanguankotchakorn

Keyword(s):

Reinforcement Learning ◽

Time Delay ◽

Waiting Time ◽

Traffic Congestion ◽

Ad Hoc ◽

Learning Model ◽

Average Waiting Time ◽

Congestion Prediction ◽

Travelling Time ◽

Reinforcement Learning Model

In recent years, a new wireless network called vehicular ad-hoc network (VANET), has become a popular research topic. VANET allows communication among vehicles and with roadside units by providing information to each other, such as vehicle velocity, location and direction. In general, when many vehicles likely to use the common route to proceed to the same destination, it can lead to a congested route that should be avoided. It may be better if vehicles are able to predict accurately the traffic congestion and then avoid it. Therefore, in this work, the deep reinforcement learning in VANET to enhance the ability to predict traffic congestion on the roads is proposed. Furthermore, different types of neural networks namely Convolutional Neural Network (CNN), Multilayer Perceptron (MLP) and Long Short-Term Memory (LSTM) are investigated and compared in this deep reinforcement learning model to discover the most effective one. Our proposed method is tested by simulation. The traffic scenarios are created using traffic simulator called Simulation of Urban Mobility (SUMO) before integrating with deep reinforcement learning model. The simulation procedures, as well as the programming used, are described in detail. The performance of our proposed method is evaluated using two metrics; the average travelling time delay and average waiting time delay of vehicles. According to the simulation results, the average travelling time delay and average waiting time delay are gradually improved over the multiple runs, since our proposed method receives feedback from the environment. In addition, the results without and with three different deep learning algorithms, i.e., CNN, MLP and LSTM are compared. It is obvious that the deep reinforcement learning model works effectively when traffic density is neither too high nor too low. In addition, it can be concluded that the effective algorithms for traffic congestion prediction models in descending order are MLP, CNN, and LSTM, respectively.

Download Full-text

High volume bus stop upstream average waiting time for working capacity and quality of service

Public Transport ◽

10.1007/s12469-018-0179-1 ◽

2018 ◽

Vol 10 (2) ◽

pp. 311-333 ◽

Cited By ~ 4

Author(s):

Jonathan M. Bunker

Keyword(s):

Quality Of Service ◽

Waiting Time ◽

High Volume ◽

Working Capacity ◽

Average Waiting Time ◽

Bus Stop

Download Full-text

Scheduling a Bus Fleet for Evacuation Planning Using Stop-Skipping Method

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/03611981211020001 ◽

2021 ◽

pp. 036119812110200

Author(s):

Hu Zhao ◽

Shumin Feng ◽

Yusheng Ci

Keyword(s):

Waiting Time ◽

Ant Colony Algorithm ◽

Programming Model ◽

Peer To Peer ◽

Mathematical Programming Model ◽

Fleet Size ◽

Bus Stop ◽

Passenger Demand ◽

Service Strategies ◽

Bus System

Sudden passenger demand at a bus stop can lead to numerous passengers gathering at the stop, which can affect bus system operation. Bus system operators often deal with this problem by adopting peer-to-peer service, where empty buses are added to the fleet and dispatched directly to the stop where passengers are gathered (PG-stop). However, with this strategy, passengers at the PG-stop have a long waiting time to board a bus. Thus, this paper proposes a novel mathematical programming model to reduce the passenger waiting time at a bus stop. A more complete stop-skipping model that including four cases for passengers’ waiting time at bus stops is proposed in this study. The stop-skipping decision and fleet size are modeled as a dynamic program to obtain the optimal strategy that minimizes the passenger waiting time, and the optimization model is solved with an improved ant colony algorithm. The proposed strategy was implemented on a bus line in Harbin, China. The results show that, during the evacuation, using the stop-skipping strategy not only reduced the total waiting time for passengers but also decreased the proportion of passengers with a long waiting time (>6 min) at the stops. Compared with the habitual and peer-to-peer service strategies, the total waiting time for passengers is reduced by 31% and 23%, respectively. Additionally, the proportion of passengers with longer waiting time dropped to 43.19% by adopting the stop-skipping strategy, compared with 72.68% with the habitual strategy and 47.5% with the peer-to-peer service strategy.

Download Full-text

Goal-driven active learning

Autonomous Agents and Multi-Agent Systems ◽

10.1007/s10458-021-09527-5 ◽

2021 ◽

Vol 35 (2) ◽

Author(s):

Nicolas Bougie ◽

Ryutaro Ichise

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Learning Process ◽

Real World ◽

Imitation Learning ◽

Learning Approaches ◽

Wide Range ◽

Fixed Set ◽

Complex Decision Making ◽

Complex Decision

AbstractDeep reinforcement learning methods have achieved significant successes in complex decision-making problems. In fact, they traditionally rely on well-designed extrinsic rewards, which limits their applicability to many real-world tasks where rewards are naturally sparse. While cloning behaviors provided by an expert is a promising approach to the exploration problem, learning from a fixed set of demonstrations may be impracticable due to lack of state coverage or distribution mismatch—when the learner’s goal deviates from the demonstrated behaviors. Besides, we are interested in learning how to reach a wide range of goals from the same set of demonstrations. In this work we propose a novel goal-conditioned method that leverages very small sets of goal-driven demonstrations to massively accelerate the learning process. Crucially, we introduce the concept of active goal-driven demonstrations to query the demonstrator only in hard-to-learn and uncertain regions of the state space. We further present a strategy for prioritizing sampling of goals where the disagreement between the expert and the policy is maximized. We evaluate our method on a variety of benchmark environments from the Mujoco domain. Experimental results show that our method outperforms prior imitation learning approaches in most of the tasks in terms of exploration efficiency and average scores.

Download Full-text

Methods for acceleration of learning process of Reinforcement Learning Neuro-Fuzzy Hierarchical Politree model

2010 International Conference on Autonomous and Intelligent Systems, AIS 2010 ◽

10.1109/ais.2010.5547027 ◽

2010 ◽

Author(s):

Fabio Martins ◽

Karla Figueiredo ◽

Marley Vellasco

Keyword(s):

Reinforcement Learning ◽

Learning Process ◽

Neuro Fuzzy

Download Full-text

Influence of bus stop land use characteristics on passenger waiting time satisfaction ‐ A case study in Guangzhou

Journal of Transport Geography ◽

10.1016/j.jtrangeo.2021.103206 ◽

2021 ◽

Vol 96 ◽

pp. 103206

Author(s):

Yucong Hu ◽

Jiangyu Cao ◽

Jianrong Liu

Keyword(s):

Land Use ◽

Waiting Time ◽

Bus Stop

Download Full-text

Modeling Bus Capacity for Bus Stops Using Queuing Theory and Diffusion Approximation

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/03611981211030257 ◽

2021 ◽

pp. 036119812110302

Author(s):

Chao Wang ◽

Weijie Chen ◽

Yueru Xu ◽

Zhirui Ye

Keyword(s):

Approximation Method ◽

Service Time ◽

Diffusion Approximation ◽

Coefficient Of Variation ◽

Queuing Theory ◽

Queuing Model ◽

Bus Stop ◽

Proposed Model ◽

Bus Stops ◽

Bus Service

For bus service quality and line capacity, one critical influencing factor is bus stop capacity. This paper proposes a bus capacity estimation method incorporating diffusion approximation and queuing theory for individual bus stops. A concurrent queuing system between public transportation vehicles and passengers can be used to describe the scenario of a bus stop. For most of the queuing systems, the explicit distributions of basic characteristics (e.g., waiting time, queue length, and busy period) are difficult to obtain. Therefore, the diffusion approximation method was introduced to deal with this theoretical gap in this study. In this method, a continuous diffusion process was applied to estimate the discrete queuing process. The proposed model was validated using relevant data from seven bus stops. As a comparison, two common methods— Highway Capacity Manual (HCM) formula and M/M/S queuing model (i.e., Poisson arrivals, exponential distribution for bus service time, and S number of berths)—were used to estimate the capacity of the bus stop. The mean absolute percentage error (MAPE) of the diffusion approximation method is 7.12%, while the MAPEs of the HCM method and M/M/S queuing model are 16.53% and 10.23%, respectively. Therefore, the proposed model is more accurate and reliable than the others. In addition, the influences of traffic intensity, bus arrival rate, coefficient of variation of bus arrival headway, service time, coefficient of variation of service time, and the number of bus berths on the capacity of bus stops are explored by sensitivity analyses.

Download Full-text

A strategy learning model for autonomous agents based on classification

International Journal of Applied Mathematics and Computer Science ◽

10.1515/amcs-2015-0035 ◽

2015 ◽

Vol 25 (3) ◽

pp. 471-482 ◽

Cited By ~ 7

Author(s):

Bartłomiej Śnieżyński

Keyword(s):

Reinforcement Learning ◽

Supervised Learning ◽

Learning Process ◽

Autonomous Agents ◽

Good Alternative ◽

Learning Model ◽

Learning Method ◽

Complex Environments ◽

Agent Based ◽

Proposed Model

AbstractIn this paper we propose a strategy learning model for autonomous agents based on classification. In the literature, the most commonly used learning method in agent-based systems is reinforcement learning. In our opinion, classification can be considered a good alternative. This type of supervised learning can be used to generate a classifier that allows the agent to choose an appropriate action for execution. Experimental results show that this model can be successfully applied for strategy generation even if rewards are delayed. We compare the efficiency of the proposed model and reinforcement learning using the farmer-pest domain and configurations of various complexity. In complex environments, supervised learning can improve the performance of agents much faster that reinforcement learning. If an appropriate knowledge representation is used, the learned knowledge may be analyzed by humans, which allows tracking the learning process

Download Full-text