scholarly journals Intelligent Buses in a Loop Service: Emergence of No-Boarding and Holding Strategies

Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-18 ◽  
Author(s):  
Vee-Liem Saw ◽  
Luca Vismara ◽  
Lock Yue Chew

We study how N intelligent buses serving a loop of M bus stops learn a no-boarding strategy and a holding strategy by reinforcement learning. The no-boarding and holding strategies emerge from the actions of stay or leave when a bus is at a bus stop and everyone who wishes to alight has done so. A reward that encourages the buses to strive towards a staggered phase difference amongst them whilst picking up passengers allows the reinforcement learning process to converge to an optimal Q-table within a reasonable amount of simulation time. It is remarkable that this emergent behaviour of intelligent buses turns out to minimise the average waiting time of commuters, in various setups where buses move with the same speed or different speeds, during busy as well as lull periods. Cooperative actions are also observed, e.g., the buses learn to unbunch.

2020 ◽  
Vol 2020 ◽  
pp. 1-14
Author(s):  
Duowei Li ◽  
Jianping Wu ◽  
Ming Xu ◽  
Ziheng Wang ◽  
Kezhen Hu

Controlling traffic signals to alleviate increasing traffic pressure is a concept that has received public attention for a long time. However, existing systems and methodologies for controlling traffic signals are insufficient for addressing the problem. To this end, we build a truly adaptive traffic signal control model in a traffic microsimulator, i.e., “Simulation of Urban Mobility” (SUMO), using the technology of modern deep reinforcement learning. The model is proposed based on a deep Q-network algorithm that precisely represents the elements associated with the problem: agents, environments, and actions. The real-time state of traffic, including the number of vehicles and the average speed, at one or more intersections is used as an input to the model. To reduce the average waiting time, the agents provide an optimal traffic signal phase and duration that should be implemented in both single-intersection cases and multi-intersection cases. The co-operation between agents enables the model to achieve an improvement in overall performance in a large road network. By testing with data sets pertaining to three different traffic conditions, we prove that the proposed model is better than other methods (e.g., Q-learning method, longest queue first method, and Webster fixed timing control method) for all cases. The proposed model reduces both the average waiting time and travel time, and it becomes more advantageous as the traffic environment becomes more complex.


Compiler ◽  
2013 ◽  
Vol 2 (1) ◽  
Author(s):  
Dwi Prasetiyo ◽  
Anton Setiawan Honggowibowo ◽  
Yuliani Indrianingsih

The increasing number o f passengers Trans Jogja bus stops can result in the existing capacity can not accommodate the number of passengers comfortably. Problems that often arise include delays resulting bus passenger waiting time is longer and there is a buildup of the number of passengers at stops. As a result of these problems, the capacity o f passenger stops can be full so that prospective passengers waiting outside the bus stop. Forecasting is one very important element in the decision. In this study using stationary and trend forecasting the data because the data are not significant changes between time and swell in certain periods and a normal in periods others. Time series methods for forecasting the number o f passengers on the Trans Jogja stop using exponential smoothing calculation and least square. From these calculations the value sought MAD (Mean Absolute Deviation) or least square error is exponential smoothing and forecasting results with small error. Forecasting will be better if it contains fewer possible errors.


2021 ◽  
Vol 13 (04) ◽  
pp. 01-19
Author(s):  
Chantakarn Pholpol ◽  
Teerapat Sanguankotchakorn

In recent years, a new wireless network called vehicular ad-hoc network (VANET), has become a popular research topic. VANET allows communication among vehicles and with roadside units by providing information to each other, such as vehicle velocity, location and direction. In general, when many vehicles likely to use the common route to proceed to the same destination, it can lead to a congested route that should be avoided. It may be better if vehicles are able to predict accurately the traffic congestion and then avoid it. Therefore, in this work, the deep reinforcement learning in VANET to enhance the ability to predict traffic congestion on the roads is proposed. Furthermore, different types of neural networks namely Convolutional Neural Network (CNN), Multilayer Perceptron (MLP) and Long Short-Term Memory (LSTM) are investigated and compared in this deep reinforcement learning model to discover the most effective one. Our proposed method is tested by simulation. The traffic scenarios are created using traffic simulator called Simulation of Urban Mobility (SUMO) before integrating with deep reinforcement learning model. The simulation procedures, as well as the programming used, are described in detail. The performance of our proposed method is evaluated using two metrics; the average travelling time delay and average waiting time delay of vehicles. According to the simulation results, the average travelling time delay and average waiting time delay are gradually improved over the multiple runs, since our proposed method receives feedback from the environment. In addition, the results without and with three different deep learning algorithms, i.e., CNN, MLP and LSTM are compared. It is obvious that the deep reinforcement learning model works effectively when traffic density is neither too high nor too low. In addition, it can be concluded that the effective algorithms for traffic congestion prediction models in descending order are MLP, CNN, and LSTM, respectively.


Author(s):  
Hu Zhao ◽  
Shumin Feng ◽  
Yusheng Ci

Sudden passenger demand at a bus stop can lead to numerous passengers gathering at the stop, which can affect bus system operation. Bus system operators often deal with this problem by adopting peer-to-peer service, where empty buses are added to the fleet and dispatched directly to the stop where passengers are gathered (PG-stop). However, with this strategy, passengers at the PG-stop have a long waiting time to board a bus. Thus, this paper proposes a novel mathematical programming model to reduce the passenger waiting time at a bus stop. A more complete stop-skipping model that including four cases for passengers’ waiting time at bus stops is proposed in this study. The stop-skipping decision and fleet size are modeled as a dynamic program to obtain the optimal strategy that minimizes the passenger waiting time, and the optimization model is solved with an improved ant colony algorithm. The proposed strategy was implemented on a bus line in Harbin, China. The results show that, during the evacuation, using the stop-skipping strategy not only reduced the total waiting time for passengers but also decreased the proportion of passengers with a long waiting time (>6 min) at the stops. Compared with the habitual and peer-to-peer service strategies, the total waiting time for passengers is reduced by 31% and 23%, respectively. Additionally, the proportion of passengers with longer waiting time dropped to 43.19% by adopting the stop-skipping strategy, compared with 72.68% with the habitual strategy and 47.5% with the peer-to-peer service strategy.


2021 ◽  
Vol 35 (2) ◽  
Author(s):  
Nicolas Bougie ◽  
Ryutaro Ichise

AbstractDeep reinforcement learning methods have achieved significant successes in complex decision-making problems. In fact, they traditionally rely on well-designed extrinsic rewards, which limits their applicability to many real-world tasks where rewards are naturally sparse. While cloning behaviors provided by an expert is a promising approach to the exploration problem, learning from a fixed set of demonstrations may be impracticable due to lack of state coverage or distribution mismatch—when the learner’s goal deviates from the demonstrated behaviors. Besides, we are interested in learning how to reach a wide range of goals from the same set of demonstrations. In this work we propose a novel goal-conditioned method that leverages very small sets of goal-driven demonstrations to massively accelerate the learning process. Crucially, we introduce the concept of active goal-driven demonstrations to query the demonstrator only in hard-to-learn and uncertain regions of the state space. We further present a strategy for prioritizing sampling of goals where the disagreement between the expert and the policy is maximized. We evaluate our method on a variety of benchmark environments from the Mujoco domain. Experimental results show that our method outperforms prior imitation learning approaches in most of the tasks in terms of exploration efficiency and average scores.


Author(s):  
Chao Wang ◽  
Weijie Chen ◽  
Yueru Xu ◽  
Zhirui Ye

For bus service quality and line capacity, one critical influencing factor is bus stop capacity. This paper proposes a bus capacity estimation method incorporating diffusion approximation and queuing theory for individual bus stops. A concurrent queuing system between public transportation vehicles and passengers can be used to describe the scenario of a bus stop. For most of the queuing systems, the explicit distributions of basic characteristics (e.g., waiting time, queue length, and busy period) are difficult to obtain. Therefore, the diffusion approximation method was introduced to deal with this theoretical gap in this study. In this method, a continuous diffusion process was applied to estimate the discrete queuing process. The proposed model was validated using relevant data from seven bus stops. As a comparison, two common methods— Highway Capacity Manual (HCM) formula and M/M/S queuing model (i.e., Poisson arrivals, exponential distribution for bus service time, and S number of berths)—were used to estimate the capacity of the bus stop. The mean absolute percentage error (MAPE) of the diffusion approximation method is 7.12%, while the MAPEs of the HCM method and M/M/S queuing model are 16.53% and 10.23%, respectively. Therefore, the proposed model is more accurate and reliable than the others. In addition, the influences of traffic intensity, bus arrival rate, coefficient of variation of bus arrival headway, service time, coefficient of variation of service time, and the number of bus berths on the capacity of bus stops are explored by sensitivity analyses.


2015 ◽  
Vol 25 (3) ◽  
pp. 471-482 ◽  
Author(s):  
Bartłomiej Śnieżyński

AbstractIn this paper we propose a strategy learning model for autonomous agents based on classification. In the literature, the most commonly used learning method in agent-based systems is reinforcement learning. In our opinion, classification can be considered a good alternative. This type of supervised learning can be used to generate a classifier that allows the agent to choose an appropriate action for execution. Experimental results show that this model can be successfully applied for strategy generation even if rewards are delayed. We compare the efficiency of the proposed model and reinforcement learning using the farmer-pest domain and configurations of various complexity. In complex environments, supervised learning can improve the performance of agents much faster that reinforcement learning. If an appropriate knowledge representation is used, the learned knowledge may be analyzed by humans, which allows tracking the learning process


Sign in / Sign up

Export Citation Format

Share Document