Demand-responsive rebalancing zone generation for reinforcement learning-based on-demand mobility

2021 ◽  
Vol 34 (1) ◽  
pp. 73-88
Author(s):  
Alberto Castagna ◽  
Maxime Guériau ◽  
Giuseppe Vizzari ◽  
Ivana Dusparic

Enabling Ride-sharing (RS) in Mobility-on-demand (MoD) systems allows reduction in vehicle fleet size while preserving the level of service. This, however, requires an efficient vehicle to request assignment, and a vehicle rebalancing strategy, which counteracts the uneven geographical spread of demand and relocates unoccupied vehicles to the areas of higher demand. Existing research into rebalancing generally divides the coverage area into predefined geographical zones. Division is done statically, at design-time, impeding adaptivity to evolving demand patterns. To enable more accurate dynamic rebalancing, this paper proposes a Dynamic Demand-Responsive Rebalancer (D2R2) for RS systems. D2R2 uses Expectation-Maximization (EM) technique to recalculate zones at each decision step based on current demand. We integrate D2R2 with a Deep Reinforcement Learning multi-agent MoD system consisting of 200 vehicles serving 10,000 trips from New York taxi dataset. Results show a more fair workload division across the fleet when compared to static pre-defined equiprobable zones.

Sensors ◽  
2020 ◽  
Vol 20 (10) ◽  
pp. 2789 ◽  
Author(s):  
Hang Qi ◽  
Hao Huang ◽  
Zhiqun Hu ◽  
Xiangming Wen ◽  
Zhaoming Lu

In order to meet the ever-increasing traffic demand of Wireless Local Area Networks (WLANs), channel bonding is introduced in IEEE 802.11 standards. Although channel bonding effectively increases the transmission rate, the wider channel reduces the number of non-overlapping channels and is more susceptible to interference. Meanwhile, the traffic load differs from one access point (AP) to another and changes significantly depending on the time of day. Therefore, the primary channel and channel bonding bandwidth should be carefully selected to meet traffic demand and guarantee the performance gain. In this paper, we proposed an On-Demand Channel Bonding (O-DCB) algorithm based on Deep Reinforcement Learning (DRL) for heterogeneous WLANs to reduce transmission delay, where the APs have different channel bonding capabilities. In this problem, the state space is continuous and the action space is discrete. However, the size of action space increases exponentially with the number of APs by using single-agent DRL, which severely affects the learning rate. To accelerate learning, Multi-Agent Deep Deterministic Policy Gradient (MADDPG) is used to train O-DCB. Real traffic traces collected from a campus WLAN are used to train and test O-DCB. Simulation results reveal that the proposed algorithm has good convergence and lower delay than other algorithms.


Author(s):  
Akkhachai Phuphanin ◽  
Wipawee Usaha

Coverage control is crucial for the deployment of wireless sensor networks (WSNs). However, most coverage control schemes are based on single objective optimization such as coverage area only, which do not consider other contradicting objectives such as energy consumption, the number of working nodes, wasteful overlapping areas. This paper proposes on a Multi-Objective Optimization (MOO) coverage control called Scalarized Q Multi-Objective Reinforcement Learning (SQMORL). The two objectives are to achieve the maximize area coverage and to minimize the overlapping area to reduce energy consumption. Performance evaluation is conducted for both simulation and multi-agent lighting control testbed experiments. Simulation results show that SQMORL can obtain more efficient area coverage with fewer working nodes than other existing schemes.  The hardware testbed results show that SQMORL algorithm can find the optimal policy with good accuracy from the repeated runs.


Author(s):  
Ali Boyali ◽  
Naohisa Hashimoto ◽  
Vijay John ◽  
Tankut Acarman

Author(s):  
Jiajie Dai ◽  
Qianyu Zhu ◽  
Nan Jiang ◽  
Wuyang Wang

The shared autonomous mobility-on-demand (AMoD) system is a promising business model in the coming future which provides a more efficient and affordable urban travel mode. However, to maintain the efficient operation of AMoD and address the demand and supply mismatching, a good rebalancing strategy is required. This paper proposes a reinforcement learning-based rebalancing strategy to minimize passengers’ waiting in a shared AMoD system. The state is defined as the nearby supply and demand information of a vehicle. The action is defined as moving to a nearby area with eight different directions or staying idle. A 4.6 4.4 km2 region in Cambridge, Massachusetts, is used as the case study. We trained and tested the rebalancing strategy in two different demand patterns: random and first-mile. Results show the proposed method can reduce passenger’s waiting time by 7% for random demand patterns and 10% for first-mile demand patterns.


2021 ◽  
Vol 3 ◽  
Author(s):  
Salomón Wollenstein-Betech ◽  
Ioannis Ch. Paschalidis ◽  
Christos G. Cassandras

The emergence of the sharing economy in urban transportation networks has enabled new fast, convenient and accessible mobility services referred to as Mobilty-on-Demand systems (e.g., Uber, Lyft, DiDi). These platforms have flourished in the last decade around the globe and face many operational challenges in order to be competitive and provide good quality of service. A crucial step in the effective operation of these systems is to reduce customers' waiting time while properly selecting the optimal fleet size and pricing policy. In this paper, we jointly tackle three operational decisions: (i) fleet size, (ii) pricing, and (iii) rebalancing, in order to maximize the platform's profit or its customers' welfare. To accomplish this, we first devise an optimization framework which gives rise to a static policy. Then, we elaborate and propose dynamic policies that are more responsive to perturbations such as unexpected increases in demand. We test this framework in a simulation environment using three case studies and leveraging traffic flow and taxi data from Eastern Massachusetts, New York City, and Chicago. Our results show that solving the problem jointly could increase profits between 1% and up to 50%, depending on the benchmark. Moreover, we observe that the proposed fleet size yield utilization of the vehicles in the fleet is around 75% compared to private vehicle utilization of 5%.


2021 ◽  
Vol 2021 ◽  
pp. 1-14
Author(s):  
Sung-Jung Wang ◽  
S. K. Jason Chang

Autonomous buses are becoming increasingly popular and have been widely developed in many countries. However, autonomous buses must learn to navigate the city efficiently to be integrated into public transport systems. Efficient operation of these buses can be achieved by intelligent agents through reinforcement learning. In this study, we investigate the autonomous bus fleet control problem, which appears noisy to the agents owing to random arrivals and incomplete observation of the environment. We propose a multi-agent reinforcement learning method combined with an advanced policy gradient algorithm for this large-scale dynamic optimization problem. An agent-based simulation platform was developed to model the dynamic system of a fixed stop/station loop route, autonomous bus fleet, and passengers. This platform was also applied to assess the performance of the proposed algorithm. The experimental results indicate that the developed algorithm outperforms other reinforcement learning methods in the multi-agent domain. The simulation results also reveal the effectiveness of our proposed algorithm in outperforming the existing scheduled bus system in terms of the bus fleet size and passenger wait times for bus routes with comparatively lesser number of passengers.


2020 ◽  
Vol 10 (15) ◽  
pp. 5335
Author(s):  
Kathleen Keogh ◽  
Liz Sonenberg

We address the challenge of multi-agent system (MAS) design for organisations of agents acting in dynamic and uncertain environments where runtime flexibility is required to enable improvisation through sharing knowledge and adapting behaviour. We identify behavioural features that correspond to runtime improvisation by agents in a MAS organisation and from this analysis describe the OJAzzIC meta-model and an associated design method. We present results from simulation scenarios, varying both problem complexity and the level of organisational support provided in the design, to show that increasing design time guidance in the organisation specification can enable runtime flexibility afforded to agents and improve performance. Hence the results demonstrate the usefulness of the constructs captured in the OJAzzIC meta-model.


Sign in / Sign up

Export Citation Format

Share Document