Demand-responsive rebalancing zone generation for reinforcement learning-based on-demand mobility

Enabling Ride-sharing (RS) in Mobility-on-demand (MoD) systems allows reduction in vehicle fleet size while preserving the level of service. This, however, requires an efficient vehicle to request assignment, and a vehicle rebalancing strategy, which counteracts the uneven geographical spread of demand and relocates unoccupied vehicles to the areas of higher demand. Existing research into rebalancing generally divides the coverage area into predefined geographical zones. Division is done statically, at design-time, impeding adaptivity to evolving demand patterns. To enable more accurate dynamic rebalancing, this paper proposes a Dynamic Demand-Responsive Rebalancer (D2R2) for RS systems. D2R2 uses Expectation-Maximization (EM) technique to recalculate zones at each decision step based on current demand. We integrate D2R2 with a Deep Reinforcement Learning multi-agent MoD system consisting of 200 vehicles serving 10,000 trips from New York taxi dataset. Results show a more fair workload division across the fleet when compared to static pre-defined equiprobable zones.

Download Full-text

On-Demand Channel Bonding in Heterogeneous WLANs: A Multi-Agent Deep Reinforcement Learning Approach

Sensors ◽

10.3390/s20102789 ◽

2020 ◽

Vol 20 (10) ◽

pp. 2789 ◽

Cited By ~ 1

Author(s):

Hang Qi ◽

Hao Huang ◽

Zhiqun Hu ◽

Xiangming Wen ◽

Zhaoming Lu

Keyword(s):

Reinforcement Learning ◽

Transmission Rate ◽

Single Agent ◽

Time Of Day ◽

Action Space ◽

Traffic Load ◽

Traffic Demand ◽

Channel Bonding ◽

On Demand ◽

Multi Agent

In order to meet the ever-increasing traffic demand of Wireless Local Area Networks (WLANs), channel bonding is introduced in IEEE 802.11 standards. Although channel bonding effectively increases the transmission rate, the wider channel reduces the number of non-overlapping channels and is more susceptible to interference. Meanwhile, the traffic load differs from one access point (AP) to another and changes significantly depending on the time of day. Therefore, the primary channel and channel bonding bandwidth should be carefully selected to meet traffic demand and guarantee the performance gain. In this paper, we proposed an On-Demand Channel Bonding (O-DCB) algorithm based on Deep Reinforcement Learning (DRL) for heterogeneous WLANs to reduce transmission delay, where the APs have different channel bonding capabilities. In this problem, the state space is continuous and the action space is discrete. However, the size of action space increases exponentially with the number of APs by using single-agent DRL, which severely affects the learning rate. To accelerate learning, Multi-Agent Deep Deterministic Policy Gradient (MADDPG) is used to train O-DCB. Real traffic traces collected from a campus WLAN are used to train and test O-DCB. Simulation results reveal that the proposed algorithm has good convergence and lower delay than other algorithms.

Download Full-text

Scalarized Q Multi-Objective Reinforcement Learning for Area Coverage Control and Light Control Implementation

ECTI Transactions on Electrical Engineering, Electronics, and Communications ◽

10.37936/ecti-eec.2018162.171333 ◽

2018 ◽

Vol 16 (2) ◽

pp. 72-82

Author(s):

Akkhachai Phuphanin ◽

Wipawee Usaha

Keyword(s):

Reinforcement Learning ◽

Energy Consumption ◽

Area Coverage ◽

Coverage Area ◽

Lighting Control ◽

Multi Objective ◽

Coverage Control ◽

Control Schemes ◽

Multi Agent ◽

Testbed Experiments

Coverage control is crucial for the deployment of wireless sensor networks (WSNs). However, most coverage control schemes are based on single objective optimization such as coverage area only, which do not consider other contradicting objectives such as energy consumption, the number of working nodes, wasteful overlapping areas. This paper proposes on a Multi-Objective Optimization (MOO) coverage control called Scalarized Q Multi-Objective Reinforcement Learning (SQMORL). The two objectives are to achieve the maximize area coverage and to minimize the overlapping area to reduce energy consumption. Performance evaluation is conducted for both simulation and multi-agent lighting control testbed experiments. Simulation results show that SQMORL can obtain more efficient area coverage with fewer working nodes than other existing schemes. The hardware testbed results show that SQMORL algorithm can find the optimal policy with good accuracy from the repeated runs.

Download Full-text

Multi-Agent Reinforcement Learning for Autonomous On Demand Vehicles

2019 IEEE Intelligent Vehicles Symposium (IV) ◽

10.1109/ivs.2019.8813876 ◽

2019 ◽

Author(s):

Ali Boyali ◽

Naohisa Hashimoto ◽

Vijay John ◽

Tankut Acarman

Keyword(s):

Reinforcement Learning ◽

On Demand ◽

Multi Agent

Download Full-text

Rebalancing Autonomous Vehicles using Deep Reinforcement Learning

International Journal of Circuits, Systems and Signal Processing ◽

10.46300/9106.2022.16.80 ◽

2022 ◽

Vol 16 ◽

pp. 646-652

Author(s):

Jiajie Dai ◽

Qianyu Zhu ◽

Nan Jiang ◽

Wuyang Wang

Keyword(s):

Reinforcement Learning ◽

Autonomous Vehicles ◽

Supply And Demand ◽

Efficient Operation ◽

Demand And Supply ◽

On Demand ◽

Demand Information ◽

Autonomous Mobility ◽

Demand Patterns

The shared autonomous mobility-on-demand (AMoD) system is a promising business model in the coming future which provides a more efficient and affordable urban travel mode. However, to maintain the efficient operation of AMoD and address the demand and supply mismatching, a good rebalancing strategy is required. This paper proposes a reinforcement learning-based rebalancing strategy to minimize passengers’ waiting in a shared AMoD system. The state is defined as the nearby supply and demand information of a vehicle. The action is defined as moving to a nearby area with eight different directions or staying idle. A 4.6 4.4 km2 region in Cambridge, Massachusetts, is used as the case study. We trained and tested the rebalancing strategy in two different demand patterns: random and first-mile. Results show the proposed method can reduce passenger’s waiting time by 7% for random demand patterns and 10% for first-mile demand patterns.

Download Full-text

Electric vehicle fleet size for carsharing services considering on-demand charging strategy and battery degradation

Transportation Research Part C Emerging Technologies ◽

10.1016/j.trc.2021.103146 ◽

2021 ◽

Vol 127 ◽

pp. 103146

Author(s):

Min Xu ◽

Ting Wu ◽

Zhijia Tan

Keyword(s):

Electric Vehicle ◽

Fleet Size ◽

On Demand ◽

Battery Degradation ◽

Vehicle Fleet ◽

Charging Strategy

Download Full-text

Mobility-Aware Charging Scheduling for Shared On-Demand Electric Vehicle Fleet Using Deep Reinforcement Learning

IEEE Transactions on Smart Grid ◽

10.1109/tsg.2020.3025082 ◽

2020 ◽

pp. 1-1

Author(s):

Yanchang Liang ◽

Zhaohao Ding ◽

Tao Ding ◽

Wei-Jen Lee

Keyword(s):

Reinforcement Learning ◽

Electric Vehicle ◽

On Demand ◽

Vehicle Fleet

Download Full-text

Optimal Operations Management of Mobility-on-Demand Systems

Frontiers in Sustainable Cities ◽

10.3389/frsc.2021.681096 ◽

2021 ◽

Vol 3 ◽

Author(s):

Salomón Wollenstein-Betech ◽

Ioannis Ch. Paschalidis ◽

Christos G. Cassandras

Keyword(s):

New York ◽

Operations Management ◽

Sharing Economy ◽

Demand Systems ◽

Pricing Policy ◽

Fleet Size ◽

Optimization Framework ◽

On Demand ◽

Eastern Massachusetts

The emergence of the sharing economy in urban transportation networks has enabled new fast, convenient and accessible mobility services referred to as Mobilty-on-Demand systems (e.g., Uber, Lyft, DiDi). These platforms have flourished in the last decade around the globe and face many operational challenges in order to be competitive and provide good quality of service. A crucial step in the effective operation of these systems is to reduce customers' waiting time while properly selecting the optimal fleet size and pricing policy. In this paper, we jointly tackle three operational decisions: (i) fleet size, (ii) pricing, and (iii) rebalancing, in order to maximize the platform's profit or its customers' welfare. To accomplish this, we first devise an optimization framework which gives rise to a static policy. Then, we elaborate and propose dynamic policies that are more responsive to perturbations such as unexpected increases in demand. We test this framework in a simulation environment using three case studies and leveraging traffic flow and taxi data from Eastern Massachusetts, New York City, and Chicago. Our results show that solving the problem jointly could increase profits between 1% and up to 50%, depending on the benchmark. Moreover, we observe that the proposed fleet size yield utilization of the vehicles in the fleet is around 75% compared to private vehicle utilization of 5%.

Download Full-text

Autonomous Bus Fleet Control Using Multiagent Reinforcement Learning

Journal of Advanced Transportation ◽

10.1155/2021/6654254 ◽

2021 ◽

Vol 2021 ◽

pp. 1-14

Author(s):

Sung-Jung Wang ◽

S. K. Jason Chang

Keyword(s):

Reinforcement Learning ◽

Intelligent Agents ◽

Large Scale ◽

Gradient Algorithm ◽

Transport Systems ◽

Efficient Operation ◽

Fleet Size ◽

Agent Based ◽

Policy Gradient ◽

Multi Agent

Autonomous buses are becoming increasingly popular and have been widely developed in many countries. However, autonomous buses must learn to navigate the city efficiently to be integrated into public transport systems. Efficient operation of these buses can be achieved by intelligent agents through reinforcement learning. In this study, we investigate the autonomous bus fleet control problem, which appears noisy to the agents owing to random arrivals and incomplete observation of the environment. We propose a multi-agent reinforcement learning method combined with an advanced policy gradient algorithm for this large-scale dynamic optimization problem. An agent-based simulation platform was developed to model the dynamic system of a fixed stop/station loop route, autonomous bus fleet, and passengers. This platform was also applied to assess the performance of the proposed algorithm. The experimental results indicate that the developed algorithm outperforms other reinforcement learning methods in the multi-agent domain. The simulation results also reveal the effectiveness of our proposed algorithm in outperforming the existing scheduled bus system in terms of the bus fleet size and passenger wait times for bus routes with comparatively lesser number of passengers.

Download Full-text

Multi-Agent Deep Reinforcement Learning for Decentralized Cooperative Traffic Signal Control

CICTP 2020 ◽

10.1061/9780784483053.039 ◽

2020 ◽

Author(s):

Yang Zhao ◽

Jian-Ming Hu ◽

Ming-Yang Gao ◽

Zuo Zhang

Keyword(s):

Reinforcement Learning ◽

Traffic Signal ◽

Signal Control ◽

Traffic Signal Control ◽

Multi Agent

Download Full-text

Designing Multi-Agent System Organisations for Flexible Runtime Behaviour

Applied Sciences ◽

10.3390/app10155335 ◽

2020 ◽

Vol 10 (15) ◽

pp. 5335

Author(s):

Kathleen Keogh ◽

Liz Sonenberg

Keyword(s):

Design Method ◽

Multi Agent System ◽

Improve Performance ◽

Uncertain Environments ◽

Agent System ◽

Meta Model ◽

Design Time ◽

Organisational Support ◽

Multi Agent ◽

Associated Design

We address the challenge of multi-agent system (MAS) design for organisations of agents acting in dynamic and uncertain environments where runtime flexibility is required to enable improvisation through sharing knowledge and adapting behaviour. We identify behavioural features that correspond to runtime improvisation by agents in a MAS organisation and from this analysis describe the OJAzzIC meta-model and an associated design method. We present results from simulation scenarios, varying both problem complexity and the level of organisational support provided in the design, to show that increasing design time guidance in the organisation specification can enable runtime flexibility afforded to agents and improve performance. Hence the results demonstrate the usefulness of the constructs captured in the OJAzzIC meta-model.

Download Full-text