scholarly journals Solving a Joint Pricing and Inventory Control Problem for Perishables via Deep Reinforcement Learning

Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-17
Author(s):  
Rui Wang ◽  
Xianghua Gan ◽  
Qing Li ◽  
Xiao Yan

We study a joint pricing and inventory control problem for perishables with positive lead time in a finite horizon periodic-review system. Unlike most studies considering a continuous density function of demand, in our paper the customer demand depends on the price of current period and arrives according to a homogeneous Poisson process. We consider both backlogging and lost-sales cases, and our goal is to find a simultaneously ordering and pricing policy to maximize the expected discounted profit over the planning horizon. When there is no fixed ordering cost involved, we design a deep reinforcement learning algorithm to obtain a near-optimal ordering policy and show that there are some monotonicity properties in the learned policy. We also show that our deep reinforcement learning algorithm achieves a better performance than tabular-based Q-learning algorithms. When a fixed ordering cost is involved, we show that our deep reinforcement learning algorithm is effective and efficient, under which the problem of “curse of dimension” is circumvented.

2018 ◽  
Vol 2018 ◽  
pp. 1-11
Author(s):  
Zhicong Zhang ◽  
Shuai Li ◽  
Xiaohui Yan

We study an online multisource multisink queueing network control problem characterized with self-organizing network structure and self-organizing job routing. We decompose the self-organizing queueing network control problem into a series of interrelated Markov Decision Processes and construct a control decision model for them based on the coupled reinforcement learning (RL) architecture. To maximize the mean time averaged weighted throughput of the jobs through the network, we propose a reinforcement learning algorithm with time averaged reward to deal with the control decision model and obtain a control policy integrating the jobs routing selection strategy and the jobs sequencing strategy. Computational experiments verify the learning ability and the effectiveness of the proposed reinforcement learning algorithm applied in the investigated self-organizing network control problem.


Sign in / Sign up

Export Citation Format

Share Document