Nonparametric Learning Algorithms for Joint Pricing and Inventory Control with Lost Sales and Censored Demand

Author(s):  
Boxiao Chen ◽  
Xiuli Chao ◽  
Cong Shi

We consider a joint pricing and inventory control problem in which the customer’s response to selling price and the demand distribution are not known a priori. Unsatisfied demand is lost and unobserved, and the only available information for decision making is the observed sales data (also known as censored demand). Conventional approaches, such as stochastic approximation, online convex optimization, and continuum-armed bandit algorithms, cannot be employed, because neither the realized values of the profit function nor its derivatives are known. A major challenge of this problem lies in that the estimated profit function constructed from observed sales data is multimodal in price. We develop a nonparametric spline approximation–based learning algorithm. The algorithm separates the planning horizon into a disjoint exploration phase and an exploitation phase. During the exploration phase, a spline approximation of the demand-price function is constructed based on sales data, and then the corresponding surrogate optimization problem is solved on a sparse grid to obtain a pair of recommended price and target inventory level. During the exploitation phase, the algorithm implements the recommended strategies. We establish a (nearly) square-root regret rate, which (almost) matches the theoretical lower bound.

2020 ◽  
Vol 66 (11) ◽  
pp. 5108-5127 ◽  
Author(s):  
Boxiao Chen ◽  
Xiuli Chao

We consider an inventory control problem with multiple products and stockout substitution. The firm knows neither the primary demand distribution for each product nor the customers’ substitution probabilities between products a priori, and it needs to learn such information from sales data on the fly. One challenge in this problem is that the firm cannot distinguish between primary demand and substitution (overflow) demand from the sales data of any product, and lost sales are not observable. To circumvent these difficulties, we construct learning stages with each stage consisting of a cyclic exploration scheme and a benchmark exploration interval. The benchmark interval allows us to isolate the primary demand information from the sales data, and then this information is used against the sales data from the cyclic exploration intervals to estimate substitution probabilities. Because raising the inventory level helps obtain primary demand information but hinders substitution demand information, inventory decisions have to be carefully balanced to learn them together. We show that our learning algorithm admits a worst-case regret rate that (almost) matches the theoretical lower bound, and numerical experiments demonstrate that the algorithm performs very well. This paper was accepted by J. George Shanthikumar, big data analytics.


2014 ◽  
Vol 28 (4) ◽  
pp. 529-563 ◽  
Author(s):  
Zhan Pang ◽  
Frank Y. Chen

This paper addresses a joint pricing and inventory control problem for a batch production system with random leadtimes. Assume that demand arrives according to a Poisson process with a price-dependent arrival rate. Each replenishment order contains a single batch of a fixed lot size. The replenishment leadtime follows an Erlang distribution, with the number of completed phases recording the delivery state of outstanding orders. The objective is to determine an optimal inventory-pricing policy that maximizes total expected discounted profit or long-run average profit. We first show that when there is at most one order outstanding at any point in time and that excess demand is lost, the optimal reorder policy can be characterized by a critical stock level and the optimal pricing decision is decreasing in the inventory level and delivery state. We then extend the analysis to mixed-Erlang leadtime distribution which can be used to approximate any random leadtime to any degree of accuracy. We further extend the analysis to allowing three outstanding orders where the optimal reorder point becomes state-dependent: the closer an outstanding order is to its arrival or the more orders are outstanding, the lower selling price is charged and the lower reorder point is chosen. Finally, we address the backlog case and show that the monotone pricing structure may not be true when the optimal reorder point is negative.


2013 ◽  
Vol 2013 ◽  
pp. 1-7 ◽  
Author(s):  
Maryam Ghoreishi ◽  
Alireza Arshsadi khamseh ◽  
Abolfazl Mirzazadeh

This paper studies the effect of inflation and customer returns on joint pricing and inventory control for deteriorating items. We adopt a price and time dependent demand function, also the customer returns are considered as a function of both price and demand. Shortage is allowed and partially backlogged. The main objective is determining the optimal selling price, the optimal replenishment cycles, and the order quantity simultaneously such that the present value of total profit in a finite time horizon is maximized. An algorithm has been presented to find the optimal solution. Finally, we solve a numerical example to illustrate the solution procedure and the algorithm.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-17
Author(s):  
Rui Wang ◽  
Xianghua Gan ◽  
Qing Li ◽  
Xiao Yan

We study a joint pricing and inventory control problem for perishables with positive lead time in a finite horizon periodic-review system. Unlike most studies considering a continuous density function of demand, in our paper the customer demand depends on the price of current period and arrives according to a homogeneous Poisson process. We consider both backlogging and lost-sales cases, and our goal is to find a simultaneously ordering and pricing policy to maximize the expected discounted profit over the planning horizon. When there is no fixed ordering cost involved, we design a deep reinforcement learning algorithm to obtain a near-optimal ordering policy and show that there are some monotonicity properties in the learned policy. We also show that our deep reinforcement learning algorithm achieves a better performance than tabular-based Q-learning algorithms. When a fixed ordering cost is involved, we show that our deep reinforcement learning algorithm is effective and efficient, under which the problem of “curse of dimension” is circumvented.


2018 ◽  
Vol 13 (4) ◽  
pp. 1037-1056 ◽  
Author(s):  
Huthaifa AL-Khazraji ◽  
Colin Cole ◽  
William Guo

Purpose This paper aims to optimise the dynamic performance of production–inventory control systems in terms of minimisation variance ratio between the order rate and the consumption, and minimisation the integral of absolute error between the actual and the target level of inventory by incorporating the Pareto optimality into particle swarm optimisation (PSO). Design/method/approach The production–inventory control system is modelled and optimised via control theory and simulations. The dynamics of a production–inventory control system are modelled through continuous time differential equations and Laplace transformations. The simulation design is conducted by using the state–space model of the system. The results of multi-objective particle swarm optimisation (MOPSO) are compared with published results obtained from weighted genetic algorithm (WGA) optimisation. Findings The results obtained from the MOPSO optimisation process ensure that the performance is systematically better than the WGA in terms of reducing the order variability (bullwhip effect) and improving the inventory responsiveness (customer service level) under the same operational conditions. Research limitations/implications This research is limited to optimising the dynamics of a single product, single-retailer single-manufacturer process with zero desired inventory level. Originality/value PSO is widely used and popular in many industrial applications. This research shows a unique application of PSO in optimising the dynamic performance of production–inventory control systems.


2019 ◽  
Vol 10 (5) ◽  
pp. 1679 ◽  
Author(s):  
Abhishek Kanti Biswas ◽  
Sahidul Islam

The inventory system has been drawing more intrigue because this system deals with the decision that minimizes the total average cost or maximizes the total average profit. For any farm, the demand for any items depends upon population, selling price and frequency of advertisement etc. Most of the model, it is assumed that deterioration of any item in inventory starts from the beginning of their production. But in reality, many goods are maintaining their good quality or original condition for some time. So, price discount is availed for defective items. Our target is to calculate the total optimal cost and the optimal inventory level for this inventory model in a crisp and fuzzy environment. Here Holding cost taken as constant and no-shortages are allowed. The cost parameters are considered as Triangular Fuzzy Numbers and to defuzzify the model Signed Distance Method is applied. A numerical example of the optimal solution is given to clarify the model. The changes of different parameters effect on the optimal total cost are presented and sensitivity analysis is given.JEL Classification: C44, Y80, C61Mathematics Subject Classification: 90B05


2013 ◽  
Vol 2013 ◽  
pp. 1-7 ◽  
Author(s):  
Wei He

Inventory control is a key factor for reducing supply chain cost and increasing customer satisfaction. However, prediction of inventory level is a challenging task for managers. As one of the widely used techniques for inventory control, standard BP neural network has such problems as low convergence rate and poor prediction accuracy. Aiming at these problems, a new fast convergent BP neural network model for predicting inventory level is developed in this paper. By adding an error offset, this paper deduces the new chain propagation rule and the new weight formula. This paper also applies the improved BP neural network model to predict the inventory level of an automotive parts company. The results show that the improved algorithm not only significantly exceeds the standard algorithm but also outperforms some other improved BP algorithms both on convergence rate and prediction accuracy.


Sign in / Sign up

Export Citation Format

Share Document