Adaptive Multi-objective Reinforcement Learning for Pareto Frontier Approximation: A Case Study of Resource Allocation Network in Massive MIMO

Author(s):  
Ruiqing Chen ◽  
Fanglei Sun ◽  
Liang Chen ◽  
Kai Li ◽  
Liantao Wu ◽  
...  
2021 ◽  
Vol 5 (4) ◽  
pp. 1-24
Author(s):  
Jianguo Chen ◽  
Kenli Li ◽  
Keqin Li ◽  
Philip S. Yu ◽  
Zeng Zeng

As a new generation of Public Bicycle-sharing Systems (PBS), the Dockless PBS (DL-PBS) is an important application of cyber-physical systems and intelligent transportation. How to use artificial intelligence to provide efficient bicycle dispatching solutions based on dynamic bicycle rental demand is an essential issue for DL-PBS. In this article, we propose MORL-BD, a dynamic bicycle dispatching algorithm based on multi-objective reinforcement learning to provide the optimal bicycle dispatching solution for DL-PBS. We model the DL-PBS system from the perspective of cyber-physical systems and use deep learning to predict the layout of bicycle parking spots and the dynamic demand of bicycle dispatching. We define the multi-route bicycle dispatching problem as a multi-objective optimization problem by considering the optimization objectives of dispatching costs, dispatch truck's initial load, workload balance among the trucks, and the dynamic balance of bicycle supply and demand. On this basis, the collaborative multi-route bicycle dispatching problem among multiple dispatch trucks is modeled as a multi-agent and multi-objective reinforcement learning model. All dispatch paths between parking spots are defined as state spaces, and the reciprocal of dispatching costs is defined as a reward. Each dispatch truck is equipped with an agent to learn the optimal dispatch path in the dynamic DL-PBS network. We create an elite list to store the Pareto optimal solutions of bicycle dispatch paths found in each action, and finally get the Pareto frontier. Experimental results on the actual DL-PBS show that compared with existing methods, MORL-BD can find a higher quality Pareto frontier with less execution time.


Author(s):  
Prateek Mittal ◽  
Kishalay Mitra

A multi-objective optimization case study of maximization and minimization of energy generation and noise propagation is considered here. A novel hybrid methodology, as a combination of probabilistic variable decomposed multi-objective evolutionary algorithm (VdRBNSGA-II) and the newly developed deterministic gradient based Pareto frontier construction approach (nD-NNC), has been proposed to determine the optimum layout of turbines (numbers and locations) inside a wind farm. In contrast to previous case studies, the proposed approach is able to yield the alternative energy-noise solutions along with the additional information on corresponding turbine layouts (numbers and locations) on a single Pareto front. As a result, it provides a decision maker with an ample of choices to choose from different competing solutions based on the existing standards and guidelines.


Author(s):  
Liang Chen ◽  
Fanglei Sun ◽  
Kai Li ◽  
Ruiqing Chen ◽  
Yang Yang ◽  
...  

2016 ◽  
Vol 57 ◽  
pp. 187-227 ◽  
Author(s):  
Simone Parisi ◽  
Matteo Pirotta ◽  
Marcello Restelli

Many real-world control applications, from economics to robotics, are characterized by the presence of multiple conflicting objectives. In these problems, the standard concept of optimality is replaced by Pareto-optimality and the goal is to find the Pareto frontier, a set of solutions representing different compromises among the objectives. Despite recent advances in multi-objective optimization, achieving an accurate representation of the Pareto frontier is still an important challenge. In this paper, we propose a reinforcement learning policy gradient approach to learn a continuous approximation of the Pareto frontier in multi-objective Markov Decision Problems (MOMDPs). Differently from previous policy gradient algorithms, where n optimization routines are executed to have n solutions, our approach performs a single gradient ascent run, generating at each step an improved continuous approximation of the Pareto frontier. The idea is to optimize the parameters of a function defining a manifold in the policy parameters space, so that the corresponding image in the objectives space gets as close as possible to the true Pareto frontier. Besides deriving how to compute and estimate such gradient, we will also discuss the non-trivial issue of defining a metric to assess the quality of the candidate Pareto frontiers. Finally, the properties of the proposed approach are empirically evaluated on two problems, a linear-quadratic Gaussian regulator and a water reservoir control task.


Author(s):  
C Lu ◽  
H Z Huang ◽  
J Y H Fuh ◽  
Y S Wong

This paper proposes a multi-objective disassembly planning approach with an ant colony optimization algorithm. The mechanism of ant colony optimization in disassembly planning is discussed, and the objectives to be optimized in disassembly planning are analysed. In order to allow a more effective search for feasible non-dominated solutions, a multi-objective searching algorithm with uniform design is investigated to guide the ants searching the routes along the uniformly scattered directions towards the Pareto frontier; based on the above searching algorithm, an ant colony optimization algorithm for disassembly planning is developed. The results of a case study are given to verify the proposed approach.


Author(s):  
Mauro Gamberi ◽  
Marco Bortolini ◽  
Francesco Pilati ◽  
Alberto Regattieri

A multi-objective optimizer Decision Support System (DSS) to minimize the operating cost, the carbon footprint and the delivery time in the design of multi-modal Distribution Networks (DNs) is presented to overcome the widely adopted methodologies focused on the cost minimization, only. The proposed approach simultaneously assesses three independent objective functions, evaluating the network costs, the Carbon Footprint (CO2 emissions) and the shipping time from the producers to the final retailers. The DSS manages multimodal four-level (three-stage) DNs, best connecting the producers to the final retailers, through a set of Distribution Centres (DCs). It allows multiple transport modes and inter-modality options looking to the most effective DN configuration from the introduced multi-objective perspective. The three optimization criteria can be considered independently or solved simultaneously, through the so-called Pareto frontier approach. Finally, the proposed DSS is validated against a case study about the delivery of Italian fresh food to several European retailers.


Sign in / Sign up

Export Citation Format

Share Document