scholarly journals Self-Optimizing and Pareto-Optimal Policies in General Environments Based on Bayes-Mixtures

Author(s):  
Marcus Hutter
Author(s):  
Tomohiro Yamaguchi ◽  
Shota Nagahama ◽  
Yoshihiro Ichikawa ◽  
Yoshimichi Honma ◽  
Keiki Takadama

This chapter describes solving multi-objective reinforcement learning (MORL) problems where there are multiple conflicting objectives with unknown weights. Previous model-free MORL methods take large number of calculations to collect a Pareto optimal set for each V/Q-value vector. In contrast, model-based MORL can reduce such a calculation cost than model-free MORLs. However, previous model-based MORL method is for only deterministic environments. To solve them, this chapter proposes a novel model-based MORL method by a reward occurrence probability (ROP) vector with unknown weights. The experimental results are reported under the stochastic learning environments with up to 10 states, 3 actions, and 3 reward rules. The experimental results show that the proposed method collects all Pareto optimal policies, and it took about 214 seconds (10 states, 3 actions, 3 rewards) for total learning time. In future research directions, the ways to speed up methods and how to use non-optimal policies are discussed.


1996 ◽  
Vol 10 (1) ◽  
pp. 57-73 ◽  
Author(s):  
Eugene A. Feinberg ◽  
Dong J. Kim

This paper studies bicriterion optimization of an M/G/1 queue with a server that can be switched on and off. One criterion is an average number of customers in the system, and another criterion is an average operating cost per unit time. Operating costs consist of switching and running costs. We describe the structure of Pareto optimal policies for a bicriterion problem and solve problems of optimization of one of these criteria under a constraint for another one.


1971 ◽  
Vol 65 (4) ◽  
pp. 1141-1145 ◽  
Author(s):  
Peter C. Ordeshook

The core of welfare economics consists of the proof that, for certain classes of goods, perfectly competitive markets are efficient in that they provide Pareto optimal allocations of these goods. In this paper, the efficiency of competitive elections is examined. Elections are modeled as two-candidate zero-sum games, and three kinds of equilibria for such games are identified: pure, risky, and mixed strategies. It is shown, however, that regardless of which kind of equilibrium prevails, if candidates adopt equilibrium strategies, an election is efficient in the sense that the candidates advocate Pareto optimal policies. But one caveat to this analysis is that while an election is Pareto optimal, citizens can unanimously prefer markets to elections as a mechanism for selecting future policies.


2009 ◽  
pp. 75-84
Author(s):  
V. Popov

Why have many transition economies succeeded by pursuing policies which are so different from the radical economic liberalization (shock therapy) that is normally credited for the economic success of countries of Central Europe? First, optimal policies are context dependent, they are specific for each stage of development and what worked in Slovenia cannot be expected to work in Mongolia. Second, even for the countries with the same level of development reforms that are necessary to stimulate growth are different; they depend on the previous history and on the path chosen. The reduction of government expenditure as a share of GDP did not undermine significantly the institutional capacity of the state in China, but in Russia and other CIS countries it turned out to be ruinous. The art of the policymaker is to create markets without causing government failure, as happened in many CIS countries.


2011 ◽  
pp. 65-87 ◽  
Author(s):  
A. Rubinstein

The article considers some aspects of the patronized goods theory with respect to efficient and inefficient equilibria. The author analyzes specific features of patronized goods as well as their connection with market failures, and conjectures that they are related to the emergence of Pareto-inefficient Nash equilibria. The key problem is the analysis of the opportunities for transforming inefficient Nash equilibrium into Pareto-optimal Nash equilibrium for patronized goods by modifying the institutional environment. The paper analyzes social motivation for institutional modernization and equilibrium conditions in the generalized Wicksell-Lindahl model for patronized goods. The author also considers some applications of patronized goods theory to social policy issues.


Sign in / Sign up

Export Citation Format

Share Document