Modeling Object’s Affordances via Reward Functions

Author(s):  
Renan Lima Baima ◽  
Esther Luna Colombini
Keyword(s):  
2020 ◽  
Vol 178 (3-4) ◽  
pp. 1125-1172
Author(s):  
Julio Backhoff-Veraguas ◽  
Daniel Bartl ◽  
Mathias Beiglböck ◽  
Manu Eder

Abstract A number of researchers have introduced topological structures on the set of laws of stochastic processes. A unifying goal of these authors is to strengthen the usual weak topology in order to adequately capture the temporal structure of stochastic processes. Aldous defines an extended weak topology based on the weak convergence of prediction processes. In the economic literature, Hellwig introduced the information topology to study the stability of equilibrium problems. Bion–Nadal and Talay introduce a version of the Wasserstein distance between the laws of diffusion processes. Pflug and Pichler consider the nested distance (and the weak nested topology) to obtain continuity of stochastic multistage programming problems. These distances can be seen as a symmetrization of Lassalle’s causal transport problem, but there are also further natural ways to derive a topology from causal transport. Our main result is that all of these seemingly independent approaches define the same topology in finite discrete time. Moreover we show that this ‘weak adapted topology’ is characterized as the coarsest topology that guarantees continuity of optimal stopping problems for continuous bounded reward functions.


Author(s):  
Erhan Bayraktar ◽  
Yuchong Zhang

We analyze a mean field tournament: a mean field game in which the agents receive rewards according to the ranking of the terminal value of their projects and are subject to cost of effort. Using Schrödinger bridges we are able to explicitly calculate the equilibrium. This allows us to identify the reward functions which would yield a desired equilibrium and solve several related mechanism design problems. We are also able to identify the effect of reward inequality on the players’ welfare as well as calculate the price of anarchy.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Yafei Li ◽  
Hongfeng Wang ◽  
Li Li ◽  
Yaping Fu

As one of the most effective medical technologies for the infertile patients, in vitro fertilization (IVF) has been more and more widely developed in recent years. However, prolonged waiting for IVF procedures has become a problem of great concern, since this technology is only mastered by the large general hospitals. To deal with the insufficiency of IVF service capacity, this paper studies an IVF queuing network in an integrated cloud healthcare system, where the two key medical services, that is, egg retrieval and transplantation, are assigned to accomplish in the general hospital, while the routine medical tests are assigned into the community hospital. Based on continuous-time Markov procedure, a dynamic large-scale server scheduling problem in this complicated service network is modeled with consideration of different arrival rates of multiple type of patients and different service capacities of multiple servers that can be defined as doctors of the general hospital. To solve this model, a reinforcement learning (RL) algorithm is proposed, where the reward functions are designed for four conflicting subcosts: setup cost, patient waiting cost, penalty cost for unsatisfied patient personal preferences, and medical cost of patient. The experimental results show that the optimal service rule of each server’s queue obtained by the RL method is significantly superior to the traditional service rule.


2021 ◽  
Author(s):  
Antonio Santos de Sousa ◽  
Rubens Fernandes Nunes ◽  
Creto Augusto Vidal ◽  
Joaquim Bento Cavalcante-Neto ◽  
Danilo Borges da Silva

Author(s):  
Hao Ji ◽  
Yan Jin

Abstract Self-organizing systems (SOS) are developed to perform complex tasks in unforeseen situations with adaptability. Predefining rules for self-organizing agents can be challenging, especially in tasks with high complexity and changing environments. Our previous work has introduced a multiagent reinforcement learning (RL) model as a design approach to solving the rule generation problem of SOS. A deep multiagent RL algorithm was devised to train agents to acquire the task and self-organizing knowledge. However, the simulation was based on one specific task environment. Sensitivity of SOS to reward functions and systematic evaluation of SOS designed with multiagent RL remain an issue. In this paper, we introduced a rotation reward function to regulate agent behaviors during training and tested different weights of such reward on SOS performance in two case studies: box-pushing and T-shape assembly. Additionally, we proposed three metrics to evaluate the SOS: learning stability, quality of learned knowledge, and scalability. Results show that depending on the type of tasks; designers may choose appropriate weights of rotation reward to obtain the full potential of agents’ learning capability. Good learning stability and quality of knowledge can be achieved with an optimal range of team sizes. Scaling up to larger team sizes has better performance than scaling downwards.


Sign in / Sign up

Export Citation Format

Share Document