STOCHASTIC RESOURCE ALLOCATION IN MULTIAGENT ENVIRONMENTS: AN APPROACH BASED ON DISTRIBUTED Q-VALUES AND BOUNDED REAL-TIME DYNAMIC PROGRAMMING
This paper contributes to solve effectively stochastic resource allocation problems in multiagent environments. To address it, a distributed Q-values approach is proposed when the resources are distributed among agents a priori, but the actions made by an agent may influence the reward obtained by at least another agent. This distributed Q-values approach allows to coordinate agents' reward and thus permits to reduce the set of states and actions to consider. On the other hand, when the resources are available to all agents, no distributed Q-values is possible and tight lower and upper bounds are proposed for existing heuristic search algorithms. Our experimental results demonstrate the efficiency of our distributed Q-values in terms of planning time as well as our tight bounds in terms of fast convergence and reduction of backups.