Distributed Policy Evaluation with Fractional Order Dynamics in Multiagent Reinforcement Learning
Keyword(s):
The main objective of multiagent reinforcement learning is to achieve a global optimal policy. It is difficult to evaluate the value function with high-dimensional state space. Therefore, we transfer the problem of multiagent reinforcement learning into a distributed optimization problem with constraint terms. In this problem, all agents share the space of states and actions, but each agent only obtains its own local reward. Then, we propose a distributed optimization with fractional order dynamics to solve this problem. Moreover, we prove the convergence of the proposed algorithm and illustrate its effectiveness with a numerical example.
2005 ◽
Vol 24
◽
pp. 81-108
◽
Keyword(s):
2020 ◽
pp. 095965182093708
2019 ◽
Vol 33
◽
pp. 2514-2521
Keyword(s):
2013 ◽
Vol 756-759
◽
pp. 3967-3971
2019 ◽
Vol 33
◽
pp. 257-264
◽