Multiagent reinforcement learning with the partly high-dimensional state space

Kazuyuki Fujita; Hiroshi Matsuo

doi:10.1002/scj.20526

Distributed Policy Evaluation with Fractional Order Dynamics in Multiagent Reinforcement Learning

Security and Communication Networks ◽

10.1155/2021/1020466 ◽

2021 ◽

Vol 2021 ◽

pp. 1-7

Author(s):

Wei Dai ◽

Wei Wang ◽

Zhongtian Mao ◽

Ruwen Jiang ◽

Fudong Nian ◽

...

Keyword(s):

Reinforcement Learning ◽

Fractional Order ◽

Optimization Problem ◽

Value Function ◽

Distributed Optimization ◽

High Dimensional ◽

Multiagent Reinforcement Learning ◽

Dimensional State Space ◽

Global Optimal ◽

The Value Function

The main objective of multiagent reinforcement learning is to achieve a global optimal policy. It is difficult to evaluate the value function with high-dimensional state space. Therefore, we transfer the problem of multiagent reinforcement learning into a distributed optimization problem with constraint terms. In this problem, all agents share the space of states and actions, but each agent only obtains its own local reward. Then, we propose a distributed optimization with fractional order dynamics to solve this problem. Moreover, we prove the convergence of the proposed algorithm and illustrate its effectiveness with a numerical example.

Download Full-text

Using deep reinforcement learning to reveal how the brain encodes abstract state-space representations in high-dimensional environments

Neuron ◽

10.1016/j.neuron.2020.11.021 ◽

2020 ◽

Author(s):

Logan Cross ◽

Jeff Cockburn ◽

Yisong Yue ◽

John P. O’Doherty

Keyword(s):

Reinforcement Learning ◽

State Space ◽

High Dimensional ◽

The Brain

Download Full-text

Bounding on rough terrain with the LittleDog robot

The International Journal of Robotics Research ◽

10.1177/0278364910388315 ◽

2010 ◽

Vol 30 (2) ◽

pp. 192-215 ◽

Cited By ~ 68

Author(s):

Alexander Shkolnik ◽

Michael Levashov ◽

Ian R. Manchester ◽

Russ Tedrake

Keyword(s):

State Space ◽

Open Loop ◽

Random Trees ◽

Configuration Spaces ◽

High Dimensional ◽

Rough Terrain ◽

Task Space ◽

Motion Primitive ◽

Dimensional State Space ◽

Planning Algorithm

A motion planning algorithm is described for bounding over rough terrain with the LittleDog robot. Unlike walking gaits, bounding is highly dynamic and cannot be planned with quasi-steady approximations. LittleDog is modeled as a planar five-link system, with a 16-dimensional state space; computing a plan over rough terrain in this high-dimensional state space that respects the kinodynamic constraints due to underactuation and motor limits is extremely challenging. Rapidly Exploring Random Trees (RRTs) are known for fast kinematic path planning in high-dimensional configuration spaces in the presence of obstacles, but search efficiency degrades rapidly with the addition of challenging dynamics. A computationally tractable planner for bounding was developed by modifying the RRT algorithm by using: (1) motion primitives to reduce the dimensionality of the problem; (2) Reachability Guidance, which dynamically changes the sampling distribution and distance metric to address differential constraints and discontinuous motion primitive dynamics; and (3) sampling with a Voronoi bias in a lower-dimensional “task space” for bounding. Short trajectories were demonstrated to work on the robot, however open-loop bounding is inherently unstable. A feedback controller based on transverse linearization was implemented, and shown in simulation to stabilize perturbations in the presence of noise and time delays.

Download Full-text

A Multiagent Reinforcement Learning approach for inverse kinematics of high dimensional manipulators with precision positioning

2016 6th IEEE International Conference on Biomedical Robotics and Biomechatronics (BioRob) ◽

10.1109/biorob.2016.7523669 ◽

2016 ◽

Cited By ~ 7

Author(s):

Yasmin Ansari ◽

Egidio Falotico ◽

Yoan Mollard ◽

Baptiste Busch ◽

Matteo Cianchetti ◽

...

Keyword(s):

Reinforcement Learning ◽

Inverse Kinematics ◽

High Dimensional ◽

Learning Approach ◽

Precision Positioning ◽

Multiagent Reinforcement Learning

Download Full-text

Efficient photon sorter in a high-dimensional state space

Quantum Information and Computation ◽

10.26421/qic11.3-4-9 ◽

2011 ◽

Vol 11 (3&4) ◽

pp. 313-325

Author(s):

Warner A. Miller

Keyword(s):

Quantum Information ◽

State Space ◽

Quantum Key Distribution ◽

Quantum Information Processing ◽

Three Dimensional ◽

High Dimensional ◽

Coupled Mode Theory ◽

Coupled Mode ◽

Dimensional State Space ◽

Significant Obstacle

An increase in the dimension of state space for quantum key distribution (QKD) can decrease its fidelity requirements while also increasing its bandwidth. A significant obstacle for QKD with qu$d$its ($d\geq 3$) has been an efficient and practical quantum state sorter for photons whose complex fields are modulated in both amplitude and phase. We propose such a sorter based on a multiplexed thick hologram, constructed e.g. from photo-thermal refractive (PTR) glass. We validate this approach using coupled-mode theory with parameters consistent with PTR glass to simulate a holographic sorter. The model assumes a three-dimensional state space spanned by three tilted planewaves. The utility of such a sorter for broader quantum information processing applications can be substantial.

Download Full-text

Pay no attention to that man behind the curtain

The Mental Lexicon ◽

10.1075/ml.11.3.02wes ◽

2016 ◽

Vol 11 (3) ◽

pp. 350-374 ◽

Cited By ~ 11

Author(s):

Chris Westbury

Keyword(s):

Associative Learning ◽

State Space ◽

Neural Activity ◽

Scientific Explanation ◽

Word Meaning ◽

Mental Simulation ◽

High Dimensional ◽

Weighted Constraints ◽

Dimensional State Space ◽

Semantic Domains

There is a distinction in scientific explanation between the explanandum, statements describing the empirical phenomenon to be explained, and the explanans, statements describing the evidence that allow one to predict that phenomenon. To avoid tautology, these sets of statements must refer to distinct domains. A scientific explanation of semantics must be grounded in explanans that appeal to entities from non-semantic domains. I consider as examples eight candidate domains (including affect, lexical or sub-word co-occurrence, mental simulation, and associative learning) that could ground semantics. Following Wittgenstein (1954), I propose adjudicating between these different domains is difficult because of the reification of a word’s ‘meaning’ as an atomistic unit. If we abandon the idea of the meaning of a word as being an atomistic unit and instead think of word meaning as a set of dynamic and disparate embodied states unified by a shared label, many apparent problems associated with identifying a meaning’s ‘true’ explanans disappear. Semantics can be considered as sets of weighted constraints that are individually sufficient for specifying and labeling a subjectively-recognizable location in the high dimensional state space defined by our neural activity.

Download Full-text

Parallel data assimilation for high dimensional state space models

Safety, Reliability, Risk and Life-Cycle Performance of Structures and Infrastructures ◽

10.1201/b16387-120 ◽

2014 ◽

pp. 829-831

Author(s):

M Khalil ◽

W Subber ◽

A Sarkar

Keyword(s):

Data Assimilation ◽

State Space ◽

State Space Models ◽

High Dimensional ◽

Parallel Data ◽

Dimensional State Space

Download Full-text

Particle filtering for Bayesian parameter estimation in a high dimensional state space model

2015 23rd European Signal Processing Conference (EUSIPCO) ◽

10.1109/eusipco.2015.7362582 ◽

2015 ◽

Author(s):

Joaquin Miguez ◽

Dan Crisan ◽

Ines P. Marino

Keyword(s):

Parameter Estimation ◽

State Space ◽

Particle Filtering ◽

State Space Model ◽

High Dimensional ◽

Bayesian Parameter Estimation ◽

Space Model ◽

Dimensional State Space

Download Full-text

Quantum key distribution in a high-dimensional state space: exploiting the transverse degree of freedom of the photon

10.1117/12.873491 ◽

2011 ◽

Cited By ~ 20

Author(s):

Robert W. Boyd ◽

Anand Jha ◽

Mehul Malik ◽

Colin O'Sullivan ◽

Brandon Rodenburg ◽

...

Keyword(s):

State Space ◽

Quantum Key Distribution ◽

Key Distribution ◽

Degree Of Freedom ◽

High Dimensional ◽

Dimensional State Space

Download Full-text

UAV Swarm Confrontation Using Hierarchical Multiagent Reinforcement Learning

International Journal of Aerospace Engineering ◽

10.1155/2021/3360116 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Baolai Wang ◽

Shengang Li ◽

Xianzhong Gao ◽

Tao Xie

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

State Space ◽

Optimal Solution ◽

Difficult Problem ◽

Action Space ◽

Sequential Decision ◽

Autonomous Decision ◽

Multiagent Reinforcement Learning ◽

Uav Swarm

With the development of unmanned aerial vehicle (UAV) technology, UAV swarm confrontation has attracted many researchers’ attention. However, the situation faced by the UAV swarm has substantial uncertainty and dynamic variability. The state space and action space increase exponentially with the number of UAVs, so that autonomous decision-making becomes a difficult problem in the confrontation environment. In this paper, a multiagent reinforcement learning method with macro action and human expertise is proposed for autonomous decision-making of UAVs. In the proposed approach, UAV swarm is modeled as a large multiagent system (MAS) with an individual UAV as an agent, and the sequential decision-making problem in swarm confrontation is modeled as a Markov decision process. Agents in the proposed method are trained based on the macro actions, where sparse and delayed rewards, large state space, and action space are effectively overcome. The key to the success of this method is the generation of the macro actions that allow the high-level policy to find a near-optimal solution. In this paper, we further leverage human expertise to design a set of good macro actions. Extensive empirical experiments in our constructed swarm confrontation environment show that our method performs better than the other algorithms.

Download Full-text