Memory-constrained ML-optimal tree search detection

Finite-horizon lookahead policies are abundantly used in Reinforcement Learning and demonstrate impressive empirical success. Usually, the lookahead policies are implemented with specific planning methods such as Monte Carlo Tree Search (e.g. in AlphaZero (Silver et al. 2017b)). Referring to the planning problem as tree search, a reasonable practice in these implementations is to back up the value only at the leaves while the information obtained at the root is not leveraged other than for updating the policy. Here, we question the potency of this approach. Namely, the latter procedure is non-contractive in general, and its convergence is not guaranteed. Our proposed enhancement is straightforward and simple: use the return from the optimal tree path to back up the values at the descendants of the root. This leads to a γh-contracting procedure, where γ is the discount factor and h is the tree depth. To establish our results, we first introduce a notion called multiple-step greedy consistency. We then provide convergence rates for two algorithmic instantiations of the above enhancement in the presence of noise injected to both the tree search stage and value estimation stage.

Download Full-text

Optimal Tree Search by a Swarm of Mobile Robots

Information and Communication Technology - Advances in Intelligent Systems and Computing ◽

10.1007/978-981-10-5508-9_17 ◽

2017 ◽

pp. 179-187

Author(s):

Maitry Sinha ◽

Srabani Mukhopadhyaya

Keyword(s):

Mobile Robots ◽

Tree Search ◽

Optimal Tree

Download Full-text

A Unification of ML-Optimal Tree-Search Decoders

2006 Fortieth Asilomar Conference on Signals, Systems and Computers ◽

10.1109/acssc.2006.355156 ◽

2006 ◽

Cited By ~ 7

Author(s):

Christoph Studer ◽

Andreas Burg ◽

Wolfgang Fichtner

Keyword(s):

Tree Search ◽

Optimal Tree

Download Full-text

Hyper-parameter Optimization for Monte Carlo Tree Search using Self-play

Korean Institute of Smart Media ◽

10.30693/smj.2020.9.4.36 ◽

2020 ◽

Vol 9 (4) ◽

pp. 36-43

Author(s):

Jin-Seon Lee ◽

Il-Seok Oh

Keyword(s):

Monte Carlo ◽

Parameter Optimization ◽

Tree Search ◽

Monte Carlo Tree Search

Download Full-text

A Theory of Heuristic Information in Game-Tree Search

10.1007/978-3-642-61368-5 ◽

1988 ◽

Cited By ~ 6

Author(s):

Chun-Hung Tzeng

Keyword(s):

Tree Search ◽

Game Tree ◽

Heuristic Information ◽

Game Tree Search

Download Full-text

Production scheduling in industrial mining complexes with incoming new information using tree search and deep reinforcement learning

Applied Soft Computing ◽

10.1016/j.asoc.2021.107644 ◽

2021 ◽

pp. 107644

Author(s):

Ashish Kumar ◽

Roussos Dimitrakopoulos

Keyword(s):

Reinforcement Learning ◽

Production Scheduling ◽

Tree Search ◽

New Information ◽

Industrial Mining

Download Full-text

Incorporation of Potential Fields and Motion Primitives for the Collision Avoidance of Unmanned Aircraft

Applied Sciences ◽

10.3390/app11073103 ◽

2021 ◽

Vol 11 (7) ◽

pp. 3103

Author(s):

Kyuman Lee ◽

Daegyun Choi ◽

Donghoon Kim

Keyword(s):

Collision Avoidance ◽

Search Algorithm ◽

Unmanned Aircraft ◽

Artificial Potential Field ◽

Tree Search ◽

Local Minima ◽

Collision Risk ◽

Motion Primitives ◽

Multiple Uavs ◽

Tree Search Algorithm

Collision avoidance (CA) using the artificial potential field (APF) usually faces several known issues such as local minima and dynamically infeasible problems, so unmanned aerial vehicles’ (UAVs) paths planned based on the APF are safe only in a certain environment. This research proposes a CA approach that combines the APF and motion primitives (MPs) to tackle the known problems associated with the APF. Since MPs solve for a locally optimal trajectory with respect to allocated time, the trajectory obtained by the MPs is verified as dynamically feasible. When a collision checker based on the k-d tree search algorithm detects collision risk on extracted sample points from the planned trajectory, generating re-planned path candidates to avoid obstacles is performed. After rejecting unsafe route candidates, one applies the APF to select the best route among the remaining safe-path candidates. To validate the proposed approach, we simulated two meaningful scenario cases—the presence of static obstacles situation with local minima and dynamic environments with multiple UAVs present. The simulation results show that the proposed approach provides smooth, efficient, and dynamically feasible pathing compared to the APF.

Download Full-text

Monte Carlo Tree Search for the Game of Diplomacy

11th Hellenic Conference on Artificial Intelligence ◽

10.1145/3411408.3411413 ◽

2020 ◽

Author(s):

Alexios Theodoridis ◽

Georgios Chalkiadakis

Keyword(s):

Monte Carlo ◽

Tree Search ◽

Monte Carlo Tree Search

Download Full-text

Memory-constrained ML-optimal tree search detection

Comparison of Tabu/2-opt heuristic and optimal tree search method for assignment problems

An optimal tree search method for the manufacturing systems cell formation problem

How to Combine Tree-Search Methods in Reinforcement Learning

Optimal Tree Search by a Swarm of Mobile Robots

A Unification of ML-Optimal Tree-Search Decoders

Hyper-parameter Optimization for Monte Carlo Tree Search using Self-play

A Theory of Heuristic Information in Game-Tree Search

Production scheduling in industrial mining complexes with incoming new information using tree search and deep reinforcement learning

Incorporation of Potential Fields and Motion Primitives for the Collision Avoidance of Unmanned Aircraft

Monte Carlo Tree Search for the Game of Diplomacy

Export Citation Format