A Learning Based Branch and Bound for Maximum Common Subgraph Related Problems

The performance of a branch-and-bound (BnB) algorithm for maximum common subgraph (MCS) problem and its related problems, like maximum common connected subgraph (MCCS) and induced Subgraph Isomorphism (SI), crucially depends on the branching heuristic. We propose a branching heuristic inspired from reinforcement learning with a goal of reaching a tree leaf as early as possible to greatly reduce the search tree size. Experimental results show that the proposed heuristic consistently and significantly improves the current best BnB algorithm for the MCS, MCCS and SI problems. An analysis is carried out to give insight on why and how reinforcement learning is useful in the new branching heuristic.

Download Full-text

Clairvoyant Restarts in Branch-and-Bound Search Using Online Tree-Size Estimation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33011427 ◽

2019 ◽

Vol 33 ◽

pp. 1427-1434 ◽

Cited By ~ 2

Author(s):

Daniel Anderson ◽

Gregor Hendel ◽

Pierre Le Bodic ◽

Merlin Viernickel

Keyword(s):

Branch And Bound ◽

State Of The Art ◽

Search Tree ◽

Tree Size ◽

Large Set ◽

Size Estimation ◽

Current Estimate ◽

Benchmark Instances ◽

Branch And Bound Search ◽

Restart Strategy

We propose a simple and general online method to measure the search progress within the Branch-and-Bound algorithm, from which we estimate the size of the remaining search tree. We then show how this information can help solvers algorithmically at runtime by designing a restart strategy for MixedInteger Programming (MIP) solvers that decides whether to restart the search based on the current estimate of the number of remaining nodes in the tree. We refer to this type of algorithm as clairvoyant. Our clairvoyant restart strategy outperforms a state-of-the-art solver on a large set of publicly available MIP benchmark instances. It is implemented in the MIP solver SCIP and will be available in future releases.

Download Full-text

SEARCH-TREE SIZE ESTIMATION FOR THE SUBGRAPH ISOMORPHISM PROBLEM

Acta Electrotechnica et Informatica ◽

10.15546/aeei-2018-0026 ◽

2018 ◽

Vol 18 (4) ◽

pp. 3-10

Author(s):

Uroš Čibej ◽

◽

Jurij Mihelič

Keyword(s):

Isomorphism Problem ◽

Search Tree ◽

Tree Size ◽

Subgraph Isomorphism ◽

Size Estimation

Download Full-text

A Partitioning Algorithm for Maximum Common Subgraph Problems

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/99 ◽

2017 ◽

Cited By ~ 4

Author(s):

Ciaran McCreesh ◽

Patrick Prosser ◽

James Trimble

Keyword(s):

Branch And Bound ◽

Constraint Programming ◽

State Of The Art ◽

Programming Approach ◽

Connected Subgraph ◽

Compact Domain ◽

Maximum Common Subgraph ◽

Inference Algorithms ◽

Order Of Magnitude ◽

Partitioning Algorithm

We introduce a new branch and bound algorithm for the maximum common subgraph and maximum common connected subgraph problems which is based around vertex labelling and partitioning. Our method in some ways resembles a traditional constraint programming approach, but uses a novel compact domain store and supporting inference algorithms which dramatically reduce the memory and computation requirements during search, and allow better dual viewpoint ordering heuristics to be calculated cheaply. Experiments show a speedup of more than an order of magnitude over the state of the art, and demonstrate that we can operate on much larger graphs without running out of memory.

Download Full-text

Adaptive dynamic programming and deep reinforcement learning for the control of an unmanned surface vehicle: Experimental results

Control Engineering Practice ◽

10.1016/j.conengprac.2021.104807 ◽

2021 ◽

Vol 111 ◽

pp. 104807

Author(s):

Alejandro Gonzalez-Garcia ◽

David Barragan-Alcantar ◽

Ivana Collado-Gonzalez ◽

Leonardo Garrido

Keyword(s):

Dynamic Programming ◽

Reinforcement Learning ◽

Adaptive Dynamic Programming ◽

Experimental Results ◽

Unmanned Surface Vehicle ◽

Adaptive Dynamic

Download Full-text

Reinforcement Learning for Cloud Computing Digital Library

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.571-572.105 ◽

2014 ◽

Vol 571-572 ◽

pp. 105-108

Author(s):

Lin Xu

Keyword(s):

Artificial Intelligence ◽

Cloud Computing ◽

Reinforcement Learning ◽

Digital Library ◽

Learning Algorithms ◽

Experimental Results ◽

Current Status ◽

Self Learning ◽

New Framework

This paper proposes a new framework of combining reinforcement learning with cloud computing digital library. Unified self-learning algorithms, which includes reinforcement learning, artificial intelligence and etc, have led to many essential advances. Given the current status of highly-available models, analysts urgently desire the deployment of write-ahead logging. In this paper we examine how DNS can be applied to the investigation of superblocks, and introduce the reinforcement learning to improve the quality of current cloud computing digital library. The experimental results show that the method works more efficiency.

Download Full-text

A Reinforcement Learning Approach to Lift Generation in Flapping MAVs: Experimental Results

Proceedings 2007 IEEE International Conference on Robotics and Automation ◽

10.1109/robot.2007.363076 ◽

2007 ◽

Cited By ~ 7

Author(s):

Mehran Motamed ◽

Joseph Yan

Keyword(s):

Reinforcement Learning ◽

Experimental Results ◽

Learning Approach

Download Full-text

Towards High-Level Intrinsic Exploration in Reinforcement Learning

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/733 ◽

2020 ◽

Author(s):

Nicolas Bougie ◽

Ryutaro Ichise

Keyword(s):

Reinforcement Learning ◽

Time Horizon ◽

State Of The Art ◽

Experimental Results ◽

Prior Work ◽

Extrinsic Rewards ◽

Intrinsic Reward ◽

Long Time ◽

End To End ◽

High Level

Deep reinforcement learning (DRL) methods traditionally struggle with tasks where environment rewards are sparse or delayed, which entails that exploration remains one of the key challenges of DRL. Instead of solely relying on extrinsic rewards, many state-of-the-art methods use intrinsic curiosity as exploration signal. While they hold promise of better local exploration, discovering global exploration strategies is beyond the reach of current methods. We propose a novel end-to-end intrinsic reward formulation that introduces high-level exploration in reinforcement learning. Our curiosity signal is driven by a fast reward that deals with local exploration and a slow reward that incentivizes long-time horizon exploration strategies. We formulate curiosity as the error in an agent’s ability to reconstruct the observations given their contexts. Experimental results show that this high-level exploration enables our agents to outperform prior work in several Atari games.

Download Full-text

Induced Subgraph Isomorphism on Interval and Proper Interval Graphs

Algorithms and Computation - Lecture Notes in Computer Science ◽

10.1007/978-3-642-17514-5_34 ◽

2010 ◽

pp. 399-409 ◽

Cited By ~ 4

Author(s):

Pinar Heggernes ◽

Daniel Meister ◽

Yngve Villanger

Keyword(s):

Interval Graphs ◽

Subgraph Isomorphism ◽

Induced Subgraph

Download Full-text

Mobile Robot Wall-Following Control Using Fuzzy Logic Controller with Improved Differential Search and Reinforcement Learning

Mathematics ◽

10.3390/math8081254 ◽

2020 ◽

Vol 8 (8) ◽

pp. 1254 ◽

Cited By ~ 1

Author(s):

Cheng-Hung Chen ◽

Shiou-Yun Jeng ◽

Cheng-Jian Lin

Keyword(s):

Fuzzy Logic ◽

Reinforcement Learning ◽

Mobile Robot ◽

Fuzzy Logic Controller ◽

Search Algorithm ◽

Experimental Results ◽

Average Error ◽

Stopover Site ◽

Accumulated Reward ◽

Wall Following

In this study, a fuzzy logic controller with the reinforcement improved differential search algorithm (FLC_R-IDS) is proposed for solving a mobile robot wall-following control problem. This study uses the reward and punishment mechanisms of reinforcement learning to train the mobile robot wall-following control. The proposed improved differential search algorithm uses parameter adaptation to adjust the control parameters. To improve the exploration of the algorithm, a change in the number of superorganisms is required as it involves a stopover site. This study uses reinforcement learning to guide the behavior of the robot. When the mobile robot satisfies three reward conditions, it gets reward +1. The accumulated reward value is used to evaluate the controller and to replace the next controller training. Experimental results show that, compared with the traditional differential search algorithm and the chaos differential search algorithm, the average error value of the proposed FLC_R-IDS in the three experimental environments is reduced by 12.44%, 22.54% and 25.98%, respectively. Final, the experimental results also show that the real mobile robot using the proposed method can effectively implement the wall-following control.

Download Full-text

Model-Based Multi-Objective Reinforcement Learning by a Reward Occurrence Probability Vector

Advanced Robotics and Intelligent Automation in Manufacturing - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-7998-1382-8.ch010 ◽

2020 ◽

pp. 269-295

Author(s):

Tomohiro Yamaguchi ◽

Shota Nagahama ◽

Yoshihiro Ichikawa ◽

Yoshimichi Honma ◽

Keiki Takadama

Keyword(s):

Reinforcement Learning ◽

Experimental Results ◽

Previous Model ◽

Pareto Optimal ◽

Occurrence Probability ◽

Contrast Model ◽

Multi Objective ◽

Model Based ◽

Model Free ◽

Optimal Policies

This chapter describes solving multi-objective reinforcement learning (MORL) problems where there are multiple conflicting objectives with unknown weights. Previous model-free MORL methods take large number of calculations to collect a Pareto optimal set for each V/Q-value vector. In contrast, model-based MORL can reduce such a calculation cost than model-free MORLs. However, previous model-based MORL method is for only deterministic environments. To solve them, this chapter proposes a novel model-based MORL method by a reward occurrence probability (ROP) vector with unknown weights. The experimental results are reported under the stochastic learning environments with up to 10 states, 3 actions, and 3 reward rules. The experimental results show that the proposed method collects all Pareto optimal policies, and it took about 214 seconds (10 states, 3 actions, 3 rewards) for total learning time. In future research directions, the ways to speed up methods and how to use non-optimal policies are discussed.

Download Full-text