scholarly journals A Learning Based Branch and Bound for Maximum Common Subgraph Related Problems

2020 ◽  
Vol 34 (03) ◽  
pp. 2392-2399
Author(s):  
Yanli Liu ◽  
Chu-Min Li ◽  
Hua Jiang ◽  
Kun He

The performance of a branch-and-bound (BnB) algorithm for maximum common subgraph (MCS) problem and its related problems, like maximum common connected subgraph (MCCS) and induced Subgraph Isomorphism (SI), crucially depends on the branching heuristic. We propose a branching heuristic inspired from reinforcement learning with a goal of reaching a tree leaf as early as possible to greatly reduce the search tree size. Experimental results show that the proposed heuristic consistently and significantly improves the current best BnB algorithm for the MCS, MCCS and SI problems. An analysis is carried out to give insight on why and how reinforcement learning is useful in the new branching heuristic.

Author(s):  
Daniel Anderson ◽  
Gregor Hendel ◽  
Pierre Le Bodic ◽  
Merlin Viernickel

We propose a simple and general online method to measure the search progress within the Branch-and-Bound algorithm, from which we estimate the size of the remaining search tree. We then show how this information can help solvers algorithmically at runtime by designing a restart strategy for MixedInteger Programming (MIP) solvers that decides whether to restart the search based on the current estimate of the number of remaining nodes in the tree. We refer to this type of algorithm as clairvoyant. Our clairvoyant restart strategy outperforms a state-of-the-art solver on a large set of publicly available MIP benchmark instances. It is implemented in the MIP solver SCIP and will be available in future releases.


Author(s):  
Ciaran McCreesh ◽  
Patrick Prosser ◽  
James Trimble

We introduce a new branch and bound algorithm for the maximum common subgraph and maximum common connected subgraph problems which is based around vertex labelling and partitioning. Our method in some ways resembles a traditional constraint programming approach, but uses a novel compact domain store and supporting inference algorithms which dramatically reduce the memory and computation requirements during search, and allow better dual viewpoint ordering heuristics to be calculated cheaply. Experiments show a speedup of more than an order of magnitude over the state of the art, and demonstrate that we can operate on much larger graphs without running out of memory.


2014 ◽  
Vol 571-572 ◽  
pp. 105-108
Author(s):  
Lin Xu

This paper proposes a new framework of combining reinforcement learning with cloud computing digital library. Unified self-learning algorithms, which includes reinforcement learning, artificial intelligence and etc, have led to many essential advances. Given the current status of highly-available models, analysts urgently desire the deployment of write-ahead logging. In this paper we examine how DNS can be applied to the investigation of superblocks, and introduce the reinforcement learning to improve the quality of current cloud computing digital library. The experimental results show that the method works more efficiency.


Author(s):  
Nicolas Bougie ◽  
Ryutaro Ichise

Deep reinforcement learning (DRL) methods traditionally struggle with tasks where environment rewards are sparse or delayed, which entails that exploration remains one of the key challenges of DRL. Instead of solely relying on extrinsic rewards, many state-of-the-art methods use intrinsic curiosity as exploration signal. While they hold promise of better local exploration, discovering global exploration strategies is beyond the reach of current methods. We propose a novel end-to-end intrinsic reward formulation that introduces high-level exploration in reinforcement learning. Our curiosity signal is driven by a fast reward that deals with local exploration and a slow reward that incentivizes long-time horizon exploration strategies. We formulate curiosity as the error in an agent’s ability to reconstruct the observations given their contexts. Experimental results show that this high-level exploration enables our agents to outperform prior work in several Atari games.


Mathematics ◽  
2020 ◽  
Vol 8 (8) ◽  
pp. 1254 ◽  
Author(s):  
Cheng-Hung Chen ◽  
Shiou-Yun Jeng ◽  
Cheng-Jian Lin

In this study, a fuzzy logic controller with the reinforcement improved differential search algorithm (FLC_R-IDS) is proposed for solving a mobile robot wall-following control problem. This study uses the reward and punishment mechanisms of reinforcement learning to train the mobile robot wall-following control. The proposed improved differential search algorithm uses parameter adaptation to adjust the control parameters. To improve the exploration of the algorithm, a change in the number of superorganisms is required as it involves a stopover site. This study uses reinforcement learning to guide the behavior of the robot. When the mobile robot satisfies three reward conditions, it gets reward +1. The accumulated reward value is used to evaluate the controller and to replace the next controller training. Experimental results show that, compared with the traditional differential search algorithm and the chaos differential search algorithm, the average error value of the proposed FLC_R-IDS in the three experimental environments is reduced by 12.44%, 22.54% and 25.98%, respectively. Final, the experimental results also show that the real mobile robot using the proposed method can effectively implement the wall-following control.


Author(s):  
Tomohiro Yamaguchi ◽  
Shota Nagahama ◽  
Yoshihiro Ichikawa ◽  
Yoshimichi Honma ◽  
Keiki Takadama

This chapter describes solving multi-objective reinforcement learning (MORL) problems where there are multiple conflicting objectives with unknown weights. Previous model-free MORL methods take large number of calculations to collect a Pareto optimal set for each V/Q-value vector. In contrast, model-based MORL can reduce such a calculation cost than model-free MORLs. However, previous model-based MORL method is for only deterministic environments. To solve them, this chapter proposes a novel model-based MORL method by a reward occurrence probability (ROP) vector with unknown weights. The experimental results are reported under the stochastic learning environments with up to 10 states, 3 actions, and 3 reward rules. The experimental results show that the proposed method collects all Pareto optimal policies, and it took about 214 seconds (10 states, 3 actions, 3 rewards) for total learning time. In future research directions, the ways to speed up methods and how to use non-optimal policies are discussed.


Sign in / Sign up

Export Citation Format

Share Document