scholarly journals Multi-Task Gaussian Process Upper Confidence Bound for Hyperparameter Tuning

Author(s):  
Bo Shen ◽  
Raghav Gnanasambandam ◽  
Rongxuan Wang ◽  
Zhenyu Kong

In many scientific and engineering applications, Bayesian optimization (BO) is a powerful tool for hyperparameter tuning of a machine learning model, materials design and discovery, etc. BO guides the choice of experiments in a sequential way to find a good combination of design points in as few experiments as possible. It can be formulated as a problem of optimizing a “black-box” function. Different from single-task Bayesian optimization, Multi-task Bayesian optimization is a general method to efficiently optimize multiple different but correlated “black-box” functions. The previous works in Multi-task Bayesian optimization algorithm queries a point to be evaluated for all tasks in each round of search, which is not efficient. For the case where different tasks are correlated, it is not necessary to evaluate all tasks for a given query point. Therefore, the objective of this work is to develop an algorithm for multi-task Bayesian optimization with automatic task selection so that only one task evaluation is needed per query round. Specifically, a new algorithm, namely, multi-task Gaussian process upper confidence bound (MT-GPUCB), is proposed to achieve this objective. The MT-GPUCB is a two-step algorithm, where the first step chooses which query point to evaluate, and the second step automatically selects the most informative task to evaluate. Under the bandit setting, a theoretical analysis is provided to show that our proposed MT-GPUCB is no-regret under some mild conditions. Our proposed algorithm is verified experimentally on a range of synthetic functions as well as real-world problems. The results clearly show the advantages of our query strategy for both design point and task.

2021 ◽  
Author(s):  
Bo Shen ◽  
Raghav Gnanasambandam ◽  
Rongxuan Wang ◽  
Zhenyu Kong

In many scientific and engineering applications, Bayesian optimization (BO) is a powerful tool for hyperparameter tuning of a machine learning model, materials design and discovery, etc. BO guides the choice of experiments in a sequential way to find a good combination of design points in as few experiments as possible. It can be formulated as a problem of optimizing a “black-box” function. Different from single-task Bayesian optimization, Multi-task Bayesian optimization is a general method to efficiently optimize multiple different but correlated “black-box” functions. The previous works in Multi-task Bayesian optimization algorithm queries a point to be evaluated for all tasks in each round of search, which is not efficient. For the case where different tasks are correlated, it is not necessary to evaluate all tasks for a given query point. Therefore, the objective of this work is to develop an algorithm for multi-task Bayesian optimization with automatic task selection so that only one task evaluation is needed per query round. Specifically, a new algorithm, namely, multi-task Gaussian process upper confidence bound (MT-GPUCB), is proposed to achieve this objective. The MT-GPUCB is a two-step algorithm, where the first step chooses which query point to evaluate, and the second step automatically selects the most informative task to evaluate. Under the bandit setting, a theoretical analysis is provided to show that our proposed MT-GPUCB is no-regret under some mild conditions. Our proposed algorithm is verified experimentally on a range of synthetic functions as well as real-world problems. The results clearly show the advantages of our query strategy for both design point and task.


2019 ◽  
Vol 66 ◽  
pp. 151-196 ◽  
Author(s):  
Kirthevasan Kandasamy ◽  
Gautam Dasarathy ◽  
Junier Oliva ◽  
Jeff Schneider ◽  
Barnabás Póczos

In many scientific and engineering applications, we are tasked with the maximisation of an expensive to evaluate black box function f. Traditional settings for this problem assume just the availability of this single function. However, in many cases, cheap approximations to f may be obtainable. For example, the expensive real world behaviour of a robot can be approximated by a cheap computer simulation. We can use these approximations to eliminate low function value regions cheaply and use the expensive evaluations of f in a small but promising region and speedily identify the optimum. We formalise this task as a multi-fidelity bandit problem where the target function and its approximations are sampled from a Gaussian process. We develop MF-GP-UCB, a novel method based on upper confidence bound techniques. In our theoretical analysis we demonstrate that it exhibits precisely the above behaviour and achieves better bounds on the regret than strategies which ignore multi-fidelity information. Empirically, MF-GP-UCB outperforms such naive strategies and other multi-fidelity methods on several synthetic and real experiments.


Author(s):  
Chao Qian ◽  
Hang Xiong ◽  
Ke Xue

Bayesian optimization (BO) is a popular approach for expensive black-box optimization, with applications including parameter tuning, experimental design, and robotics. BO usually models the objective function by a Gaussian process (GP), and iteratively samples the next data point by maximizing an acquisition function. In this paper, we propose a new general framework for BO by generating pseudo-points (i.e., data points whose objective values are not evaluated) to improve the GP model. With the classic acquisition function, i.e., upper confidence bound (UCB), we prove that the cumulative regret can be generally upper bounded. Experiments using UCB and other acquisition functions, i.e., probability of improvement (PI) and expectation of improvement (EI), on synthetic as well as real-world problems clearly show the advantage of generating pseudo-points.


Author(s):  
Julian Berk ◽  
Sunil Gupta ◽  
Santu Rana ◽  
Svetha Venkatesh

In order to improve the performance of Bayesian optimisation, we develop a modified Gaussian process upper confidence bound (GP-UCB) acquisition function. This is done by sampling the exploration-exploitation trade-off parameter from a distribution. We prove that this allows the expected trade-off parameter to be altered to better suit the problem without compromising a bound on the function's Bayesian regret. We also provide results showing that our method achieves better performance than GP-UCB in a range of real-world and synthetic problems.


Author(s):  
Laurens Bliek ◽  
Sicco Verwer ◽  
Mathijs de Weerdt

Abstract When a black-box optimization objective can only be evaluated with costly or noisy measurements, most standard optimization algorithms are unsuited to find the optimal solution. Specialized algorithms that deal with exactly this situation make use of surrogate models. These models are usually continuous and smooth, which is beneficial for continuous optimization problems, but not necessarily for combinatorial problems. However, by choosing the basis functions of the surrogate model in a certain way, we show that it can be guaranteed that the optimal solution of the surrogate model is integer. This approach outperforms random search, simulated annealing and a Bayesian optimization algorithm on the problem of finding robust routes for a noise-perturbed traveling salesman benchmark problem, with similar performance as another Bayesian optimization algorithm, and outperforms all compared algorithms on a convex binary optimization problem with a large number of variables.


2012 ◽  
Vol 3 (4) ◽  
pp. 1-19 ◽  
Author(s):  
Marcio K. Crocomo ◽  
Jean P. Martins ◽  
Alexandre C. B. Delbem

Estimation of Distribution Algorithms (EDAs) have proved themselves as an efficient alternative to Genetic Algorithms when solving nearly decomposable optimization problems. In general, EDAs substitute genetic operators by probabilistic sampling, enabling a better use of the information provided by the population and, consequently, a more efficient search. In this paper the authors exploit EDAs' probabilistic models from a different point-of-view, the authors argue that by looking for substructures in the probabilistic models it is possible to decompose a black-box optimization problem and solve it in a more straightforward way. Relying on the Building-Block hypothesis and the nearly-decomposability concept, their decompositional approach is implemented by a two-step method: 1) the current population is modeled by a Bayesian network, which is further decomposed into substructures (communities) using a version of the Fast Newman Algorithm. 2) Since the identified communities can be seen as sub-problems, they are solved separately and used to compose a solution for the original problem. The experiments showed strengths and limitations for the proposed method, but for some of the tested scenarios the authors’ method outperformed the Bayesian Optimization Algorithm by requiring up to 78% fewer fitness evaluations and being 30 times faster.


2019 ◽  
Vol 52 (7-8) ◽  
pp. 888-895
Author(s):  
Heping Chen ◽  
Seth Bowels ◽  
Biao Zhang ◽  
Thomas Fuhlbrigge

Proportional–integral–derivative control system has been widely used in industrial applications. For complex systems, tuning controller parameters to satisfy the process requirements is very challenging. Different methods have been proposed to solve the problem. However these methods suffer several problems, such as dealing with system complexity, minimizing tuning effort and balancing different performance indices including rise time, settling time, steady-state error and overshoot. In this paper, we develop an automatic controller parameter optimization method based on Gaussian process regression Bayesian optimization algorithm. A non-parametric model is constructed using Gaussian process regression. By combining Gaussian process regression with Bayesian optimization algorithm, potential candidate can be predicted and applied to guide the optimization process. Both experiments and simulation were performed to demonstrate the effectiveness of the proposed method.


Author(s):  
Antonio Candelieri ◽  
Francesco Archetti

AbstractOptimizing a black-box, expensive, and multi-extremal function, given multiple approximations, is a challenging task known as multi-information source optimization (MISO), where each source has a different cost and the level of approximation (aka fidelity) of each source can change over the search space. While most of the current approaches fuse the Gaussian processes (GPs) modelling each source, we propose to use GP sparsification to select only “reliable” function evaluations performed over all the sources. These selected evaluations are used to create an augmented Gaussian process (AGP), whose name is implied by the fact that the evaluations on the most expensive source are augmented with the reliable evaluations over less expensive sources. A new acquisition function, based on confidence bound, is also proposed, including both cost of the next source to query and the location-dependent approximation of that source. This approximation is estimated through a model discrepancy measure and the prediction uncertainty of the GPs. MISO-AGP and the MISO-fused GP counterpart are compared on two test problems and hyperparameter optimization of a machine learning classifier on a large dataset.


Author(s):  
Peter Mitic ◽  

A black-box optimization problem is considered, in which the function to be optimized can only be expressed in terms of a complicated stochastic algorithm that takes a long time to evaluate. The value returned is required to be sufficiently near to a target value, and uses data that has a significant noise component. Bayesian Optimization with an underlying Gaussian Process is used as an optimization solution, and its effectiveness is measured in terms of the number of function evaluations required to attain the target. To improve results, a simple modification of the Gaussian Process ‘Lower Confidence Bound’ (LCB) acquisition function is proposed. The expression used for the confidence bound is squared in order to better comply with the target requirement. With this modification, much improved results compared to random selection methods and to other commonly used acquisition functions are obtained.


Sign in / Sign up

Export Citation Format

Share Document