scholarly journals Scalable3-BO: Big Data Meets HPC - A Scalable Asynchronous Parallel High-Dimensional Bayesian Optimization Framework on Supercomputers

2021 ◽  
Author(s):  
Anh Tran

Abstract Bayesian optimization (BO) is a flexible and powerful framework that is suitable for computationally expensive simulation-based applications and guarantees statistical convergence to the global optimum. While remaining as one of the most popular optimization methods, its capability is hindered by the size of data, the dimensionality of the considered problem, and the nature of sequential optimization. These scalability issues are intertwined with each other and must be tackled simultaneously. In this work, we propose the Scalable3-BO framework, which employs sparse GP as the underlying surrogate model to scope with Big Data and is equipped with a random embedding to efficiently optimize high-dimensional problems with low effective dimensionality. The Scalable3-BO framework is further leveraged with asynchronous parallelization feature, which fully exploits the computational resource on HPC within a computational budget. As a result, the proposed Scalable3-BO framework is scalable in three independent perspectives: with respect to data size, dimensionality, and computational resource on HPC. The goal of this work is to push the frontiers of BO beyond its well-known scalability issues and minimize the wall-clock waiting time for optimizing high-dimensional computationally expensive applications. We demonstrate the capability of Scalable3-BO with 1 million data points, 10,000-dimensional problems, with 20 concurrent workers in an HPC environment.

2020 ◽  
Author(s):  
Alexander Jung

We propose networked exponential families for non-parametric<br>machine learning from massive network-structured datasets<br>(“big data over networks”). High-dimensional data points are<br>interpreted as the realizations of a random process distributed<br>according to some exponential family. Networked exponential<br>families allow to jointly leverage the information contained<br>in high-dimensional data points and their network structure.<br>For data points representing individuals, we obtain perfectly<br>personalized models which enable high-precision medicine or<br>more general recommendation systems.We learn the parameters<br>of networked exponential families, using the network Lasso<br>which implicitly pools (or clusters) the data points according to<br>the intrinsic network structure and a local likelihood function.<br>Our main theoretical result characterizes how the accuracy<br>of network Lasso depends on the network structure and the<br>information geometry of the node-wise exponential families.<br>The network Lasso can be implemented as highly scalable<br>message-passing over the data network. Such message passing<br>is appealing for federated machine learning relying on edge<br>computing. The proposed method is also privacy preserving in<br>the sense that no raw data but only parameter (estimates) are<br>shared among different nodes.


2017 ◽  
Vol 37 (1) ◽  
pp. 137-154 ◽  
Author(s):  
Peter Englert ◽  
Marc Toussaint

We consider the scenario where a robot is demonstrated a manipulation skill once and should then use only a few trials on its own to learn to reproduce, optimize, and generalize that same skill. A manipulation skill is generally a high-dimensional policy. To achieve the desired sample efficiency, we need to exploit the inherent structure in this problem. With our approach, we propose to decompose the problem into analytically known objectives, such as motion smoothness, and black-box objectives, such as trial success or reward, depending on the interaction with the environment. The decomposition allows us to leverage and combine (i) constrained optimization methods to address analytic objectives, (ii) constrained Bayesian optimization to explore black-box objectives, and (iii) inverse optimal control methods to eventually extract a generalizable skill representation. The algorithm is evaluated on a synthetic benchmark experiment and compared with state-of-the-art learning methods. We also demonstrate the performance on real-robot experiments with a PR2.


2017 ◽  
Author(s):  
Safoora Yousefi ◽  
Fatemeh Amrollahi ◽  
Mohamed Amgad ◽  
Coco Dong ◽  
Joshua E. Lewis ◽  
...  

ABSTRACTTranslating the vast data generated by genomic platforms into accurate predictions of clinical outcomes is a fundamental challenge in genomic medicine. Many prediction methods face limitations in learning from the high-dimensional profiles generated by these platforms, and rely on experts to hand-select a small number of features for training prediction models. In this paper, we demonstrate how deep learning and Bayesian optimization methods that have been remarkably successful in general high-dimensional prediction tasks can be adapted to the problem of predicting cancer outcomes. We perform an extensive comparison of Bayesian optimized deep survival models and other state of the art machine learning methods for survival analysis, and describe a framework for interpreting deep survival models using a risk backpropagation technique. Finally, we illustrate that deep survival models can successfully transfer information across diseases to improve prognostic accuracy. We provide an open-source software implementation of this framework called SurvivalNet that enables automatic training, evaluation and interpretation of deep survival models.


2020 ◽  
Author(s):  
Alexander Jung

We propose networked exponential families for non-parametric<br>machine learning from massive network-structured datasets<br>(“big data over networks”). High-dimensional data points are<br>interpreted as the realizations of a random process distributed<br>according to some exponential family. Networked exponential<br>families allow to jointly leverage the information contained<br>in high-dimensional data points and their network structure.<br>For data points representing individuals, we obtain perfectly<br>personalized models which enable high-precision medicine or<br>more general recommendation systems.We learn the parameters<br>of networked exponential families, using the network Lasso<br>which implicitly pools (or clusters) the data points according to<br>the intrinsic network structure and a local likelihood function.<br>Our main theoretical result characterizes how the accuracy<br>of network Lasso depends on the network structure and the<br>information geometry of the node-wise exponential families.<br>The network Lasso can be implemented as highly scalable<br>message-passing over the data network. Such message passing<br>is appealing for federated machine learning relying on edge<br>computing. The proposed method is also privacy preserving in<br>the sense that no raw data but only parameter (estimates) are<br>shared among different nodes.


2020 ◽  
Vol 34 (03) ◽  
pp. 2425-2432
Author(s):  
Hung Tran-The ◽  
Sunil Gupta ◽  
Santu Rana ◽  
Svetha Venkatesh

Scaling Bayesian optimisation (BO) to high-dimensional search spaces is a active and open research problems particularly when no assumptions are made on function structure. The main reason is that at each iteration, BO requires to find global maximisation of acquisition function, which itself is a non-convex optimization problem in the original search space. With growing dimensions, the computational budget for this maximisation gets increasingly short leading to inaccurate solution of the maximisation. This inaccuracy adversely affects both the convergence and the efficiency of BO. We propose a novel approach where the acquisition function only requires maximisation on a discrete set of low dimensional subspaces embedded in the original high-dimensional search space. Our method is free of any low dimensional structure assumption on the function unlike many recent high-dimensional BO methods. Optimising acquisition function in low dimensional subspaces allows our method to obtain accurate solutions within limited computational budget. We show that in spite of this convenience, our algorithm remains convergent. In particular, cumulative regret of our algorithm only grows sub-linearly with the number of iterations. More importantly, as evident from our regret bounds, our algorithm provides a way to trade the convergence rate with the number of subspaces used in the optimisation. Finally, when the number of subspaces is "sufficiently large", our algorithm's cumulative regret is at most O*(√TγT) as opposed to O*(√DTγT) for the GP-UCB of Srinivas et al. (2012), reducing a crucial factor √D where D being the dimensional number of input space. We perform empirical experiments to evaluate our method extensively, showing that its sample efficiency is better than the existing methods for many optimisation problems involving dimensions up to 5000.


Author(s):  
Yogesh Jaluria

Abstract A common occurrence in many practical systems is that the desired result is known or given, but the conditions needed for achieving this result are not known. This situation leads to inverse problems, which are of particular interest in thermal processes. For instance, the temperature cycle to which a component must be subjected in order to obtain desired characteristics in a manufacturing system, such as heat treatment or plastic thermoforming, is prescribed. However, the necessary boundary and initial conditions are not known and must be determined by solving the inverse problem. Similarly, an inverse solution may be needed to complete a given physical problem by determining the unknown boundary conditions. Solutions thus obtained are not unique and optimization is generally needed to obtain results within a small region of uncertainty. This review paper discusses several inverse problems that arise in a variety of practical processes and presents some of the approaches that may be used to solve them and obtain acceptable and realistic results. Optimization methods that may be used for reducing the error are presented. A few examples are given to illustrate the applicability of these methods and the challenges that must be addressed in solving inverse problems. These examples include the heat treatment process, unknown wall temperature distribution in a furnace, and transport in a plume or jet involving the determination of the strength and location of the heat source by employing a few selected data points downstream. Optimization of the positioning of the data points is used to minimize the number of samples needed for accurate predictions.


2020 ◽  
Author(s):  
Alberto Bemporad ◽  
Dario Piga

AbstractThis paper proposes a method for solving optimization problems in which the decision-maker cannot evaluate the objective function, but rather can only express a preference such as “this is better than that” between two candidate decision vectors. The algorithm described in this paper aims at reaching the global optimizer by iteratively proposing the decision maker a new comparison to make, based on actively learning a surrogate of the latent (unknown and perhaps unquantifiable) objective function from past sampled decision vectors and pairwise preferences. A radial-basis function surrogate is fit via linear or quadratic programming, satisfying if possible the preferences expressed by the decision maker on existing samples. The surrogate is used to propose a new sample of the decision vector for comparison with the current best candidate based on two possible criteria: minimize a combination of the surrogate and an inverse weighting distance function to balance between exploitation of the surrogate and exploration of the decision space, or maximize a function related to the probability that the new candidate will be preferred. Compared to active preference learning based on Bayesian optimization, we show that our approach is competitive in that, within the same number of comparisons, it usually approaches the global optimum more closely and is computationally lighter. Applications of the proposed algorithm to solve a set of benchmark global optimization problems, for multi-objective optimization, and for optimal tuning of a cost-sensitive neural network classifier for object recognition from images are described in the paper. MATLAB and a Python implementations of the algorithms described in the paper are available at http://cse.lab.imtlucca.it/~bemporad/glis.


2021 ◽  
pp. 1-59
Author(s):  
George Cheng ◽  
G. Gary Wang ◽  
Yeong-Maw Hwang

Abstract Multi-objective optimization (MOO) problems with computationally expensive constraints are commonly seen in real-world engineering design. However, metamodel based design optimization (MBDO) approaches for MOO are often not suitable for high-dimensional problems and often do not support expensive constraints. In this work, the Situational Adaptive Kreisselmeier and Steinhauser (SAKS) method was combined with a new multi-objective trust region optimizer (MTRO) strategy to form the SAKS-MTRO method for MOO problems with expensive black-box constraint functions. The SAKS method is an approach that hybridizes the modeling and aggregation of expensive constraints and adds an adaptive strategy to control the level of hybridization. The MTRO strategy uses a combination of objective decomposition and K-means clustering to handle MOO problems. SAKS-MTRO was benchmarked against four popular multi-objective optimizers and demonstrated superior performance on average. SAKS-MTRO was also applied to optimize the design of a semiconductor substrate and the design of an industrial recessed impeller.


2021 ◽  
Vol 12 (4) ◽  
pp. 98-116
Author(s):  
Noureddine Boukhari ◽  
Fatima Debbat ◽  
Nicolas Monmarché ◽  
Mohamed Slimane

Evolution strategies (ES) are a family of strong stochastic methods for global optimization and have proved their capability in avoiding local optima more than other optimization methods. Many researchers have investigated different versions of the original evolution strategy with good results in a variety of optimization problems. However, the convergence rate of the algorithm to the global optimum stays asymptotic. In order to accelerate the convergence rate, a hybrid approach is proposed using the nonlinear simplex method (Nelder-Mead) and an adaptive scheme to control the local search application, and the authors demonstrate that such combination yields significantly better convergence. The new proposed method has been tested on 15 complex benchmark functions and applied to the bi-objective portfolio optimization problem and compared with other state-of-the-art techniques. Experimental results show that the performance is improved by this hybridization in terms of solution eminence and strong convergence.


Sign in / Sign up

Export Citation Format

Share Document