scholarly journals Sample-efficient Optimization Using Neural Networks

2021 ◽  
Author(s):  
◽  
Mashall Aryan

<p>The solution to many science and engineering problems includes identifying the minimum or maximum of an unknown continuous function whose evaluation inflicts non-negligible costs in terms of resources such as money, time, human attention or computational processing. In such a case, the choice of new points to evaluate is critical. A successful approach has been to choose these points by considering a distribution over plausible surfaces, conditioned on all previous points and their evaluations. In this sequential bi-step strategy, also known as Bayesian Optimization, first a prior is defined over possible functions and updated to a posterior in the light of available observations. Then using this posterior, namely the surrogate model, an infill criterion is formed and utilized to find the next location to sample from. By far the most common prior distribution and infill criterion are Gaussian Process and Expected Improvement, respectively.    The popularity of Gaussian Processes in Bayesian optimization is partially due to their ability to represent the posterior in closed form. Nevertheless, the Gaussian Process is afflicted with several shortcomings that directly affect its performance. For example, inference scales poorly with the amount of data, numerical stability degrades with the number of data points, and strong assumptions about the observation model are required, which might not be consistent with reality. These drawbacks encourage us to seek better alternatives. This thesis studies the application of Neural Networks to enhance Bayesian Optimization. It proposes several Bayesian optimization methods that use neural networks either as their surrogates or in the infill criterion.    This thesis introduces a novel Bayesian Optimization method in which Bayesian Neural Networks are used as a surrogate. This has reduced the computational complexity of inference in surrogate from cubic (on the number of observation) in GP to linear. Different variations of Bayesian Neural Networks (BNN) are put into practice and inferred using a Monte Carlo sampling. The results show that Monte Carlo Bayesian Neural Network surrogate could performed better than, or at least comparably to the Gaussian Process-based Bayesian optimization methods on a set of benchmark problems.  This work develops a fast Bayesian Optimization method with an efficient surrogate building process. This new Bayesian Optimization algorithm utilizes Bayesian Random-Vector Functional Link Networks as surrogate. In this family of models the inference is only performed on a small subset of the entire model parameters and the rest are randomly drawn from a prior. The proposed methods are tested on a set of benchmark continuous functions and hyperparameter optimization problems and the results show the proposed methods are competitive with state-of-the-art Bayesian Optimization methods.  This study proposes a novel Neural network-based infill criterion. In this method locations to sample from are found by minimizing the joint conditional likelihood of the new point and parameters of a neural network. The results show that in Bayesian Optimization methods with Bayesian Neural Network surrogates, this new infill criterion outperforms the expected improvement.   Finally, this thesis presents order-preserving generative models and uses it in a variational Bayesian context to infer Implicit Variational Bayesian Neural Network (IVBNN) surrogates for a new Bayesian Optimization. This new inference mechanism is more efficient and scalable than Monte Carlo sampling. The results show that IVBNN could outperform Monte Carlo BNN in Bayesian optimization of hyperparameters of machine learning models.</p>

2021 ◽  
Author(s):  
◽  
Mashall Aryan

<p>The solution to many science and engineering problems includes identifying the minimum or maximum of an unknown continuous function whose evaluation inflicts non-negligible costs in terms of resources such as money, time, human attention or computational processing. In such a case, the choice of new points to evaluate is critical. A successful approach has been to choose these points by considering a distribution over plausible surfaces, conditioned on all previous points and their evaluations. In this sequential bi-step strategy, also known as Bayesian Optimization, first a prior is defined over possible functions and updated to a posterior in the light of available observations. Then using this posterior, namely the surrogate model, an infill criterion is formed and utilized to find the next location to sample from. By far the most common prior distribution and infill criterion are Gaussian Process and Expected Improvement, respectively.    The popularity of Gaussian Processes in Bayesian optimization is partially due to their ability to represent the posterior in closed form. Nevertheless, the Gaussian Process is afflicted with several shortcomings that directly affect its performance. For example, inference scales poorly with the amount of data, numerical stability degrades with the number of data points, and strong assumptions about the observation model are required, which might not be consistent with reality. These drawbacks encourage us to seek better alternatives. This thesis studies the application of Neural Networks to enhance Bayesian Optimization. It proposes several Bayesian optimization methods that use neural networks either as their surrogates or in the infill criterion.    This thesis introduces a novel Bayesian Optimization method in which Bayesian Neural Networks are used as a surrogate. This has reduced the computational complexity of inference in surrogate from cubic (on the number of observation) in GP to linear. Different variations of Bayesian Neural Networks (BNN) are put into practice and inferred using a Monte Carlo sampling. The results show that Monte Carlo Bayesian Neural Network surrogate could performed better than, or at least comparably to the Gaussian Process-based Bayesian optimization methods on a set of benchmark problems.  This work develops a fast Bayesian Optimization method with an efficient surrogate building process. This new Bayesian Optimization algorithm utilizes Bayesian Random-Vector Functional Link Networks as surrogate. In this family of models the inference is only performed on a small subset of the entire model parameters and the rest are randomly drawn from a prior. The proposed methods are tested on a set of benchmark continuous functions and hyperparameter optimization problems and the results show the proposed methods are competitive with state-of-the-art Bayesian Optimization methods.  This study proposes a novel Neural network-based infill criterion. In this method locations to sample from are found by minimizing the joint conditional likelihood of the new point and parameters of a neural network. The results show that in Bayesian Optimization methods with Bayesian Neural Network surrogates, this new infill criterion outperforms the expected improvement.   Finally, this thesis presents order-preserving generative models and uses it in a variational Bayesian context to infer Implicit Variational Bayesian Neural Network (IVBNN) surrogates for a new Bayesian Optimization. This new inference mechanism is more efficient and scalable than Monte Carlo sampling. The results show that IVBNN could outperform Monte Carlo BNN in Bayesian optimization of hyperparameters of machine learning models.</p>


Author(s):  
N.T. Abdullaev ◽  
U.N. Musevi ◽  
K.S. Pashaeva

Formulation of the problem. This work is devoted to the use of artificial neural networks for diagnosing the functional state of the gastrointestinal tract caused by the influence of parasites in the body. For the experiment, 24 symptoms were selected, the number of which can be increased, and 9 most common diseases. The coincidence of neural network diagnostics with classical medical diagnostics for a specific disease is shown. The purpose of the work is to compare the neural networks in terms of their performance after describing the methods of preprocessing, isolating symptoms and classifying parasitic diseases of the gastrointestinal tract. Computer implementation of the experiment was carried out in the NeuroPro 0.25 software environment and optimization methods were chosen for training the network: "gradient descent" modified by Par Tan, "conjugate gradients", BFGS. Results. The results of forecasting using a multilayer perceptron using the above optimization methods are presented. To compare optimization methods, we used the values of the minimum and maximum network errors. Comparison of optimization methods using network errors makes it possible to draw the correct conclusion that for the task at hand, the best results were obtained when using the "conjugate gradients" optimization method. Practical significance. The proposed approach facilitates the work of the experimenter-doctor in choosing the optimization method when working with neural networks for the problem of diagnosing parasitic diseases of the gastrointestinal tract from the point of view of assessing the network error.


Algorithms ◽  
2021 ◽  
Vol 14 (6) ◽  
pp. 163
Author(s):  
Yaru Li ◽  
Yulai Zhang ◽  
Yongping Cai

The selection of the hyper-parameters plays a critical role in the task of prediction based on the recurrent neural networks (RNN). Traditionally, the hyper-parameters of the machine learning models are selected by simulations as well as human experiences. In recent years, multiple algorithms based on Bayesian optimization (BO) are developed to determine the optimal values of the hyper-parameters. In most of these methods, gradients are required to be calculated. In this work, the particle swarm optimization (PSO) is used under the BO framework to develop a new method for hyper-parameter optimization. The proposed algorithm (BO-PSO) is free of gradient calculation and the particles can be optimized in parallel naturally. So the computational complexity can be effectively reduced which means better hyper-parameters can be obtained under the same amount of calculation. Experiments are done on real world power load data,where the proposed method outperforms the existing state-of-the-art algorithms,BO with limit-BFGS-bound (BO-L-BFGS-B) and BO with truncated-newton (BO-TNC),in terms of the prediction accuracy. The errors of the prediction result in different models show that BO-PSO is an effective hyper-parameter optimization method.


Author(s):  
GERALDO BRAZ JUNIOR ◽  
LEONARDO DE OLIVEIRA MARTINS ◽  
ARISTÓFANES CORREA SILVA ◽  
ANSELMO CARDOSO PAIVA

Female breast cancer is a major cause of deaths in occidental countries. Computer-aided Detection (CAD) systems can aid radiologists to increase diagnostic accuracy. In this work, we present a comparison between two classifiers applied to the separation of normal and abnormal breast tissues from mammograms. The purpose of the comparison is to select the best prediction technique to be part of a CAD system. Each region of interest is classified through a Support Vector Machine (SVM) and a Bayesian Neural Network (BNN) as normal or abnormal region. SVM is a machine-learning method, based on the principle of structural risk minimization, which shows good performance when applied to data outside the training set. A Bayesian Neural Network is a classifier that joins traditional neural networks theory and Bayesian inference. We use a set of measures obtained by the application of the semivariogram, semimadogram, covariogram, and correlogram functions to the characterization of breast tissue as normal or abnormal. The results show that SVM presents best performance for the classification of breast tissues in mammographic images. The tests indicate that SVM has more generalization power than the BNN classifier. BNN has a sensibility of 76.19% and a specificity of 79.31%, while SVM presents a sensibility of 74.07% and a specificity of 98.77%. The accuracy rate for tests is 78.70% and 92.59% for BNN and SVM, respectively.


Author(s):  
Arunabha Batabyal ◽  
Sugrim Sagar ◽  
Jian Zhang ◽  
Tejesh Dube ◽  
Xuehui Yang ◽  
...  

Abstract A persistent problem in the selective laser sintering process is to maintain the quality of additively manufactured parts, which can be attributed to the various sources of uncertainty. In this work, a two-particle phase-field microstructure model has been analyzed. The sources of uncertainty as the two input parameters were surface diffusivity and inter-particle distance. The response quantity of interest (QOI) was selected as the size of the neck region that develops between the two particles. Two different cases with equal and unequal sized particles were studied. It was observed that the neck size increased with increasing surface diffusivity and decreased with increasing inter-particle distance irrespective of particle size. Sensitivity analysis found that the inter-particle distance has more influence on variation in neck size than that of surface diffusivity. The machine learning algorithm Gaussian Process Regression was used to create the surrogate model of the QOI. Bayesian Optimization method was used to find optimal values of the input parameters. For equal-sized particles, optimization using Probability of Improvement provided optimal values of surface diffusivity and inter-particle distance as 23.8268 and 40.0001, respectively. The Expected Improvement as an acquisition function gave optimal values 23.9874 and 40.7428, respectively. For unequal sized particles, optimal design values from Probability of Improvement were 23.9700 and 33.3005, respectively, while those from Expected Improvement were 23.9893 and 33.9627, respectively. The optimization results from the two different acquisition functions seemed to be in good agreement.


Author(s):  
Seyyed Ali Latifi Rostami ◽  
Ali Ghoddosian

In this paper, a robust topology optimization method presents that insensitive to the uncertainty in geometry and applied load. Geometric uncertainty can be introduced in the manufacturing variability. Applied load uncertainty is occurring in magnitude and angle of force. These uncertainties can be modeled as a random field. A memory-less transformation of random fields used to random variation modeling. The Adaptive Sparse Grid Collocation (ASGC) method combined with the uncertainty models provides robust designs by utilizing already developed deterministic solvers. The proposed algorithm provides a computationally cheap alternative to previously introduced stochastic optimization methods based on Monte Carlo sampling by using the adaptive sparse grid method. Numerical examples, such as a 2D simply supported beam and cantilever beam as benchmark problems, are used to show the effectiveness and superiority of the ASGC method.


2020 ◽  
pp. 147592172090454 ◽  
Author(s):  
Manuel A Vega ◽  
Michael D Todd

Many physics-based and surrogate models used in structural health monitoring are affected by different sources of uncertainty such as model approximations and simplified assumptions. Optimal structural health monitoring and prognostics are only possible with uncertainty quantification that leads to an informed course of action. In this article, a Bayesian neural network using variational inference is applied to learn a damage feature from a high-fidelity finite element model. Bayesian neural networks can learn from small and noisy data sets and are more robust to overfitting than artificial neural networks, which make it very suitable for applications such as structural health monitoring. Also, uncertainty estimates obtained from a trained Bayesian neural network model are used to build a cost-informed decision-making process. To demonstrate the applicability of Bayesian neural networks, an example of this approach applied to miter gates is presented. In this example, a degradation model based on real inspection data is used to simulate the damage evolution.


2012 ◽  
Vol 628 ◽  
pp. 324-329
Author(s):  
F. García Fernández ◽  
L. García Esteban ◽  
P. de Palacios ◽  
A. García-Iruela ◽  
R. Cabedo Gallén

Artificial neural networks have become a powerful modeling tool. However, although they obtain an output with very good accuracy, they provide no information about the uncertainty of the network or its coverage intervals. This study describes the application of the Monte Carlo method to obtain the output uncertainty and coverage intervals of a particular type of artificial neural network: the multilayer perceptron.


Sign in / Sign up

Export Citation Format

Share Document