scholarly journals Selecting the Best Quantity and Variety of Surrogates for an Ensemble Model

Mathematics ◽  
2020 ◽  
Vol 8 (10) ◽  
pp. 1721
Author(s):  
Pengcheng Ye ◽  
Guang Pan

Surrogate modeling techniques are widely used to replace the computationally expensive black-box functions in engineering. As a combination of individual surrogate models, an ensemble of surrogates is preferred due to its strong robustness. However, how to select the best quantity and variety of surrogates for an ensemble has always been a challenging task. In this work, five popular surrogate modeling techniques including polynomial response surface (PRS), radial basis functions (RBF), kriging (KRG), Gaussian process (GP) and linear shepard (SHEP) are considered as the basic surrogate models, resulting in twenty-six ensemble models by using a previously presented weights selection method. The best ensemble model is expected to be found by comparative studies on prediction accuracy and robustness. By testing eight mathematical problems and two engineering examples, we found that: (1) in general, using as many accurate surrogates as possible to construct ensemble models will improve the prediction performance and (2) ensemble models can be used as an insurance rather than offering significant improvements. Moreover, the ensemble of three surrogates PRS, RBF and KRG is preferred based on the prediction performance. The results provide engineering practitioners with guidance on the superior choice of the quantity and variety of surrogates for an ensemble.

Author(s):  
Jie Zhang ◽  
Souma Chowdhury ◽  
Achille Messac ◽  
Junqiang Zhang ◽  
Luciano Castillo

This paper explores the effectiveness of the recently developed surrogate modeling method, the Adaptive Hybrid Functions (AHF), through its application to complex engineered systems design. The AHF is a hybrid surrogate modeling method that seeks to exploit the advantages of each component surrogate. In this paper, the AHF integrates three component surrogate models: (i) the Radial Basis Functions (RBF), (ii) the Extended Radial Basis Functions (E-RBF), and (iii) the Kriging model, by characterizing and evaluating the local measure of accuracy of each model. The AHF is applied to model complex engineering systems and an economic system, namely: (i) wind farm design; (ii) product family design (for universal electric motors); (iii) three-pane window design; and (iv) onshore wind farm cost estimation. We use three differing sampling techniques to investigate their influence on the quality of the resulting surrogates. These sampling techniques are (i) Latin Hypercube Sampling (LHS), (ii) Sobol’s quasirandom sequence, and (iii) Hammersley Sequence Sampling (HSS). Cross-validation is used to evaluate the accuracy of the resulting surrogate models. As expected, the accuracy of the surrogate model was found to improve with increase in the sample size. We also observed that, the Sobol’s and the LHS sampling techniques performed better in the case of high-dimensional problems, whereas the HSS sampling technique performed better in the case of low-dimensional problems. Overall, the AHF method was observed to provide acceptable-to-high accuracy in representing complex design systems.


2019 ◽  
Vol 141 (6) ◽  
Author(s):  
M. Giselle Fernández-Godino ◽  
S. Balachandar ◽  
Raphael T. Haftka

When simulations are expensive and multiple realizations are necessary, as is the case in uncertainty propagation, statistical inference, and optimization, surrogate models can achieve accurate predictions at low computational cost. In this paper, we explore options for improving the accuracy of a surrogate if the modeled phenomenon presents symmetries. These symmetries allow us to obtain free information and, therefore, the possibility of more accurate predictions. We present an analytical example along with a physical example that has parametric symmetries. Although imposing parametric symmetries in surrogate models seems to be a trivial matter, there is not a single way to do it and, furthermore, the achieved accuracy might vary. We present four different ways of using symmetry in surrogate models. Three of them are straightforward, but the fourth is original and based on an optimization of the subset of points used. The performance of the options was compared with 100 random designs of experiments (DoEs) where symmetries were not imposed. We found that each of the options to include symmetries performed the best in one or more of the studied cases and, in all cases, the errors obtained imposing symmetries were substantially smaller than the worst cases among the 100. We explore the options for using symmetries in two surrogates that present different challenges and opportunities: Kriging and linear regression. Kriging is often used as a black box; therefore, we consider approaches to include the symmetries without changes in the main code. On the other hand, since linear regression is often built by the user; owing to its simplicity, we consider also approaches that modify the linear regression basis functions to impose the symmetries.


2012 ◽  
Vol 518-523 ◽  
pp. 1586-1591
Author(s):  
Hao Zhang ◽  
Ze Meng Zhao ◽  
Ahmet Palazoglu ◽  
Wei Sun

Surface ozone in the air boundary layer is one of the most harmful air pollutants produced by photochemical reaction between nitrogen oxides and volatile hydrocarbons, which causes great damage to human beings and environment. The prediction of surface ozone levels plays an important role in the control and the reduction of air pollutants. As model-driven statistical prediction models, hidden Markov Models (HMMs) are rich in mathematical structure and work well in many important applications. Due to the complex structure of HMM, long observation sequences would increase computational load by geometric ratio. In order to reduce training time, wavelet decomposition is used to compress the original observations into shorter ones. During compression step, observation sequences compressed by different wavelet basis functions keep different information content. This may have impact on prediction results. In this paper, ozone prediction performance of HMM based on different wavelet basis functions are discussed. Shannon entropy is employed to measure how much information content is kept in the new sequence compared to the original one. Data from Houston Metropolitan Area, TX are used in this paper. Results show that wavelet basis functions used in data compression step can affect the HMM model performance significantly. The new sequence with the maximum Shannon entropy generates the best prediction result.


Sensors ◽  
2020 ◽  
Vol 20 (18) ◽  
pp. 5332
Author(s):  
Carlos A. Duchanoy ◽  
Hiram Calvo ◽  
Marco A. Moreno-Armendáriz

Surrogate Modeling (SM) is often used to reduce the computational burden of time-consuming system simulations. However, continuous advances in Artificial Intelligence (AI) and the spread of embedded sensors have led to the creation of Digital Twins (DT), Design Mining (DM), and Soft Sensors (SS). These methodologies represent a new challenge for the generation of surrogate models since they require the implementation of elaborated artificial intelligence algorithms and minimize the number of physical experiments measured. To reduce the assessment of a physical system, several existing adaptive sequential sampling methodologies have been developed; however, they are limited in most part to the Kriging models and Kriging-model-based Monte Carlo Simulation. In this paper, we integrate a distinct adaptive sampling methodology to an automated machine learning methodology (AutoML) to help in the process of model selection while minimizing the system evaluation and maximizing the system performance for surrogate models based on artificial intelligence algorithms. In each iteration, this framework uses a grid search algorithm to determine the best candidate models and perform a leave-one-out cross-validation to calculate the performance of each sampled point. A Voronoi diagram is applied to partition the sampling region into some local cells, and the Voronoi vertexes are considered as new candidate points. The performance of the sample points is used to estimate the accuracy of the model for a set of candidate points to select those that will improve more the model’s accuracy. Then, the number of candidate models is reduced. Finally, the performance of the framework is tested using two examples to demonstrate the applicability of the proposed method.


Author(s):  
M. R. Brake ◽  
M. J. Starr ◽  
D. J. Segalman

Constrained layer frictional interfaces, such as joints, are prevalent in engineering applications. Because these interfaces are often used in built-up structures, reduced order modeling techniques are utilized for developing simulations of them. One limitation of the existing reduced order modeling techniques, though, is the loss of the local kinematics due to regularization of the frictional interfaces. This paper aims to avoid the use of regularization in the modeling of constrained layer frictional interfaces by utilizing a new technique, the discontinuous basis function method. This method supplements the linear mode shapes of the system with a series of discontinuous basis functions that are used to account for nonlinear forces acting on the system. A symmetric, constrained layer frictional interface is modeled as a continuous system connected to two rigid planes by a series of Iwan elements. This symmetric model is used to test the hypothesis that symmetric problems are not subjected to the range of variability seen in physical structures, which have non-uniform pressure and friction distributions. Insights from solving the symmetric problem are used to consider the case where a non-uniform distribution of friction and pressure exists.


2014 ◽  
Vol 670-671 ◽  
pp. 548-553
Author(s):  
Wen Rui Duan ◽  
Ling Tian

In order to analyze performance of the Capacitively Coupled Plasma (CCP) etcher, commercial software like OPTIMUS can be applied to approximate etch process model by Response Surface Method (RSM) or Radial Basis Functions (RBF). Multi-factor parameters are concerned in etch process, like frequencies of the dual Radio Frequency system (RF) and flow rate and flow ratio of the process gas. When facing the multi-dimensional problem, the algorithms would turned to be inefficiency and the optimization process may be trapped in local minimum area or cannot converge because of oscillation. To improve surrogate modeling for the CCP etcher, a self-optimizing RBF (SO-RBF) algorithm is proposed and a process modeling tool is developed. Experiments on a state-of-art dual station CCP etcher shows that based on the global approximation model generated by this algorithm, process parameter optimization can be easily implemented with less error than OPTIMUS.


Author(s):  
Yong Zhao ◽  
Siyu Ye ◽  
Xianqi Chen ◽  
Yufeng Xia ◽  
Xiaohu Zheng

AbstractPolynomial Regression Surface (PRS) is a commonly used surrogate model for its simplicity, good interpretability, and computational efficiency. The performance of PRS is largely dependent on its basis functions. With limited samples, how to correctly select basis functions remains a challenging problem. To improve prediction accuracy, a PRS modeling approach based on multitask optimization and ensemble modeling (PRS-MOEM) is proposed for rational basis function selection with robustness. First, the training set is partitioned into multiple subsets by the cross validation method, and for each subset a sub-model is independently constructed by optimization. To effectively solve these multiple optimization tasks, an improved evolutionary algorithm with transfer migration is developed, which can enhance the optimization efficiency and robustness by useful information exchange between these similar optimization tasks. Second, a novel ensemble method is proposed to integrate the multiple sub-models into the final model. The significance of each basis function is scored according to the error estimation of the sub-models and the occurrence frequency of the basis functions in all the sub-models. Then the basis functions are ranked and selected based on the bias-corrected Akaike’s information criterion. PRS-MOEM can effectively mitigate the negative influence from the sub-models with large prediction error, and alleviate the uncertain impact resulting from the randomness of training subsets. Thus the basis function selection accuracy and robustness can be enhanced. Seven numerical examples and an engineering problem are utilized to test and verify the effectiveness of PRS-MOEM.


2018 ◽  
Vol 12 (1) ◽  
pp. 1-16 ◽  
Author(s):  
Shreshth Nagpal ◽  
Caitlin Mueller ◽  
Arfa Aijazi ◽  
Christoph F. Reinhart

Sign in / Sign up

Export Citation Format

Share Document