An Enhanced Squared Exponential Kernel With Manhattan Similarity Measure for High Dimensional Gaussian Process Models

2021 ◽  
Author(s):  
Yanwen Xu ◽  
Pingfeng Wang

Abstract The Gaussian Process (GP) model has become one of the most popular methods and exhibits superior performance among surrogate models in many engineering design applications. However, the standard Gaussian process model is not able to deal with high dimensional applications. The root of the problem comes from the similarity measurements of the GP model that relies on the Euclidean distance, which becomes uninformative in the high-dimensional cases, and causes accuracy and efficiency issues. Limited studies explore this issue. In this study, thereby, we propose an enhanced squared exponential kernel using Manhattan distance that is more effective at preserving the meaningfulness of proximity measures and preferred to be used in the GP model for high-dimensional cases. The experiments show that the proposed approach has obtained a superior performance in high-dimensional problems. Based on the analysis and experimental results of similarity metrics, a guide to choosing the desirable similarity measures which result in the most accurate and efficient results for the Kriging model with respect to different sample sizes and dimension levels is provided in this paper.

Author(s):  
Yanwen Xu ◽  
Pingfeng Wang

Abstract The Gaussian Process (GP) model has become one of the most popular methods to develop computationally efficient surrogate models in many engineering design applications, including simulation-based design optimization and uncertainty analysis. When more observations are used for high dimensional problems, estimating the best model parameters of Gaussian Process model is still an essential yet challenging task due to considerable computation cost. One of the most commonly used methods to estimate model parameters is Maximum Likelihood Estimation (MLE). A common bottleneck arising in MLE is computing a log determinant and inverse over a large positive definite matrix. In this paper, a comparison of five commonly used gradient based and non-gradient based optimizers including Sequential Quadratic Programming (SQP), Quasi-Newton method, Interior Point method, Trust Region method and Pattern Line Search for likelihood function optimization of high dimension GP surrogate modeling problem is conducted. The comparison has been focused on the accuracy of estimation, the efficiency of computation and robustness of the method for different types of Kernel functions.


Sensors ◽  
2019 ◽  
Vol 19 (21) ◽  
pp. 4610 ◽  
Author(s):  
Adolfo Molada-Tebar ◽  
Gabriel Riutort-Mayol ◽  
Ángel Marqués-Mateu ◽  
José Luis Lerma

In this paper, we propose a novel approach to undertake the colorimetric camera characterization procedure based on a Gaussian process (GP). GPs are powerful and flexible nonparametric models for multivariate nonlinear functions. To validate the GP model, we compare the results achieved with a second-order polynomial model, which is the most widely used regression model for characterization purposes. We applied the methodology on a set of raw images of rock art scenes collected with two different Single Lens Reflex (SLR) cameras. A leave-one-out cross-validation (LOOCV) procedure was used to assess the predictive performance of the models in terms of CIE XYZ residuals and Δ E a b * color differences. Values of less than 3 CIELAB units were achieved for Δ E a b * . The output sRGB characterized images show that both regression models are suitable for practical applications in cultural heritage documentation. However, the results show that colorimetric characterization based on the Gaussian process provides significantly better results, with lower values for residuals and Δ E a b * . We also analyzed the induced noise into the output image after applying the camera characterization. As the noise depends on the specific camera, proper camera selection is essential for the photogrammetric work.


Author(s):  
Hongxu Zhao ◽  
Ran Jin ◽  
Su Wu ◽  
Jianjun Shi

Thickness uniformity of wafers is a critical quality measure in a wire saw slicing process. Nonuniformity occurs when the material removal rate (MRR) changes over time during a slicing process, and it poses a significant problem for the downstream processes such as lapping and polishing. Therefore, the MRR should be modeled and controlled to maintain the thickness uniformity. In this paper, a PDE-constrained Gaussian process model is developed based on the global Galerkin discretization of the governing partial differential equations (PDEs). Three features are incorporated into the statistical model: (1) the PDEs governing the wire saw slicing process, which are obtained from engineering knowledge, (2) the systematic errors of the manufacturing process, and (3) the random errors, including both random manufacturing errors and measurement noises. Real experiments are conducted to provide data for the validation of the PDE-constrained Gaussian process model by estimating the model coefficients and further using the model to predict the overall MRR profile. The results of cross-validation indicate that the prediction performance of the PDE-constrained Gaussian process model is better than the widely used universal Kriging model with a mean of second order polynomial functions.


Author(s):  
Wei Li ◽  
Akhil Garg ◽  
Mi Xiao ◽  
Liang Gao

Abstract The power of electric vehicles (EVs) comes from lithium-ion batteries (LIBs). LIBs are sensitive to temperature. Too high and too low temperatures will affect the performance and safety of EVs. Therefore, a stable and efficient battery thermal management system (BTMS) is essential for an EV. This article has conducted a comprehensive study on liquid-cooled BTMS. Two cooling schemes are designed: the serpentine channel and the U-shaped channel. The results show that the cooling effect of two schemes is roughly the same, but the U-shaped channel can significantly decrease the pressure drop (PD) loss. The U-shaped channel is parameterized and modeled. A machine learning method called the Gaussian process (GP) model has been used to express the outputs such as temperature difference, temperature standard deviation, and pressure drop. A multi-objective optimization model is established using GP models, and the NSGA-II method is employed to drive the optimization process. The optimized scheme is compared with the initial design. The main findings are summarized as follows: the velocity of cooling water v decreases from 0.3 m/s to 0.22 m/s by 26.67%. Pressure drop decreases from 431.40 Pa to 327.11 Pa by 24.18%. The optimized solution has a significant reduction in pressure drop and helps to reduce parasitic power. The proposed method can provide a useful guideline for the liquid cooling design of large-scale battery packs.


Author(s):  
Hongyi Xu ◽  
Zhen Jiang ◽  
Daniel W. Apley ◽  
Wei Chen

Data-driven random process models have become increasingly important for uncertainty quantification (UQ) in science and engineering applications, due to their merit of capturing both the marginal distributions and the correlations of high-dimensional responses. However, the choice of a random process model is neither unique nor straightforward. To quantitatively validate the accuracy of random process UQ models, new metrics are needed to measure their capability in capturing the statistical information of high-dimensional data collected from simulations or experimental tests. In this work, two goodness-of-fit (GOF) metrics, namely, a statistical moment-based metric (SMM) and an M-margin U-pooling metric (MUPM), are proposed for comparing different stochastic models, taking into account their capabilities of capturing the marginal distributions and the correlations in spatial/temporal domains. This work demonstrates the effectiveness of the two proposed metrics by comparing the accuracies of four random process models (Gaussian process (GP), Gaussian copula, Hermite polynomial chaos expansion (PCE), and Karhunen–Loeve (K–L) expansion) in multiple numerical examples and an engineering example of stochastic analysis of microstructural materials properties. In addition to the new metrics, this paper provides insights into the pros and cons of various data-driven random process models in UQ.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-22 ◽  
Author(s):  
Xunfeng Wu ◽  
Shiwen Zhang ◽  
Zhe Gong ◽  
Junkai Ji ◽  
Qiuzhen Lin ◽  
...  

In recent years, a number of recombination operators have been proposed for multiobjective evolutionary algorithms (MOEAs). One kind of recombination operators is designed based on the Gaussian process model. However, this approach only uses one standard Gaussian process model with fixed variance, which may not work well for solving various multiobjective optimization problems (MOPs). To alleviate this problem, this paper introduces a decomposition-based multiobjective evolutionary optimization with adaptive multiple Gaussian process models, aiming to provide a more effective heuristic search for various MOPs. For selecting a more suitable Gaussian process model, an adaptive selection strategy is designed by using the performance enhancements on a number of decomposed subproblems. In this way, our proposed algorithm owns more search patterns and is able to produce more diversified solutions. The performance of our algorithm is validated when solving some well-known F, UF, and WFG test instances, and the experiments confirm that our algorithm shows some superiorities over six competitive MOEAs.


2021 ◽  
Author(s):  
Santiago Belda ◽  
Matías Salinero ◽  
Eatidal Amin ◽  
Luca Pipia ◽  
Pablo Morcillo-Pallarés ◽  
...  

<p>In general, modeling phenological evolution represents a challenging task mainly because of time series gaps and noisy data, coming from different viewing and illumination geometries, cloud cover, seasonal snow and the interval needed to revisit and acquire data for the exact same location. For that reason, the use of reliable gap-filling fitting functions and smoothing filters is frequently required for retrievals at the highest feasible accuracy. Of specific interest to filling gaps in time series is the emergence of machine learning regression algorithms (MLRAs) which can serve as fitting functions. Among the multiple MLRA approaches currently available, the kernel-based methods developed in a Bayesian framework deserve special attention because of both being adaptive and providing associated uncertainty estimates, such as Gaussian Process Regression (GPR).</p><p>Recent studies demonstrated the effectiveness of GPR for gap-filling of biophysical parameter time series because the hyperparameters can be optimally set for each time series (one for each pixel in the area) with a single optimization procedure. The entire procedure of learning a GPR model only relies on appropriate selection of the type of kernel and the hyperparameters involved in the estimation of input data covariance. Despite its clear strategic advantage, the most important shortcomings of this technique are the (1) high computational cost and (2) memory requirements of their training, which grows cubically and quadratically with the number of model’s samples, respectively. This can become problematic in view of processing a large amount of data, such as in Sentinel-2 (S2) time series tiles. Hence, optimization strategies need to be developed on how to speed up the GPR processing while maintaining the superior performance in terms of accuracy.</p><p>To mitigate its computational burden and to address such shortcoming and repetitive procedure, we evaluated whether the GPR hyperparameters can be preoptimized over a reduced set of representative pixels and kept fixed over a more extended crop area. We used S2 LAI time series over an agricultural region in Castile and Leon (North-West Spain) and testing different functions for Covariance estimation such as exponential Kernel, Squared exponential kernel and matern kernel with parameter 3/2 or 5/2. The performance of image reconstructions was compared against the standard per-pixel GPR time series training process. Results showed that accuracies were on the same order (12% RMSE degradation) whereas processing time accelerated up to 90 times. Crop phenology indicators were also calculated and compared, revealing similar temporal patterns with differences in start and end of growing season of no more than five days. To the benefit of crop monitoring applications, all the gap-filling and phenology indicators retrieval techniques have been implemented into the <strong>freely downloadable GUI toolbox DATimeS</strong> (Decomposition and Analysis of Time Series Software - https://artmotoolbox.com/).</p>


2020 ◽  
Vol 36 (12) ◽  
pp. 3795-3802
Author(s):  
Arttu Arjas ◽  
Andreas Hauptmann ◽  
Mikko J Sillanpää

Abstract Motivation Improved DNA technology has made it practical to estimate single-nucleotide polymorphism (SNP)-heritability among distantly related individuals with unknown relationships. For growth- and development-related traits, it is meaningful to base SNP-heritability estimation on longitudinal data due to the time-dependency of the process. However, only few statistical methods have been developed so far for estimating dynamic SNP-heritability and quantifying its full uncertainty. Results We introduce a completely tuning-free Bayesian Gaussian process (GP)-based approach for estimating dynamic variance components and heritability as their function. For parameter estimation, we use a modern Markov Chain Monte Carlo method which allows full uncertainty quantification. Several datasets are analysed and our results clearly illustrate that the 95% credible intervals of the proposed joint estimation method (which ‘borrows strength’ from adjacent time points) are significantly narrower than of a two-stage baseline method that first estimates the variance components at each time point independently and then performs smoothing. We compare the method with a random regression model using MTG2 and BLUPF90 software and quantitative measures indicate superior performance of our method. Results are presented for simulated and real data with up to 1000 time points. Finally, we demonstrate scalability of the proposed method for simulated data with tens of thousands of individuals. Availability and implementation The C++ implementation dynBGP and simulated data are available in GitHub: https://github.com/aarjas/dynBGP. The programmes can be run in R. Real datasets are available in QTL archive: https://phenome.jax.org/centers/QTLA. Supplementary information Supplementary data are available at Bioinformatics online.


2017 ◽  
Vol 40 (6) ◽  
pp. 1799-1807 ◽  
Author(s):  
Mehdi Ghasemi Naraghi ◽  
Yousef Alipouri

In this paper, we utilize the probability density function of the data to estimate the minimum variance lower bound (MVLB) of a nonlinear system. For this purpose, the Gaussian Process (GP) model has been used. With this model, given a new input and based on past observations, we naturally obtained the variance of the predictive distribution of the future output, which enabled us to estimate MVLB as well as estimation uncertainty. Also, an advantage of the proposed method over others is its ability to estimate MVLB recursively. The application of this method to the real-life dynamic process (experimental four-tank process) indicates that this approach gives very credible estimates of the MVLB.


Sign in / Sign up

Export Citation Format

Share Document