scholarly journals Linking Gaussian process regression with data-driven manifold embeddings for nonlinear data fusion

2019 ◽  
Vol 9 (3) ◽  
pp. 20180083 ◽  
Author(s):  
Seungjoon Lee ◽  
Felix Dietrich ◽  
George E. Karniadakis ◽  
Ioannis G. Kevrekidis

In statistical modelling with Gaussian process regression, it has been shown that combining (few) high-fidelity data with (many) low-fidelity data can enhance prediction accuracy, compared to prediction based on the few high-fidelity data only. Such information fusion techniques for multi-fidelity data commonly approach the high-fidelity model f h ( t ) as a function of two variables ( t , s ), and then use f l ( t ) as the s data. More generally, the high-fidelity model can be written as a function of several variables ( t , s 1 , s 2 ….); the low-fidelity model f l and, say, some of its derivatives can then be substituted for these variables. In this paper, we will explore mathematical algorithms for multi-fidelity information fusion that use such an approach towards improving the representation of the high-fidelity function with only a few training data points. Given that f h may not be a simple function—and sometimes not even a function—of f l , we demonstrate that using additional functions of t , such as derivatives or shifts of f l , can drastically improve the approximation of f h through Gaussian processes. We also point out a connection with ‘embedology’ techniques from topology and dynamical systems. Our illustrative examples range from instructive caricatures to computational biology models, such as Hodgkin–Huxley neural oscillations.

Author(s):  
Arvind Keprate ◽  
R. M. Chandima Ratnayake ◽  
Shankar Sankararaman

The main aim of this paper is to perform the validation of the adaptive Gaussian process regression model (AGPRM) developed by the authors for the Stress Intensity Factor (SIF) prediction of a crack propagating in topside piping. For validation purposes, the values of SIF obtained from experiments available in the literature are used. Sixty-six data points (consisting of L, a, c and SIF values obtained by experiments) are used to train the AGPRM, while four independent data sets are used for validation purposes. The experimental validation of the AGPRM also consists of the comparison of the prediction accuracy of AGPRM and Finite Element Method (FEM) relative to the experimentally derived SIF values. Four metrics, namely, Root Mean Square Error (RMSE), Average Absolute Error (AAE), Maximum Absolute Error (MAE), and Coefficient of Determination (R2), are used to compare the accuracy. A case study illustrating the development and experimental validation of the AGPRM is presented. Results indicate that the prediction accuracy of the AGPRM is comparable with and even higher than that of the FEM, provided the training points of the AGPRM are aptly chosen.


2019 ◽  
Vol 2019 ◽  
pp. 1-20 ◽  
Author(s):  
Haopeng Zhang ◽  
Cong Zhang ◽  
Zhiguo Jiang ◽  
Yuan Yao ◽  
Gang Meng

In this paper, we address the problem of vision-based satellite recognition and pose estimation, which is to recognize the satellite from multiviews and estimate the relative poses using imaging sensors. We propose a vision-based method to solve these two problems using Gaussian process regression (GPR). Assuming that the regression function mapping from the image (or feature) of the target satellite to its category or pose follows a Gaussian process (GP) properly parameterized by a mean function and a covariance function, the predictive equations can be easily obtained by a maximum-likelihood approach when training data are given. These explicit formulations can not only offer the category or estimated pose by the mean value of the predicted output but also give its uncertainty by the variance which makes the predicted result convincing and applicable in practice. Besides, we also introduce a manifold constraint to the output of the GPR model to improve its performance for satellite pose estimation. Extensive experiments are performed on two simulated image datasets containing satellite images of 1D and 2D pose variations, as well as different noises and lighting conditions. Experimental results validate the effectiveness and robustness of our approach.


2021 ◽  
Vol 7 (2) ◽  
pp. 287-290
Author(s):  
Jannik Prüßmann ◽  
Jan Graßhoff ◽  
Philipp Rostalski

Abstract Gaussian processes are a versatile tool for data processing. Unfortunately, due to storage and runtime requirements, standard Gaussian process (GP) methods are limited to a few thousand data points. Thus, they are infeasible in most biomedical, spatio-temporal problems. The methods treated in this work cover GP inference and hyperparameter optimization, exploiting the Kronecker structure of covariance matrices. To solve regression and source separation problems, two different approaches are presented. The first approach uses efficient matrix-vector-products, whilst the second approach is based on efficient solutions to the eigendecomposition. The latter also enables efficient hyperparameter optimization. In comparison to standard GP methods, the proposed methods can be applied to very large biomedical datasets without any further performance loss and perform substantially faster. The performance is demonstrated on esophageal manometry data, where the cardiac and respiratory signal components are to be inferred by source separation.


Author(s):  
Qingtao Tang ◽  
Li Niu ◽  
Yisen Wang ◽  
Tao Dai ◽  
Wangpeng An ◽  
...  

Gaussian Process Regression (GPR) is a powerful Bayesian method. However, the performance of GPR can be significantly degraded when the training data are contaminated by outliers, including target outliers and input outliers. Although there are some variants of GPR (e.g., GPR with Student-t likelihood (GPRT)) aiming to handle outliers, most of the variants focus on handling the target outliers while little effort has been done to deal with the input outliers. In contrast, in this work, we aim to handle both the target outliers and the input outliers at the same time. Specifically, we replace the Gaussian noise in GPR with independent Student-t noise to cope with the target outliers. Moreover, to enhance the robustness w.r.t. the input outliers, we use a Student-t Process prior instead of the common Gaussian Process prior, leading to Student-t Process Regression with Student-t Likelihood (TPRT). We theoretically show that TPRT is more robust to both input and target outliers than GPR and GPRT, and prove that both GPR and GPRT are special cases of TPRT. Various experiments demonstrate that TPRT outperforms GPR and its variants on both synthetic and real datasets.


2000 ◽  
Vol 12 (11) ◽  
pp. 2719-2741 ◽  
Author(s):  
Volker Tresp

The Bayesian committee machine (BCM) is a novel approach to combining estimators that were trained on different data sets. Although the BCM can be applied to the combination of any kind of estimators, the main foci are gaussian process regression and related systems such as regularization networks and smoothing splines for which the degrees of freedom increase with the number of training data. Somewhat surprisingly, we find that the performance of the BCM improves if several test points are queried at the same time and is optimal if the number of test points is at least as large as the degrees of freedom of the estimator. The BCM also provides a new solution for on-line learning with potential applications to data mining. We apply the BCM to systems with fixed basis functions and discuss its relationship to gaussian process regression. Finally, we show how the ideas behind the BCM can be applied in a non-Bayesian setting to extend the input-dependent combination of estimators.


2021 ◽  
Author(s):  
◽  
Phillip Boyle

<p>Gaussian processes have proved to be useful and powerful constructs for the purposes of regression. The classical method proceeds by parameterising a covariance function, and then infers the parameters given the training data. In this thesis, the classical approach is augmented by interpreting Gaussian processes as the outputs of linear filters excited by white noise. This enables a straightforward definition of dependent Gaussian processes as the outputs of a multiple output linear filter excited by multiple noise sources. We show how dependent Gaussian processes defined in this way can also be used for the purposes of system identification. Onewell known problem with Gaussian process regression is that the computational complexity scales poorly with the amount of training data. We review one approximate solution that alleviates this problem, namely reduced rank Gaussian processes. We then show how the reduced rank approximation can be applied to allow for the efficient computation of dependent Gaussian processes. We then examine the application of Gaussian processes to the solution of other machine learning problems. To do so, we review methods for the parameterisation of full covariance matrices. Furthermore, we discuss how improvements can be made by marginalising over alternative models, and introduce methods to perform these computations efficiently. In particular, we introduce sequential annealed importance sampling as a method for calculating model evidence in an on-line fashion as new data arrives. Gaussian process regression can also be applied to optimisation. An algorithm is described that uses model comparison between multiple models to find the optimum of a function while taking as few samples as possible. This algorithm shows impressive performance on the standard control problem of double pole balancing. Finally, we describe how Gaussian processes can be used to efficiently estimate gradients of noisy functions, and numerically estimate integrals.</p>


2020 ◽  
Vol 10 (15) ◽  
pp. 5216
Author(s):  
Anh Hong Nguyen ◽  
Michael Rath ◽  
Erik Leitinger ◽  
Khang Van Nguyen ◽  
Klaus Witrisal

The consideration of ultra-wideband (UWB) and mm-wave signals allows for a channel description decomposed into specular multipath components (SMCs) and dense/diffuse multipath. In this paper, the amplitude and phase of SMCs are studied. Gaussian Process regression (GPR) is used as a tool to analyze and predict the SMC amplitudes and phases based on a measured training data set. In this regard, the dependency of the amplitude (and phase) on the angle-of-arrival/angle-of-departure of a multipath component is analyzed, which accounts for the incident angle and incident position of the signal at a reflecting surface—and thus for the reflection characteristics of the building material—and for the antenna gain patterns. The GPR model describes the similarities between different data points. Based on its model parameters and the training data, the amplitudes of SMCs are predicted at receiver positions that have not been measured in the experiment. The method can be used to predict a UWB channel impulse response at an arbitrary position in the environment.


2021 ◽  
Author(s):  
◽  
Phillip Boyle

<p>Gaussian processes have proved to be useful and powerful constructs for the purposes of regression. The classical method proceeds by parameterising a covariance function, and then infers the parameters given the training data. In this thesis, the classical approach is augmented by interpreting Gaussian processes as the outputs of linear filters excited by white noise. This enables a straightforward definition of dependent Gaussian processes as the outputs of a multiple output linear filter excited by multiple noise sources. We show how dependent Gaussian processes defined in this way can also be used for the purposes of system identification. Onewell known problem with Gaussian process regression is that the computational complexity scales poorly with the amount of training data. We review one approximate solution that alleviates this problem, namely reduced rank Gaussian processes. We then show how the reduced rank approximation can be applied to allow for the efficient computation of dependent Gaussian processes. We then examine the application of Gaussian processes to the solution of other machine learning problems. To do so, we review methods for the parameterisation of full covariance matrices. Furthermore, we discuss how improvements can be made by marginalising over alternative models, and introduce methods to perform these computations efficiently. In particular, we introduce sequential annealed importance sampling as a method for calculating model evidence in an on-line fashion as new data arrives. Gaussian process regression can also be applied to optimisation. An algorithm is described that uses model comparison between multiple models to find the optimum of a function while taking as few samples as possible. This algorithm shows impressive performance on the standard control problem of double pole balancing. Finally, we describe how Gaussian processes can be used to efficiently estimate gradients of noisy functions, and numerically estimate integrals.</p>


2022 ◽  
Vol 7 (01) ◽  
pp. 31-51
Author(s):  
Tanya Peart ◽  
Nicolas Aubin ◽  
Stefano Nava ◽  
John Cater ◽  
Stuart Norris

Velocity Prediction Programs (VPPs) are commonly used to help predict and compare the performance of different sail designs. A VPP requires an aerodynamic input force matrix which can be computationally expensive to calculate, limiting its application in industrial sail design projects. The use of multi-fidelity kriging surrogate models has previously been presented by the authors to reduce this cost, with high-fidelity data for a new sail being modelled and the low-fidelity data provided by data from existing, but different, sail designs. The difference in fidelity is not due to the simulation method used to obtain the data, but instead how similar the sail’s geometry is to the new sail design. An important consideration for the construction of these models is the choice of low-fidelity data points, which provide information about the trend of the model curve between the high-fidelity data. A method is required to select the best existing sail design to use for the low-fidelity data when constructing a multi-fidelity model. The suitability of an existing sail design as a low fidelity model could be evaluated based on the similarity of its geometric parameters with the new sail. It is shown here that for upwind jib sails, the similarity of the broadseam between the two sails best indicates the ability of a design to be used as low-fidelity data for a lift coefficient surrogate model. The lift coefficient surrogate model error predicted by the regression is shown to be close to 1% of the lift coefficient surrogate error for most points. Larger discrepancies are observed for a drag coefficient surrogate error regression.


2020 ◽  
Vol 1618 ◽  
pp. 022043
Author(s):  
Leif Erik Andersson ◽  
Bart Doekemeijer ◽  
Daan van der Hoek ◽  
Jan-Willem van Wingerden ◽  
Lars Imsland

Sign in / Sign up

Export Citation Format

Share Document