Gaussian Processes in Machine Learning

Abstract This paper examines and compares commonly used Machine Learning algorithms in their performance in interpolation and extrapolation of FDFs, based on experimental and simulation data. Algorithm performance is evaluated by interpolating and extrapolating FDFs and then the impact of errors on the limit cycle amplitudes are evaluated using the xFDF framework. The best algorithms in interpolation and extrapolation were found to be the widely used cubic spline interpolation, as well as the Gaussian Processes regressor. The data itself was found to be an important factor in defining the predictive performance of a model, therefore a method of optimally selecting data points at test time using Gaussian Processes was demonstrated. The aim of this is to allow a minimal amount of data points to be collected while still providing enough information to model the FDF accurately. The extrapolation performance was shown to decay very quickly with distance from the domain and so emphasis should be put on selecting measurement points in order to expand the covered domain. Gaussian Processes also give an indication of confidence on its predictions and is used to carry out uncertainty quantification, in order to understand model sensitivities. This was demonstrated through application to the xFDF framework.

Download Full-text

A Multi-Classification Method Based on Gaussian Processes

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.198-199.1333 ◽

2012 ◽

Vol 198-199 ◽

pp. 1333-1337 ◽

Cited By ~ 2

Author(s):

San Xi Wei ◽

Zong Hai Sun

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Gaussian Process ◽

Gaussian Processes ◽

Good Accuracy ◽

Classification Problem ◽

Decision Time ◽

Support Vector ◽

Regression Problem ◽

Multi Classification

Gaussian processes (GPs) is a very promising technology that has been applied both in the regression problem and the classification problem. In recent years, models based on Gaussian process priors have attracted much attention in the machine learning. Binary (or two-class, C=2) classification using Gaussian process is a very well-developed method. In this paper, a Multi-classification (C>2) method is illustrated, which is based on Binary GPs classification. A good accuracy can be obtained through this method. Meanwhile, a comparison about decision time and accuracy between this method and Support Vector Machine (SVM) is made during the experiments.

Download Full-text

Finite element approach to Bragg edge neutron strain tomography

ANZIAM Journal ◽

10.21914/anziamj.v60i0.14054 ◽

2020 ◽

Vol 60 ◽

Author(s):

Riya Aggarwal ◽

Mike Meylan ◽

Bishnu Lamichhane ◽

Chris Wensrich

Keyword(s):

Machine Learning ◽

Finite Element Method ◽

Finite Element ◽

Gaussian Processes ◽

Elastic Strain ◽

Neutron Imaging ◽

The Finite Element Method ◽

Neutron Transmission ◽

Ray Transform ◽

Element Method

A number of techniques and applications in neutron imaging that exploit wavelength resolved measurements have been developed recently. One such technique, known as energy resolved neutron imaging, receives ample attention because of its capability to not only visualise but to also quantify physical attributes with spatial resolution. The objective of this article is to develop a reconstruction algorithm for elastic strain tomography from Bragg edge neutron transmission strain images obtained from a pulsed neutron beam with high resolution. This technique has several advantages over those using monochromatic neutron beams from continuous sources; for example, finer wavelength resolution. In contrast to the conventional radon based computed tomography, wherein neutron transmission revolves around the inversion of the longitudinal ray transform that has uniqueness issues, the reconstruction in the proposed algorithm is based on the least squares approach, constrained by an equilibrium formulated through the finite element method. References B. Abbey, S. Y. Zhang, W. J. J. Vorster, and A. M. Korsunsky. Feasibility study of neutron strain tomography. Proc. Eng., 1\penalty 0 (1):185–188, 2009. doi:10.1016/j.proeng.2009.06.043. R. Aggarwal, M. H. Meylan, B. P. Lamichhane, and C. M. Wensrich. Energy resolved neutron imaging for strain reconstruction using the finite element method. J. Imag., 6(3):13, 2020. doi:10.3390/jimaging6030013. J. N. Hendriks, A. W. T. Gregg, C. M. Wensrich, A. S. Tremsin, T. Shinohara, M. Meylan, E. H. Kisi, V. Luzin, and O. Kirsten. Bragg-edge elastic strain tomography for in situ systems from energy-resolved neutron transmission imaging. Phys. Rev. Mat., 1:053802, 2017. doi:10.1103/PhysRevMaterials.1.053802. C. Jidling, J. Hendriks, N. Wahlstrom, A. Gregg, T. B. Schon, C. Wensrich, and A. Wills. Probabilistic modelling and reconstruction of strain. Nuc. Inst. Meth. Phys. Res. B, pages 141–155, 2018. doi:10.1016/j.nimb.2018.08.051. W. R. B. Lionheart and P. J. Withers. Diffraction tomography of strain. Inv. Prob., 31:045005, 2015. doi:10.1088/0266-5611/31/4/045005. C. E. Rasmussen and C. K. I. Williams. Gaussian processes for machine learning. MIT Press, 2006. URL https://mitpress.mit.edu/books/gaussian-processes-machine-learning. C. M. Wensrich, E. Kisi, V. Luzin, and O. Kirstein. Non-contact measurement of the stress within granular materials via neutron diffraction. AIP Conf. Proc., 1542:441–444, 2013. doi:10.1063/1.4811962. R. Woracek, J. Santisteban, A. Fedrigo, and M. Strobl. Diffraction in neutron imaging–-a review. Nuc. Inst. Meth. Phys. Res. A, 878:141–158, 2018. doi:10.1016/j.nima.2017.07.040.

Download Full-text

Bayesian machine learning for financial modeling

10.21248/gups.64469 ◽

2021 ◽

Author(s):

◽

Rajbir Singh Nirwan

Keyword(s):

Machine Learning ◽

Order Statistics ◽

Gaussian Processes ◽

Asset Allocation ◽

Latent Variable ◽

Nonlinear Models ◽

Rotational Symmetry ◽

Variable Model ◽

Advantages And Disadvantages ◽

Latent Space

Machine Learning (ML) is so pervasive in our todays life that we don't even realise that, more often than expected, we are using systems based on it. It is also evolving faster than ever before. When deploying ML systems that make decisions on their own, we need to think about their ignorance of our uncertain world. The uncertainty might arise due to scarcity of the data, the bias of the data or even a mismatch between the real world and the ML-model. Given all these uncertainties, we need to think about how to build systems that are not totally ignorant thereof. Bayesian ML can to some extent deal with these problems. The specification of the model using probabilities provides a convenient way to quantify uncertainties, which can then be included in the decision making process. In this thesis, we introduce the Bayesian ansatz to modeling and apply Bayesian ML models in finance and economics. Especially, we will dig deeper into Gaussian processes (GP) and Gaussian process latent variable model (GPLVM). Applied to the returns of several assets, GPLVM provides the covariance structure and also a latent space embedding thereof. Several financial applications can be build upon the output of the GPLVM. To demonstrate this, we build an automated asset allocation system, a predictor for missing asset prices and identify other structure in financial data. It turns out that the GPLVM exhibits a rotational symmetry in the latent space, which makes it harder to fit. Our second publication reports, how to deal with that symmetry. We propose another parameterization of the model using Householder transformations, by which the symmetry is broken. Bayesian models are changed by reparameterization, if the prior is not changed accordingly. We provide the correct prior distribution of the new parameters, such that the model, i.e. the data density, is not changed under the reparameterization. After applying the reparametrization on Bayesian PCA, we show that the symmetry of nonlinear models can also be broken in the same way. In our last project, we propose a new method for matching quantile observations, which uses order statistics. The use of order statistics as the likelihood, instead of a Gaussian likelihood, has several advantages. We compare these two models and highlight their advantages and disadvantages. To demonstrate our method, we fit quantiled salary data of several European countries. Given several candidate models for the fit, our method also provides a metric to choose the best option. We hope that this thesis illustrates some benefits of Bayesian modeling (especially Gaussian processes) in finance and economics and its usage when uncertainties are to be quantified.

Download Full-text

Scalable Gaussian Processes for Data-Driven Design using Big Data with Categorical Factors (IDETC2021-71570)

Journal of Mechanical Design ◽

10.1115/1.4052221 ◽

2021 ◽

pp. 1-36

Author(s):

Liwei Wang ◽

Suraj Yerramilli ◽

Akshay Iyer ◽

Daniel Apley ◽

Ping Zhu ◽

...

Keyword(s):

Machine Learning ◽

Gaussian Processes ◽

Building Blocks ◽

Variational Inference ◽

Data Driven ◽

Ternary Oxide ◽

Latent Space ◽

Multiple Materials ◽

Qualitative Factors ◽

Gp Model

Abstract Scientific and engineering problems often require the use of artificial intelligence to aid understanding and the search for promising designs. While Gaussian processes (GP) stand out as easy-to-use and interpretable learners, they have difficulties in accommodating big datasets, qualitative inputs, and multi-type responses obtained from different simulators, which has become a common challenge for data-driven design applications. In this paper, we propose a GP model that utilizes latent variables and functions obtained through variational inference to address the aforementioned challenges simultaneously. The method is built upon the latent variable Gaussian process (LVGP) model where qualitative factors are mapped into a continuous latent space to enable GP modeling of mixed-variable datasets. By extending variational inference to LVGP models, the large training dataset is replaced by a small set of inducing points to address the scalability issue. Output response vectors are represented by a linear combination of independent latent functions, forming a flexible kernel structure to handle multi-type responses. Comparative studies demonstrate that the proposed method scales well for large datasets, while outperforming state-of-the-art machine learning methods without requiring much hyperparameter tuning. In addition, an interpretable latent space is obtained to draw insights into the effect of qualitative factors, such as those associated with “building blocks” of architectures and element choices in metamaterial and materials design. Our approach is demonstrated for machine learning of ternary oxide materials and topology optimization of a multiscale compliant mechanism with aperiodic microstructures and multiple materials.

Download Full-text

Gaussian Processes for Machine Learning

10.7551/mitpress/3206.001.0001 ◽

2005 ◽

Cited By ~ 1877

Author(s):

Carl Edward Rasmussen ◽

Christopher K. I. Williams

Keyword(s):

Machine Learning ◽

Gaussian Processes

Download Full-text

Uncertainty Quantification and Explainable Artificial Intelligence

10.5194/egusphere-egu2020-21281 ◽

2020 ◽

Author(s):

Maria Moreno de Castro

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Monte Carlo ◽

Deep Learning ◽

Uncertainty Quantification ◽

Gaussian Processes ◽

Model Complexity ◽

Sensitivity Analyses ◽

Conformal Predictors ◽

Explainable Artificial Intelligence

The presence of automated decision making continuously increases in today's society. Algorithms based in machine and deep learning decide how much we pay for insurance,&#160; translate our thoughts to speech, and shape our consumption of goods (via e-marketing) and knowledge (via search engines). Machine and deep learning models are ubiquitous in science too, in particular, many promising examples are being developed to prove their feasibility for earth sciences applications, like finding temporal trends or spatial patterns in data or improving parameterization schemes for climate simulations.&#160;However, most machine and deep learning applications aim to optimise performance metrics (for instance, accuracy, which stands for the times the model prediction was right), which are rarely good indicators of trust (i.e., why these predictions were right?). In fact, with the increase of data volume and model complexity, machine learning and deep learning&#160; predictions can be very accurate but also prone to rely on spurious correlations, encode and magnify bias, and draw conclusions that do not incorporate the underlying dynamics governing the system. Because of that, the uncertainty of the predictions and our confidence in the model are difficult to estimate and the relation between inputs and outputs becomes hard to interpret.&#160;Since it is challenging to shift a community from &#8220;black&#8221; to &#8220;glass&#8221; boxes, it is more useful to implement Explainable Artificial Intelligence (XAI) techniques right at the beginning of the machine learning and deep learning adoption rather than trying to fix fundamental problems later. The good news is that most of the popular XAI techniques basically are sensitivity analyses because they consist of a systematic perturbation of some model components in order to observe how it affects the model predictions. The techniques comprise random sampling, Monte-Carlo simulations, and ensemble runs, which are common methods in geosciences. Moreover, many XAI techniques are reusable because they are model-agnostic and must be applied after the model has been fitted. In addition, interpretability provides robust arguments when communicating machine and deep learning predictions to scientists and decision-makers.In order to assist not only the practitioners but also the end-users in the evaluation of&#160; machine and deep learning results, we will explain the intuition behind some popular techniques of XAI and aleatory and epistemic Uncertainty Quantification: (1) the Permutation Importance and Gaussian processes on the inputs (i.e., the perturbation of the model inputs), (2) the Monte-Carlo Dropout, Deep ensembles, Quantile Regression, and Gaussian processes on the weights (i.e, the perturbation of the model architecture), (3) the Conformal Predictors (useful to estimate the confidence interval on the outputs), and (4) the Layerwise Relevance Propagation (LRP), Shapley values, and Local Interpretable Model-Agnostic Explanations (LIME) (designed to visualize how each feature in the data affected a particular prediction). We will also introduce some best-practises, like the detection of anomalies in the training data before the training, the implementation of fallbacks when the prediction is not reliable, and physics-guided learning by including constraints in the loss function to avoid physical inconsistencies, like the violation of conservation laws.&#160;

Download Full-text