Nonparametric Bayesian machine learning and signal processing

Machine Learning for Signal Processing ◽

10.1093/oso/9780198714934.003.0010 ◽

2019 ◽

pp. 313-344

Author(s):

Max A. Little

Keyword(s):

Machine Learning ◽

Stochastic Processes ◽

Gaussian Process ◽

Dirichlet Process ◽

Predictive Accuracy ◽

Gaussian Process Regression ◽

Time Signal ◽

Dirichlet Process Mixture ◽

Power Spectral ◽

Machine Learning Applications

We have seen that stochastic processes play an important foundational role in a wide range of methods in DSP. For example, we treat a discrete-time signal as a Gaussian process, and thereby obtain many mathematically simplified algorithms, particularly based on the power spectral density. At the same time, in machine learning, it has generally been observed that nonparametric methods outperform parametric methods in terms of predictive accuracy since they can adapt to data with arbitrary complexity. However, these techniques are not Bayesian so we are unable to do important inferential procedures such as draw samples from the underlying probabilistic model or compute posterior confidence intervals. But, Bayesian models are often only mathematically tractable if parametric, with the corresponding loss of predictive accuracy. An alternative, discussed in this section, is to extend the mathematical tractability of stochastic processes to Bayesian methods. This leads to so-called Bayesian nonparametrics exemplified by techniques such as Gaussian process regression and Dirichlet process mixture modelling that have been shown to be extremely useful in practical DSP and machine learning applications.

Download Full-text

Machine Learning in Fine Wine Price Prediction

Journal of Wine Economics ◽

10.1017/jwe.2015.17 ◽

2015 ◽

Vol 10 (2) ◽

pp. 151-172 ◽

Cited By ~ 9

Author(s):

Michelle Yeo ◽

Tristan Fletcher ◽

John Shawe-Taylor

Keyword(s):

Machine Learning ◽

Gaussian Process ◽

Predictive Accuracy ◽

Time Series Prediction ◽

Feature Learning ◽

Gaussian Process Regression ◽

Machine Learning Techniques ◽

Linear Regression Models ◽

Price Prediction ◽

Learning Techniques

AbstractAdvanced machine learning techniques like Gaussian process regression and multi-task learning are novel in the area of wine price prediction; previous research in this area being restricted to parametric linear regression models when predicting wine prices. Using historical price data of the 100 wines in the Liv-Ex 100 index, the main contributions of this paper to the field are, firstly, a clustering of the wines into two distinct clusters based on autocorrelation. Secondly, an implementation of Gaussian process regression on these wines with predictive accuracy surpassing both the trivial and simple ARMA and GARCH time series prediction benchmarks. Lastly, an implementation of an algorithm which performs multi-task feature learning with kernels on the wine returns as an extension to our optimal Gaussian process regression model. Using the optimal covariance kernel from Gaussian process regression, we achieve predictive results which are comparable to that of Gaussian process regression. Altogether, our research suggests that there is potential in using advanced machine learning techniques in wine price prediction. (JEL Classifications: C6, G12)

Download Full-text

Exchange Spin Coupling from Gaussian Process Regression

10.26434/chemrxiv.12589541.v3 ◽

2020 ◽

Author(s):

Marc Philipp Bahlke ◽

Natnael Mogos ◽

Jonny Proppe ◽

Carmen Herrmann

Keyword(s):

Machine Learning ◽

Gaussian Process ◽

Gaussian Process Regression ◽

Molecular Magnets ◽

Molecular Structures ◽

Spin Coupling ◽

Structure Property ◽

Data Set ◽

Uncertainty Estimates

Heisenberg exchange spin coupling between metal centers is essential for describing and understanding the electronic structure of many molecular catalysts, metalloenzymes, and molecular magnets for potential application in information technology. We explore the machine-learnability of exchange spin coupling, which has not been studied yet. We employ Gaussian process regression since it can potentially deal with small training sets (as likely associated with the rather complex molecular structures required for exploring spin coupling) and since it provides uncertainty estimates (“error bars”) along with predicted values. We compare a range of descriptors and kernels for 257 small dicopper complexes and find that a simple descriptor based on chemical intuition, consisting only of copper-bridge angles and copper-copper distances, clearly outperforms several more sophisticated descriptors when it comes to extrapolating towards larger experimentally relevant complexes. Exchange spin coupling is similarly easy to learn as the polarizability, while learning dipole moments is much harder. The strength of the sophisticated descriptors lies in their ability to linearize structure-property relationships, to the point that a simple linear ridge regression performs just as well as the kernel-based machine-learning model for our small dicopper data set. The superior extrapolation performance of the simple descriptor is unique to exchange spin coupling, reinforcing the crucial role of choosing a suitable descriptor, and highlighting the interesting question of the role of chemical intuition vs. systematic or automated selection of features for machine learning in chemistry and material science.

Download Full-text

Modeling of Cutting Force in the Turning of AISI 4340 Using Gaussian Process Regression Algorithm

Applied Sciences ◽

10.3390/app11094055 ◽

2021 ◽

Vol 11 (9) ◽

pp. 4055

Author(s):

Mahdi S. Alajmi ◽

Abdullah M. Almeshal

Keyword(s):

Gaussian Process ◽

Cutting Force ◽

Predictive Accuracy ◽

Gaussian Process Regression ◽

Machining Process ◽

Support Vector ◽

Process Data ◽

Cutting Force Prediction ◽

Artificial Neural Network Ann ◽

Aisi 4340

Machining process data can be utilized to predict cutting force and optimize process parameters. Cutting force is an essential parameter that has a significant impact on the metal turning process. In this study, a cutting force prediction model for turning AISI 4340 alloy steel was developed using Gaussian process regression (GPR), support vector machines (SVM), and artificial neural network (ANN) methods. The GPR simulations demonstrated a reliable prediction of surface roughness for the dry turning method with R2 = 0.9843, MAPE = 5.12%, and RMSE = 1.86%. Performance comparisons between GPR, SVM, and ANN show that GPR is an effective method that can ensure high predictive accuracy of the cutting force in the turning of AISI 4340.

Download Full-text

Application of Gaussian Process Regression (GPR) in Gas Hydrate Mitigation

Journal of Advanced Research in Fluid Mechanics and Thermal Sciences ◽

10.37934/arfmts.88.2.2737 ◽

2021 ◽

Vol 88 (2) ◽

pp. 27-37

Author(s):

Sachin Dev Suresh ◽

Ali Qasim ◽

Bhajan Lal ◽

Syed Muhammad Imran ◽

Khor Siak Foo

Keyword(s):

Machine Learning ◽

Gaussian Process ◽

Gas Hydrate ◽

Learning Algorithm ◽

Hydrate Formation ◽

Gaussian Process Regression ◽

Normal Operation ◽

Coefficient Of Determination ◽

Gas Hydrate Formation ◽

Testing Data

The production of oil and natural gas contributes to a significant amount of revenue generation in Malaysia thereby strengthening the country’s economy. The flow assurance industry is faced with impediments during smooth operation of the transmission pipeline in which gas hydrate formation is the most important. It affects the normal operation of the pipeline by plugging it. Under high pressure and low temperature conditions, gas hydrate is a crystalline structure consisting of a network of hydrogen bonds between host molecules of water and guest molecules of the incoming gases. Industry uses different types of chemical inhibitors in pipeline to suppress hydrate formation. To overcome this problem, machine learning algorithm has been introduced as part of risk management strategies. The objective of this paper is to utilize Machine Learning (ML) model which is Gaussian Process Regression (GPR). GPR is a new approach being applied to mitigate the growth of gas hydrate. The input parameters used are concentration and pressure of Carbon Dioxide (CO2) and Methane (CH4) gas hydrates whereas the output parameter is the Average Depression Temperature (ADT). The values for the parameter are taken from available data sets that enable GPR to predict the results accurately in terms of Coefficient of Determination, R2 and Mean Squared Error, MSE. The outcome from the research showed that GPR model provided with highest R2 value for training and testing data of 97.25% and 96.71%, respectively. MSE value for GPR was also found to be lowest for training and testing data of 0.019 and 0.023, respectively.

Download Full-text

The Role of Textualisation and Argumentation in Understanding the Machine Learning Process

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/765 ◽

2017 ◽

Author(s):

Kacper Sokol ◽

Peter Flach

Keyword(s):

Machine Learning ◽

Predictive Accuracy ◽

Spatial Perception ◽

Black Box ◽

High Dimensional ◽

Box Models ◽

Machine Learning Applications ◽

Black Box Models ◽

Machine Learning Models

Understanding data, models and predictions is important for machine learning applications. Due to the limitations of our spatial perception and intuition, analysing high-dimensional data is inherently difficult. Furthermore, black-box models achieving high predictive accuracy are widely used, yet the logic behind their predictions is often opaque. Use of textualisation -- a natural language narrative of selected phenomena -- can tackle these shortcomings. When extended with argumentation theory we could envisage machine learning models and predictions arguing persuasively for their choices.

Download Full-text

Easy representation of multivariate functions with low-dimensional terms via Gaussian process regression kernel design: applications to machine learning of potential energy surfaces and kinetic energy densities from sparse data

Machine Learning: Science and Technology ◽

10.1088/2632-2153/ac4949 ◽

2022 ◽

Author(s):

Sergei Manzhos ◽

Eita Sasaki ◽

Manabu Ihara

Keyword(s):

Machine Learning ◽

Kinetic Energy ◽

Potential Energy ◽

Gaussian Process ◽

Potential Energy Surfaces ◽

Gaussian Process Regression ◽

Multivariate Functions ◽

Low Dimensional ◽

Energy Surfaces ◽

Kernel Design

Abstract We show that Gaussian process regression (GPR) allows representing multivariate functions with low-dimensional terms via kernel design. When using a kernel built with HDMR (High-dimensional model representation), one obtains a similar type of representation as the previously proposed HDMR-GPR scheme while being faster and simpler to use. We tested the approach on cases where highly accurate machine learning is required from sparse data by fitting potential energy surfaces and kinetic energy densities.

Download Full-text

Nonparametric Local Pseudopotentials with Machine Learning: A Tin Pseudopotential Built Using Gaussian Process Regression

The Journal of Physical Chemistry A ◽

10.1021/acs.jpca.0c05723 ◽

2020 ◽

Vol 124 (52) ◽

pp. 11111-11124

Author(s):

Johann Lüder ◽

Sergei Manzhos

Keyword(s):

Machine Learning ◽

Gaussian Process ◽

Gaussian Process Regression

Download Full-text

Green LAI Mapping and Cloud Gap-Filling Using Gaussian Process Regression in Google Earth Engine

Remote Sensing ◽

10.3390/rs13030403 ◽

2021 ◽

Vol 13 (3) ◽

pp. 403

Author(s):

Luca Pipia ◽

Eatidal Amin ◽

Santiago Belda ◽

Matías Salinero-Delgado ◽

Jochem Verrelst

Keyword(s):

Machine Learning ◽

Time Series ◽

Gaussian Process ◽

Gaussian Process Regression ◽

Google Earth ◽

Machine Learning Techniques ◽

Gap Filling ◽

Area Index ◽

Uncertainty Estimates ◽

Google Earth Engine

For the last decade, Gaussian process regression (GPR) proved to be a competitive machine learning regression algorithm for Earth observation applications, with attractive unique properties such as band relevance ranking and uncertainty estimates. More recently, GPR also proved to be a proficient time series processor to fill up gaps in optical imagery, typically due to cloud cover. This makes GPR perfectly suited for large-scale spatiotemporal processing of satellite imageries into cloud-free products of biophysical variables. With the advent of the Google Earth Engine (GEE) cloud platform, new opportunities emerged to process local-to-planetary scale satellite data using advanced machine learning techniques and convert them into gap-filled vegetation properties products. However, GPR is not yet part of the GEE ecosystem. To circumvent this limitation, this work proposes a general adaptation of GPR formulation to parallel processing framework and its integration into GEE. To demonstrate the functioning and utility of the developed workflow, a GPR model predicting green leaf area index (LAIG) from Sentinel-2 imagery was imported. Although by running this GPR model into GEE any corner of the world can be mapped into LAIG at a resolution of 20 m, here we show some demonstration cases over western Europe with zoom-ins over Spain. Thanks to the computational power of GEE, the mapping takes place on-the-fly. Additionally, a GPR-based gap filling strategy based on pre-optimized kernel hyperparameters is also put forward for the generation of multi-orbit cloud-free LAIG maps with an unprecedented level of detail, and the extraction of regularly-sampled LAIG time series at a pixel level. The ability to plugin a locally-trained GPR model into the GEE framework and its instant processing opens up a new paradigm of remote sensing image processing.

Download Full-text