scholarly journals One-Match-Ahead Forecasting in Two-Team Sports with Stacked Bayesian Regressions

2018 ◽  
Vol 8 (3) ◽  
pp. 159-171 ◽  
Author(s):  
Max W. Y. Lam

AbstractThere is a growing interest in applying machine learning algorithms to real-world examples by explicitly deriving models based on probabilistic reasoning. Sports analytics, being favoured mostly by the statistics community and less discussed in the machine learning community, becomes our focus in this paper. Specifically, we model two-team sports for the sake of one-match-ahead forecasting. We present a pioneering modeling approach based on stacked Bayesian regressions, in a way that winning probability can be calculated analytically. Benefiting from regression flexibility and high standard of performance, Sparse Spectrum Gaussian Process Regression (SSGPR) – an improved algorithm for the standard Gaussian Process Regression (GPR), was used to solve Bayesian regression tasks, resulting in a novel predictive model called TLGProb. For evaluation, TLGProb was applied to a popular sports event – National Basketball Association (NBA). Finally, 85.28% of the matches in NBA 2014/2015 regular season were correctly predicted by TLGProb, surpassing the existing predictive models for NBA.

Author(s):  
Sarini Jayasinghe ◽  
Paolo Paoletti ◽  
Chris Sutcliffe ◽  
John Dardis ◽  
Nick Jones ◽  
...  

This study evaluates whether a combination of photodiode sensor measurements, taken during laser powder bed fusion (L-PBF) builds, can be used to predict the resulting build quality via a purely data-based approach. We analyse the relationship between build density and features that are extracted from sensor data collected from three different photodiodes. The study uses a Singular Value Decomposition to extract lower-dimensional features from photodiode measurements, which are then fed into machine learning algorithms. Several unsupervised learning methods are then employed to classify low density (< 99% part density) and high density (≥ 99% part density) specimens. Subsequently, a supervised learning method (Gaussian Process regression) is used to directly predict build density. Using the unsupervised clustering approaches, applied to features extracted from both photodiode sensor data as well as observations relating to the energy transferred to the material, build density was predicted with up to 93.54% accuracy. With regard to the supervised regression approach, a Gaussian Process algorithm was capable of predicting the build density with a RMS error of 3.65%. The study shows, therefore, that there is potential for machine learning algorithms to predict indicators of L-PBF build quality from photodiode build-measurements. Moreover, the work herein describes approaches that are predominantly probabilistic, thus facilitating uncertainty quantification in machine-learnt predictions of L-PBF build quality.


2020 ◽  
Vol 34 (04) ◽  
pp. 3866-3873
Author(s):  
Peter Fenner ◽  
Edward Pyzer-Knapp

Much of machine learning relies on the use of large amounts of data to train models to make predictions. When this data comes from multiple sources, for example when evaluation of data against a machine learning model is offered as a service, there can be privacy issues and legal concerns over the sharing of data. Fully homomorphic encryption (FHE) allows data to be computed on whilst encrypted, which can provide a solution to the problem of data privacy. However, FHE is both slow and restrictive, so existing algorithms must be manipulated to make them work efficiently under the FHE paradigm. Some commonly used machine learning algorithms, such as Gaussian process regression, are poorly suited to FHE and cannot be manipulated to work both efficiently and accurately. In this paper, we show that a modular approach, which applies FHE to only the sensitive steps of a workflow that need protection, allows one party to make predictions on their data using a Gaussian process regression model built from another party's data, without either party gaining access to the other's data, in a way which is both accurate and efficient. This construction is, to our knowledge, the first example of an effectively encrypted Gaussian process.


2009 ◽  
Vol 21 (3) ◽  
pp. 786-792 ◽  
Author(s):  
Manfred Opper ◽  
Cédric Archambeau

The variational approximation of posterior distributions by multivariate gaussians has been much less popular in the machine learning community compared to the corresponding approximation by factorizing distributions. This is for a good reason: the gaussian approximation is in general plagued by an [Formula: see text] number of variational parameters to be optimized, N being the number of random variables. In this letter, we discuss the relationship between the Laplace and the variational approximation, and we show that for models with gaussian priors and factorizing likelihoods, the number of variational parameters is actually [Formula: see text]. The approach is applied to gaussian process regression with nongaussian likelihoods.


2020 ◽  
Author(s):  
Marc Philipp Bahlke ◽  
Natnael Mogos ◽  
Jonny Proppe ◽  
Carmen Herrmann

Heisenberg exchange spin coupling between metal centers is essential for describing and understanding the electronic structure of many molecular catalysts, metalloenzymes, and molecular magnets for potential application in information technology. We explore the machine-learnability of exchange spin coupling, which has not been studied yet. We employ Gaussian process regression since it can potentially deal with small training sets (as likely associated with the rather complex molecular structures required for exploring spin coupling) and since it provides uncertainty estimates (“error bars”) along with predicted values. We compare a range of descriptors and kernels for 257 small dicopper complexes and find that a simple descriptor based on chemical intuition, consisting only of copper-bridge angles and copper-copper distances, clearly outperforms several more sophisticated descriptors when it comes to extrapolating towards larger experimentally relevant complexes. Exchange spin coupling is similarly easy to learn as the polarizability, while learning dipole moments is much harder. The strength of the sophisticated descriptors lies in their ability to linearize structure-property relationships, to the point that a simple linear ridge regression performs just as well as the kernel-based machine-learning model for our small dicopper data set. The superior extrapolation performance of the simple descriptor is unique to exchange spin coupling, reinforcing the crucial role of choosing a suitable descriptor, and highlighting the interesting question of the role of chemical intuition vs. systematic or automated selection of features for machine learning in chemistry and material science.


2020 ◽  
Vol 237 (12) ◽  
pp. 1430-1437
Author(s):  
Achim Langenbucher ◽  
Nóra Szentmáry ◽  
Jascha Wendelstein ◽  
Peter Hoffmann

Abstract Background and Purpose In the last decade, artificial intelligence and machine learning algorithms have been more and more established for the screening and detection of diseases and pathologies, as well as for describing interactions between measures where classical methods are too complex or fail. The purpose of this paper is to model the measured postoperative position of an intraocular lens implant after cataract surgery, based on preoperatively assessed biometric effect sizes using techniques of machine learning. Patients and Methods In this study, we enrolled 249 eyes of patients who underwent elective cataract surgery at Augenklinik Castrop-Rauxel. Eyes were measured preoperatively with the IOLMaster 700 (Carl Zeiss Meditec), as well as preoperatively and postoperatively with the Casia 2 OCT (Tomey). Based on preoperative effect sizes axial length, corneal thickness, internal anterior chamber depth, thickness of the crystalline lens, mean corneal radius and corneal diameter a selection of 17 machine learning algorithms were tested for prediction performance for calculation of internal anterior chamber depth (AQD_post) and axial position of equatorial plane of the lens in the pseudophakic eye (LEQ_post). Results The 17 machine learning algorithms (out of 4 families) varied in root mean squared/mean absolute prediction error between 0.187/0.139 mm and 0.255/0.204 mm (AQD_post) and 0.183/0.135 mm and 0.253/0.206 mm (LEQ_post), using 5-fold cross validation techniques. The Gaussian Process Regression Model using an exponential kernel showed the best performance in terms of root mean squared error for prediction of AQDpost and LEQpost. If the entire dataset is used (without splitting for training and validation data), comparison of a simple multivariate linear regression model vs. the algorithm with the best performance showed a root mean squared prediction error for AQD_post/LEQ_post with 0.188/0.187 mm vs. the best performance Gaussian Process Regression Model with 0.166/0.159 mm. Conclusion In this paper we wanted to show the principles of supervised machine learning applied to prediction of the measured physical postoperative axial position of the intraocular lenses. Based on our limited data pool and the algorithms used in our setting, the benefit of machine learning algorithms seems to be limited compared to a standard multivariate regression model.


2018 ◽  
Vol 18 (3-4) ◽  
pp. 623-637 ◽  
Author(s):  
ARINDAM MITRA ◽  
CHITTA BARAL

AbstractOver the years the Artificial Intelligence (AI) community has produced several datasets which have given the machine learning algorithms the opportunity to learn various skills across various domains. However, a subclass of these machine learning algorithms that aimed at learning logic programs, namely the Inductive Logic Programming algorithms, have often failed at the task due to the vastness of these datasets. This has impacted the usability of knowledge representation and reasoning techniques in the development of AI systems. In this research, we try to address this scalability issue for the algorithms that learn answer set programs. We present a sound and complete algorithm which takes the input in a slightly different manner and performs an efficient and more user controlled search for a solution. We show via experiments that our algorithm can learn from two popular datasets from machine learning community, namely bAbl (a question answering dataset) and MNIST (a dataset for handwritten digit recognition), which to the best of our knowledge was not previously possible. The system is publicly available athttps://goo.gl/KdWAcV.


Author(s):  
Sachin Dev Suresh ◽  
Ali Qasim ◽  
Bhajan Lal ◽  
Syed Muhammad Imran ◽  
Khor Siak Foo

The production of oil and natural gas contributes to a significant amount of revenue generation in Malaysia thereby strengthening the country’s economy. The flow assurance industry is faced with impediments during smooth operation of the transmission pipeline in which gas hydrate formation is the most important. It affects the normal operation of the pipeline by plugging it. Under high pressure and low temperature conditions, gas hydrate is a crystalline structure consisting of a network of hydrogen bonds between host molecules of water and guest molecules of the incoming gases. Industry uses different types of chemical inhibitors in pipeline to suppress hydrate formation. To overcome this problem, machine learning algorithm has been introduced as part of risk management strategies. The objective of this paper is to utilize Machine Learning (ML) model which is Gaussian Process Regression (GPR). GPR is a new approach being applied to mitigate the growth of gas hydrate. The input parameters used are concentration and pressure of Carbon Dioxide (CO2) and Methane (CH4) gas hydrates whereas the output parameter is the Average Depression Temperature (ADT). The values for the parameter are taken from available data sets that enable GPR to predict the results accurately in terms of Coefficient of Determination, R2 and Mean Squared Error, MSE. The outcome from the research showed that GPR model provided with highest R2 value for training and testing data of 97.25% and 96.71%, respectively. MSE value for GPR was also found to be lowest for training and testing data of 0.019 and 0.023, respectively.


Author(s):  
Nannan Li ◽  
Xinyu Wu ◽  
Huiwen Guo ◽  
Dan Xu ◽  
Yongsheng Ou ◽  
...  

In this paper, we propose a new approach for anomaly detection in video surveillance. This approach is based on a nonparametric Bayesian regression model built upon Gaussian process priors. It establishes a set of basic vectors describing motion patterns from low-level features via online clustering, and then constructs a Gaussian process regression model to approximate the distribution of motion patterns in kernel space. We analyze different anomaly measure criterions derived from Gaussian process regression model and compare their performances. To reduce false detections caused by crowd occlusion, we utilize supplement information from previous frames to assist in anomaly detection for current frame. In addition, we address the problem of hyperparameter tuning and discuss the method of efficient calculation to reduce computation overhead. The approach is verified on published anomaly detection datasets and compared with other existing methods. The experiment results demonstrate that it can detect various anomalies efficiently and accurately.


2016 ◽  
Vol 15 (1) ◽  
pp. 59-63
Author(s):  
Morgan Stuart

Abstract Sports informatics and computer science in sport are perhaps the most exciting and fast-moving disciplines across all of sports science. The tremendous parallel growth in digital technology, non-invasive sensor devices, computer vision and machine learning have empowered sports analytics in ways perhaps never seen before. This growth provides great challenges for new entrants and seasoned veterans of sports analytics alike. Keeping pace with new technological innovations requires a thorough and systematic understanding of many diverse topics from computer programming, to database design, machine learning algorithms and sensor technology. Nevertheless, as quickly as the state of the art technology changes, the foundation skills and knowledge about computer science in sport are lasting. Furthermore, resources for students and practitioners across this range of areas are scarce, and the new-release textbook Computer Science in Sport: Research and Practice edited by Professor Arnold Baca, provides much of the foundation knowledge required for working in sports informatics. This is certainly a comprehensive text that will be a valuable resource for many readers.


2021 ◽  
Vol 2070 (1) ◽  
pp. 012243
Author(s):  
A Varun ◽  
Mechiri Sandeep Kumar ◽  
Karthik Murumulla ◽  
Tatiparthi Sathvik

Abstract Lathe turning is one of the manufacturing sector’s most basic and important operations. From small businesses to large corporations, optimising machining operations is a key priority. Cooling systems in machining have an important role in determining surface roughness. The machine learning model under discussion assesses the surface roughness of lathe turned surfaces for a variety of materials. To forecast surface roughness, the machine learning model is trained using machining parameters, material characteristics, tool properties, and cooling conditions such as dry, MQL, and hybrid nano particle mixed MQL. Mixing with appropriate nano particles such as copper, aluminium, etc. may significantly improve cooling system heat absorption. To create a data collection for training and testing the model, many standard journals and publications are used. Surface roughness varies with work parameter combinations. In MATLAB, a Gaussian Process Regression (GPR) method will be utilised to construct a model and predict surface roughness. To improve prediction outcomes and make the model more flexible, data from a variety of publications was included. Some characteristics were omitted in order to minimise data noise. Different statistical factors will be explored to predict surface roughness.


Sign in / Sign up

Export Citation Format

Share Document