An uncertainty-aware hybrid modelling approach using probabilistic machine learning

Within the fields of materials mechanics, the consideration of physical laws in machine learning predictions besides the use of data can enable low prediction errors and robustness as opposed to predictions only based on data. On the one hand, exclusive utilization of fundamental physical relationships might show significant deviations in their predictions compared to reality, due to simplifications and assumptions. On the other hand, using only data and neglecting well-established physical laws can create the need for unreasonably large data sets that are required to exhibit low bias and are usually expensive to collect. However, fundamental but simplified physics in combination with a corrective model that compensates for possible deviations, e.g., to experimental data, can lead to physics-based predictions with low prediction errors, also despite scarce data. In this article, it is demonstrated that a hybrid model approach consisting of a physics-based model that is corrected via an artificial neural network represents an efficient prediction tool as opposed to a purely data-driven model. In particular, a semi-analytical model serves as an efficient low-fidelity model with noticeable prediction errors outside its calibration domain. An artificial neural network is used to correct the semi-analytical solution towards a desired reference solution provided by high-fidelity finite element simulations, while the efficiency of the semi-analytical model is maintained and the applicability range enhanced. We utilize residual stresses that are induced by laser shock peening as a use-case example. In addition, it is shown that non-unique relationships between model inputs and outputs lead to high prediction errors and the identification of salient input features via dimensionality analysis is highly beneficial to achieve low prediction errors. In a generalization task, predictions are also outside the process parameter space of the training region while remaining in the trained range of corrections. The corrective model predictions show substantially smaller errors than purely data-driven model predictions, which illustrates one of the benefits of the hybrid modelling approach. Ultimately, when the amount of samples in the data set is reduced, the generalization of the physics-related corrective model outperforms the purely data-driven model, which also demonstrates efficient applicability of the proposed hybrid modelling approach to problems where data is scarce.

Download Full-text

Investigation of a New Approach to Predict Water Quality Extremes with a Case Study of Chemical Determinands in Stream Water

Water Science & Technology ◽

10.2166/wst.1991.0137 ◽

1991 ◽

Vol 24 (6) ◽

pp. 25-33

Author(s):

A. J. Jakeman ◽

P. G. Whitehead ◽

A. Robson ◽

J. A. Taylor ◽

J. Bai

Keyword(s):

Water Quality ◽

Stream Water ◽

Probability Distributions ◽

Management Strategies ◽

Hybrid Modelling ◽

Trend Model ◽

Environmental Extremes ◽

Modelling Approach

The paper illustrates analysis of the assumptions of the statistical component of a hybrid modelling approach for predicting environmental extremes. This shows how to assess the applicability of the approach to water quality problems. The analysis involves data on stream acidity from the Birkenes catchment in Norway. The modelling approach is hybrid in that it uses: (1) a deterministic or process-based description to simulate (non-stationary) long term trend values of environmental variables, and (2) probability distributions which are superimposed on the trend values to characterise the frequency of shorter term concentrations. This permits assessment of management strategies and of sensitivity to climate variables by adjusting the values of major forcing variables in the trend model. Knowledge of the variability about the trend is provided by: (a) identification of an appropriate parametric form of the probability density function (pdf) of the environmental attribute (e.g. stream acidity variables) whose extremes are of interest, and (b) estimation of pdf parameters using the output of the trend model.

Download Full-text

EXPRESS: Overcoming the Cold Start Problem of CRM using a Probabilistic Machine Learning Approach

Journal of Marketing Research ◽

10.1177/00222437211032938 ◽

2021 ◽

pp. 002224372110329

Author(s):

Nicolas Padilla ◽

Eva Ascarza

Keyword(s):

Machine Learning ◽

Cold Start ◽

Customer Relationship ◽

Exponential Families ◽

Modeling Framework ◽

Wide Range ◽

The Moment ◽

Probabilistic Machine Learning ◽

Marketing Actions ◽

Cold Start Problem

The success of Customer Relationship Management (CRM) programs ultimately depends on the firm's ability to identify and leverage differences across customers — a very diffcult task when firms attempt to manage new customers, for whom only the first purchase has been observed. For those customers, the lack of repeated observations poses a structural challenge to inferring unobserved differences across them. This is what we call the “cold start” problem of CRM, whereby companies have difficulties leveraging existing data when they attempt to make inferences about customers at the beginning of their relationship. We propose a solution to the cold start problem by developing a probabilistic machine learning modeling framework that leverages the information collected at the moment of acquisition. The main aspect of the model is that it exibly captures latent dimensions that govern the behaviors observed at acquisition as well as future propensities to buy and to respond to marketing actions using deep exponential families. The model can be integrated with a variety of demand specifications and is exible enough to capture a wide range of heterogeneity structures. We validate our approach in a retail context and empirically demonstrate the model's ability at identifying high-value customers as well as those most sensitive to marketing actions, right after their first purchase.

Download Full-text

Probabilistic Machine Learning Estimation of Ocean Mixed Layer Depth from Dense Satellite and Sparse In-Situ Observations

10.1002/essoar.10505859.1 ◽

2021 ◽

Author(s):

Dallas Foster ◽

David John Gagne ◽

Daniel Bridger Whitt

Keyword(s):

Machine Learning ◽

Mixed Layer ◽

Mixed Layer Depth ◽

Ocean Mixed Layer ◽

Layer Depth ◽

In Situ Observations ◽

Probabilistic Machine Learning ◽

Ocean Mixed Layer Depth

Download Full-text

On-the-fly construction of surrogate constitutive models for concurrent multiscale mechanical analysis through probabilistic machine learning

Journal of Computational Physics X ◽

10.1016/j.jcpx.2020.100083 ◽

2020 ◽

pp. 100083

Author(s):

I.B.C.M. Rocha ◽

P. Kerfriden ◽

F.P. van der Meer

Keyword(s):

Machine Learning ◽

Constitutive Models ◽

Mechanical Analysis ◽

Probabilistic Machine Learning

Download Full-text

Algorithmic fairness in credit scoring

Oxford Review of Economic Policy ◽

10.1093/oxrep/grab020 ◽

2021 ◽

Vol 37 (3) ◽

pp. 585-617

Author(s):

Teresa Bono ◽

Karen Croxson ◽

Adam Giles

Keyword(s):

Machine Learning ◽

Credit Scoring ◽

Large Data ◽

Error Rates ◽

The Past ◽

Ensemble Machine Learning ◽

Hidden Patterns ◽

Credit Scoring Model ◽

Distributional Impacts ◽

Modelling Approach

Abstract The use of machine learning as an input into decision-making is on the rise, owing to its ability to uncover hidden patterns in large data and improve prediction accuracy. Questions have been raised, however, about the potential distributional impacts of these technologies, with one concern being that they may perpetuate or even amplify human biases from the past. Exploiting detailed credit file data for 800,000 UK borrowers, we simulate a switch from a traditional (logit) credit scoring model to ensemble machine-learning methods. We confirm that machine-learning models are more accurate overall. We also find that they do as well as the simpler traditional model on relevant fairness criteria, where these criteria pertain to overall accuracy and error rates for population subgroups defined along protected or sensitive lines (gender, race, health status, and deprivation). We do observe some differences in the way credit-scoring models perform for different subgroups, but these manifest under a traditional modelling approach and switching to machine learning neither exacerbates nor eliminates these issues. The paper discusses some of the mechanical and data factors that may contribute to statistical fairness issues in the context of credit scoring.

Download Full-text

HilbertNet: A Probabilistic Machine Learning Framework for Frequency Response Extrapolation of Electromagnetic Structures

IEEE Transactions on Electromagnetic Compatibility ◽

10.1109/temc.2021.3119277 ◽

2021 ◽

pp. 1-13

Author(s):

Osama Waqar Bhatti ◽

Hakki Mert Torun ◽

Madhavan Swaminathan

Keyword(s):

Machine Learning ◽

Frequency Response ◽

Learning Framework ◽

Probabilistic Machine Learning

Download Full-text

Advances in reliability analysis and health prognostics using probabilistic machine learning

10.31274/etd-20200902-92 ◽

2020 ◽

Author(s):

Meng Li

Keyword(s):

Machine Learning ◽

Reliability Analysis ◽

Probabilistic Machine Learning

Download Full-text

Probabilistic Machine Learning for improved Decision-making with 3-D Geological Models

10.5194/egusphere-egu21-14771 ◽

2021 ◽

Author(s):

Florian Wellmann ◽

Miguel de la Varga ◽

Nilgün Güdük ◽

Jan von Harten ◽

Fabian Stamm ◽

...

Keyword(s):

Machine Learning ◽

Decision Making ◽

Imperfect Information ◽

Probabilistic Models ◽

Uncertainty Propagation ◽

Model Parameters ◽

Data Types ◽

Subsurface Structures ◽

Geological Models ◽

Probabilistic Machine Learning

Geological models, as 3-D representations of subsurface structures and property distributions, are used in many economic, scientific, and societal decision processes. These models are built on prior assumptions and imperfect information, and they often result from an integration of geological and geophysical data types with varying quality. These aspects result in uncertainties about the predicted subsurface structures and property distributions, which will affect the subsequent decision process.We discuss approaches to evaluate uncertainties in geological models and to integrate geological and geophysical information in combined workflows. A first step is the consideration of uncertainties in prior model parameters on the basis of uncertainty propagation (forward uncertainty quantification). When applied to structural geological models with discrete classes, these methods result in a class probability for each point in space, often represented in tessellated grid cells. These results can then be visualized or forwarded to process simulations. Another option is to add risk functions for subsequent decision analyses. In recent work, these geological uncertainty fields have also been used as an input to subsequent geophysical inversions.A logical extension to these existing approaches is the integration of geological forward operators into inverse frameworks, to enable a full flow of inference for a wider range of relevant parameters. We investigate here specifically the use of probabilistic machine learning tools in combination with geological and geophysical modeling. Challenges exist due to the hierarchical nature of the probabilistic models, but modern sampling strategies allow for efficient sampling in these complex settings. We showcase the application with examples combining geological modeling and geophysical potential field measurements in an integrated model for improved decision making.

Download Full-text