An uncertainty-aware hybrid modelling approach using probabilistic machine learning

Author(s):  
Rasmus Fjordbak Nielsen ◽  
Nima Nazemzadeh ◽  
Martin Peter Andersson ◽  
Krist V. Gernaey ◽  
Seyed Soheil Mansouri
Materials ◽  
2021 ◽  
Vol 14 (8) ◽  
pp. 1883
Author(s):  
Frederic E. Bock ◽  
Sören Keller ◽  
Norbert Huber ◽  
Benjamin Klusemann

Within the fields of materials mechanics, the consideration of physical laws in machine learning predictions besides the use of data can enable low prediction errors and robustness as opposed to predictions only based on data. On the one hand, exclusive utilization of fundamental physical relationships might show significant deviations in their predictions compared to reality, due to simplifications and assumptions. On the other hand, using only data and neglecting well-established physical laws can create the need for unreasonably large data sets that are required to exhibit low bias and are usually expensive to collect. However, fundamental but simplified physics in combination with a corrective model that compensates for possible deviations, e.g., to experimental data, can lead to physics-based predictions with low prediction errors, also despite scarce data. In this article, it is demonstrated that a hybrid model approach consisting of a physics-based model that is corrected via an artificial neural network represents an efficient prediction tool as opposed to a purely data-driven model. In particular, a semi-analytical model serves as an efficient low-fidelity model with noticeable prediction errors outside its calibration domain. An artificial neural network is used to correct the semi-analytical solution towards a desired reference solution provided by high-fidelity finite element simulations, while the efficiency of the semi-analytical model is maintained and the applicability range enhanced. We utilize residual stresses that are induced by laser shock peening as a use-case example. In addition, it is shown that non-unique relationships between model inputs and outputs lead to high prediction errors and the identification of salient input features via dimensionality analysis is highly beneficial to achieve low prediction errors. In a generalization task, predictions are also outside the process parameter space of the training region while remaining in the trained range of corrections. The corrective model predictions show substantially smaller errors than purely data-driven model predictions, which illustrates one of the benefits of the hybrid modelling approach. Ultimately, when the amount of samples in the data set is reduced, the generalization of the physics-related corrective model outperforms the purely data-driven model, which also demonstrates efficient applicability of the proposed hybrid modelling approach to problems where data is scarce.


1991 ◽  
Vol 24 (6) ◽  
pp. 25-33
Author(s):  
A. J. Jakeman ◽  
P. G. Whitehead ◽  
A. Robson ◽  
J. A. Taylor ◽  
J. Bai

The paper illustrates analysis of the assumptions of the statistical component of a hybrid modelling approach for predicting environmental extremes. This shows how to assess the applicability of the approach to water quality problems. The analysis involves data on stream acidity from the Birkenes catchment in Norway. The modelling approach is hybrid in that it uses: (1) a deterministic or process-based description to simulate (non-stationary) long term trend values of environmental variables, and (2) probability distributions which are superimposed on the trend values to characterise the frequency of shorter term concentrations. This permits assessment of management strategies and of sensitivity to climate variables by adjusting the values of major forcing variables in the trend model. Knowledge of the variability about the trend is provided by: (a) identification of an appropriate parametric form of the probability density function (pdf) of the environmental attribute (e.g. stream acidity variables) whose extremes are of interest, and (b) estimation of pdf parameters using the output of the trend model.


2021 ◽  
pp. 002224372110329
Author(s):  
Nicolas Padilla ◽  
Eva Ascarza

The success of Customer Relationship Management (CRM) programs ultimately depends on the firm's ability to identify and leverage differences across customers — a very diffcult task when firms attempt to manage new customers, for whom only the first purchase has been observed. For those customers, the lack of repeated observations poses a structural challenge to inferring unobserved differences across them. This is what we call the “cold start” problem of CRM, whereby companies have difficulties leveraging existing data when they attempt to make inferences about customers at the beginning of their relationship. We propose a solution to the cold start problem by developing a probabilistic machine learning modeling framework that leverages the information collected at the moment of acquisition. The main aspect of the model is that it exibly captures latent dimensions that govern the behaviors observed at acquisition as well as future propensities to buy and to respond to marketing actions using deep exponential families. The model can be integrated with a variety of demand specifications and is exible enough to capture a wide range of heterogeneity structures. We validate our approach in a retail context and empirically demonstrate the model's ability at identifying high-value customers as well as those most sensitive to marketing actions, right after their first purchase.


2021 ◽  
Vol 37 (3) ◽  
pp. 585-617
Author(s):  
Teresa Bono ◽  
Karen Croxson ◽  
Adam Giles

Abstract The use of machine learning as an input into decision-making is on the rise, owing to its ability to uncover hidden patterns in large data and improve prediction accuracy. Questions have been raised, however, about the potential distributional impacts of these technologies, with one concern being that they may perpetuate or even amplify human biases from the past. Exploiting detailed credit file data for 800,000 UK borrowers, we simulate a switch from a traditional (logit) credit scoring model to ensemble machine-learning methods. We confirm that machine-learning models are more accurate overall. We also find that they do as well as the simpler traditional model on relevant fairness criteria, where these criteria pertain to overall accuracy and error rates for population subgroups defined along protected or sensitive lines (gender, race, health status, and deprivation). We do observe some differences in the way credit-scoring models perform for different subgroups, but these manifest under a traditional modelling approach and switching to machine learning neither exacerbates nor eliminates these issues. The paper discusses some of the mechanical and data factors that may contribute to statistical fairness issues in the context of credit scoring.


2021 ◽  
Author(s):  
Florian Wellmann ◽  
Miguel de la Varga ◽  
Nilgün Güdük ◽  
Jan von Harten ◽  
Fabian Stamm ◽  
...  

<p>Geological models, as 3-D representations of subsurface structures and property distributions, are used in many economic, scientific, and societal decision processes. These models are built on prior assumptions and imperfect information, and they often result from an integration of geological and geophysical data types with varying quality. These aspects result in uncertainties about the predicted subsurface structures and property distributions, which will affect the subsequent decision process.</p><p>We discuss approaches to evaluate uncertainties in geological models and to integrate geological and geophysical information in combined workflows. A first step is the consideration of uncertainties in prior model parameters on the basis of uncertainty propagation (forward uncertainty quantification). When applied to structural geological models with discrete classes, these methods result in a class probability for each point in space, often represented in tessellated grid cells. These results can then be visualized or forwarded to process simulations. Another option is to add risk functions for subsequent decision analyses. In recent work, these geological uncertainty fields have also been used as an input to subsequent geophysical inversions.</p><p>A logical extension to these existing approaches is the integration of geological forward operators into inverse frameworks, to enable a full flow of inference for a wider range of relevant parameters. We investigate here specifically the use of probabilistic machine learning tools in combination with geological and geophysical modeling. Challenges exist due to the hierarchical nature of the probabilistic models, but modern sampling strategies allow for efficient sampling in these complex settings. We showcase the application with examples combining geological modeling and geophysical potential field measurements in an integrated model for improved decision making.</p>


Sign in / Sign up

Export Citation Format

Share Document