scholarly journals A Bayesian Framework for Adsorption Energy Prediction on Bimetallic Alloy Catalysts

Author(s):  
Osman Mamun ◽  
Kirsten Winther ◽  
Jacob Boes ◽  
Thomas Bligaard

For high-throughput screening of materials for heterogeneous catalysis, scaling relations provides an efficient scheme to estimate the chemisorption energies of hydrogenated species. However, conditioning on a single descriptor ignores the model uncertainty and leads to sub optimal prediction of the chemisorption energy. In this paper, we extend the single descriptor linear scaling relation to a multi descriptor linear regression models to leverage the correlation between adsorption energy of any two pair of adsorbates. With a large dataset, we use Bayesian Information Criteria (BIC) as the model evidence to select the best linear regression model that are derived from non-informative priors. Furthermore, Gaussian Process Regression (GPR) based on the meaningful convolution of physical properties of the metal-adsorbate complex can be used to predict the baseline residual of the selected model. This integrated Bayesian model selection and Gaussian process regression, dubbed as residual learning, can achieve performance comparable to standard DFT error (0.1 eV) for most adsorbate system. For sparse and small datasets, we propose an ad hoc Bayesian Model Averaging (BMA) approach to make a robust prediction. With this Bayesian framework, we significantly reduce the model uncertainty and improve the prediction accuracy. The possibilities of the framework for high-throughput catalytic materials exploration in a realistic setting is illustrated using large and small sets of both dense and sparse simulated dataset generated from a public database of bimetallic alloys available in Catalysis-Hub.org.

Author(s):  
Osman Mamun ◽  
Kirsten Winther ◽  
Jacob Boes ◽  
Thomas Bligaard

For high-throughput screening of materials for heterogeneous catalysis, scaling relations provides an efficient scheme to estimate the chemisorption energies of hydrogenated species. However, conditioning on a single descriptor ignores the model uncertainty and leads to sub optimal prediction of the chemisorption energy. In this paper, we extend the single descriptor linear scaling relation to a multi descriptor linear regression models to leverage the correlation between adsorption energy of any two pair of adsorbates. With a large dataset, we use Bayesian Information Criteria (BIC) as the model evidence to select the best linear regression model that are derived from non-informative priors. Furthermore, Gaussian Process Regression (GPR) based on the meaningful convolution of physical properties of the metal-adsorbate complex can be used to predict the baseline residual of the selected model. This integrated Bayesian model selection and Gaussian process regression, dubbed as residual learning, can achieve performance comparable to standard DFT error (0.1 eV) for most adsorbate system. For sparse and small datasets, we propose an ad hoc Bayesian Model Averaging (BMA) approach to make a robust prediction. With this Bayesian framework, we significantly reduce the model uncertainty and improve the prediction accuracy. The possibilities of the framework for high-throughput catalytic materials exploration in a realistic setting is illustrated using large and small sets of both dense and sparse simulated dataset generated from a public database of bimetallic alloys available in Catalysis-Hub.org.


2020 ◽  
Vol 6 (1) ◽  
Author(s):  
Osman Mamun ◽  
Kirsten T. Winther ◽  
Jacob R. Boes ◽  
Thomas Bligaard

AbstractFor high-throughput screening of materials for heterogeneous catalysis, scaling relations provides an efficient scheme to estimate the chemisorption energies of hydrogenated species. However, conditioning on a single descriptor ignores the model uncertainty and leads to suboptimal prediction of the chemisorption energy. In this article, we extend the single descriptor linear scaling relation to a multi-descriptor linear regression models to leverage the correlation between adsorption energy of any two pair of adsorbates. With a large dataset, we use Bayesian Information Criteria (BIC) as the model evidence to select the best linear regression model. Furthermore, Gaussian Process Regression (GPR) based on the meaningful convolution of physical properties of the metal-adsorbate complex can be used to predict the baseline residual of the selected model. This integrated Bayesian model selection and Gaussian process regression, dubbed as residual learning, can achieve performance comparable to standard DFT error (0.1 eV) for most adsorbate system. For sparse and small datasets, we propose an ad hoc Bayesian Model Averaging (BMA) approach to make a robust prediction. With this Bayesian framework, we significantly reduce the model uncertainty and improve the prediction accuracy. The possibilities of the framework for high-throughput catalytic materials exploration in a realistic setting is illustrated using large and small sets of both dense and sparse simulated dataset generated from a public database of bimetallic alloys available in Catalysis-Hub.org.


2021 ◽  
Author(s):  
Carlos R Oliveira ◽  
Eugene D Shapiro ◽  
Daniel M Weinberger

Vaccine effectiveness (VE) studies are often conducted after the introduction of new vaccines to ensure they provide protection in real-world settings. Although susceptible to confounding, the test-negative case-control study design is the most efficient method to assess VE post-licensure. Control of confounding is often needed during the analyses, which is most efficiently done through multivariable modeling. When a large number of potential confounders are being considered, it can be challenging to know which variables need to be included in the final model. This paper highlights the importance of considering model uncertainty by re-analyzing a Lyme VE study using several confounder selection methods. We propose an intuitive Bayesian Model Averaging (BMA) framework for this task and compare the performance of BMA to that of traditional single-best-model-selection methods. We demonstrate how BMA can be advantageous in situations when there is uncertainty about model selection by systematically considering alternative models and increasing transparency.


2014 ◽  
Vol 10 (8) ◽  
pp. 2023-2030 ◽  
Author(s):  
Xun Huang ◽  
Zhike Zi

A new method that uses Bayesian model averaging for linear regression to infer molecular interactions in biological systems with high prediction accuracy and high computational efficiency.


2019 ◽  
Vol 220 (2) ◽  
pp. 1368-1378
Author(s):  
M Bertin ◽  
S Marin ◽  
C Millet ◽  
C Berge-Thierry

SUMMARY In low-seismicity areas such as Europe, seismic records do not cover the whole range of variable configurations required for seismic hazard analysis. Usually, a set of empirical models established in such context (the Mediterranean Basin, northeast U.S.A., Japan, etc.) is considered through a logic-tree-based selection process. This approach is mainly based on the scientist’s expertise and ignores the uncertainty in model selection. One important and potential consequence of neglecting model uncertainty is that we assign more precision to our inference than what is warranted by the data, and this leads to overly confident decisions and precision. In this paper, we investigate the Bayesian model averaging (BMA) approach, using nine ground-motion prediction equations (GMPEs) issued from several databases. The BMA method has become an important tool to deal with model uncertainty, especially in empirical settings with large number of potential models and relatively limited number of observations. Two numerical techniques, based on the Markov chain Monte Carlo method and the maximum likelihood estimation approach, for implementing BMA are presented and applied together with around 1000 records issued from the RESORCE-2013 database. In the example considered, it is shown that BMA provides both a hierarchy of GMPEs and an improved out-of-sample predictive performance.


Sign in / Sign up

Export Citation Format

Share Document