scholarly journals Calibration and prediction for the inexact SIR model

2022 ◽  
Vol 19 (3) ◽  
pp. 2800-2818
Author(s):  
Yan Wang ◽  
◽  
Guichen Lu ◽  
Jiang Du ◽  

<abstract><p>A Susceptible Infective Recovered (SIR) model is usually unable to mimic the actual epidemiological system exactly. The reasons for this inaccuracy include observation errors and model discrepancies due to assumptions and simplifications made by the SIR model. Hence, this work proposes calibration and prediction methods for the SIR model with a one-time reported number of infected cases. Given that the observation errors of the reported data are assumed to be heteroscedastic, we propose two predictors to predict the actual epidemiological system by modeling the model discrepancy through a Gaussian Process model. One is the calibrated SIR model, and the other one is the discrepancy-corrected predictor, which integrates the calibrated SIR model with the Gaussian Process predictor to solve the model discrepancy. A wild bootstrap method quantifies the two predictors' uncertainty, while two numerical studies assess the performance of the proposed method. The numerical results show that, the proposed predictors outperform the existing ones and the prediction accuracy of the discrepancy-corrected predictor is improved by at least $ 49.95\% $.</p></abstract>

2018 ◽  
Author(s):  
Caitlin C. Bannan ◽  
David Mobley ◽  
A. Geoff Skillman

<div>A variety of fields would benefit from accurate pK<sub>a</sub> predictions, especially drug design due to the affect a change in ionization state can have on a molecules physiochemical properties.</div><div>Participants in the recent SAMPL6 blind challenge were asked to submit predictions for microscopic and macroscopic pK<sub>a</sub>s of 24 drug like small molecules.</div><div>We recently built a general model for predicting pK<sub>a</sub>s using a Gaussian process regression trained using physical and chemical features of each ionizable group.</div><div>Our pipeline takes a molecular graph and uses the OpenEye Toolkits to calculate features describing the removal of a proton.</div><div>These features are fed into a Scikit-learn Gaussian process to predict microscopic pK<sub>a</sub>s which are then used to analytically determine macroscopic pK<sub>a</sub>s.</div><div>Our Gaussian process is trained on a set of 2,700 macroscopic pK<sub>a</sub>s from monoprotic and select diprotic molecules.</div><div>Here, we share our results for microscopic and macroscopic predictions in the SAMPL6 challenge.</div><div>Overall, we ranked in the middle of the pack compared to other participants, but our fairly good agreement with experiment is still promising considering the challenge molecules are chemically diverse and often polyprotic while our training set is predominately monoprotic.</div><div>Of particular importance to us when building this model was to include an uncertainty estimate based on the chemistry of the molecule that would reflect the likely accuracy of our prediction. </div><div>Our model reports large uncertainties for the molecules that appear to have chemistry outside our domain of applicability, along with good agreement in quantile-quantile plots, indicating it can predict its own accuracy.</div><div>The challenge highlighted a variety of means to improve our model, including adding more polyprotic molecules to our training set and more carefully considering what functional groups we do or do not identify as ionizable. </div>


Energies ◽  
2021 ◽  
Vol 14 (15) ◽  
pp. 4392
Author(s):  
Jia Zhou ◽  
Hany Abdel-Khalik ◽  
Paul Talbot ◽  
Cristian Rabiti

This manuscript develops a workflow, driven by data analytics algorithms, to support the optimization of the economic performance of an Integrated Energy System. The goal is to determine the optimum mix of capacities from a set of different energy producers (e.g., nuclear, gas, wind and solar). A stochastic-based optimizer is employed, based on Gaussian Process Modeling, which requires numerous samples for its training. Each sample represents a time series describing the demand, load, or other operational and economic profiles for various types of energy producers. These samples are synthetically generated using a reduced order modeling algorithm that reads a limited set of historical data, such as demand and load data from past years. Numerous data analysis methods are employed to construct the reduced order models, including, for example, the Auto Regressive Moving Average, Fourier series decomposition, and the peak detection algorithm. All these algorithms are designed to detrend the data and extract features that can be employed to generate synthetic time histories that preserve the statistical properties of the original limited historical data. The optimization cost function is based on an economic model that assesses the effective cost of energy based on two figures of merit: the specific cash flow stream for each energy producer and the total Net Present Value. An initial guess for the optimal capacities is obtained using the screening curve method. The results of the Gaussian Process model-based optimization are assessed using an exhaustive Monte Carlo search, with the results indicating reasonable optimization results. The workflow has been implemented inside the Idaho National Laboratory’s Risk Analysis and Virtual Environment (RAVEN) framework. The main contribution of this study addresses several challenges in the current optimization methods of the energy portfolios in IES: First, the feasibility of generating the synthetic time series of the periodic peak data; Second, the computational burden of the conventional stochastic optimization of the energy portfolio, associated with the need for repeated executions of system models; Third, the inadequacies of previous studies in terms of the comparisons of the impact of the economic parameters. The proposed workflow can provide a scientifically defendable strategy to support decision-making in the electricity market and to help energy distributors develop a better understanding of the performance of integrated energy systems.


Author(s):  
Daniel Blatter ◽  
Anandaroop Ray ◽  
Kerry Key

Summary Bayesian inversion of electromagnetic data produces crucial uncertainty information on inferred subsurface resistivity. Due to their high computational cost, however, Bayesian inverse methods have largely been restricted to computationally expedient 1D resistivity models. In this study, we successfully demonstrate, for the first time, a fully 2D, trans-dimensional Bayesian inversion of magnetotelluric data. We render this problem tractable from a computational standpoint by using a stochastic interpolation algorithm known as a Gaussian process to achieve a parsimonious parametrization of the model vis-a-vis the dense parameter grids used in numerical forward modeling codes. The Gaussian process links a trans-dimensional, parallel tempered Markov chain Monte Carlo sampler, which explores the parsimonious model space, to MARE2DEM, an adaptive finite element forward solver. MARE2DEM computes the model response using a dense parameter mesh with resistivity assigned via the Gaussian process model. We demonstrate the new trans-dimensional Gaussian process sampler by inverting both synthetic and field magnetotelluric data for 2D models of electrical resistivity, with the field data example converging within 10 days on 148 cores, a non-negligible but tractable computational cost. For a field data inversion, our algorithm achieves a parameter reduction of over 32x compared to the fixed parameter grid used for the MARE2DEM regularized inversion. Resistivity probability distributions computed from the ensemble of models produced by the inversion yield credible intervals and interquartile plots that quantitatively show the non-linear 2D uncertainty in model structure. This uncertainty could then be propagated to other physical properties that impact resistivity including bulk composition, porosity and pore-fluid content.


2011 ◽  
Vol 328-330 ◽  
pp. 524-529
Author(s):  
Jun Yan Ma ◽  
Xiao Ping Liao ◽  
Wei Xia ◽  
Xue Lian Yan

As a powerful modeling tool, Gaussian process (GP) employs a Bayesian statistics approach and adopts a highly nonlinear regression technique for general scientific and engineering tasks. In the first step of constructing Gaussian process model is to estimate the best value of the hyperparameter which turned to be used in the second step where a satisfactory nonlinear model was fitted. In this paper, a modified Wolfe line search approach for hyperparameters estimation by maximizing the marginal likelihood based on conjugate gradient method is proposed. And then we analyze parameter correlation according to the value of hyperparameters to control the warpage which is a main defect for a thin shell structure part in injection molding.


2020 ◽  
Vol 98 (Supplement_4) ◽  
pp. 245-246
Author(s):  
Cláudio U Magnabosco ◽  
Fernando Lopes ◽  
Valentina Magnabosco ◽  
Raysildo Lobo ◽  
Leticia Pereira ◽  
...  

Abstract The aim of the study was to evaluate prediction methods, validation approaches and pseudo-phenotypes for the prediction of the genomic breeding values of feed efficiency related traits in Nellore cattle. It used the phenotypic and genotypic information of 4,329 and 3,594 animals, respectively, which were tested for residual feed intake (RFI), dry matter intake (DMI), feed efficiency (FE), feed conversion ratio (FCR), residual body weight gain (RG), and residual intake and body weight gain (RIG). Six prediction methods were used: ssGBLUP, BayesA, BayesB, BayesCπ, BLASSO, and BayesR. Three validation approaches were used: 1) random: where the data was randomly divided into ten subsets and the validation was done in each subset at a time; 2) age: the division into the training (2010 to 2016) and validation population (2017) were based on the year of birth; 3) genetic breeding value (EBV) accuracy: the data was split in the training population being animals with accuracy above 0.45; and validation population those below 0.45. We checked the accuracy and bias of genomic value (GEBV). The results showed that the GEBV accuracy was the highest when the prediction is obtained with ssGBLUP (0.05 to 0.31) (Figure 1). The low heritability obtained, mainly for FE (0.07 ± 0.03) and FCR (0.09 ± 0.03), limited the GEBVs accuracy, which ranged from low to moderate. The regression coefficient estimates were close to 1, and similar between the prediction methods, validation approaches, and pseudo-phenotypes. The cross-validation presented the most accurate predictions ranging from 0.07 to 0.037. The prediction accuracy was higher for phenotype adjusted for fixed effects than for EBV and EBV deregressed (30.0 and 34.3%, respectively). Genomic prediction can provide a reliable estimate of genomic breeding values for RFI, DMI, RG and RGI, as to even say that those traits may have higher genetic gain than FE and FCR.


Sign in / Sign up

Export Citation Format

Share Document