Supplementary material to "A Bayesian Ensemble Data Assimilation to Constrain Model Parameters and Land Use Carbon Emissions"

<p style='text-indent:20px;'>The disparity in the impact of COVID-19 on minority populations in the United States has been well established in the available data on deaths, case counts, and adverse outcomes. However, critical metrics used by public health officials and epidemiologists, such as a time dependent viral reproductive number (<inline-formula><tex-math id="M1">\begin{document}$ R_t $\end{document}</tex-math></inline-formula>), can be hard to calculate from this data especially for individual populations. Furthermore, disparities in the availability of testing, record keeping infrastructure, or government funding in disadvantaged populations can produce incomplete data sets. In this work, we apply ensemble data assimilation techniques which optimally combine model and data to produce a more complete data set providing better estimates of the critical metrics used by public health officials and epidemiologists. We employ a multi-population SEIR (Susceptible, Exposed, Infected and Recovered) model with a time dependent reproductive number and age stratified contact rate matrix for each population. We assimilate the daily death data for populations separated by ethnic/racial groupings using a technique called Ensemble Smoothing with Multiple Data Assimilation (ESMDA) to estimate model parameters and produce an <inline-formula><tex-math id="M10000">\begin{document}$R_t(n)$\end{document}</tex-math></inline-formula> for the <inline-formula><tex-math id="M2000">\begin{document}$n^{th}$\end{document}</tex-math></inline-formula> population. We do this with three distinct approaches, (1) using the same contact matrices and prior <inline-formula><tex-math id="M30000">\begin{document}$R_t(n)$\end{document}</tex-math></inline-formula> for each population, (2) assigning contact matrices with increased contact rates for working age and older adults to populations experiencing disparity and (3) as in (2) but with a time-continuous update to <inline-formula><tex-math id="M4">\begin{document}$R_t(n)$\end{document}</tex-math></inline-formula>. We make a study of 9 U.S. states and the District of Columbia providing a complete time series of the pandemic in each and, in some cases, identifying disparities not otherwise evident in the aggregate statistics.</p>

Download Full-text

FRONTLINE COMMUNITIES AND SARS-COV-2 - MULTI-POPULATION MODELING WITH AN ASSESSMENT OF DISPARITY BY RACE/ETHNICITY USING ENSEMBLE DATA ASSIMILATION

10.1101/2021.02.27.21252589 ◽

2021 ◽

Author(s):

Emmanuel Fleurantin ◽

Christian Sampson ◽

Daniel Paul Maes ◽

Justin Bennet ◽

Tayler Fernandez-Nunez ◽

...

Keyword(s):

Data Assimilation ◽

Global Economy ◽

Reproductive Number ◽

Model Parameters ◽

Modeling Framework ◽

Population Mixing ◽

Ensemble Data Assimilation ◽

Ensemble Data ◽

Contact Matrix ◽

Medical Infrastructure

AbstractThe COVID-19 pandemic has imposed many strenuous effects on the global economy, community, and medical infrastructure. Since the out- break, researchers and policymakers have scrambled to develop ways to identify how COVID-19 will affect specific sub-populations so that good public health decisions can be made. To this end, we adapt the work of Evensenet al[1] which introduces a SEIR model that incorporates an age-stratified contact matrix, a time dependent effective reproduction numberR, and uses ensemble data assimilation to estimate model parameters. The adaptation is an extension of Evensen’s modeling framework, in which we model sub-populations with varying risks of contracting SARS-CoV-2 (the virus that causes COVID-19) in a particular state, each with a characteristic age-stratified contact matrix. In this work, we will focus on 9 U.S. states as well as the District of Columbia. We estimate the effective reproductive number as a function of time for our different sub-populations and then divide them into two groups: frontline communities (FLCs) and the complement (NFLCs). Our model will account for mixing both within populations (intra-population mixing) and between populations (inter-population mixing). Our data is conditioned on the daily numbers of accumulated deaths for each sub-population. We aim to test and demonstrate methodologies that can be used to assess critical metrics of the pandemic’s evolution which are difficult to directly measure. The output may ultimately be of use to measure the success or failures of the pandemic response and provide experts and policymakers a tool to create better plans for a future outbreak or pandemic. We consider the results of this work to be a reanalysis of pandemic evolution across differently affected sub-populations which may also be used to improve modeling and forecasts.

Download Full-text

Methods to mitigate loss of variance due to sampling errors in ensemble data assimilation with non-local model parameters

Journal of Petroleum Science and Engineering ◽

10.1016/j.petrol.2018.08.056 ◽

2019 ◽

Vol 172 ◽

pp. 690-706 ◽

Cited By ~ 5

Author(s):

Johann M. Lacerda ◽

Alexandre A. Emerick ◽

Adolfo P. Pires

Keyword(s):

Data Assimilation ◽

Local Model ◽

Model Parameters ◽

Sampling Errors ◽

Ensemble Data Assimilation ◽

Ensemble Data ◽

Non Local

Download Full-text

Initiation of ensemble data assimilation

Tellus A Dynamic Meteorology and Oceanography ◽

10.3402/tellusa.v58i2.14766 ◽

2006 ◽

Cited By ~ 1

Author(s):

M. Zupanski ◽

S. J. Fletcher ◽

I. M. Navon ◽

B. Uzunoglu ◽

R. P. Heikes ◽

...

Keyword(s):

Data Assimilation ◽

Ensemble Data Assimilation ◽

Ensemble Data

Download Full-text

Using a machine learning proxy for localization in ensemble data assimilation

Computational Geosciences ◽

10.1007/s10596-020-10031-0 ◽

2021 ◽

Vol 25 (3) ◽

pp. 931-944

Author(s):

Johann M. Lacerda ◽

Alexandre A. Emerick ◽

Adolfo P. Pires

Keyword(s):

Machine Learning ◽

Data Assimilation ◽

Ensemble Data Assimilation ◽

Ensemble Data

Download Full-text

Maximum Likelihood Ensemble Filter: Theoretical Aspects

Monthly Weather Review ◽

10.1175/mwr2946.1 ◽

2005 ◽

Vol 133 (6) ◽

pp. 1710-1726 ◽

Cited By ~ 222

Author(s):

Milija Zupanski

Keyword(s):

Maximum Likelihood ◽

Data Assimilation ◽

Cost Function ◽

Ensemble Data Assimilation ◽

Error Covariance ◽

Ensemble Data ◽

Analysis Error ◽

Nonlinear Observation ◽

The Cost ◽

Maximum Likelihood Ensemble Filter

Abstract A new ensemble-based data assimilation method, named the maximum likelihood ensemble filter (MLEF), is presented. The analysis solution maximizes the likelihood of the posterior probability distribution, obtained by minimization of a cost function that depends on a general nonlinear observation operator. The MLEF belongs to the class of deterministic ensemble filters, since no perturbed observations are employed. As in variational and ensemble data assimilation methods, the cost function is derived using a Gaussian probability density function framework. Like other ensemble data assimilation algorithms, the MLEF produces an estimate of the analysis uncertainty (e.g., analysis error covariance). In addition to the common use of ensembles in calculation of the forecast error covariance, the ensembles in MLEF are exploited to efficiently calculate the Hessian preconditioning and the gradient of the cost function. A sufficient number of iterative minimization steps is 2–3, because of superior Hessian preconditioning. The MLEF method is well suited for use with highly nonlinear observation operators, for a small additional computational cost of minimization. The consistent treatment of nonlinear observation operators through optimization is an advantage of the MLEF over other ensemble data assimilation algorithms. The cost of MLEF is comparable to the cost of existing ensemble Kalman filter algorithms. The method is directly applicable to most complex forecast models and observation operators. In this paper, the MLEF method is applied to data assimilation with the one-dimensional Korteweg–de Vries–Burgers equation. The tested observation operator is quadratic, in order to make the assimilation problem more challenging. The results illustrate the stability of the MLEF performance, as well as the benefit of the cost function minimization. The improvement is noted in terms of the rms error, as well as the analysis error covariance. The statistics of innovation vectors (observation minus forecast) also indicate a stable performance of the MLEF algorithm. Additional experiments suggest the amplified benefit of targeted observations in ensemble data assimilation.

Download Full-text

An adaptive covariance relaxation method for ensemble data assimilation

Quarterly Journal of the Royal Meteorological Society ◽

10.1002/qj.2576 ◽

2015 ◽

Vol 141 (692) ◽

pp. 2898-2906 ◽

Cited By ~ 24

Author(s):

Yue Ying ◽

Fuqing Zhang

Keyword(s):

Data Assimilation ◽

Relaxation Method ◽

Ensemble Data Assimilation ◽

Ensemble Data

Download Full-text

Convection-Permitting Forecasts Initialized with Continuously Cycling Limited-Area 3DVAR, Ensemble Kalman Filter, and “Hybrid” Variational–Ensemble Data Assimilation Systems

Monthly Weather Review ◽

10.1175/mwr-d-13-00100.1 ◽

2014 ◽

Vol 142 (2) ◽

pp. 716-738 ◽

Cited By ~ 49

Author(s):

Craig S. Schwartz ◽

Zhiquan Liu

Keyword(s):

Kalman Filter ◽

Data Assimilation ◽

Three Dimensional ◽

Skill Score ◽

Limited Area ◽

Computational Domain ◽

Ensemble Data Assimilation ◽

Ensemble Data ◽

Precipitation Characteristics ◽

High Precipitation

Abstract Analyses with 20-km horizontal grid spacing were produced from parallel continuously cycling three-dimensional variational (3DVAR), ensemble square root Kalman filter (EnSRF), and “hybrid” variational–ensemble data assimilation (DA) systems between 0000 UTC 6 May and 0000 UTC 21 June 2011 over a domain spanning the contiguous United States. Beginning 9 May, the 0000 UTC analyses initialized 36-h Weather Research and Forecasting Model (WRF) forecasts containing a large convection-permitting 4-km nest. These 4-km 3DVAR-, EnSRF-, and hybrid-initialized forecasts were compared to benchmark WRF forecasts initialized by interpolating 0000 UTC Global Forecast System (GFS) analyses onto the computational domain. While important differences regarding mean state characteristics of the 20-km DA systems were noted, verification efforts focused on the 4-km precipitation forecasts. The 3DVAR-, hybrid-, and EnSRF-initialized 4-km precipitation forecasts performed similarly regarding general precipitation characteristics, such as timing of the diurnal cycle, and all three forecast sets had high precipitation biases at heavier rainfall rates. However, meaningful differences emerged regarding precipitation placement as quantified by the fractions skill score. For most forecast hours, the hybrid-initialized 4-km precipitation forecasts were better than the EnSRF-, 3DVAR-, and GFS-initialized forecasts, and the improvement was often statistically significant at the 95th percentile. These results demonstrate the potential of limited-area continuously cycling hybrid DA configurations and suggest additional hybrid development is warranted.

Download Full-text

A Bayesian ensemble data assimilation to constrain model parameters and land-use carbon emissions

Biogeosciences ◽

10.5194/bg-15-2909-2018 ◽

2018 ◽

Vol 15 (9) ◽

pp. 2909-2930 ◽

Cited By ~ 18

Author(s):

Sebastian Lienert ◽

Fortunat Joos

Keyword(s):

Land Use ◽

Carbon Emissions ◽

Latin Hypercube Sampling ◽

Historical Period ◽

Model Parameters ◽

Dynamic Global Vegetation Model ◽

Order Of Magnitude ◽

The Difference ◽

Wood Harvest ◽

Global Vegetation

Abstract. A dynamic global vegetation model (DGVM) is applied in a probabilistic framework and benchmarking system to constrain uncertain model parameters by observations and to quantify carbon emissions from land-use and land-cover change (LULCC). Processes featured in DGVMs include parameters which are prone to substantial uncertainty. To cope with these uncertainties Latin hypercube sampling (LHS) is used to create a 1000-member perturbed parameter ensemble, which is then evaluated with a diverse set of global and spatiotemporally resolved observational constraints. We discuss the performance of the constrained ensemble and use it to formulate a new best-guess version of the model (LPX-Bern v1.4). The observationally constrained ensemble is used to investigate historical emissions due to LULCC (ELUC) and their sensitivity to model parametrization. We find a global ELUC estimate of 158 (108, 211) PgC (median and 90 % confidence interval) between 1800 and 2016. We compare ELUC to other estimates both globally and regionally. Spatial patterns are investigated and estimates of ELUC of the 10 countries with the largest contribution to the flux over the historical period are reported. We consider model versions with and without additional land-use processes (shifting cultivation and wood harvest) and find that the difference in global ELUC is on the same order of magnitude as parameter-induced uncertainty and in some cases could potentially even be offset with appropriate parameter choice.

Download Full-text