EnKF and Hybrid Gain Ensemble Data Assimilation. Part I: EnKF Implementation

Abstract The desire to do detailed comparisons between variational and more scalable ensemble-based data assimilation systems in a semioperational environment has led to the development of a state-of-the-art EnKF system at ECMWF. A broad description of the ECMWF EnKF is given in this paper, focusing on highlighting differences compared to standard EnKF practice. In particular, a discussion of the novel algorithm used to control imbalances between the mass and wind fields in the EnKF analysis is given. The scalability and computational properties of the EnKF are reviewed and the implementation choices adopted at ECMWF described. The sensitivity of the ECMWF EnKF to ensemble size, horizontal resolution, and representation of model errors is also discussed. A comparison with 4DVar will be found in Part II of this two-part study.

Download Full-text

Technical note: Sequential ensemble data assimilation in convergent and divergent systems

10.5194/hess-2020-668 ◽

2021 ◽

Author(s):

Hannes Helmut Bauser ◽

Daniel Berg ◽

Kurt Roth

Keyword(s):

Data Assimilation ◽

Water Movement ◽

Measurement Data ◽

Technical Note ◽

Richards Equation ◽

Model Errors ◽

Parameter Uncertainties ◽

Ensemble Data Assimilation ◽

Ensemble Data ◽

Abstract Data

Abstract. Data assimilation methods are used throughout the geosciences to combine information from uncertain models and uncertain measurement data. However, the characteristics of geophysical systems differ and may be distinguished between divergent and convergent systems. In divergent systems initially nearby states will drift apart, while they will coalesce in convergent systems. This difference has implications for the application of sequential ensemble data assimilation methods. This study explores these implications on two exemplary systems: the divergent Lorenz-96 model and the convergent description of soil water movement by the Richards equation. The results show that sequential ensemble data assimilation methods require a sufficient divergent component. This makes the transfer of the methods from divergent to convergent systems challenging. We demonstrate through a set of case studies that it is imperative to represent model errors adequately and incorporate parameter uncertainties in ensemble data assimilation in convergent systems.

Download Full-text

Improving Weather Forecast Skill through Reduced-Precision Data Assimilation

Monthly Weather Review ◽

10.1175/mwr-d-17-0132.1 ◽

2017 ◽

Vol 146 (1) ◽

pp. 49-62 ◽

Cited By ~ 10

Author(s):

Sam Hatfield ◽

Aneesh Subramanian ◽

Tim Palmer ◽

Peter Düben

Keyword(s):

Data Assimilation ◽

Observation Error ◽

Double Precision ◽

Ensemble Size ◽

Error Statistics ◽

Rounding Errors ◽

Precision Data ◽

Ensemble Data Assimilation ◽

Ensemble Data ◽

Computational Resources

Abstract A new approach for improving the accuracy of data assimilation, by trading numerical precision for ensemble size, is introduced. Data assimilation is inherently uncertain because of the use of noisy observations and imperfect models. Thus, the larger rounding errors incurred from reducing precision may be within the tolerance of the system. Lower-precision arithmetic is cheaper, and so by reducing precision in ensemble data assimilation, computational resources can be redistributed toward, for example, a larger ensemble size. Because larger ensembles provide a better estimate of the underlying distribution and are less reliant on covariance inflation and localization, lowering precision could actually permit an improvement in the accuracy of weather forecasts. Here, this idea is tested on an ensemble data assimilation system comprising the Lorenz ’96 toy atmospheric model and the ensemble square root filter. The system is run at double-, single-, and half-precision (the latter using an emulation tool), and the performance of each precision is measured through mean error statistics and rank histograms. The sensitivity of these results to the observation error and the length of the observation window are addressed. Then, by reinvesting the saved computational resources from reducing precision into the ensemble size, assimilation error can be reduced for (hypothetically) no extra cost. This results in increased forecasting skill, with respect to double-precision assimilation.

Download Full-text

Ensemble Data Assimilation to Characterize Surface-Layer Errors in Numerical Weather Prediction Models

Monthly Weather Review ◽

10.1175/mwr-d-12-00280.1 ◽

2013 ◽

Vol 141 (6) ◽

pp. 1804-1821 ◽

Cited By ~ 18

Author(s):

J. P. Hacker ◽

W. M. Angevine

Keyword(s):

Surface Layer ◽

Data Assimilation ◽

Atmospheric Surface Layer ◽

Weather Prediction ◽

Model Error ◽

Time Of Day ◽

Turbulent Heat ◽

Model Errors ◽

Ensemble Data Assimilation ◽

Ensemble Data

Abstract Experiments with the single-column implementation of the Weather Research and Forecasting Model provide a basis for deducing land–atmosphere coupling errors in the model. Coupling occurs both through heat and moisture fluxes through the land–atmosphere interface and roughness sublayer, and turbulent heat, moisture, and momentum fluxes through the atmospheric surface layer. This work primarily addresses the turbulent fluxes, which are parameterized following the Monin–Obukhov similarity theory applied to the atmospheric surface layer. By combining ensemble data assimilation and parameter estimation, the model error can be characterized. Ensemble data assimilation of 2-m temperature and water vapor mixing ratio, and 10-m wind components, forces the model to follow observations during a month-long simulation for a column over the well-instrumented Atmospheric Radiation Measurement (ARM) Central Facility near Lamont, Oklahoma. One-hour errors in predicted observations are systematically small but nonzero, and the systematic errors measure bias as a function of local time of day. Analysis increments for state elements nearby (15 m AGL) can be too small or have the wrong sign, indicating systematically biased covariances and model error. Experiments using the ensemble filter to objectively estimate a parameter controlling the thermal land–atmosphere coupling show that the parameter adapts to offset the model errors, but that the errors cannot be eliminated. Results suggest either structural errors or further parametric errors that may be difficult to estimate. Experiments omitting atypical observations such as soil and flux measurements lead to qualitatively similar deductions, showing the potential for assimilating common in situ observations as an inexpensive framework for deducing and isolating model errors.

Download Full-text

Accounting for model error in air-quality forecasts: an application of 4DEnVar to the assimilation of atmospheric composition using QG-Chem 1.0

10.5194/gmd-2016-107 ◽

2016 ◽

Author(s):

Emanuele Emili ◽

Selime Gürol ◽

Daniel Cariolle

Keyword(s):

Air Quality ◽

Data Assimilation ◽

Atmospheric Chemistry ◽

Transport Model ◽

Model Error ◽

Tropospheric Chemistry ◽

Atmospheric Composition ◽

Forecast Errors ◽

Model Errors ◽

Ensemble Size

Abstract. Model errors play a significant role in air-quality forecasts. Accounting for them in the data assimilation (DA) procedures is decisive to obtain improved forecasts. We address this issue using a reduced-order chemical transport model based on quasi-geostrophic dynamics and a detailed tropospheric chemistry mechanism, which we name QG-Chem. This model has been coupled to a generic software library for data assimilation and used to assess the potential of the 4DEnVar algorithm for air-quality analyses and forecasts. Among the assets of 4DEnVar, we reckon the possibility to deal with multivariate aspects of atmospheric chemistry and to account for model errors of generic type. A simple diagnostic procedure for detecting model errors is proposed, based on the 4DEnVar analysis and one additional model forecast. A large number of idealized data assimilation experiments are shown for several chemical species of relevance for air-quality forecasts (O3, NOx, CO and CO2), with very different atmospheric life-times and chemical couplings. Experiments are done both under a perfect model hypothesis and including model error through perturbation of surface chemical emissions, for two meteorological and chemical regimes. Some key elements of the 4DEnVar algorithm such as the ensemble size and localization are also discussed. A comparison with results of 3D-Var, widely used in operational centers, shows that, for some species, analyses and next day forecast errors can be halved when model error is taken in account. This result was obtained using a small ensemble size, which remain affordable for most operational centers. We conclude that 4DEnVar has a promising potential for operational air-quality models. We finally highlight areas that deserve further research for applying 4DEnVar to large scale chemistry models, i.e. localization techniques, propagation of analysis covariance between DA cycles and treatment for chemical non-linearities. QG-Chem provides a useful tool in this regard.

Download Full-text

Optimized Localization and Hybridization to Filter Ensemble-Based Covariances

Monthly Weather Review ◽

10.1175/mwr-d-15-0057.1 ◽

2015 ◽

Vol 143 (10) ◽

pp. 3931-3947 ◽

Cited By ~ 17

Author(s):

Benjamin Ménétrier ◽

Thomas Auligné

Keyword(s):

Data Assimilation ◽

Experimental Evidence ◽

Covariance Matrix ◽

Experimental Results ◽

Objective Method ◽

Linear Filtering ◽

Ensemble Size ◽

Optimal Weights ◽

Ensemble Data ◽

Vertical Level

Abstract Localization and hybridization are two methods used in ensemble data assimilation to improve the accuracy of sample covariances. It is shown in this paper that it is beneficial to consider them jointly in the framework of linear filtering of sample covariances. Following previous work on localization, an objective method is provided to optimize both localization and hybridization coefficients simultaneously. Theoretical and experimental evidence shows that if optimal weights are used, localized-hybridized sample covariances are always more accurate than their localized-only counterparts, whatever the static covariance matrix specified for the hybridization. Experimental results obtained using a 1000-member ensemble as a reference show that the method developed in this paper can efficiently provide localization and hybridization coefficients consistent with the variable, vertical level, and ensemble size. Spatially heterogeneous optimization is shown to improve the accuracy of the filtered covariances, and consideration of both vertical and horizontal covariances is proven to have an impact on the hybridization coefficients.

Download Full-text

Accounting for Skewness in Ensemble Data Assimilation

Monthly Weather Review ◽

10.1175/mwr-d-11-00198.1 ◽

2012 ◽

Vol 140 (7) ◽

pp. 2346-2358 ◽

Cited By ~ 21

Author(s):

Daniel Hodyss

Keyword(s):

Kalman Filter ◽

Data Assimilation ◽

Ensemble Kalman Filter ◽

Prior Distribution ◽

State Of The Art ◽

New Technique ◽

Central Feature ◽

Ensemble Size ◽

Boussinesq Model ◽

The Third

Abstract A practical data assimilation algorithm is presented that explicitly accounts for skewness in the prior distribution. The algorithm operates as a global solve (all observations are considered at once) using a minimization-based approach and Schur–Hadamard (elementwise) localization. The central feature of this technique is the squaring of the innovation and the ensemble perturbations so as to create an extended state space that accounts for the second, third, and fourth moments of the prior distribution. This new technique is illustrated in a simple scalar system as well as in a Boussinesq model configured to simulate nonlinearly evolving shear instabilities (Kelvin–Helmholtz waves). It is shown that an ensemble size of at least 100 members is needed to adequately resolve the third and fourth moments required for the algorithm. For ensembles of this size it is shown that this new technique is superior to a state-of-the-art ensemble Kalman filter in situations with significant skewness; otherwise, the new algorithm reduces to the performance of the ensemble Kalman filter.

Download Full-text

Accounting for model error in air quality forecasts: an application of 4DEnVar to the assimilation of atmospheric composition using QG-Chem 1.0

Geoscientific Model Development ◽

10.5194/gmd-9-3933-2016 ◽

2016 ◽

Vol 9 (11) ◽

pp. 3933-3959 ◽

Cited By ~ 1

Author(s):

Emanuele Emili ◽

Selime Gürol ◽

Daniel Cariolle

Keyword(s):

Air Quality ◽

Data Assimilation ◽

Atmospheric Chemistry ◽

Large Scale ◽

Model Error ◽

Tropospheric Chemistry ◽

Atmospheric Composition ◽

Forecast Errors ◽

Model Errors ◽

Ensemble Size

Abstract. Model errors play a significant role in air quality forecasts. Accounting for them in the data assimilation (DA) procedures is decisive to obtain improved forecasts. We address this issue using a reduced-order coupled chemistry–meteorology model based on quasi-geostrophic dynamics and a detailed tropospheric chemistry mechanism, which we name QG-Chem. This model has been coupled to the software library for the data assimilation Object Oriented Prediction System (OOPS) and used to assess the potential of the 4DEnVar algorithm for air quality analyses and forecasts. The assets of 4DEnVar include the possibility to deal with multivariate aspects of atmospheric chemistry and to account for model errors of a generic type. A simple diagnostic procedure for detecting model errors is proposed, based on the 4DEnVar analysis and one additional model forecast. A large number of idealized data assimilation experiments are shown for several chemical species of relevance for air quality forecasts (O3, NOx, CO and CO2) with very different atmospheric lifetimes and chemical couplings. Experiments are done both under a perfect model hypothesis and including model error through perturbation of surface chemical emissions. Some key elements of the 4DEnVar algorithm such as the ensemble size and localization are also discussed. A comparison with results of 3D-Var, widely used in operational centers, shows that, for some species, analysis and next-day forecast errors can be halved when model error is taken into account. This result was obtained using a small ensemble size, which remains affordable for most operational centers. We conclude that 4DEnVar has a promising potential for operational air quality models. We finally highlight areas that deserve further research for applying 4DEnVar to large-scale chemistry models, i.e., localization techniques, propagation of analysis covariance between DA cycles and treatment for chemical nonlinearities. QG-Chem can provide a useful tool in this regard.

Download Full-text

Model Error Representation Using the Stochastically Perturbed Hybrid Physical–Dynamical Tendencies in Ensemble Data Assimilation System

Applied Sciences ◽

10.3390/app10249010 ◽

2020 ◽

Vol 10 (24) ◽

pp. 9010

Author(s):

Sujeong Lim ◽

Myung-Seo Koo ◽

In-Hyuk Kwon ◽

Seon Ki Park

Keyword(s):

Data Assimilation ◽

Weather Prediction ◽

Stochastic Perturbation ◽

Numerical Weather Prediction Model ◽

Natural State ◽

Model Errors ◽

Background Error ◽

Ensemble Data Assimilation ◽

Ensemble Data ◽

Stochastically Perturbed

Ensemble data assimilation systems generally suffer from underestimated background error covariance that leads to a filter divergence problem—the analysis diverges from the natural state by ignoring the observation influence due to the diminished estimation of model uncertainty. To alleviate this problem, we have developed and implemented the stochastically perturbed hybrid physical–dynamical tendencies to the local ensemble transform Kalman filter in a global numerical weather prediction model—the Korean Integrated Model (KIM). This approach accounts for the model errors associated with computational representations of underlying partial differential equations and the imperfect physical parameterizations. The new stochastic perturbation hybrid tendencies scheme generally improved the background error covariances in regions where the ensemble spread was not sufficiently expressed by the control experiment that used an additive inflation and the relaxation to prior spread method.

Download Full-text

Accounting for the Error due to Unresolved Scales in Ensemble Data Assimilation: A Comparison of Different Approaches

Monthly Weather Review ◽

10.1175/mwr3020.1 ◽

2005 ◽

Vol 133 (11) ◽

pp. 3132-3147 ◽

Cited By ~ 122

Author(s):

Thomas M. Hamill ◽

Jeffrey S. Whitaker

Keyword(s):

Data Assimilation ◽

Forecast Error ◽

Additive Model ◽

Model Error ◽

Ensemble Member ◽

Model Errors ◽

Ensemble Data Assimilation ◽

Additive Error ◽

Ensemble Data ◽

Analysis Errors

Abstract Insufficient model resolution is one source of model error in numerical weather predictions. Methods for parameterizing this error in ensemble data assimilations are explored here. Experiments were conducted with a two-layer primitive equation model, where the assumed true state was a T127 forecast simulation. Ensemble data assimilations were performed with the same model at T31 resolution, assimilating imperfect observations drawn from the T127 forecast. By design, the magnitude of errors due to model truncation was much larger than the error growth due to initial condition uncertainty, making this a stringent test of the ability of an ensemble-based data assimilation to deal with model error. Two general methods, “covariance inflation” and “additive error,” were considered for parameterizing the model error at the resolved scales (T31 and larger) due to interaction with the unresolved scales (T32 to T127). Covariance inflation expanded the background forecast members’ deviations about the ensemble mean, while additive error added specially structured noise to each ensemble member forecast before the update step. The method of parameterizing this model error had a substantial effect on the accuracy of the ensemble data assimilation. Covariance inflation produced ensembles with analysis errors that were no lower than the analysis errors from three-dimensional variational (3D-Var) assimilation, and for the method to avoid filter divergence, the assimilations had to be periodically reseeded. Covariance inflation uniformly expanded the model spread; however, the actual growth of model errors depended on the dynamics, growing proportionally more in the midlatitudes. The inappropriately uniform inflation progressively degradated the capacity of the ensemble to span the actual forecast error. The most accurate model-error parameterization was an additive model-error parameterization, which reduced the error difference between 3D-Var and a near-perfect assimilation system by ∼40%. In the lowest-error simulations, additive errors were parameterized using samples of model error from a time series of differences between T63 and T31 forecasts. Scaled samples of differences between model forecast states separated by 24 h were also tested as additive error parameterizations, as well as scaled samples of the T31 model state’s anomaly from the T31 model climatology. The latter two methods produced analyses that were progressively less accurate. The decrease in accuracy was likely due to their inappropriately long spatial correlation length scales.

Download Full-text

Docclustering: An Implementation of the Novel Ps-Document Clustering Algorithm a Preprint

10.1101/397133 ◽

2018 ◽

Author(s):

Jens Dörpinghaus ◽

Sebastian Schaaf ◽

Marc Jacobs

Keyword(s):

Graph Theory ◽

Clustering Algorithm ◽

State Of The Art ◽

Source Code ◽

Document Clustering ◽

Data Retrieval ◽

Medical Data ◽

Biomedical Literature ◽

The Novel ◽

Novel Algorithm

ABSTRACTDocument clustering is widely used in science for data retrieval and organisation. DocClustering is developed to include and use a novel algorithm called PS-Document Clustering that has been first introduced in 2017. This method combines approaches of graph theory with state of the art NLP-technologies. This new heuristic has been shown to be superior to conventional algorithms and it provides – given a suiting similarity measure – a more accurate clustering on biological and medical data.Since the application is written for research on biomedical literature, interfaces for PubMed and SCAIView are available. In this brief report the source code as well as a short overview about the new features, novel heuristics and approaches are provided.The software can be obtained from the authors or directly downloaded from GitHub, see https://github.com/jd-s/DocClustering.

Download Full-text