scholarly journals Coupling the K-nearest neighbors and locally weighted linear regression with ensemble Kalman filter for data-driven data assimilation

2021 ◽  
Vol 13 (1) ◽  
pp. 1395-1413
Author(s):  
Manhong Fan ◽  
Yulong Bai ◽  
Lili Wang ◽  
Lihong Tang ◽  
Lin Ding

Abstract Machine learning-based data-driven methods are increasingly being used to extract structures and essences from the ever-increasing pool of geoscience-related big data, which are often used in relation to the atmosphere, oceans, and land surfaces. This study focuses on applying a data-driven forecast model to the classical ensemble Kalman filter process to reconstruct, analyze, and elucidate the model. In this study, a nonparametric sampler from a catalog of historical datasets, namely, a nearest neighbor or analog sampler, is given by numerical simulations. Based on this catalog (sampler), the dynamics physics model is reconstructed using the K-nearest neighbors algorithm. The optimal values of the surrogate model are found, and the forecast step is performed using locally weighted linear regression. Several numerical experiments carried out using the Lorenz-63 and Lorenz-96 models demonstrate that the proposed approach performs as good as the ensemble Kalman filter for larger catalog sizes. This approach is restricted to the ensemble Kalman filter form. However, the basic strategy is not restricted to any particular version of the Kalman filter. It is found that this combined approach can outperform the generally used sequential data assimilation approach when the size of the catalog is substantially large.

2014 ◽  
Vol 2 (5) ◽  
pp. 3289-3349 ◽  
Author(s):  
M. C. Rochoux ◽  
S. Ricci ◽  
D. Lucor ◽  
B. Cuenot ◽  
A. Trouvé

Abstract. This paper is the first part in a series of two articles and presents a data-driven wildfire simulator for forecasting wildfire spread scenarios, at a reduced computational cost that is consistent with operational systems. The prototype simulator features the following components: a level-set-based fire propagation solver FIREFLY that adopts a regional-scale modeling viewpoint, treats wildfires as surface propagating fronts, and uses a description of the local rate of fire spread (ROS) as a function of environmental conditions based on Rothermel's model; a series of airborne-like observations of the fire front positions; and a data assimilation algorithm based on an ensemble Kalman filter (EnKF) for parameter estimation. This stochastic algorithm partly accounts for the non-linearities between the input parameters of the semi-empirical ROS model and the fire front position, and is sequentially applied to provide a spatially-uniform correction to wind and biomass fuel parameters as observations become available. A wildfire spread simulator combined with an ensemble-based data assimilation algorithm is therefore a promising approach to reduce uncertainties in the forecast position of the fire front and to introduce a paradigm-shift in the wildfire emergency response. In order to reduce the computational cost of the EnKF algorithm, a surrogate model based on a polynomial chaos (PC) expansion is used in place of the forward model FIREFLY in the resulting hybrid PC-EnKF algorithm. The performance of EnKF and PC-EnKF is assessed on synthetically-generated simple configurations of fire spread to provide valuable information and insight on the benefits of the PC-EnKF approach as well as on a controlled grassland fire experiment. The results indicate that the proposed PC-EnKF algorithm features similar performance to the standard EnKF algorithm, but at a much reduced computational cost. In particular, the re-analysis and forecast skills of data assimilation strongly relate to the spatial and temporal variability of the errors in the ROS model parameters.


2005 ◽  
Vol 133 (11) ◽  
pp. 3081-3094 ◽  
Author(s):  
A. Caya ◽  
J. Sun ◽  
C. Snyder

Abstract A four-dimensional variational data assimilation (4DVAR) algorithm is compared to an ensemble Kalman filter (EnKF) for the assimilation of radar data at the convective scale. Using a cloud-resolving model, simulated, imperfect radar observations of a supercell storm are assimilated under the assumption of a perfect forecast model. Overall, both assimilation schemes perform well and are able to recover the supercell with comparable accuracy, given radial-velocity and reflectivity observations where rain was present. 4DVAR produces generally better analyses than the EnKF given observations limited to a period of 10 min (or three volume scans), particularly for the wind components. In contrast, the EnKF typically produces better analyses than 4DVAR after several assimilation cycles, especially for model variables not functionally related to the observations. The advantages of the EnKF in later cycles arise at least in part from the fact that the 4DVAR scheme implemented here does not use a forecast from a previous cycle as background or evolve its error covariance. Possible reasons for the initial advantage of 4DVAR are deficiencies in the initial ensemble used by the EnKF, the temporal smoothness constraint used in 4DVAR, and nonlinearities in the evolution of forecast errors over the assimilation window.


2010 ◽  
Vol 138 (5) ◽  
pp. 1550-1566 ◽  
Author(s):  
Mark Buehner ◽  
P. L. Houtekamer ◽  
Cecilien Charette ◽  
Herschel L. Mitchell ◽  
Bin He

Abstract An intercomparison of the Environment Canada variational and ensemble Kalman filter (EnKF) data assimilation systems is presented in the context of global deterministic NWP. In an EnKF experiment having the same spatial resolution as the inner loop in the four-dimensional variational data assimilation system (4D-Var), the mean of each analysis ensemble is used to initialize the higher-resolution deterministic forecasts. Five different variational data assimilation experiments are also conducted. These include both 4D-Var and 3D-Var (with first guess at appropriate time) experiments using either (i) prescribed background-error covariances similar to those used operationally, which are static in time and include horizontally homogeneous and isotropic correlations; or (ii) flow-dependent covariances computed from the EnKF background ensembles with spatial covariance localization applied. The fifth variational data assimilation experiment is a new approach called the Ensemble-4D-Var (En-4D-Var). This approach uses 4D flow-dependent background-error covariances estimated from EnKF ensembles to produce a 4D analysis without the need for tangent-linear or adjoint versions of the forecast model. In this first part of a two-part paper, results from a series of idealized assimilation experiments are presented. In these experiments, only a single observation or vertical profile of observations is assimilated to explore the impact of various fundamental differences among the EnKF and the various variational data assimilation approaches considered. In particular, differences in the application of covariance localization in the EnKF and variational approaches are shown to have a significant impact on the assimilation of satellite radiance observations. The results also demonstrate that 4D-Var and the EnKF can both produce similar 4D background-error covariances within a 6-h assimilation window. In the second part, results from medium-range deterministic forecasts for the study period of February 2007 are presented for the EnKF and the five variational data assimilation approaches considered.


2007 ◽  
Vol 135 (10) ◽  
pp. 3484-3495 ◽  
Author(s):  
Brian J. Etherton

Abstract An ensemble Kalman filter (EnKF) estimates the error statistics of a model forecast using an ensemble of model forecasts. One use of an EnKF is data assimilation, resulting in the creation of an increment to the first-guess field at the observation time. Another use of an EnKF is to propagate error statistics of a model forecast forward in time, such as is done for optimizing the location of adaptive observations. Combining these two uses of an ensemble Kalman filter, a “preemptive forecast” can be generated. In a preemptive forecast, the increment to the first-guess field is, using ensembles, propagated to some future time and added to the future control forecast, resulting in a new forecast. This new forecast requires no more time to produce than the time needed to run a data assimilation scheme, as no model integration is necessary. In an observing system simulation experiment (OSSE), a barotropic vorticity model was run to produce a 300-day “nature run.” The same model, run with a different vorticity forcing scheme, served as the forecast model. The model produced 24- and 48-h forecasts for each of the 300 days. The model was initialized every 24 h by assimilating observations of the nature run using a hybrid ensemble Kalman filter–three-dimensional variational data assimilation (3DVAR) scheme. In addition to the control forecast, a 64-member forecast ensemble was generated for each of the 300 days. Every 24 h, given a set of observations, the 64-member ensemble, and the control run, an EnKF was used to create 24-h preemptive forecasts. The preemptive forecasts were more accurate than the unmodified, original 48-h forecasts, though not quite as accurate as the 24-h forecast obtained from a new model integration initialized by assimilating the same observations as were used in the preemptive forecasts. The accuracy of the preemptive forecasts improved significantly when 1) the ensemble-based error statistics used by the EnKF were localized using a Schur product and 2) a model error term was included in the background error covariance matrices.


2010 ◽  
Vol 138 (5) ◽  
pp. 1567-1586 ◽  
Author(s):  
Mark Buehner ◽  
P. L. Houtekamer ◽  
Cecilien Charette ◽  
Herschel L. Mitchell ◽  
Bin He

Abstract An intercomparison of the Environment Canada variational and ensemble Kalman filter (EnKF) data assimilation systems is presented in the context of producing global deterministic numerical weather forecasts. Five different variational data assimilation approaches are considered including four-dimensional variational data assimilation (4D-Var) and three-dimensional variational data assimilation (3D-Var) with first guess at the appropriate time (3D-FGAT). Also included among these is a new approach, called Ensemble-4D-Var (En-4D-Var), that uses 4D ensemble background-error covariances from the EnKF. A description of the experimental configurations and results from single-observation experiments are presented in the first part of this two-part paper. The present paper focuses on results from medium-range deterministic forecasts initialized with analyses from the EnKF and the five variational data assimilation approaches for the period of February 2007. All experiments assimilate exactly the same full set of meteorological observations and use the same configuration of the forecast model to produce global deterministic medium-range forecasts. The quality of forecasts in the short (medium) range obtained by using the EnKF ensemble mean analysis is slightly degraded (improved) in the extratropics relative to using the 4D-Var analysis with background-error covariances similar to those used operationally. The use of the EnKF flow-dependent error covariances in the variational system (4D-Var or 3D-FGAT) leads to large (modest) forecast improvements in the southern extratropics (tropics) as compared with using covariances similar to the operational system (a gain of up to 9 h at day 5). The En-4D-Var approach leads to (i) either improved or similar forecast quality when compared with the 4D-Var experiment similar to the currently operational system, (ii) slightly worse forecast quality when compared with the 4D-Var experiment with EnKF error covariances, and (iii) generally similar forecast quality when compared with the EnKF experiment.


Author(s):  
Nicolas Papadakis ◽  
Etienne Mémin ◽  
Anne Cuzol ◽  
Nicolas Gengembre

2016 ◽  
Vol 66 (8) ◽  
pp. 955-971 ◽  
Author(s):  
Stéphanie Ponsar ◽  
Patrick Luyten ◽  
Valérie Dulière

Sign in / Sign up

Export Citation Format

Share Document