scholarly journals Deep Data Assimilation: Integrating Deep Learning with Data Assimilation

2021 ◽  
Vol 11 (3) ◽  
pp. 1114
Author(s):  
Rossella Arcucci ◽  
Jiangcheng Zhu ◽  
Shuang Hu ◽  
Yi-Ke Guo

In this paper, we propose Deep Data Assimilation (DDA), an integration of Data Assimilation (DA) with Machine Learning (ML). DA is the Bayesian approximation of the true state of some physical system at a given time by combining time-distributed observations with a dynamic model in an optimal way. We use a ML model in order to learn the assimilation process. In particular, a recurrent neural network, trained with the state of the dynamical system and the results of the DA process, is applied for this purpose. At each iteration, we learn a function that accumulates the misfit between the results of the forecasting model and the results of the DA. Subsequently, we compose this function with the dynamic model. This resulting composition is a dynamic model that includes the features of the DA process and that can be used for future prediction without the necessity of the DA. In fact, we prove that the DDA approach implies a reduction of the model error, which decreases at each iteration; this is achieved thanks to the use of DA in the training process. DDA is very useful in that cases when observations are not available for some time steps and DA cannot be applied to reduce the model error. The effectiveness of this method is validated by examples and a sensitivity study. In this paper, the DDA technology is applied to two different applications: the Double integral mass dot system and the Lorenz system. However, the algorithm and numerical methods that are proposed in this work can be applied to other physics problems that involve other equations and/or state variables.

2016 ◽  
Vol 145 (1) ◽  
pp. 97-116 ◽  
Author(s):  
Douglas R. Allen ◽  
Craig H. Bishop ◽  
Sergey Frolov ◽  
Karl W. Hoppel ◽  
David D. Kuhl ◽  
...  

Abstract An ensemble-based tangent linear model (TLM) is described and tested in data assimilation experiments using a global shallow-water model (SWM). A hybrid variational data assimilation system was developed with a 4D variational (4DVAR) solver that could be run either with a conventional TLM or a local ensemble TLM (LETLM) that propagates analysis corrections using only ensemble statistics. An offline ensemble Kalman filter (EnKF) is used to generate and maintain the ensemble. The LETLM uses data within a local influence volume, similar to the local ensemble transform Kalman filter, to linearly propagate the state variables at the central grid point. After tuning the LETLM with offline 6-h forecasts of analysis corrections, cycling experiments were performed that assimilated randomly located SWM height observations, based on a truth run with forced bottom topography. The performance using the LETLM is similar to that of the conventional TLM, suggesting that a well-constructed LETLM could free 4D variational methods from dependence on conventional TLMs. This is a first demonstration of the LETLM application within a context of a hybrid-4DVAR system applied to a complex two-dimensional fluid dynamics problem. Sensitivity tests are included that examine LETLM dependence on several factors including length of cycling window, size of analysis correction, spread of initial ensemble perturbations, ensemble size, and model error. LETLM errors are shown to increase linearly with correction size in the linear regime, while TLM errors increase quadratically. As nonlinearity (or forecast model error) increases, the two schemes asymptote to the same solution.


2015 ◽  
Vol 143 (5) ◽  
pp. 1568-1582 ◽  
Author(s):  
Juan Ruiz ◽  
Manuel Pulido

Abstract This work explores the potential of online parameter estimation as a technique for model error treatment under an imperfect model scenario, in an ensemble-based data assimilation system, using a simple atmospheric general circulation model, and an observing system simulation experiment (OSSE) approach. Model error is introduced in the imperfect model scenario by changing the value of the parameters associated with different schemes. The parameters of the moist convection scheme are the only ones to be estimated in the data assimilation system. In this work, parameter estimation is compared and combined with techniques that account for the lack of ensemble spread and for the systematic model error. The OSSEs show that when parameter estimation is combined with model error treatment techniques, multiplicative and additive inflation or a bias correction technique, parameter estimation produces a further improvement of analysis quality and medium-range forecast skill with respect to the OSSEs with model error treatment techniques without parameter estimation. The improvement produced by parameter estimation is mainly a consequence of the optimization of the parameter values. The estimated parameters do not converge to the value used to generate the observations in the imperfect model scenario; however, the analysis error is reduced and the forecast skill is improved.


2013 ◽  
Vol 13 (3) ◽  
pp. 583-596 ◽  
Author(s):  
M. Coustau ◽  
S. Ricci ◽  
V. Borrell-Estupina ◽  
C. Bouvier ◽  
O. Thual

Abstract. Mediterranean catchments in southern France are threatened by potentially devastating fast floods which are difficult to anticipate. In order to improve the skill of rainfall-runoff models in predicting such flash floods, hydrologists use data assimilation techniques to provide real-time updates of the model using observational data. This approach seeks to reduce the uncertainties present in different components of the hydrological model (forcing, parameters or state variables) in order to minimize the error in simulated discharges. This article presents a data assimilation procedure, the best linear unbiased estimator (BLUE), used with the goal of improving the peak discharge predictions generated by an event-based hydrological model Soil Conservation Service lag and route (SCS-LR). For a given prediction date, selected model inputs are corrected by assimilating discharge data observed at the basin outlet. This study is conducted on the Lez Mediterranean basin in southern France. The key objectives of this article are (i) to select the parameter(s) which allow for the most efficient and reliable correction of the simulated discharges, (ii) to demonstrate the impact of the correction of the initial condition upon simulated discharges, and (iii) to identify and understand conditions in which this technique fails to improve the forecast skill. The correction of the initial moisture deficit of the soil reservoir proves to be the most efficient control parameter for adjusting the peak discharge. Using data assimilation, this correction leads to an average of 12% improvement in the flood peak magnitude forecast in 75% of cases. The investigation of the other 25% of cases points out a number of precautions for the appropriate use of this data assimilation procedure.


2011 ◽  
Vol 21 (12) ◽  
pp. 3619-3626 ◽  
Author(s):  
ALBERTO CARRASSI ◽  
STÉPHANE VANNITSEM

In this paper, a method to account for model error due to unresolved scales in sequential data assimilation, is proposed. An equation for the model error covariance required in the extended Kalman filter update is derived along with an approximation suitable for application with large scale dynamics typical in environmental modeling. This approach is tested in the context of a low order chaotic dynamical system. The results show that the filter skill is significantly improved by implementing the proposed scheme for the treatment of the unresolved scales.


2019 ◽  
Vol 29 (5) ◽  
pp. 053102 ◽  
Author(s):  
Franz Hamilton ◽  
Tyrus Berry ◽  
Timothy Sauer

Author(s):  
Alban Farchi ◽  
Patrick Laloyaux ◽  
Massimo Bonavita ◽  
Marc Bocquet

<p>Recent developments in machine learning (ML) have demonstrated impressive skills in reproducing complex spatiotemporal processes. However, contrary to data assimilation (DA), the underlying assumption behind ML methods is that the system is fully observed and without noise, which is rarely the case in numerical weather prediction. In order to circumvent this issue, it is possible to embed the ML problem into a DA formalism characterised by a cost function similar to that of the weak-constraint 4D-Var (Bocquet et al., 2019; Bocquet et al., 2020). In practice ML and DA are combined to solve the problem: DA is used to estimate the state of the system while ML is used to estimate the full model. </p><p>In realistic systems, the model dynamics can be very complex and it may not be possible to reconstruct it from scratch. An alternative could be to learn the model error of an already existent model using the same approach combining DA and ML. In this presentation, we test the feasibility of this method using a quasi geostrophic (QG) model. After a brief description of the QG model model, we introduce a realistic model error to be learnt. We then asses the potential of ML methods to reconstruct this model error, first with perfect (full and noiseless) observation and then with sparse and noisy observations. We show in either case to what extent the trained ML models correct the mid-term forecasts. Finally, we show how the trained ML models can be used in a DA system and to what extent they correct the analysis.</p><p>Bocquet, M., Brajard, J., Carrassi, A., and Bertino, L.: Data assimilation as a learning tool to infer ordinary differential equation representations of dynamical models, Nonlin. Processes Geophys., 26, 143–162, 2019</p><p>Bocquet, M., Brajard, J., Carrassi, A., and Bertino, L.: Bayesian inference of chaotic dynamics by merging data assimilation, machine learning and expectation-maximization, Foundations of Data Science, 2 (1), 55-80, 2020</p><p>Farchi, A., Laloyaux, P., Bonavita, M., and Bocquet, M.: Using machine learning to correct model error in data assimilation and forecast applications, arxiv:2010.12605, submitted. </p>


2015 ◽  
Vol 144 (1) ◽  
pp. 213-224 ◽  
Author(s):  
Chiara Piccolo ◽  
Mike Cullen

Abstract A natural way to set up an ensemble forecasting system is to use a model with additional stochastic forcing representing the model error and to derive the initial uncertainty by using an ensemble of analyses generated with this model. Current operational practice has tended to separate the problems of generating initial uncertainty and forecast uncertainty. Thus, in ensemble forecasts, it is normal to use physically based stochastic forcing terms to represent model errors, while in generating analysis uncertainties, artificial inflation methods are used to ensure that the analysis spread is sufficient given the observations. In this paper a more unified approach is tested that uses the same stochastic forcing in the analyses and forecasts and estimates the model error forcing from data assimilation diagnostics. This is shown to be successful if there are sufficient observations. Ensembles used in data assimilation have to be reliable in a broader sense than the usual forecast verification methods; in particular, they need to have the correct covariance structure, which is demonstrated.


Sign in / Sign up

Export Citation Format

Share Document