Review of Combining data assimilation and machine learning to emulate a dynamical model from sparse and noisy observations: a case study with the Lorenz 96 model

Abstract. A novel method, based on the combination of data assimilation and machine learning is introduced. The new hybrid approach is designed for a two-fold scope: (i) emulating a hidden, possibly chaotic, dynamics and (ii) predicting its future states. The method applies alternatively a data assimilation step, here an ensemble Kalman filter, and a neural network. Data assimilation is used to combine optimally a surrogate model with sparse noisy data. The resulting analysis is spatially complete and can thus be used as a training set by the neural network to upgrade the surrogate model. The two steps are then repeated iteratively. Numerical experiments have been carried out using the chaotic Lorenz 96, a 40-variables model, proving both convergence and statistical skills. The skill metrics include the short-term forecast skills out to two Lyapunov times, the retrieval of positive Lyapunov exponents and the power density spectrum. The sensitivity of the method to critical setup parameters is also presented: forecast skills decrease smoothly with increased observational noise but drops abruptly if less then half of the model domain is observed. The synergy demonstrated with a low-dimensional system is encouraging for more sophisticated dynamics and motivates further investigation to merge data assimilation and machine learning.

Download Full-text

Combining Data Assimilation and Machine Learning to build data‐driven models for unknown long time dynamics –Applications in cardiovascular modeling

International Journal for Numerical Methods in Biomedical Engineering ◽

10.1002/cnm.3471 ◽

2021 ◽

Author(s):

Francesco Regazzoni ◽

Dominique Chapelle ◽

Philippe Moireau

Keyword(s):

Machine Learning ◽

Data Assimilation ◽

Data Driven ◽

Time Dynamics ◽

Cardiovascular Modeling ◽

Combining Data ◽

Long Time ◽

Long Time Dynamics

Download Full-text

Air Quality Forecasts Improved by Combining Data Assimilation and Machine Learning with Satellite AOD

Geophysical Research Letters ◽

10.1029/2021gl096066 ◽

2021 ◽

Author(s):

Seunghee Lee ◽

Seohui Park ◽

Myong‐In Lee ◽

Ganghan Kim ◽

Jungho Im ◽

...

Keyword(s):

Machine Learning ◽

Air Quality ◽

Data Assimilation ◽

Combining Data

Download Full-text

Combining data assimilation and machine learning to infer unresolved scale parametrization

Philosophical Transactions of The Royal Society A Mathematical Physical and Engineering Sciences ◽

10.1098/rsta.2020.0086 ◽

2021 ◽

Vol 379 (2194) ◽

pp. 20200086

Author(s):

Julien Brajard ◽

Alberto Carrassi ◽

Marc Bocquet ◽

Laurent Bertino

Keyword(s):

Machine Learning ◽

High Resolution ◽

Data Assimilation ◽

Hybrid Model ◽

Numerical Models ◽

Model Error ◽

Climate Modelling ◽

Combining Data ◽

Full State ◽

Direct Data

In recent years, machine learning (ML) has been proposed to devise data-driven parametrizations of unresolved processes in dynamical numerical models. In most cases, the ML training leverages high-resolution simulations to provide a dense, noiseless target state. Our goal is to go beyond the use of high-resolution simulations and train ML-based parametrization using direct data, in the realistic scenario of noisy and sparse observations. The algorithm proposed in this work is a two-step process. First, data assimilation (DA) techniques are applied to estimate the full state of the system from a truncated model. The unresolved part of the truncated model is viewed as a model error in the DA system. In a second step, ML is used to emulate the unresolved part, a predictor of model error given the state of the system. Finally, the ML-based parametrization model is added to the physical core truncated model to produce a hybrid model. The DA component of the proposed method relies on an ensemble Kalman filter while the ML parametrization is represented by a neural network. The approach is applied to the two-scale Lorenz model and to MAOOAM, a reduced-order coupled ocean-atmosphere model. We show that in both cases, the hybrid model yields forecasts with better skill than the truncated model. Moreover, the attractor of the system is significantly better represented by the hybrid model than by the truncated model. This article is part of the theme issue ‘Machine learning for weather and climate modelling’.

Download Full-text

Using machine learning techniques to generate analog ensembles for data assimilation

10.5194/egusphere-egu21-6036 ◽

2021 ◽

Author(s):

Lucia Yang ◽

Ian Grooms

Keyword(s):

Machine Learning ◽

Data Assimilation ◽

Generative Models ◽

Machine Learning Techniques ◽

Optimal Interpolation ◽

Square Root ◽

Learning Techniques ◽

Wide Range ◽

Lorenz 96 ◽

Square Root Filter

<p>We propose to use analogs of the forecast mean to generate an ensemble of perturbations for use in ensemble optimal interpolation (EnOI) or ensemble variational (EnVar) methods.&#160; In addition to finding analogs from a library, we propose a new method of constructing analogs using autoencoders (a machine learning method).&#160; To extend the scalability of constructed analogs for use in data assimilation on geophysical models, we propose using patching schemes to divide the global spatial domain into digestable chunks.&#160; Using patches makes training the generative models possible and has the added benefit of being able to exploit parallel computing powers.&#160; The resulting analog methods using analogs from a catalog (AnEnOI), constructed analogs (cAnEnOI), and patched constructed analogs (p-cAnEnOI) are tested in the context of a multiscale Lorenz-`96 model, with standard EnOI and an ensemble square root filter for comparison.&#160; The use of analogs from a modestly-sized catalog is shown to improve the performance of EnOI, with limited marginal improvements resulting from increases in the catalog size.&#160; The method using constructed analogs is found to perform as well as a full ensemble square root filter, and to be robust over a wide range of tuning parameters.&#160; Lastly, we find that p-cAnENOI with larger patches produces the best data assimilation performance despite having larger reconstruction errors.&#160; All patch variants except for the variant that uses the smallest patch size outperform cAnEnOI as well as some traditional data assimilation methods such as the ensemble square root filter.</p>

Download Full-text