scholarly journals Combining data assimilation and machine learning to estimate parameters of a convective‐scale model

Author(s):  
S. Legler ◽  
T. Janjić

Author(s):  
Seunghee Lee ◽  
Seohui Park ◽  
Myong‐In Lee ◽  
Ganghan Kim ◽  
Jungho Im ◽  
...  


2019 ◽  
Author(s):  
Julien Brajard ◽  
Alberto Carrassi ◽  
Marc Bocquet ◽  
Laurent Bertino

Abstract. A novel method, based on the combination of data assimilation and machine learning is introduced. The new hybrid approach is designed for a two-fold scope: (i) emulating a hidden, possibly chaotic, dynamics and (ii) predicting its future states. The method applies alternatively a data assimilation step, here an ensemble Kalman filter, and a neural network. Data assimilation is used to combine optimally a surrogate model with sparse noisy data. The resulting analysis is spatially complete and can thus be used as a training set by the neural network to upgrade the surrogate model. The two steps are then repeated iteratively. Numerical experiments have been carried out using the chaotic Lorenz 96, a 40-variables model, proving both convergence and statistical skills. The skill metrics include the short-term forecast skills out to two Lyapunov times, the retrieval of positive Lyapunov exponents and the power density spectrum. The sensitivity of the method to critical setup parameters is also presented: forecast skills decrease smoothly with increased observational noise but drops abruptly if less then half of the model domain is observed. The synergy demonstrated with a low-dimensional system is encouraging for more sophisticated dynamics and motivates further investigation to merge data assimilation and machine learning.



Author(s):  
Julien Brajard ◽  
Alberto Carrassi ◽  
Marc Bocquet ◽  
Laurent Bertino

In recent years, machine learning (ML) has been proposed to devise data-driven parametrizations of unresolved processes in dynamical numerical models. In most cases, the ML training leverages high-resolution simulations to provide a dense, noiseless target state. Our goal is to go beyond the use of high-resolution simulations and train ML-based parametrization using direct data, in the realistic scenario of noisy and sparse observations. The algorithm proposed in this work is a two-step process. First, data assimilation (DA) techniques are applied to estimate the full state of the system from a truncated model. The unresolved part of the truncated model is viewed as a model error in the DA system. In a second step, ML is used to emulate the unresolved part, a predictor of model error given the state of the system. Finally, the ML-based parametrization model is added to the physical core truncated model to produce a hybrid model. The DA component of the proposed method relies on an ensemble Kalman filter while the ML parametrization is represented by a neural network. The approach is applied to the two-scale Lorenz model and to MAOOAM, a reduced-order coupled ocean-atmosphere model. We show that in both cases, the hybrid model yields forecasts with better skill than the truncated model. Moreover, the attractor of the system is significantly better represented by the hybrid model than by the truncated model. This article is part of the theme issue ‘Machine learning for weather and climate modelling’.





2020 ◽  
Vol 148 (6) ◽  
pp. 2457-2477 ◽  
Author(s):  
Yuefei Zeng ◽  
Tijana Janjić ◽  
Alberto de Lozar ◽  
Stephan Rasp ◽  
Ulrich Blahak ◽  
...  

Abstract Different approaches for representing model error due to unresolved scales and processes are compared in convective-scale data assimilation, including the physically based stochastic perturbation (PSP) scheme for turbulence, an advanced warm bubble approach that automatically detects and triggers absent convective cells, and additive noise based on model truncation error. The analysis of kinetic energy spectrum guides the understanding of differences in precipitation forecasts. It is found that the PSP scheme results in more ensemble spread in assimilation cycles, but its effects on the root-mean-square error (RMSE) are neutral. This leads to positive impacts on precipitation forecasts that last up to three hours. The warm bubble technique does not create more spread, but is effective in reducing the RMSE, and improving precipitation forecasts for up to 3 h. The additive noise approach contributes greatly to ensemble spread, but it results in a larger RMSE during assimilation cycles. Nevertheless, it considerably improves the skill of precipitation forecasts up to 6 h. Combining the additive noise with either the PSP scheme or the warm bubble technique reduces the RMSE within cycles and improves the skill of the precipitation forecasts, with the latter being more beneficial.



2016 ◽  
Vol 23 (1) ◽  
pp. 1-12 ◽  
Author(s):  
R. Legrand ◽  
Y. Michel ◽  
T. Montmerle

Abstract. In numerical weather prediction, the problem of estimating initial conditions with a variational approach is usually based on a Bayesian framework associated with a Gaussianity assumption of the probability density functions of both observations and background errors. In practice, Gaussianity of errors is tied to linearity, in the sense that a nonlinear model will yield non-Gaussian probability density functions. In this context, standard methods relying on Gaussian assumption may perform poorly. This study aims to describe some aspects of non-Gaussianity of forecast and analysis errors in a convective-scale model using a Monte Carlo approach based on an ensemble of data assimilations. For this purpose, an ensemble of 90 members of cycled perturbed assimilations has been run over a highly precipitating case of interest. Non-Gaussianity is measured using the K2 statistics from the D'Agostino test, which is related to the sum of the squares of univariate skewness and kurtosis. Results confirm that specific humidity is the least Gaussian variable according to that measure and also that non-Gaussianity is generally more pronounced in the boundary layer and in cloudy areas. The dynamical control variables used in our data assimilation, namely vorticity and divergence, also show distinct non-Gaussian behaviour. It is shown that while non-Gaussianity increases with forecast lead time, it is efficiently reduced by the data assimilation step especially in areas well covered by observations. Our findings may have implication for the choice of the control variables.



2015 ◽  
Vol 2 (4) ◽  
pp. 1061-1090 ◽  
Author(s):  
R. Legrand ◽  
Y. Michel ◽  
T. Montmerle

Abstract. In numerical weather prediction, the problem of estimating initial conditions is usually based on a Bayesian framework. Two common derivations respectively lead to the Kalman filter and to variational approaches. They rely on either assumptions of linearity or assumptions of Gaussianity of the probability density functions of both observation and background errors. In practice, linearity and Gaussianity of errors are tied to one another, in the sense that a nonlinear model will yield non-Gaussian probability density functions, and that standard methods may perform poorly in the context of non-Gaussian probability density functions. This study aims to describe some aspects of non-Gaussianity of forecast and analysis errors in a convective scale model using a Monte-Carlo approach based on an ensemble of data assimilations. For this purpose, an ensemble of 90 members of cycled perturbed assimilations has been run over a highly precipitating case of interest. Non-Gaussianity is measured using the K2-statistics from the D'Agostino test, which is related to the sum of the squares of univariate skewness and kurtosis. Results confirm that specific humidity is the least Gaussian variable according to that measure, and also that non-Gaussianity is generally more pronounced in the boundary layer and in cloudy areas. The mass control variables used in our data assimilation, namely vorticity and divergence, also show distinct non-Gaussian behavior. It is shown that while non-Gaussianity increases with forecast lead time, it is efficiently reduced by the data assimilation step especially in areas well covered by observations. Our findings may have implication for the choice of the control variables.



Author(s):  
Alban Farchi ◽  
Patrick Laloyaux ◽  
Massimo Bonavita ◽  
Marc Bocquet


Sign in / Sign up

Export Citation Format

Share Document