Prediction of near-surface temperatures using a non-linear machine learning post-processing model

Author(s):  
Emy Alerskans ◽  
Joachim Nyborg ◽  
Morten Birk ◽  
Eigil Kaas

<p>Numerical weather prediction (NWP) models are known to exhibit systematic errors, especially for near-surface variables such as air temperature. This is partly due to deficiencies in the physical formulation of the model dynamics and the inability of these models to successfully handle sub-grid phenomena. Forecasts that better match the locally observed weather can be obtained by post-processing NWP model output using local meteorological observations. Here, we have implemented a non-linear post-processing model based on machine learning techniques with the aim of post-processing near-surface air temperature forecasts from a global coarse-resolution model in order to produce localized forecasts. The model is trained on observational from a network of private weather stations and forecast data from the global coarse-resolution NWP model. Independent data is used to assess the performance of the model and the results are compared with the performance of the raw NWP model output. Overall, the non-linear machine learning post-processing method reduces the bias and the standard deviation compared to the raw NWP forecast and produces a forecast that better match the locally observed weather.</p>

2020 ◽  
Vol 2020 ◽  
pp. 1-14
Author(s):  
Ke Zhou ◽  
Hailei Liu ◽  
Xiaobo Deng ◽  
Hao Wang ◽  
Shenglan Zhang

Six machine-learning approaches, including multivariate linear regression (MLR), gradient boosting decision tree, k-nearest neighbors, random forest, extreme gradient boosting (XGB), and deep neural network (DNN), were compared for near-surface air-temperature (Tair) estimation from the new generation of Chinese geostationary meteorological satellite Fengyun-4A (FY-4A) observations. The brightness temperatures in split-window channels from the Advanced Geostationary Radiation Imager (AGRI) of FY-4A and numerical weather prediction data from the global forecast system were used as the predictor variables for Tair estimation. The performance of each model and the temporal and spatial distribution of the estimated Tair errors were analyzed. The results showed that the XGB model had better overall performance, with R2 of 0.902, bias of −0.087°C, and root-mean-square error of 1.946°C. The spatial variation characteristics of the Tair error of the XGB method were less obvious than those of the other methods. The XGB model can provide more stable and high-precision Tair for a large-scale Tair estimation over China and can serve as a reference for Tair estimation based on machine-learning models.


2021 ◽  
Author(s):  
Thordis Thorarinsdottir ◽  
Jana Sillmann ◽  
Marion Haugen ◽  
Nadine Gissibl ◽  
Marit Sandstad

<p>Reliable projections of extremes in near-surface air temperature (SAT) by climate models become more and more important as global warming is leading to significant increases in the hottest days and decreases in coldest nights around the world with considerable impacts on various sectors, such as agriculture, health and tourism.</p><p>Climate model evaluation has traditionally been performed by comparing summary statistics that are derived from simulated model output and corresponding observed quantities using, for instance, the root mean squared error (RMSE) or mean bias as also used in the model evaluation chapter of the fifth Assessment Report of the Intergovernmental Panel on Climate Change (IPCC AR5). Both RMSE and mean bias compare averages over time and/or space, ignoring the variability, or the uncertainty, in the underlying values. Particularly when interested in the evaluation of climate extremes, climate models should be evaluated by comparing the probability distribution of model output to the corresponding distribution of observed data.</p><p>To address this shortcoming, we use the integrated quadratic distance (IQD) to compare distributions of simulated indices to the corresponding distributions from a data product. The IQD is the proper divergence associated with the proper continuous ranked probability score (CRPS) as it fulfills essential decision-theoretic properties for ranking competing models and testing equality in performance, while also assessing the full distribution.</p><p>The IQD is applied to evaluate CMIP5 and CMIP6 simulations of monthly maximum (TXx) and minimum near-surface air temperature (TNn) over the data-dense regions Europe and North America against both observational and reanalysis datasets. There is not a notable difference between the model generations CMIP5 and CMIP6 when the model simulations are compared against the observational dataset HadEX2. However, the CMIP6 models show a better agreement with the reanalysis ERA5 than CMIP5 models, with a few exceptions. Overall, the climate models show higher skill when compared against ERA5 than when compared against HadEX2. While the model rankings vary with region, season and index, the model evaluation is robust against changes in the grid resolution considered in the analysis.</p>


2020 ◽  
Author(s):  
Florian Dupuy ◽  
Olivier Mestre ◽  
Léo Pfitzner

<p>Cloud cover is a crucial information for many applications such as planning land observation missions from space. However, cloud cover remains a challenging variable to forecast, and Numerical Weather Prediction (NWP) models suffer from significant biases, hence justifying the use of statistical post-processing techniques. In our application, the ground truth is a gridded cloud cover product derived from satellite observations over Europe, and predictors are spatial fields of various variables produced by ARPEGE (Météo-France global NWP) at the corresponding lead time.</p><p>In this study, ARPEGE cloud cover is post-processed using a convolutional neural network (CNN). CNN is the most popular machine learning tool to deal with images. In our case, CNN allows to integrate spatial information contained in NWP outputs. We show that a simple U-Net architecture produces significant improvements over Europe. Compared to the raw ARPEGE forecasts, MAE drops from 25.1 % to 17.8 % and RMSE decreases from 37.0 % to 31.6 %. Considering specific needs for earth observation, special interest was put on forecasts with low cloud cover conditions (< 10 %). For this particular nebulosity class, we show that hit rate jumps from 40.6 to 70.7 (which is the order of magnitude of what can be achieved using classical machine learning algorithms such as random forests) while false alarm decreases from 38.2 to 29.9. This is an excellent result, since improving hit rates by means of random forests usually also results in a slight increase of false alarms.</p>


Atmosphere ◽  
2020 ◽  
Vol 11 (8) ◽  
pp. 823
Author(s):  
Ting Peng ◽  
Xiefei Zhi ◽  
Yan Ji ◽  
Luying Ji ◽  
Ye Tian

The extended range temperature prediction is of great importance for public health, energy and agriculture. The two machine learning methods, namely, the neural networks and natural gradient boosting (NGBoost), are applied to improve the prediction skills of the 2-m maximum air temperature with lead times of 1–35 days over East Asia based on the Environmental Modeling Center, Global Ensemble Forecast System (EMC-GEFS), under the Subseasonal Experiment (SubX) of the National Centers for Environmental Prediction (NCEP). The ensemble model output statistics (EMOS) method is conducted as the benchmark for comparison. The results show that all the post-processing methods can efficiently reduce the prediction biases and uncertainties, especially in the lead week 1–2. The two machine learning methods outperform EMOS by approximately 0.2 in terms of the continuous ranked probability score (CRPS) overall. The neural networks and NGBoost behave as the best models in more than 90% of the study area over the validation period. In our study, CRPS, which is not a common loss function in machine learning, is introduced to make probabilistic forecasting possible for traditional neural networks. Moreover, we extend the NGBoost model to atmospheric sciences of probabilistic temperature forecasting which obtains satisfying performances.


2015 ◽  
Vol 8 (8) ◽  
pp. 2645-2653 ◽  
Author(s):  
C. G. Nunalee ◽  
Á. Horváth ◽  
S. Basu

Abstract. Recent decades have witnessed a drastic increase in the fidelity of numerical weather prediction (NWP) modeling. Currently, both research-grade and operational NWP models regularly perform simulations with horizontal grid spacings as fine as 1 km. This migration towards higher resolution potentially improves NWP model solutions by increasing the resolvability of mesoscale processes and reducing dependency on empirical physics parameterizations. However, at the same time, the accuracy of high-resolution simulations, particularly in the atmospheric boundary layer (ABL), is also sensitive to orographic forcing which can have significant variability on the same spatial scale as, or smaller than, NWP model grids. Despite this sensitivity, many high-resolution atmospheric simulations do not consider uncertainty with respect to selection of static terrain height data set. In this paper, we use the Weather Research and Forecasting (WRF) model to simulate realistic cases of lower tropospheric flow over and downstream of mountainous islands using the default global 30 s United States Geographic Survey terrain height data set (GTOPO30), the Shuttle Radar Topography Mission (SRTM), and the Global Multi-resolution Terrain Elevation Data set (GMTED2010) terrain height data sets. While the differences between the SRTM-based and GMTED2010-based simulations are extremely small, the GTOPO30-based simulations differ significantly. Our results demonstrate cases where the differences between the source terrain data sets are significant enough to produce entirely different orographic wake mechanics, such as vortex shedding vs. no vortex shedding. These results are also compared to MODIS visible satellite imagery and ASCAT near-surface wind retrievals. Collectively, these results highlight the importance of utilizing accurate static orographic boundary conditions when running high-resolution mesoscale models.


Author(s):  
Hayk Grigoryan ◽  
Rita Abrahamyan

The Lesser Caucasus Mountains are crossing through the territory of Armenia, creating vast differences in altitude, terrain, temperature and precipitation in provinces and towns. Even Armenia’s lowlands are 500 to 1500m above sea level. Armenias highlands extend up to Aragats mountain at 4090m where, 75% of the territory is above 1000m, 50% is above 2000m, and 3.4% is above 3000m. This paper presents a cloud service with interactive visualization and analytical capabilities for weather data in Armenia by integrating the two existing infrastructures for observational data and numerical weather prediction. The weather data used in the platform consist of near-surface atmospheric elements including air temperature, relative humidity, pressure, wind and precipitation. The visualization and analitycs have been implemented for 2m air temperature. Cloud service provides the Armenian State Hydrometeorological and Monitoring Service with analytical capabilities to make a comparative analysis between the observation data and the results of a numerical weather prediction model for per station and region for a given period.


Author(s):  
Simon Veldkamp ◽  
Kirien Whan ◽  
Sjoerd Dirksen ◽  
Maurice Schmeits

AbstractCurrent statistical post-processing methods for probabilistic weather forecasting are not capable of using full spatial patterns from the numerical weather prediction (NWP) model. In this paper we incorporate spatial wind speed information by using convolutional neural networks (CNNs) and obtain probabilistic wind speed forecasts in the Netherlands for 48 hours ahead, based on KNMI’s deterministic Harmonie-Arome NWP model. The probabilistic forecasts from the CNNs are shown to have higher Brier skill scores for medium to higher wind speeds, as well as a better continuous ranked probability score (CRPS) and logarithmic score, than the forecasts from fully connected neural networks and quantile regression forests. As a secondary result, we have compared the CNNs using 3 different density estimation methods (quantized softmax (QS), kernel mixture networks, and fitting a truncated normal distribution), and found the probabilistic forecasts based on the QS method to be best.


Author(s):  
SINCLAIR CHINYOKA ◽  
GERT-JAN STEENEVELD ◽  
THIERRY HEDDE

AbstractThis study improves surface wind predictions in an unresolved valley using an artificial neural network (ANN). Forecasting winds in complex terrain with a mesoscale model is challenging. This study assesses the quality of 3-km wind forecasts by the Weather Research and Forecasting (WRF) model and the potential of post-processing by an ANN within the 1-2 km wide Cadarache Valley in southeast France. Operational wind forecasts for 110m above ground level and the near-surface vertical potential temperature gradient with a lead time of 24-48h were used as ANN input. Observed horizontal wind components at 10m within the valley were used as targets during ANN training. We use the Directional ACCuracy (DACC45, wind direction error ≤ 45°) and mean absolute error to evaluate the WRF direct model output and the ANN results. By post-processing, the score for DACC45 improves from 56% in the WRF direct model output to 79% after applying the ANN. Furthermore, the ANN performed well during the day and night, but poorly during the morning and afternoon transitions. The ANN improves the DACC45 at 10m even for poor WRF forecasts (direction bias ≥ 45°) from 42% to 72%. A shorter lead time and finer grid spacing (1 km) showed negligible impact which suggests that a 3 km grid spacing and a 24-48h lead time is effective and relatively cheap to apply. We find that WRF performs well in near-neutral conditions and poorly in other atmospheric stability conditions. The ANN post-treatment consistently improves the wind forecast for all stability classes to a DACC45 of about 80%. The study demonstrates the ability to improve Cadarache valley wind forecasts using an ANN as post-processing for WRF daily forecasts.


2020 ◽  
Author(s):  
Kang Yanyan ◽  
Li Haochen ◽  
Xia Jiangjiang ◽  
Zhang Yingxin

<p>    Weather forecasts play an important role in the Olympic game,especially the mountain snow projects, which will help to find a "window period" for the game. The winter Olympics track is located on very complex terrain, and a detailed weather forecast is needed. A Post-processing method based on machine learning is used for the future-10-days weather prediction with 1-km spatial resolution and 1-hour temporal resolution, which can greatly improve accuracy and refinement of numerical weather prediction(NWP). The ECWMF/RMAPS model data and the automatic weather station data(AWS) from 2015-2018 are prepared for the training data and test data, included 48 features and 4 labels (the observed 2m temperature, relative humidity , 10m wind speed and wind direction ). The model data are grid point, while the AWS data are station point. We take the nearest 9 model point to predict the station point, instead of making an interpolation between the grid point and station point. Then the feature number will be 48*9 in dataset. The interpolation error from grid point to station is eliminated,and the spatial distribution is considered to some extent. Machine leaning method we used are SVM, Random Forest, Gradient Boosting Decision Tree(GBDT) and XGBoost. We find that XGBoost method performs best, slightly better than GBDT and Random Forest. It is noted that we did some feature engineering work before training, and we found that it’s not that the more features, the better the model, while 10 features are enough. Also there is an interesting thing that the features that closely related the labels values becomes less important as the forecast time increases,such as the model outputed 2m temperature, 10m wind speed and wind direction. While some features that forecasters don’t pay attention to become more important in the 6-10 days prediction, such as latent heat flux, snow depth and so on. So it’s necessary to train the model based on dynamic weight parameters for different forecast time. Through the post-processing based on the machine learning method, the forecast accuracy has been greatly improved compared with EC model. The averaged forecast accuracy of 0-10 days for 2m relative humidity, 10m wind speed and direction has been increased by almost 15%, and the temperature accuracy has been increased by 20%~40% ( 40% for 0-3 days, and the accuracy decreased with the forecast time ). </p>


Sign in / Sign up

Export Citation Format

Share Document