Comparison of data-driven methods for downscaling ensemble weather forecasts

Abstract. This study investigates dynamically different data-driven methods, specifically a statistical downscaling model (SDSM), a time lagged feedforward neural network (TLFN), and an evolutionary polynomial regression (EPR) technique for downscaling numerical weather ensemble forecasts generated by a medium range forecast (MRF) model. Given the coarse resolution (about 200-km grid spacing) of the MRF model, an optimal use of the weather forecasts at the local or watershed scale, requires appropriate downscaling techniques. The selected methods are applied for downscaling ensemble daily precipitation and temperature series for the Chute-du-Diable basin located in northeastern Canada. The downscaling results show that the TLFN and EPR have similar performance in downscaling ensemble daily precipitation as well as daily maximum and minimum temperature series whatever the season. Both the TLFN and EPR are more efficient downscaling techniques than SDSM for both the ensemble daily precipitation and temperature.

Download Full-text

Comparison of data-driven methods for downscaling ensemble weather forecasts

Hydrology and Earth System Sciences ◽

10.5194/hess-12-615-2008 ◽

2008 ◽

Vol 12 (2) ◽

pp. 615-624 ◽

Cited By ~ 14

Author(s):

◽

P. Coulibaly ◽

N. Evora

Keyword(s):

Polynomial Regression ◽

Daily Precipitation ◽

Temperature Series ◽

Data Driven ◽

Daily Maximum ◽

Ensemble Forecasts ◽

Weather Forecasts ◽

Evolutionary Polynomial Regression ◽

Medium Range Forecast ◽

Precipitation And Temperature

Download Full-text

Experimental investigation of the predictive capabilities of data driven modeling techniques in hydrology – Part 1: Concepts and methodology

Hydrology and Earth System Sciences Discussions ◽

10.5194/hessd-6-7055-2009 ◽

2009 ◽

Vol 6 (6) ◽

pp. 7055-7093 ◽

Cited By ~ 4

Author(s):

A. Elshorbagy ◽

G. Corzo ◽

S. Srinivasulu ◽

D. P. Solomatine

Keyword(s):

Polynomial Regression ◽

Predictive Accuracy ◽

Lower Layer ◽

Data Driven ◽

Support Vector ◽

K Nearest Neighbors ◽

Evolutionary Polynomial Regression ◽

Modeling Techniques ◽

Modeling Experiment ◽

Data Driven Modeling

Abstract. A comprehensive data driven modeling experiment is presented in two-part paper. In this first part, an extensive data-driven modeling experiment is proposed. The most important concerns regarding the way data driven modeling (DDM) techniques and data were handled, compared, and evaluated, and the basis on which findings and conclusions were drawn are discussed. A concise review of key articles that presented comparisons among various DDM techniques is presented. Six DDM techniques, namely, neural networks, genetic programming, evolutionary polynomial regression, support vector machines, M5 model trees, and K-nearest neighbors are proposed and explained. Multiple linear regression and naïve models are also suggested as baseline for comparison with the various techniques. Five datasets from Canada and Europe representing evapotranspiration, upper and lower layer soil moisture content, and rainfall-runoff process are described and proposed for the modeling experiment. Twelve different realizations (groups) from each dataset are created by a procedure involving random sampling. Each group contains three subsets; training, cross-validation, and testing. Each modeling technique is proposed to be applied to each of the 12 groups of each dataset. This way, both predictive accuracy and uncertainty of the modeling techniques can be evaluated. The implementation of the modeling techniques, results and analysis, and the findings of the modeling experiment are deferred to the second part of this paper.

Download Full-text

Comparison of three data-driven techniques in modelling the evapotranspiration process

Journal of Hydroinformatics ◽

10.2166/hydro.2010.029 ◽

2010 ◽

Vol 12 (4) ◽

pp. 365-379 ◽

Cited By ~ 21

Author(s):

I. El-Baroudy ◽

A. Elshorbagy ◽

S. K. Carey ◽

O. Giustolisi ◽

D. Savic

Keyword(s):

Case Studies ◽

Water Resource Management ◽

Polynomial Regression ◽

Hydrological Cycle ◽

Global Scale ◽

Data Driven ◽

Actual Evapotranspiration ◽

Time Lags ◽

Evolutionary Polynomial Regression ◽

Input Variables

Evapotranspiration is one of the main components of the hydrological cycle as it accounts for more than two-thirds of the precipitation losses at the global scale. Reliable estimates of actual evapotranspiration are crucial for effective watershed modelling and water resource management, yet direct measurements of the evapotranspiration losses are difficult and expensive. This research explores the utility and effectiveness of data-driven techniques in modelling actual evapotranspiration measured by an eddy covariance system. The authors compare the Evolutionary Polynomial Regression (EPR) performance to Artificial Neural Networks (ANNs) and Genetic Programming (GP). Furthermore, this research investigates the effect of previous states (time lags) of the meteorological input variables on characterizing actual evapotranspiration. The models developed using the EPR, based on the two case studies at the Mildred Lake mine, AB, Canada provided comparable performance to the models of GP and ANNs. Moreover, the EPR provided simpler models than those developed by the other data-driven techniques, particularly in one of the case studies. The inclusion of the previous states of the input variables slightly enhanced the performance of the developed model, which in turn indicates the dynamic nature of the evapotranspiration process.

Download Full-text

Spatio-temporal variation of extreme indices derived from observed and reanalysis products for detection of climate change across India

10.5194/egusphere-egu2020-5837 ◽

2020 ◽

Author(s):

Sachidanand Kumar ◽

Kironmala Chanda ◽

Srinivas Pasupuleti

Keyword(s):

Climate Change ◽

Extreme Precipitation ◽

Daily Precipitation ◽

Temporal Trends ◽

Maximum Temperature ◽

Daily Minimum Temperature ◽

Daily Maximum Temperature ◽

Daily Maximum ◽

Spatio Temporal ◽

Precipitation And Temperature

AbstractThis article reports the research findings in a recent study (Kumar et al., 2020) that utilizes eight indices of climate change recommended by the Expert Team on Climate Change Detection and Indices (ETCCDI) for analyzing spatio-temporal trends in extreme precipitation and temperature at the daily scale across India. Observed gridded precipitation (1971-2017) and temperature (1971-2013) datasets from India Meteorological Department (IMD) are used along with reanalysis products from Climate Prediction Centre (CPC). The trends are estimated using non-parametric Mann-Kendall (MK) test and regression analysis. The trends in &#8216;wet days&#8217; (daily precipitation greater than 95th percentile) and &#8216;dry days&#8217; (daily precipitation lower than 5th percentile) are examined considering the entire year (annual) as well as monsoon months only (seasonal). At the annual scale, about 13% of the grid locations indicated significant trend (either increasing or decreasing at 5% significance level) in the index R95p (rainfall contribution from extreme &#8216;wet days&#8217;) while 20% of the locations indicated significant trend in R5p (rainfall contribution from extreme &#8216;dry days&#8217;). For the seasonal analysis (June to September), the corresponding figures are nil and 21% respectively. The spatio-temporal trends in &#8216;warm days&#8217; (daily maximum temperature greater than 95th percentile), &#8216;warm nights&#8217; (daily minimum temperature greater than 95th percentile), &#8216;cold days&#8217; (daily maximum temperature lower than 5th percentile) and &#8216;cold nights&#8217; (daily minimum temperature lower than 5th percentile) are also investigated for the aforementioned period. The number of &#8216;warm days&#8217; per year increased significantly at 14% of the locations, while the number of &#8216;cold days&#8217;, &#8216;warm nights&#8217; and &#8216;cold nights&#8217; per year decreased significantly at several (42%, 34% and 39%) of the locations. The extreme temperature indices are also investigated for the future using CanESM2 projected data for RCP8.5 after suitable bias correction. Most of the locations (49% to 84%) indicate significant increasing (decreasing) trend in &#8216;warm days&#8217; (&#8216;cold days&#8217;) in the three epochs, 2006-2040, 2041-2070 and 2071-2100. Moreover, most locations (60% to 81%) show an increasing trend in &#8216;warm nights&#8217; and a decreasing trend in &#8216;cold nights&#8217; in all the epochs. A similar investigation for the historical and future periods using CPC data as the reference indicates that the trends, on comparison with IMD observations, seem to be in agreement for temperature extremes but spatially more extensive in case of CPC precipitation extremes.Keywords: extreme precipitation and temperature, climate change indices, spatio-temporal variation, IndiaReferences:Kumar S., Chanda, K., Srinivas P., (2020), Spatiotemporal analysis of extreme indices derived from daily precipitation and temperature for climate change detection over India, Theoretical and Applied Climatology, Springer, In press, DOI: 10.1007/s00704-020-03088-5.

Download Full-text

Crisp discharge forecasts and grey uncertainty bands using data-driven models

Hydrology Research ◽

10.2166/nh.2012.121 ◽

2012 ◽

Vol 43 (5) ◽

pp. 589-602 ◽

Cited By ~ 9

Author(s):

S. Alvisi ◽

E. Creaco ◽

M. Franchini

Keyword(s):

Polynomial Regression ◽

Standard Technique ◽

Least Square ◽

Data Driven ◽

Lead Times ◽

Ann Model ◽

Total Uncertainty ◽

Evolutionary Polynomial Regression ◽

Similar Accuracy ◽

Comparison Of The Results

A data-driven artificial neural network (ANN) model and a data-driven evolutionary polynomial regression (EPR) model are here used to set up two real-time crisp discharge forecasting models whose crisp parameters are estimated through the least-square criterion. In order to represent the total uncertainty of each model in performing the forecast, their parameters are then considered as grey numbers. Comparison of the results obtained through the application of the two models to a real case study shows that the crisp models based on ANN and EPR provide similar accuracy for short forecasting lead times; for long forecasting lead times, the performance of the EPR model deteriorates with respect to that of the ANN model. As regards the uncertainty bands produced by the grey formulation of the two data-driven models, it is shown that, in the ANN case, these bands are on average narrower than those obtained by using a standard technique such as the Box–Cox transformation of the errors; in the EPR case, these bands are on average larger. These results therefore suggest that the performance of a grey data-driven model depends on its inner structure and that, for the specific models here considered, the ANN is to be preferred.

Download Full-text

A symbolic data-driven technique based on evolutionary polynomial regression

Journal of Hydroinformatics ◽

10.2166/hydro.2006.020b ◽

2006 ◽

Vol 8 (3) ◽

pp. 207-222 ◽

Cited By ~ 174

Author(s):

Orazio Giustolisi ◽

Dragan A. Savic

Keyword(s):

Polynomial Regression ◽

Computing Methodology ◽

Resistance Coefficient ◽

Regression Method ◽

Data Driven ◽

Evolutionary Polynomial Regression ◽

Symbolic Data ◽

Computational Performance ◽

Regression Techniques ◽

Physical Insight

This paper describes a new hybrid regression method that combines the best features of conventional numerical regression techniques with the genetic programming symbolic regression technique. The key idea is to employ an evolutionary computing methodology to search for a model of the system/process being modelled and to employ parameter estimation to obtain constants using least squares. The new technique, termed Evolutionary Polynomial Regression (EPR) overcomes shortcomings in the GP process, such as computational performance; number of evolutionary parameters to tune and complexity of the symbolic models. Similarly, it alleviates issues arising from numerical regression, including difficulties in using physical insight and over-fitting problems. This paper demonstrates that EPR is good, both in interpolating data and in scientific knowledge discovery. As an illustration, EPR is used to identify polynomial formulæ with progressively increasing levels of noise, to interpolate the Colebrook-White formula for a pipe resistance coefficient and to discover a formula for a resistance coefficient from experimental data.

Download Full-text

Experimental investigation of the predictive capabilities of data driven modeling techniques in hydrology - Part 1: Concepts and methodology

Hydrology and Earth System Sciences ◽

10.5194/hess-14-1931-2010 ◽

2010 ◽

Vol 14 (10) ◽

pp. 1931-1941 ◽

Cited By ~ 120

Author(s):

A. Elshorbagy ◽

G. Corzo ◽

S. Srinivasulu ◽

D. P. Solomatine

Keyword(s):

Polynomial Regression ◽

Lower Layer ◽

Data Driven ◽

Support Vector ◽

K Nearest Neighbors ◽

Evolutionary Polynomial Regression ◽

Vector Machines ◽

Modeling Techniques ◽

Modeling Experiment ◽

Data Driven Modeling

Abstract. A comprehensive data driven modeling experiment is presented in a two-part paper. In this first part, an extensive data-driven modeling experiment is proposed. The most important concerns regarding the way data driven modeling (DDM) techniques and data were handled, compared, and evaluated, and the basis on which findings and conclusions were drawn are discussed. A concise review of key articles that presented comparisons among various DDM techniques is presented. Six DDM techniques, namely, neural networks, genetic programming, evolutionary polynomial regression, support vector machines, M5 model trees, and K-nearest neighbors are proposed and explained. Multiple linear regression and naïve models are also suggested as baseline for comparison with the various techniques. Five datasets from Canada and Europe representing evapotranspiration, upper and lower layer soil moisture content, and rainfall-runoff process are described and proposed, in the second paper, for the modeling experiment. Twelve different realizations (groups) from each dataset are created by a procedure involving random sampling. Each group contains three subsets; training, cross-validation, and testing. Each modeling technique is proposed to be applied to each of the 12 groups of each dataset. This way, both prediction accuracy and uncertainty of the modeling techniques can be evaluated. The description of the datasets, the implementation of the modeling techniques, results and analysis, and the findings of the modeling experiment are deferred to the second part of this paper.

Download Full-text

Downscaling Precipitation and Temperature with Temporal Neural Networks

Journal of Hydrometeorology ◽

10.1175/jhm409.1 ◽

2005 ◽

Vol 6 (4) ◽

pp. 483-496 ◽

Cited By ~ 102

Author(s):

Paulin Coulibaly ◽

Yonas B. Dibike ◽

François Anctil

Keyword(s):

Neural Network ◽

Climate Model ◽

Global Climate ◽

Global Climate Model ◽

Temperature Series ◽

Specific Humidity ◽

Daily Maximum ◽

Impact Studies ◽

Hydrologic Impact ◽

Precipitation And Temperature

Abstract The issues of downscaling the outputs of a global climate model (GCM) to a scale that is appropriate to hydrological impact studies are investigated using a temporal neural network approach. The time-lagged feed-forward neural network (TLFN) is proposed for downscaling daily total precipitation and daily maximum and minimum temperature series for the Serpent River watershed in northern Quebec (Canada). The downscaling models are developed and validated using large-scale predictor variables derived from the National Centers for Environmental Prediction–National Center for Atmospheric Research (NCEP–NCAR) reanalysis dataset. Atmospheric predictors such as specific humidity, wind velocity, and geopotential height are identified as the most relevant inputs to the downscaling models. The performance of the TLFN downscaling model is also compared to a statistical downscaling model (SDSM). The downscaling results suggest that the TLFN is an efficient method for downscaling both daily precipitation and temperature series. The best downscaling models were then applied to the outputs of the Canadian Global Climate Model (CGCM1), forced with the Intergovernmental Panel on Climate Change (IPCC) IS92a scenario. Changes in average precipitation between the current and the future scenarios predicted by the TLFN are generally found to be smaller than those predicted by the SDSM model. Furthermore, application of the downscaled data for hydrologic impact analysis in the Serpent River resulted in an overall increasing trend in mean annual flow as well as earlier spring peak flow. The results also demonstrate the emphasis that should be given in identifying the appropriate downscaling tools for impact studies by showing how a future climate scenario downscaled with different downscaling methods could result in significantly different hydrologic impact simulation results for the same watershed.

Download Full-text

Postprocessing of Ensemble Weather Forecasts Using a Stochastic Weather Generator

Monthly Weather Review ◽

10.1175/mwr-d-13-00180.1 ◽

2014 ◽

Vol 142 (3) ◽

pp. 1106-1124 ◽

Cited By ~ 13

Author(s):

Jie Chen ◽

François P. Brissette ◽

Zhi Li

Keyword(s):

Bias Correction ◽

Weather Forecast ◽

Correction Method ◽

Weather Generator ◽

Ensemble Forecasts ◽

Stochastic Weather Generator ◽

Weather Forecasts ◽

Bias Correction Method ◽

Ensemble Weather Forecasts ◽

Precipitation And Temperature

Abstract This study proposes a new statistical method for postprocessing ensemble weather forecasts using a stochastic weather generator. Key parameters of the weather generator were linked to the ensemble forecast means for both precipitation and temperature, allowing the generation of an infinite number of daily times series that are fully coherent with the ensemble weather forecast. This method was verified through postprocessing reforecast datasets derived from the Global Forecast System (GFS) for forecast leads ranging between 1 and 7 days over two Canadian watersheds in the Province of Quebec. The calibration of the ensemble weather forecasts was based on a cross-validation approach that leaves one year out for validation and uses the remaining years for training the model. The proposed method was compared with a simple bias correction method for ensemble precipitation and temperature forecasts using a set of deterministic and probabilistic metrics. The results show underdispersion and biases for the raw GFS ensemble weather forecasts, which indicated that they were poorly calibrated. The proposed method significantly increased the predictive power of ensemble weather forecasts for forecast leads ranging between 1 and 7 days, and was consistently better than the bias correction method. The ability to generate discrete, autocorrelated daily time series leads to ensemble weather forecasts’ straightforward use in forecasting models commonly used in the fields of hydrology or agriculture. This study further indicates that the calibration of ensemble forecasts for a period up to one week is reasonable for precipitation, and for temperature it could be reasonable for another week.

Download Full-text