Coupled machine learning and the limits of acceptability approach applied in parameter identification for a distributed hydrological model

Abstract. Monte Carlo (MC) methods have been widely used in uncertainty analysis and parameter identification for hydrological models. The main challenge with these approaches is, however, the prohibitive number of model runs required to get an adequate sample size which may take from days to months especially when the simulations are run in distributed mode. In the past, emulators have been used to minimize the computational burden of the MC simulation through direct estimation of the residual based response surfaces. Here, we apply emulators of MC simulation in parameter identification for a distributed conceptual hydrological model using two likelihood measures, i.e. the absolute bias of model predictions (Score) and another based on the time relaxed limits of acceptability concept (pLoA). Three machine learning models (MLMs) were built using model parameter sets and response surfaces with limited number of model realizations (4000). The developed MLMs were applied to predict pLoA and Score for a large set of model parameters (95 000). The behavioural parameter sets were identified using a time relaxed limits of acceptability approach based on the predicted pLoA values; and applied to estimate the quantile streamflow predictions weighted by their respective Score. The three MLMs were able to adequately mimic the response surfaces directly estimated from MC simulations; and the models identified using the coupled ML emulators and the limits of acceptability approach have performed very well in reproducing the median streamflow prediction both during the calibration and validation periods.

Download Full-text

Coupled machine learning and the limits of acceptability approach applied in parameter identification for a distributed hydrological model

Hydrology and Earth System Sciences ◽

10.5194/hess-24-4641-2020 ◽

2020 ◽

Vol 24 (9) ◽

pp. 4641-4658 ◽

Cited By ~ 1

Author(s):

Aynom T. Teweldebrhan ◽

Thomas V. Schuler ◽

John F. Burkhart ◽

Morten Hjorth-Jensen

Keyword(s):

Machine Learning ◽

Parameter Identification ◽

Hydrological Model ◽

Response Surfaces ◽

Model Parameters ◽

Large Set ◽

Distributed Hydrological Model ◽

Streamflow Prediction ◽

Main Challenge ◽

Mc Simulation

Abstract. Monte Carlo (MC) methods have been widely used in uncertainty analysis and parameter identification for hydrological models. The main challenge with these approaches is, however, the prohibitive number of model runs required to acquire an adequate sample size, which may take from days to months – especially when the simulations are run in distributed mode. In the past, emulators have been used to minimize the computational burden of the MC simulation through direct estimation of the residual-based response surfaces. Here, we apply emulators of an MC simulation in parameter identification for a distributed conceptual hydrological model using two likelihood measures, i.e. the absolute bias of model predictions (Score) and another based on the time-relaxed limits of acceptability concept (pLoA). Three machine-learning models (MLMs) were built using model parameter sets and response surfaces with a limited number of model realizations (4000). The developed MLMs were applied to predict pLoA and Score for a large set of model parameters (95 000). The behavioural parameter sets were identified using a time-relaxed limits of acceptability approach, based on the predicted pLoA values, and applied to estimate the quantile streamflow predictions weighted by their respective Score. The three MLMs were able to adequately mimic the response surfaces directly estimated from MC simulations with an R2 value of 0.7 to 0.92. Similarly, the models identified using the coupled machine-learning (ML) emulators and limits of acceptability approach have performed very well in reproducing the median streamflow prediction during the calibration and validation periods, with an average Nash–Sutcliffe efficiency value of 0.89 and 0.83, respectively.

Download Full-text

Improving streamflow predictions at ungauged locations with real-time updating: application of an EnKF-based state-parameter estimation strategy

Hydrology and Earth System Sciences ◽

10.5194/hess-18-3923-2014 ◽

2014 ◽

Vol 18 (10) ◽

pp. 3923-3936 ◽

Cited By ~ 20

Author(s):

X. Xie ◽

S. Meng ◽

S. Liang ◽

Y. Yao

Keyword(s):

Real Time ◽

Land Surface ◽

Hydrological Model ◽

Similarity Measures ◽

State Parameter ◽

Model Parameters ◽

Prediction Errors ◽

Distributed Hydrological Model ◽

Streamflow Prediction ◽

Short Period

Abstract. The challenge of streamflow predictions at ungauged locations is primarily attributed to various uncertainties in hydrological modelling. Many studies have been devoted to addressing this issue. The similarity regionalization approach, a commonly used strategy, is usually limited by subjective selection of similarity measures. This paper presents an application of a partitioned update scheme based on the ensemble Kalman filter (EnKF) to reduce the prediction uncertainties. This scheme performs real-time updating for states and parameters of a distributed hydrological model by assimilating gauged streamflow. The streamflow predictions are constrained by the physical rainfall-runoff processes defined in the distributed hydrological model and by the correlation information transferred from gauged to ungauged basins. This scheme is successfully demonstrated in a nested basin with real-world hydrological data where the subbasins have immediate upstream and downstream neighbours. The results suggest that the assimilated observed data from downstream neighbours have more important roles in reducing the streamflow prediction errors at ungauged locations. The real-time updated model parameters remain stable with reasonable spreads after short-period assimilation, while their estimation trajectories have slow variations, which may be attributable to climate and land surface changes. Although this real-time updating scheme is intended for streamflow predictions in nested basins, it can be a valuable tool in separate basins to improve hydrological predictions by assimilating multi-source data sets, including ground-based and remote-sensing observations.

Download Full-text

Improving streamflow predictions at ungauged locations with real-time updating: application of an EnKF-based state-parameter estimation strategy

Hydrology and Earth System Sciences Discussions ◽

10.5194/hessd-10-13441-2013 ◽

2013 ◽

Vol 10 (11) ◽

pp. 13441-13473

Author(s):

X. Xie ◽

S. Meng ◽

S. Liang ◽

Y. Yao

Keyword(s):

Real Time ◽

Land Surface ◽

Hydrological Model ◽

Similarity Measures ◽

State Parameter ◽

Model Parameters ◽

Prediction Errors ◽

Distributed Hydrological Model ◽

Streamflow Prediction ◽

Short Period

Abstract. The challenge of streamflow predictions at ungauged locations is primarily attributed to various uncertainties in hydrological modelling. Many studies have been devoted to addressing this issue. The similarity regionalization approach, a commonly used strategy, is usually limited by subjective selection of similarity measures. This paper presents an application of a portioned update scheme based on the ensemble Kalman filter (EnKF) to reduce the prediction uncertainties. This scheme performs real-time updating for states and parameters of a distributed hydrological model by assimilating gauged streamflow. The streamflow predictions are constrained by the physical rainfall-runoff processes defined in the distributed hydrological model and by the correlation information transferred from gauged to ungauged basins. This scheme is successfully demonstrated in a nested basin with real-world hydrological data where the subbasins have immediate upstream and downstream neighbours. The results suggest that the assimilated observed data from downstream neighbours have more important roles in reducing the streamflow prediction errors at ungauged locations. The real-time updated model parameters remain stable after short-period assimilation, while their estimation trajectories have slow variations, which may be attributable to climate and land surface changes. Although this real-time updating scheme is intended for streamflow predictions in nested basins, it can be a valuable tool in separate basins to improve hydrological predictions by assimilating multi-source datasets, e.g. ground-based and remote-sensing observations.

Download Full-text

Conditioning Model Ensembles to Various Observed Data (Field and Regional Level) by Applying Machine-Learning-Augmented Workflows to a Mature Field with 70 Years of Production History

SPE Reservoir Evaluation & Engineering ◽

10.2118/205188-pa ◽

2021 ◽

pp. 1-18

Author(s):

Gisela Vanegas ◽

John Nejedlik ◽

Pascale Neff ◽

Torsten Clemens

Keyword(s):

Machine Learning ◽

Oil Recovery ◽

Numerical Models ◽

Operating Conditions ◽

Model Parameters ◽

Large Set ◽

Model Parameter ◽

Production History ◽

Hydrocarbon Fields ◽

Parameter Distributions

Summary Forecasting production from hydrocarbon fields is challenging because of the large number of uncertain model parameters and the multitude of observed data that are measured. The large number of model parameters leads to uncertainty in the production forecast from hydrocarbon fields. Changing operating conditions [e.g., implementation of improved oil recovery or enhanced oil recovery (EOR)] results in model parameters becoming sensitive in the forecast that were not sensitive during the production history. Hence, simulation approaches need to be able to address uncertainty in model parameters as well as conditioning numerical models to a multitude of different observed data. Sampling from distributions of various geological and dynamic parameters allows for the generation of an ensemble of numerical models that could be falsified using principal-component analysis (PCA) for different observed data. If the numerical models are not falsified, machine-learning (ML) approaches can be used to generate a large set of parameter combinations that can be conditioned to the different observed data. The data conditioning is followed by a final step ensuring that parameter interactions are covered. The methodology was applied to a sandstone oil reservoir with more than 70 years of production history containing dozens of wells. The resulting ensemble of numerical models is conditioned to all observed data. Furthermore, the resulting posterior-model parameter distributions are only modified from the prior-model parameter distributions if the observed data are informative for the model parameters. Hence, changes in operating conditions can be forecast under uncertainty, which is essential if nonsensitive parameters in the history are sensitive in the forecast.

Download Full-text

Diagnostic calibration of a hydrological model in an alpine area

Hydrology and Earth System Sciences Discussions ◽

10.5194/hessd-11-1253-2014 ◽

2014 ◽

Vol 11 (1) ◽

pp. 1253-1300 ◽

Cited By ~ 4

Author(s):

Z. He ◽

F. Tian ◽

H. C. Hu ◽

H. V. Gupta ◽

H. P. Hu

Keyword(s):

Hydrological Modeling ◽

Hydrological Model ◽

Generation Mechanism ◽

Spatiotemporal Variability ◽

Model Parameters ◽

Distributed Hydrological Model ◽

Runoff Generation ◽

Alpine Area ◽

Long Time ◽

Tailan River Basin

Abstract. Hydrological modeling depends on single- or multiple-objective strategies for parameter calibration using long time sequences of observed streamflow. Here, we demonstrate a diagnostic approach to the calibration of a hydrological model of an alpine area in which we partition the hydrograph based on the dominant runoff generation mechanism (groundwater baseflow, glacier melt, snowmelt, and direct runoff). The partitioning reflects the spatiotemporal variability in snowpack, glaciers, and temperature. Model parameters are grouped by runoff generation mechanism, and each group is calibrated separately via a stepwise approach. This strategy helps to reduce the problem of equifinality and, hence, model uncertainty. We demonstrate the method for the Tailan River basin (1324 km2) in the Tianshan Mountains of China with the help of a semi-distributed hydrological model (THREW).

Download Full-text

Using nowcasting technique and data assimilation in a meteorological model to improve very short range hydrological forecasts

Hydrology and Earth System Sciences ◽

10.5194/hess-23-3823-2019 ◽

2019 ◽

Vol 23 (9) ◽

pp. 3823-3841 ◽

Cited By ~ 7

Author(s):

Maria Laura Poletti ◽

Francesco Silvestro ◽

Silvio Davolio ◽

Flavio Pignone ◽

Nicola Rebora

Keyword(s):

Data Assimilation ◽

Hydrological Model ◽

Weather Prediction ◽

Distributed Hydrological Model ◽

Streamflow Prediction ◽

Flood Prediction ◽

Meteorological Model ◽

Distributed Approach ◽

Nwp Model ◽

Small Catchments

Abstract. Forecasting flash floods some hours in advance is still a challenge, especially in environments made up of many small catchments. Hydrometeorological forecasting systems generally allow for predicting the possibility of having very intense rainfall events on quite large areas with good performances, even with 12–24 h of anticipation. However, they are not able to predict the exact rainfall location if we consider portions of a territory of 10 to 1000 km2 as the order of magnitude. The scope of this work is to exploit both observations and modelling sources to improve the discharge prediction in small catchments with a lead time of 2–8 h. The models used to achieve the goal are essentially (i) a probabilistic rainfall nowcasting model able to extrapolate the rainfall evolution from observations, (ii) a non-hydrostatic high-resolution numerical weather prediction (NWP) model and (iii) a distributed hydrological model able to provide a streamflow prediction in each pixel of the studied domain. These tools are used, together with radar observations, in a synergistic way, exploiting the information of each element in order to complement each other. For this purpose observations are used in a frequently updated data assimilation framework to drive the NWP system, whose output is in turn used to improve the information as input to the nowcasting technique in terms of a predicted rainfall volume trend; finally nowcasting and NWP outputs are blended, generating an ensemble of rainfall scenarios used to feed the hydrological model and produce a prediction in terms of streamflow. The flood prediction system is applied to three major events that occurred in the Liguria region (Italy) first to produce a standard analysis on predefined basin control sections and then using a distributed approach that exploits the capabilities of the employed hydrological model. The results obtained for these three analysed events show that the use of the present approach is promising. Even if not in all the cases, the blending technique clearly enhances the prediction capacity of the hydrological nowcasting chain with respect to the use of input coming only from the nowcasting technique; moreover, a worsening of the performance is observed less, and it is nevertheless ascribable to the critical transition between the nowcasting and the NWP model rainfall field.

Download Full-text

Streamflow prediction in ungauged basins: benchmarking the efficiency of deep learning

E3S Web of Conferences ◽

10.1051/e3sconf/202016301001 ◽

2020 ◽

Vol 163 ◽

pp. 01001

Author(s):

Georgy Ayzel ◽

Liubov Kurochkina ◽

Eduard Kazakov ◽

Sergei Zhuravlev

Keyword(s):

Deep Learning ◽

Hydrological Model ◽

State Of The Art ◽

Flash Flood ◽

Early Warning Systems ◽

Model Parameters ◽

Ungauged Basins ◽

Streamflow Prediction ◽

Basin Scale ◽

The Impact

Streamflow prediction is a vital public service that helps to establish flash-flood early warning systems or assess the impact of projected climate change on water management. However, the availability of streamflow observations limits the utilization of the state-of-the-art streamflow prediction techniques to the basins where hydrometric gauging stations exist. Since the most river basins in the world are ungauged, the development of the specialized techniques for the reliable streamflow prediction in ungauged basins (PUB) is of crucial importance. In recent years, the emerging field of deep learning provides a myriad of new models that can breathe new life into the stagnating PUB methods. In the presented study, we benchmark the streamflow prediction efficiency of Long Short-Term Memory (LSTM) networks against the standard technique of GR4J hydrological model parameters regionalization (HMREG) at 200 basins in Northwest Russia. Results show that the LSTM-based regional hydrological model significantly outperforms the HMREG scheme in terms of median Nash-Sutcliffe efficiency (NSE), which is 0.73 and 0.61 for LSTM and HMREG, respectively. Moreover, LSTM demonstrates the comparable median NSE with that for basin-scale calibration of GR4J (0.75). Therefore, this study underlines the high utilization potential of deep learning for the PUB by demonstrating the new state-of-the-art performance in this field.

Download Full-text

Machine Learning Model for GSM BSC Control Plane Units

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f1044.0886s19 ◽

2019 ◽

Vol 8 (6S) ◽

pp. 219-223

Keyword(s):

Machine Learning ◽

Back Propagation ◽

Back Propagation Neural Network ◽

Model Parameters ◽

Large Set ◽

Data Set ◽

Wide Acceptance ◽

Machine Learning Model ◽

Machine Learning Approach ◽

Accuracy Of Prediction

At maximum traffic intensity i.e. during the busy hour, the GSM BSC signalling units (BSU) measured CPU load will be at its peak. The BSUs CPU load is a function of the number of transceivers (TRXs) mapped to it and hence the volume of offered traffic being handled by the unit. The unit CPU load is also a function of the nature of the offered load, i.e. with the same volume of offered load, the CPU load with the nominal traffic profile would be different as compared to some other arbitrary traffic profile. To manage future traffic growth, a model to estimate the BSU unit CPU load is an essential need. In recent times, using Machine Learning (ML) to develop such a model is an approach that has gained wide acceptance. Since, the estimation of CPU load is difficult as it depends on large set of parameters, machine learning approach is more scalable. In this paper, we describe a back-propagation neural network model that was developed to estimate the BSU unit CPU load. We describe the model parameters and choices and implementation architecture, and estimate its accuracy of prediction, based on an evaluation data set. We also discuss alternative ML architectures and compare their relative prediction accuracies, to the primary ML model

Download Full-text

Extending flood forecasting lead time in large watershed by coupling WRF QPF with distributed hydrological model

10.5194/hess-2016-512 ◽

2016 ◽

Author(s):

Ji Li ◽

Yangbo Chen ◽

Huanyu Wang ◽

Jianming Qin ◽

Jie Li

Keyword(s):

Lead Time ◽

Hydrological Model ◽

Southern China ◽

Weather Forecast ◽

Forecast Model ◽

Flood Forecasting ◽

Rain Gauge ◽

Model Parameters ◽

Distributed Hydrological Model ◽

Flood Warning

Abstract. Long lead time flood forecasting is very important for large watershed flood mitigation as it provides more time for flood warning and emergency responses. Latest numerical weather forecast model could provide 1–15 days quantitative precipitation forecasting products at grid format, by coupling this product with distributed hydrological model could produce long lead time watershed flood forecasting products. This paper studied the feasibility of coupling the Liuxihe Model with the WRF QPF for a large watershed flood forecasting in southern China. The QPF of WRF products has three lead time, including 24 hour, 48 hour and 72 hour, the grid resolution is 20 km × 20 km. The Liuxihe Model is set up with freely downloaded terrain property, the model parameters were previously optimized with rain gauge observed precipitation, and re-optimized with WRF QPF. Results show that the WRF QPF has bias with the rain gauge precipitation, and a post-processing method is proposed to post process the WRF QPF products, which improves the flood forecasting capability. With model parameter re-optimization, the model's performance improves also, it suggests that the model parameters be optimized with QPF, not the rain gauge precipitation. With the increasing of lead time, the accuracy of WRF QPF decreases, so does the flood forecasting capability. Flood forecasting products produced by coupling Liuxihe Model with WRF QPF provides good reference for large watershed flood warning due to its long lead time and rational results.

Download Full-text

Regionalization of hydrological model parameters using gradient boosting machine

10.5194/hess-2021-382 ◽

2021 ◽

Author(s):

Zhihong Song ◽

Jun Xia ◽

Gangsheng Wang ◽

Dunxian She ◽

Chen Hu ◽

...

Keyword(s):

Machine Learning ◽

Hydrological Model ◽

Spatial Coherence ◽

Gradient Boosting ◽

Model Parameters ◽

Explanatory Variables ◽

Terrain Attributes ◽

Gradient Boosting Machine ◽

Stepwise Mlr ◽

Predictions In Ungauged Basins

Abstract. Regionalization of hydrological model parameters is key to hydrological predictions in ungauged basins. The commonly used multiple linear regression (MLR) method may not be applicable in complex and nonlinear relationships between model parameters and watershed properties. Moreover, most regionalization methods assume lumped parameters for each catchment without considering within-catchment heterogeneity. Here we incorporated the Penman-Monteith-Leuning (PML) equation into the Distributed Time-Variant Gain Model (DTVGM) to improve the mechanistic representation of the evapotranspiration process. We calibrated six key model parameters grid-by-grid across China using a multivariable calibration strategy, which incorporates spatiotemporal runoff and evapotranspiration (ET) datasets (0.25°, monthly) as reference. In addition, we used the gradient boosting machine (GBM), a machine learning technique, to portray the dependence of model parameters on soil and terrain attributes in four distinct climatic zones across China. We show that the modified DTVGM could reasonably estimate the runoff and ET over China using the calibrated parameters, but performed better in humid than arid regions for the validation period. The regionalized parameters by the GBM method exhibited better spatial coherence relative to the calibrated grid-by-grid parameters. In addition, GBM outperformed the stepwise MLR method in both parameter regionalization and gridded runoff simulations at national scale, though the improvement is not significant pertaining to watershed streamflow validation due to most of the watersheds being located in humid regions. We also revealed that the slope, saturated soil moisture content, and elevation are the most important explanatory variables to inform model parameters based on the GBM approach. The machine-learning-based regionalization approach provides an effective alternative to deriving hydrological model parameters by using watershed properties in ungauged regions.

Download Full-text