scholarly journals Streamflow forecasting based on the hybrid decomposition-ensemble model 

Author(s):  
Xiaomei Sun ◽  
Haiou Zhang ◽  
Jian Wang ◽  
Chendi Shi ◽  
Dongwen Hua ◽  
...  

Abstract Reliable and accurate streamflow forecasting plays a vital role in the optimal management of water resources. To improve the stability and accuracy of streamflow forecasting, a hybrid decomposition-ensemble model named VMD-LSTM-GBRT, which is sensitive to sampling, noise and long historical changes of streamflow, was established. The variational mode decomposition (VMD) algorithm was first applied to extract features, which were then learned by several long short-term memory (LSTM) networks. Simultaneously, an ensemble tree, a gradient boosting tree for regression (GBRT), was trained to model the relationships between the extracted features and the original streamflow. The outputs of these LSTMs were finally reconstructed by the GBRT model to obtain the forecasting streamflow results. A historical daily streamflow series (from 1/1/1997 to 31/12/2014) for Yangxian station, Han River, China, was investigated by the proposed model. VMD-LSTM-GBRT was compared with respect to three aspects: (1) Feature extraction algorithm; ensemble empirical mode decomposition (EEMD) was used. (2) Feature learning techniques; deep neural networks (DNNs) and support vector machines for regression (SVRs) were exploited. (3) Ensemble strategy; the summation strategy was used. The results indicate that the VMD-LSTM-GBRT model overwhelms all other peer models in terms of the root mean square error (RMSE=36.3692), determination coefficient (R 2 =0.9890), mean absolute error (MAE=9.5246) and peak percentage threshold statistics (PPTS(5)=0.0391%). The addressed approach based on the memory of long historical changes with deep feature representations had good stability and high prediction precision.

2022 ◽  
Vol 12 (1) ◽  
Author(s):  
Xiaomei Sun ◽  
Haiou Zhang ◽  
Jian Wang ◽  
Chendi Shi ◽  
Dongwen Hua ◽  
...  

AbstractReliable and accurate streamflow forecasting plays a vital role in the optimal management of water resources. To improve the stability and accuracy of streamflow forecasting, a hybrid decomposition-ensemble model named VMD-LSTM-GBRT, which is sensitive to sampling, noise and long historical changes of streamflow, was established. The variational mode decomposition (VMD) algorithm was first applied to extract features, which were then learned by several long short-term memory (LSTM) networks. Simultaneously, an ensemble tree, a gradient boosting tree for regression (GBRT), was trained to model the relationships between the extracted features and the original streamflow. The outputs of these LSTMs were finally reconstructed by the GBRT model to obtain the forecasting streamflow results. A historical daily streamflow series (from 1/1/1997 to 31/12/2014) for Yangxian station, Han River, China, was investigated by the proposed model. VMD-LSTM-GBRT was compared with respect to three aspects: (1) feature extraction algorithm; ensemble empirical mode decomposition (EEMD) was used. (2) Feature learning techniques; deep neural networks (DNNs) and support vector machines for regression (SVRs) were exploited. (3) Ensemble strategy; the summation strategy was used. The results indicate that the VMD-LSTM-GBRT model overwhelms all other peer models in terms of the root mean square error (RMSE = 36.3692), determination coefficient (R2 = 0.9890), mean absolute error (MAE = 9.5246) and peak percentage threshold statistics (PPTS(5) = 0.0391%). The addressed approach based on the memory of long historical changes with deep feature representations had good stability and high prediction precision.


2021 ◽  
Author(s):  
Yani Lian ◽  
Jungang Luo ◽  
Jingmin Wang ◽  
Ganggang Zuo

Abstract Many previous studies have developed decomposition and ensemble models to improve runoff forecasting performance. However, these decomposition-based models usually introduce large decomposition errors into the modeling process. Since the variation in runoff time series is greatly driven by climate change, many previous studies considering climate change focused on only rainfall-runoff modeling, with few meteorological factors as input. Therefore, a climate-driven streamflow forecasting (CDSF) framework was proposed to improve the runoff forecasting accuracy. This framework is realized using principal component analysis (PCA), long short-term memory (LSTM) and Bayesian optimization (BO) referred to as PCA-LSTM-BO. To validate the effectiveness and superiority of the PCA-LSTM-BO method with which one autoregressive LSTM model and two other CDSF models based on PCA, BO, and either support vector regression (SVR) or, gradient boosting regression trees (GBRT), namely, PCA-SVR-BO and PCA-GBRT-BO, respectively, were compared. A generalization performance index based on the Nash-Sutcliffe efficiency (NSE), called the GI(NSE) value, is proposed to evaluate the generalizability of the model. The results show that (1) the proposed model is significantly better than the other benchmark models in terms of the mean square error (MSE<=185.782), NSE>=0.819, and GI(NSE) <=0.223 for all the forecasting scenarios; (2) the PCA in the CDSF framework can improve the forecasting capacity and generalizability; (3) the CDSF framework is superior to the autoregressive LSTM models for all the forecasting scenarios; and (4) the GI(NSE) value is demonstrated to be effective in selecting the optimal model with a better generalizability.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-21
Author(s):  
Hui Hu ◽  
Jianfeng Zhang ◽  
Tao Li

Data-driven methods are very useful for streamflow forecasting when the underlying physical relationships are not entirely clear. However, obtaining an accurate data-driven model that is sufficiently performant for streamflow forecasting remains often challenging. This study proposes a new data-driven model that combined the variational mode decomposition (VMD) and the prediction models for daily streamflow forecasting. The prediction models include the autoregressive moving average (ARMA), the gradient boosting regression tree (GBRT), the support vector regression (SVR), and the backpropagation neural network (BPNN). The latest decomposition model, the VMD algorithm, was first applied to extract the multiscale features from the entire time series and to decompose them into several subseries, which were predicted after that using forecast models. The ensemble forecast was finally reconstructed by summing. Historical daily streamflow series recorded at the Wushan and Weijiabao hydrologic stations from 1 January 2001 to 31 December 2014 in China were investigated using the proposed VMD-based models. Three quantitative evaluation indexes, including the Nash–Sutcliffe efficiency coefficient (NSE), the root mean square error (RMSE), and the mean absolute error (MAE), were used to evaluate and compare the predicted results of the proposed VMD-based models with two other models such as nondecomposition method (BPNN) and BPNN based on ensemble empirical mode decomposition (EEMD-BPNN). Furthermore, a comparative analysis of the performance of the VMD-BPNN model under different forecast periods (1, 3, 5, and 7 days) was performed. The results evidenced that the proposed VMD-based models could always achieve good performance in the testing stage and had relatively good stability and representativeness. Specifically, the VMD-BPNN model considered both the prediction accuracy and computation efficiency. The results show that the reliability of the forecasting decreased as the foresight period increased. The model performed satisfactorily up to 7-d lead time. The VMD-BPNN model could be applied as a promising, reliable, and robust prediction tool for short-term streamflow forecasting modelling.


2019 ◽  
Author(s):  
Ganggang Zuo ◽  
Jungang Luo ◽  
Ni Wang ◽  
Yani Lian ◽  
Xinxin He

Abstract. Streamflow forecasting is a crucial component in the management and control of water resources. Decomposition-based approaches have particularly demonstrated improved streamflow forecasting performance. However, it is not practical to firstly decompose the entire streamflow into several signal components and then divide the data samples of each component into training and validation sets for signal component prediction. This impracticality is due to the fact that some validation information, that is not available in practical streamflow forecasting, is used in that training process. Unfortunately, firstly dividing the entire streamflow into training and validation sets and then decomposing each set separately lead to undesirable boundary effects and complicated forecasting. Moreover, establishing a model for each signal component is quite laborious and summing the component predictions may lead to error accumulation. In addition, summing the decomposition results may sometimes lead to inaccurate reconstruction of the original streamflow. In order to address these shortcomings of decomposition-based models and improve the forecasting performance in basins lacking meteorological observations (e.g., precipitation and temperature), we propose a two-stage decomposition prediction (TSDP) framework, realize this framework using variational mode decomposition (VMD) and support vector machines (SVR), and refer to this realization as VMD-SVR. In the first stage of the TSDP framework, the entire streamflow data was divided into training and validation sets, each of which was then separately decomposed to avoid the influence of validation information on training. In the second stage, a single model for streamflow prediction was established using a set of mixed shuffled samples. This scheme saves the modelling time and reduces the influence of the boundary effects. We demonstrate experimentally the effectiveness, efficiency and reliability of the TSDP framework and its VMD-SVR realization in terms of the boundary effect reduction, decomposition performance, prediction outcomes, time consumption, overfitting, and forecasting capability for long leading times. Specifically, five comparative experiments were conducted based on the ensemble empirical mode decomposition (EEMD), singular spectrum analysis (SSA), discrete wavelet transform (DWT) and SVR. The experimental results on monthly runoff collected from three stations at the Wei River show the superiority of the TSDP framework compared to benchmark models.


2021 ◽  
Vol 11 ◽  
Author(s):  
R'mani Haulcy ◽  
James Glass

Alzheimer's Disease (AD) is a form of dementia that affects the memory, cognition, and motor skills of patients. Extensive research has been done to develop accessible, cost-effective, and non-invasive techniques for the automatic detection of AD. Previous research has shown that speech can be used to distinguish between healthy patients and afflicted patients. In this paper, the ADReSS dataset, a dataset balanced by gender and age, was used to automatically classify AD from spontaneous speech. The performance of five classifiers, as well as a convolutional neural network and long short-term memory network, was compared when trained on audio features (i-vectors and x-vectors) and text features (word vectors, BERT embeddings, LIWC features, and CLAN features). The same audio and text features were used to train five regression models to predict the Mini-Mental State Examination score for each patient, a score that has a maximum value of 30. The top-performing classification models were the support vector machine and random forest classifiers trained on BERT embeddings, which both achieved an accuracy of 85.4% on the test set. The best-performing regression model was the gradient boosting regression model trained on BERT embeddings and CLAN features, which had a root mean squared error of 4.56 on the test set. The performance on both tasks illustrates the feasibility of using speech to classify AD and predict neuropsychological scores.


2019 ◽  
Vol 22 (2) ◽  
pp. 310-326 ◽  
Author(s):  
Yujie Li ◽  
Zhongmin Liang ◽  
Yiming Hu ◽  
Binquan Li ◽  
Bin Xu ◽  
...  

Abstract In this study, we evaluate elastic net regression (ENR), support vector regression (SVR), random forest (RF) and eXtreme Gradient Boosting (XGB) models and propose a modified multi-model integration method named a modified stacking ensemble strategy (MSES) for monthly streamflow forecasting. We apply the above methods to the Three Gorges Reservoir in the Yangtze River Basin, and the results show the following: (1) RF and XGB present better and more stable forecast performance than ENR and SVR. It can be concluded that the machine learning-based models have the potential for monthly streamflow forecasting. (2) The MSES can effectively reconstruct the original training data in the first layer and optimize the XGB model in the second layer, improving the forecast performance. We believe that the MSES is a computing framework worthy of development, with simple mathematical structure and low computational cost. (3) The forecast performance mainly depends on the size and distribution characteristics of the monthly streamflow sequence, which is still difficult to predict using only climate indices.


Sign in / Sign up

Export Citation Format

Share Document