scholarly journals Climate-driven Model Based on Long Short-Term Memory and Bayesian Optimization for Multi-day-ahead Daily Streamflow Forecasting

Author(s):  
Yani Lian ◽  
Jungang Luo ◽  
Jingmin Wang ◽  
Ganggang Zuo

Abstract Many previous studies have developed decomposition and ensemble models to improve runoff forecasting performance. However, these decomposition-based models usually introduce large decomposition errors into the modeling process. Since the variation in runoff time series is greatly driven by climate change, many previous studies considering climate change focused on only rainfall-runoff modeling, with few meteorological factors as input. Therefore, a climate-driven streamflow forecasting (CDSF) framework was proposed to improve the runoff forecasting accuracy. This framework is realized using principal component analysis (PCA), long short-term memory (LSTM) and Bayesian optimization (BO) referred to as PCA-LSTM-BO. To validate the effectiveness and superiority of the PCA-LSTM-BO method with which one autoregressive LSTM model and two other CDSF models based on PCA, BO, and either support vector regression (SVR) or, gradient boosting regression trees (GBRT), namely, PCA-SVR-BO and PCA-GBRT-BO, respectively, were compared. A generalization performance index based on the Nash-Sutcliffe efficiency (NSE), called the GI(NSE) value, is proposed to evaluate the generalizability of the model. The results show that (1) the proposed model is significantly better than the other benchmark models in terms of the mean square error (MSE<=185.782), NSE>=0.819, and GI(NSE) <=0.223 for all the forecasting scenarios; (2) the PCA in the CDSF framework can improve the forecasting capacity and generalizability; (3) the CDSF framework is superior to the autoregressive LSTM models for all the forecasting scenarios; and (4) the GI(NSE) value is demonstrated to be effective in selecting the optimal model with a better generalizability.

2022 ◽  
Vol 12 (1) ◽  
Author(s):  
Xiaomei Sun ◽  
Haiou Zhang ◽  
Jian Wang ◽  
Chendi Shi ◽  
Dongwen Hua ◽  
...  

AbstractReliable and accurate streamflow forecasting plays a vital role in the optimal management of water resources. To improve the stability and accuracy of streamflow forecasting, a hybrid decomposition-ensemble model named VMD-LSTM-GBRT, which is sensitive to sampling, noise and long historical changes of streamflow, was established. The variational mode decomposition (VMD) algorithm was first applied to extract features, which were then learned by several long short-term memory (LSTM) networks. Simultaneously, an ensemble tree, a gradient boosting tree for regression (GBRT), was trained to model the relationships between the extracted features and the original streamflow. The outputs of these LSTMs were finally reconstructed by the GBRT model to obtain the forecasting streamflow results. A historical daily streamflow series (from 1/1/1997 to 31/12/2014) for Yangxian station, Han River, China, was investigated by the proposed model. VMD-LSTM-GBRT was compared with respect to three aspects: (1) feature extraction algorithm; ensemble empirical mode decomposition (EEMD) was used. (2) Feature learning techniques; deep neural networks (DNNs) and support vector machines for regression (SVRs) were exploited. (3) Ensemble strategy; the summation strategy was used. The results indicate that the VMD-LSTM-GBRT model overwhelms all other peer models in terms of the root mean square error (RMSE = 36.3692), determination coefficient (R2 = 0.9890), mean absolute error (MAE = 9.5246) and peak percentage threshold statistics (PPTS(5) = 0.0391%). The addressed approach based on the memory of long historical changes with deep feature representations had good stability and high prediction precision.


Water ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 437
Author(s):  
Heechan Han ◽  
Changhyun Choi ◽  
Jaewon Jung ◽  
Hung Soo Kim

Accurate runoff prediction is one of the important tasks in various fields such as agriculture, hydrology, and environmental studies. Recently, with massive improvements of computational system and hardware, the deep learning-based approach has recently been applied for more accurate runoff prediction. In this study, the long short-term memory model with sequence-to-sequence structure was applied for hourly runoff predictions from 2015 to 2019 in the Russian River basin, California, USA. The proposed model was used to predict hourly runoff with lead time of 1–6 h using runoff data observed at upstream stations. The model was evaluated in terms of event-based performance using the statistical metrics including root mean square error, Nash-Sutcliffe Efficiency, peak runoff error, and peak time error. The results show that proposed model outperforms support vector machine and conventional long short-term memory models. In addition, the model has the best predictive ability for runoff events, which means that it can be effective for developing short-term flood forecasting and warning systems. The results of this study demonstrate that the deep learning-based approach for hourly runoff forecasting has high predictive power and sequence-to-sequence structure is effective method to improve the prediction results.


2021 ◽  
Vol 35 (4) ◽  
pp. 1167-1181
Author(s):  
Yun Bai ◽  
Nejc Bezak ◽  
Bo Zeng ◽  
Chuan Li ◽  
Klaudija Sapač ◽  
...  

2021 ◽  
pp. 016555152110065
Author(s):  
Rahma Alahmary ◽  
Hmood Al-Dossari

Sentiment analysis (SA) aims to extract users’ opinions automatically from their posts and comments. Almost all prior works have used machine learning algorithms. Recently, SA research has shown promising performance in using the deep learning approach. However, deep learning is greedy and requires large datasets to learn, so it takes more time for data annotation. In this research, we proposed a semiautomatic approach using Naïve Bayes (NB) to annotate a new dataset in order to reduce the human effort and time spent on the annotation process. We created a dataset for the purpose of training and testing the classifier by collecting Saudi dialect tweets. The dataset produced from the semiautomatic model was then used to train and test deep learning classifiers to perform Saudi dialect SA. The accuracy achieved by the NB classifier was 83%. The trained semiautomatic model was used to annotate the new dataset before it was fed into the deep learning classifiers. The three deep learning classifiers tested in this research were convolutional neural network (CNN), long short-term memory (LSTM) and bidirectional long short-term memory (Bi-LSTM). Support vector machine (SVM) was used as the baseline for comparison. Overall, the performance of the deep learning classifiers exceeded that of SVM. The results showed that CNN reported the highest performance. On one hand, the performance of Bi-LSTM was higher than that of LSTM and SVM, and, on the other hand, the performance of LSTM was higher than that of SVM. The proposed semiautomatic annotation approach is usable and promising to increase speed and save time and effort in the annotation process.


Author(s):  
Ralph Sherwin A. Corpuz ◽  

Analyzing natural language-based Customer Satisfaction (CS) is a tedious process. This issue is practically true if one is to manually categorize large datasets. Fortunately, the advent of supervised machine learning techniques has paved the way toward the design of efficient categorization systems used for CS. This paper presents the feasibility of designing a text categorization model using two popular and robust algorithms – the Support Vector Machine (SVM) and Long Short-Term Memory (LSTM) Neural Network, in order to automatically categorize complaints, suggestions, feedbacks, and commendations. The study found that, in terms of training accuracy, SVM has best rating of 98.63% while LSTM has best rating of 99.32%. Such results mean that both SVM and LSTM algorithms are at par with each other in terms of training accuracy, but SVM is significantly faster than LSTM by approximately 35.47s. The training performance results of both algorithms are attributed on the limitations of the dataset size, high-dimensionality of both English and Tagalog languages, and applicability of the feature engineering techniques used. Interestingly, based on the results of actual implementation, both algorithms are found to be 100% effective in accurately predicting the correct CS categories. Hence, the extent of preference between the two algorithms boils down on the available dataset and the skill in optimizing these algorithms through feature engineering techniques and in implementing them toward actual text categorization applications.


Sign in / Sign up

Export Citation Format

Share Document