scholarly journals An improved framework to predict river flow time series data

PeerJ ◽  
2019 ◽  
Vol 7 ◽  
pp. e7183 ◽  
Author(s):  
Hafiza Mamona Nazir ◽  
Ijaz Hussain ◽  
Ishfaq Ahmad ◽  
Muhammad Faisal ◽  
Ibrahim M. Almanjahie

Due to non-stationary and noise characteristics of river flow time series data, some pre-processing methods are adopted to address the multi-scale and noise complexity. In this paper, we proposed an improved framework comprising Complete Ensemble Empirical Mode Decomposition with Adaptive Noise-Empirical Bayesian Threshold (CEEMDAN-EBT). The CEEMDAN-EBT is employed to decompose non-stationary river flow time series data into Intrinsic Mode Functions (IMFs). The derived IMFs are divided into two parts; noise-dominant IMFs and noise-free IMFs. Firstly, the noise-dominant IMFs are denoised using empirical Bayesian threshold to integrate the noises and sparsities of IMFs. Secondly, the denoised IMF’s and noise free IMF’s are further used as inputs in data-driven and simple stochastic models respectively to predict the river flow time series data. Finally, the predicted IMF’s are aggregated to get the final prediction. The proposed framework is illustrated by using four rivers of the Indus Basin System. The prediction performance is compared with Mean Square Error, Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE). Our proposed method, CEEMDAN-EBT-MM, produced the smallest MAPE for all four case studies as compared with other methods. This suggests that our proposed hybrid model can be used as an efficient tool for providing the reliable prediction of non-stationary and noisy time series data to policymakers such as for planning power generation and water resource management.

2017 ◽  
Vol 49 (3) ◽  
pp. 711-723 ◽  
Author(s):  
Xiaorong Lu ◽  
Xuelei Wang ◽  
Liang Zhang ◽  
Ting Zhang ◽  
Chao Yang ◽  
...  

Abstract Due to the effects of anthropogenic activities and natural climate change, streamflows of rivers have gradually decreased. In order to maintain reliable water supplies, reservoir operation and water resource management, accurate streamflow forecasts are very important. Based on monthly flow data from five hydrological stations in the middle and lower parts of the Hanjiang River Basin, between 1989 and 2009, we consider an efficient approach of adopting the gene expression programming model based on wavelet decomposition and de-noising (WDDGEP) to forecast river flow. Original flow time series data are initially decomposed into one sub-signal approximation and seven sub-signal details using the dmey wavelet. A wavelet threshold de-noising method is also applied in this study. Data that have been de-noised after decomposition are then adopted as inputs for WDDGEP models. Finally, the forecasted sub-signal results are summed to formulate an ensemble forecast for the original monthly flow series. A comparison of the prediction accuracy between the two models is based on three performance evaluation measures. Results show that the new WDDGEP models can effectively enhance accuracy in forecasting streamflow, and the proposed wavelet-based de-noising of the observed non-stationary time series is an effective measure to improve simulation accuracy.


2020 ◽  
Vol 7 (1) ◽  
Author(s):  
Ari Wibisono ◽  
Petrus Mursanto ◽  
Jihan Adibah ◽  
Wendy D. W. T. Bayu ◽  
May Iffah Rizki ◽  
...  

Abstract Real-time information mining of a big dataset consisting of time series data is a very challenging task. For this purpose, we propose using the mean distance and the standard deviation to enhance the accuracy of the existing fast incremental model tree with the drift detection (FIMT-DD) algorithm. The standard FIMT-DD algorithm uses the Hoeffding bound as its splitting criterion. We propose the further use of the mean distance and standard deviation, which are used to split a tree more accurately than the standard method. We verify our proposed method using the large Traffic Demand Dataset, which consists of 4,000,000 instances; Tennet’s big wind power plant dataset, which consists of 435,268 instances; and a road weather dataset, which consists of 30,000,000 instances. The results show that our proposed FIMT-DD algorithm improves the accuracy compared to the standard method and Chernoff bound approach. The measured errors demonstrate that our approach results in a lower Mean Absolute Percentage Error (MAPE) in every stage of learning by approximately 2.49% compared with the Chernoff Bound method and 19.65% compared with the standard method.


2011 ◽  
Vol 50 (4II) ◽  
pp. 715-732 ◽  
Author(s):  
Naseeb Zada ◽  
Malik Muhammad ◽  
Khan Bahadar

Given the importance of international trade and export performance in economic growth, this study attempts to examine the determinants of exports of Pakistan, using a time series data over the period 1975-2008. A simultaneous equation approach is followed and the demand and supply side equations are specified with appropriate variables. This is a country-wise disaggregated analysis of Pakistan versus its trade partners and the estimation strategy is based on two approaches. First we employ the Generalised Methods of Moments (GMM), which is followed by the Empirical Bayesian technique to get consistent estimates. The GMM technique is believed to be efficient for time series data provided the sample size is sufficiently large. In case of small samples, the estimates might not be precise and might appear with unbelievable sign and insignificant magnitudes. To avoid the sample bias and other problems, we employ the Empirical Bayesian technique which provides much precise estimates. The factual results obtained via the GMM technique are a little bit mixed, although most of the coefficients are found to be statistically significant and carry their expected signs. In order to compare and validate these results, the Empirical Bayesian technique is employed. This offers considerable improvement over the previous results and all the variables are found to be highly significant with correct sign across the countries concerned with the exception of a few cases. The price and income elasticities in both the demand and supply side equations carry their expected signs and significant magnitudes for the trading partners. The findings suggest that exports of Pakistan are much sensitive to changes in the world demand and world prices. This establishes the importance of demand side factors like world GDP, Real exchange rate, and world prices to determine the exports of Pakistan. On the supply side, we find relatively small price and income elasiticities. The results reveal that demand for exports is relatively higher for countries in NAFTA, European Union and Middle East regions. The study recommends particular concentration on the trade partners in these regions to improve the export performance of Pakistan. Keywords: Exports, GMM, Empirical Bayesian Method, Pakistan


Agriculture ◽  
2020 ◽  
Vol 10 (12) ◽  
pp. 612
Author(s):  
Helin Yin ◽  
Dong Jin ◽  
Yeong Hyeon Gu ◽  
Chang Jin Park ◽  
Sang Keun Han ◽  
...  

It is difficult to forecast vegetable prices because they are affected by numerous factors, such as weather and crop production, and the time-series data have strong non-linear and non-stationary characteristics. To address these issues, we propose the STL-ATTLSTM (STL-Attention-based LSTM) model, which integrates the seasonal trend decomposition using the Loess (STL) preprocessing method and attention mechanism based on long short-term memory (LSTM). The proposed STL-ATTLSTM forecasts monthly vegetable prices using various types of information, such as vegetable prices, weather information of the main production areas, and market trading volumes. The STL method decomposes time-series vegetable price data into trend, seasonality, and remainder components. It uses the remainder component by removing the trend and seasonality components. In the model training process, attention weights are assigned to all input variables; thus, the model’s prediction performance is improved by focusing on the variables that affect the prediction results. The proposed STL-ATTLSTM was applied to five crops, namely cabbage, radish, onion, hot pepper, and garlic, and its performance was compared to three benchmark models (i.e., LSTM, attention LSTM, and STL-LSTM). The performance results show that the LSTM model combined with the STL method (STL-LSTM) achieved a 12% higher prediction accuracy than the attention LSTM model that did not use the STL method and solved the prediction lag arising from high seasonality. The attention LSTM model improved the prediction accuracy by approximately 4% to 5% compared to the LSTM model. The STL-ATTLSTM model achieved the best performance, with an average root mean square error (RMSE) of 380, and an average mean absolute percentage error (MAPE) of 7%.


2020 ◽  
Vol 9 (3) ◽  
pp. 306-315
Author(s):  
Febyani Rachim ◽  
Tarno Tarno ◽  
Sugito Sugito

Import is one of the efforts of an area to meet the needs of its population in order to stabilize prices and maintain stock availability. The value of imports in Central Java throughout 2016 amounted to 8811.05 Million US Dollars. The value of imports in Central Java is the top 10 in all provinces in Indonesia with a percentage of 6.50%. Import data in Central Java is included in the time series data category. To maintain the stability of imports in Central Java, it is deemed necessary to make a plan based on a statistical model. One of the time series models that can be applied is the fuzzy time series model with the Chen method approach and the S. R. Singh method because the method is suitable for cyclical patterned data with monthly time periods such as Import data in Central Java. Important concepts in the preparation of the model are fuzzy sets, membership functions, set basic operators, fuzzy variables, universe sets and domains. The fuzzy time series modeling procedure is carried out through several stages, namely the determination of universe discourse which is divided into several intervals, then defines the fuzzy set so that it can be performed fuzzification. After that the fuzzy logical relations and fuzzy logical group relations are determined. The accuracy calculation in both methods uses symmetric Mean Absolute Percentage Error (sMAPE). In this study the sMAPE value obtained in the Fuzzy Time Series Chen method of 10.95% means that it shows good forecasting ability. While the sMAPE value on the Fuzzy Time Series method of S. R. Singh method by 5.50% shows very good forecasting ability. It can be concluded that the sMAPE value in the S. R. Singh fuzzy time series method is better than the Chen method.Keywords: Import value, fuzzy time series , Chen, S. R. Singh, sMAPE


2017 ◽  
Vol 145 (6) ◽  
pp. 1118-1129 ◽  
Author(s):  
K. W. WANG ◽  
C. DENG ◽  
J. P. LI ◽  
Y. Y. ZHANG ◽  
X. Y. LI ◽  
...  

SUMMARYTuberculosis (TB) affects people globally and is being reconsidered as a serious public health problem in China. Reliable forecasting is useful for the prevention and control of TB. This study proposes a hybrid model combining autoregressive integrated moving average (ARIMA) with a nonlinear autoregressive (NAR) neural network for forecasting the incidence of TB from January 2007 to March 2016. Prediction performance was compared between the hybrid model and the ARIMA model. The best-fit hybrid model was combined with an ARIMA (3,1,0) × (0,1,1)12 and NAR neural network with four delays and 12 neurons in the hidden layer. The ARIMA-NAR hybrid model, which exhibited lower mean square error, mean absolute error, and mean absolute percentage error of 0·2209, 0·1373, and 0·0406, respectively, in the modelling performance, could produce more accurate forecasting of TB incidence compared to the ARIMA model. This study shows that developing and applying the ARIMA-NAR hybrid model is an effective method to fit the linear and nonlinear patterns of time-series data, and this model could be helpful in the prevention and control of TB.


Sign in / Sign up

Export Citation Format

Share Document