On the use of cross-validation for time series predictor evaluation

2012 ◽  
Vol 191 ◽  
pp. 192-213 ◽  
Author(s):  
Christoph Bergmeir ◽  
José M. Benítez
Keyword(s):  
Author(s):  
Abdulrahim Mohammed ◽  
Ahmed Khedr ◽  
Duaa AlHaj ◽  
Reem Al Khalifa ◽  
Abdulla Alqaddoumi

Water ◽  
2021 ◽  
Vol 14 (1) ◽  
pp. 34
Author(s):  
Sebastian C. Ibañez ◽  
Carlo Vincienzo G. Dajac ◽  
Marissa P. Liponhay ◽  
Erika Fille T. Legara ◽  
Jon Michael H. Esteban ◽  
...  

Forecasting reservoir water levels is essential in water supply management, impacting both operations and intervention strategies. This paper examines the short-term and long-term forecasting performance of several statistical and machine learning-based methods for predicting the water levels of the Angat Dam in the Philippines. A total of six forecasting methods are compared: naïve/persistence; seasonal mean; autoregressive integrated moving average (ARIMA); gradient boosting machines (GBM); and two deep neural networks (DNN) using a long short-term memory-based (LSTM) encoder-decoder architecture: a univariate model (DNN-U) and a multivariate model (DNN-M). Daily historical water levels from 2001 to 2021 are used in predicting future water levels. In addition, we include meteorological data (rainfall and the Oceanic Niño Index) and irrigation data as exogenous variables. To evaluate the forecast accuracy of our methods, we use a time series cross-validation approach to establish a more robust estimate of the error statistics. Our results show that our DNN-U model has the best accuracy in the 1-day-ahead scenario with a mean absolute error (MAE) and root mean square error (RMSE) of 0.2 m. In the 30-day-, 90-day-, and 180-day-ahead scenarios, the DNN-M shows the best performance with MAE (RMSE) scores of 2.9 (3.3), 5.1 (6.0), and 6.7 (8.1) meters, respectively. Additionally, we demonstrate that further improvements in performance are possible by scanning over all possible combinations of the exogenous variables and only using a subset of them as features. In summary, we provide a comprehensive framework for evaluating water level forecasting by defining a baseline accuracy, analyzing performance across multiple prediction horizons, using time series cross-validation to assess accuracy and uncertainty, and examining the effects of exogenous variables on forecasting performance. In the process, our work addresses several notable gaps in the methodologies of previous works.


Author(s):  
Yun Zhang ◽  
Lianhuan Wei ◽  
Jiayu Li ◽  
Shanjun Liu ◽  
Yachun Mao ◽  
...  

More and more high-speed railway are under construction in China. The slow settlement along high-speed railway tracks and newly-built stations would lead to inhomogeneous deformation of local area, and the accumulation may be a threat to the safe operation of high-speed rail system. In this paper, surface deformation of the newly-built high-speed railway station as well as the railway lines in Shenyang region will be retrieved by time series InSAR analysis using multi-orbit COSMO-SkyMed images. This paper focuses on the non-uniform subsidence caused by the changing of local environment along the railway. The accuracy of the settlement results can be verified by cross validation of the results obtained from two different orbits during the same period.


2004 ◽  
Vol 14 (03) ◽  
pp. 1037-1051 ◽  
Author(s):  
S. A. BILLINGS ◽  
K. L. LEE

A new NARMA based smoothing algorithm is introduced for chaotic and nonchaotic time series. The new algorithm employs a cross-validation method to determine the smoother structure, requires very little user interaction, and can be combined with wavelet thresholding to further enhance the noise reduction. Numerical examples are included to illustrate the application of the new algorithm.


2013 ◽  
Vol 284-287 ◽  
pp. 3111-3114
Author(s):  
Hsiang Chuan Liu ◽  
Wei Sung Chen ◽  
Ben Chang Shia ◽  
Chia Chen Lee ◽  
Shang Ling Ou ◽  
...  

In this paper, a novel fuzzy measure, high order lambda measure, was proposed, based on the Choquet integral with respect to this new measure, a novel composition forecasting model which composed the GM(1,1) forecasting model, the time series model and the exponential smoothing model was also proposed. For evaluating the efficiency of this improved composition forecasting model, an experiment with a real data by using the 5 fold cross validation mean square error was conducted. The performances of Choquet integral composition forecasting model with the P-measure, Lambda-measure, L-measure and high order lambda measure, respectively, a ridge regression composition forecasting model and a multiple linear regression composition forecasting model and the traditional linear weighted composition forecasting model were compared. The experimental results showed that the Choquet integral composition forecasting model with respect to the high order lambda measure has the best performance.


2019 ◽  
Vol 11 (21) ◽  
pp. 2512 ◽  
Author(s):  
Nicolas Karasiak ◽  
Jean-François Dejoux ◽  
Mathieu Fauvel ◽  
Jérôme Willm ◽  
Claude Monteil ◽  
...  

Mapping forest composition using multiseasonal optical time series remains a challenge. Highly contrasted results are reported from one study to another suggesting that drivers of classification errors are still under-explored. We evaluated the performances of single-year Formosat-2 time series to discriminate tree species in temperate forests in France and investigated how predictions vary statistically and spatially across multiple years. Our objective was to better estimate the impact of spatial autocorrelation in the validation data on measurement accuracy and to understand which drivers in the time series are responsible for classification errors. The experiments were based on 10 Formosat-2 image time series irregularly acquired during the seasonal vegetation cycle from 2006 to 2014. Due to lot of clouds in the year 2006, an alternative 2006 time series using only cloud-free images has been added. Thirteen tree species were classified in each single-year dataset based on the Support Vector Machine (SVM) algorithm. The performances were assessed using a spatial leave-one-out cross validation (SLOO-CV) strategy, thereby guaranteeing full independence of the validation samples, and compared with standard non-spatial leave-one-out cross-validation (LOO-CV). The results show relatively close statistical performances from one year to the next despite the differences between the annual time series. Good agreements between years were observed in monospecific tree plantations of broadleaf species versus high disparity in other forests composed of different species. A strong positive bias in the accuracy assessment (up to 0.4 of Overall Accuracy (OA)) was also found when spatial dependence in the validation data was not removed. Using the SLOO-CV approach, the average OA values per year ranged from 0.48 for 2006 to 0.60 for 2013, which satisfactorily represents the spatial instability of species prediction between years.


Biometrika ◽  
1988 ◽  
Vol 75 (3) ◽  
pp. 594-600 ◽  
Author(s):  
PIET DE JONG

2005 ◽  
Vol 79 (6-7) ◽  
pp. 363-369 ◽  
Author(s):  
D.W. Zheng ◽  
P. Zhong ◽  
X.L. Ding ◽  
W. Chen

Sign in / Sign up

Export Citation Format

Share Document