Empirical Study on the Impact of Different Sets of Parameters of Gradient Boosting Algorithms for Time-Series Forecasting with LightGBM

Time Series Forecasting has always been a very important area of research in many domains because many different types of data are stored as time series. Given the growing availability of data and computing power in the recent years, Deep Learning has become a fundamental part of the new generation of Time Series Forecasting models, obtaining excellent results.As different time series problems are studied in many different fields, a large number of new architectures have been developed in recent years. This has also been simplified by the growing availability of open source frameworks, which make the development of new custom network components easier and faster.In this paper three different Deep Learning Architecture for Time Series Forecasting are presented: Recurrent Neural Networks (RNNs), that are the most classical and used architecture for Time Series Forecasting problems; Long Short-Term Memory (LSTM), that are an evolution of RNNs developed in order to overcome the vanishing gradient problem; Gated Recurrent Unit (GRU), that are another evolution of RNNs, similar to LSTM.The article is devoted to modeling and forecasting the cost of international air transportation in a pandemic using deep learning methods. The author builds time series models of the American Airlines (AAL) stock prices for a selected period using LSTM, GRU, RNN recurrent neural networks models and compare the accuracy forecast results.

Download Full-text

Time series forecasting for dynamic quality of web services: An empirical study

Journal of Systems and Software ◽

10.1016/j.jss.2017.09.011 ◽

2017 ◽

Vol 134 ◽

pp. 279-303 ◽

Cited By ~ 10

Author(s):

Yang Syu ◽

Jong-Yih Kuo ◽

Yong-Yi Fanjiang

Keyword(s):

Time Series ◽

Web Services ◽

Empirical Study ◽

Time Series Forecasting ◽

Dynamic Quality

Download Full-text

Significance Tests for Boosted Location and Scale Models with Linear Base-Learners

The International Journal of Biostatistics ◽

10.1515/ijb-2018-0110 ◽

2019 ◽

Vol 15 (1) ◽

Author(s):

Tobias Hepp ◽

Matthias Schmid ◽

Andreas Mayr

Keyword(s):

Type I Error ◽

Parametric Bootstrap ◽

Additive Models ◽

Model Specification ◽

Gradient Boosting ◽

Type I ◽

Wide Range ◽

Scale Models ◽

Boosting Algorithms ◽

The Impact

Abstract Generalized additive models for location scale and shape (GAMLSS) offer very flexible solutions to a wide range of statistical analysis problems, but can be challenging in terms of proper model specification. This complex task can be simplified using regularization techniques such as gradient boosting algorithms, but the estimates derived from such models are shrunken towards zero and it is consequently not straightforward to calculate proper confidence intervals or test statistics. In this article, we propose two strategies to obtain p-values for linear effect estimates for Gaussian location and scale models based on permutation tests and a parametric bootstrap approach. These procedures can provide a solution for one of the remaining problems in the application of gradient boosting algorithms for distributional regression in biostatistical data analyses. Results from extensive simulations indicate that in low-dimensional data both suggested approaches are able to hold the type-I error threshold and provide reasonable test power comparable to the Wald-type test for maximum likelihood inference. In high-dimensional data, when gradient boosting is the only feasible inference for this model class, the power decreases but the type-I error is still under control. In addition, we demonstrate the application of both tests in an epidemiological study to analyse the impact of physical exercise on both average and the stability of the lung function of elderly people in Germany.

Download Full-text

The bias in reversing the Box–Cox transformation in time series forecasting: An empirical study based on neural networks

Neurocomputing ◽

10.1016/j.neucom.2014.01.004 ◽

2014 ◽

Vol 136 ◽

pp. 281-288 ◽

Cited By ~ 2

Author(s):

Alexandre Fructuoso da Costa ◽

Antonio Fernando Crepaldi

Keyword(s):

Neural Networks ◽

Time Series ◽

Empirical Study ◽

Time Series Forecasting ◽

Box Cox Transformation

Download Full-text

Analysis of the impact of COVID-19 on collisions, fatalities and injuries using time series forecasting: The case of Greece

Accident Analysis & Prevention ◽

10.1016/j.aap.2021.106391 ◽

2021 ◽

pp. 106391

Author(s):

Marios Sekadakis ◽

Christos Katrakazas ◽

Eva Michelaraki ◽

Fotini Kehagia ◽

George Yannis

Keyword(s):

Time Series ◽

Time Series Forecasting ◽

The Impact ◽

Of Greece

Download Full-text

Forecasting US movies box office performances in Turkey using machine learning algorithms

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189120 ◽

2020 ◽

Vol 39 (5) ◽

pp. 6579-6590

Author(s):

Sandy Çağlıyor ◽

Başar Öztayşi ◽

Selime Sezgin

Keyword(s):

Machine Learning ◽

Global Economy ◽

Learning Algorithms ◽

Forecast Model ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

High Stakes ◽

Box Office ◽

Industry Forecast ◽

The Impact

The motion picture industry is one of the largest industries worldwide and has significant importance in the global economy. Considering the high stakes and high risks in the industry, forecast models and decision support systems are gaining importance. Several attempts have been made to estimate the theatrical performance of a movie before or at the early stages of its release. Nevertheless, these models are mostly used for predicting domestic performances and the industry still struggles to predict box office performances in overseas markets. In this study, the aim is to design a forecast model using different machine learning algorithms to estimate the theatrical success of US movies in Turkey. From various sources, a dataset of 1559 movies is constructed. Firstly, independent variables are grouped as pre-release, distributor type, and international distribution based on their characteristic. The number of attendances is discretized into three classes. Four popular machine learning algorithms, artificial neural networks, decision tree regression and gradient boosting tree and random forest are employed, and the impact of each group is observed by compared by the performance models. Then the number of target classes is increased into five and eight and results are compared with the previously developed models in the literature.

Download Full-text