A Study on Time Series Forecasting using Hybridization of Time Series Models and Neural Networks

Introduction: Auto-Regressive Integrated Moving Average (ARIMA) and Artificial Neural Networks (ANN) are leading linear and non-linear models in Machine learning respectively for time series forecasting. Objective: This survey paper presents a review of recent advances in the area of Machine Learning techniques and artificial intelligence used for forecasting different events. Methods: This paper presents an extensive survey of work done in the field of Machine Learning where hybrid models for are compared to the basic models for forecasting on the basis of error parameters like Mean Absolute Deviation (MAD), Mean Square Error (MSE), Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE) and Normalized Root Mean Square Error (NRMSE). Results: Table 1 summarizes important papers discussed in this paper on the basis of some parameters which explain the efficiency of hybrid models or when the model is used in isolation. Conclusion: The hybrid model has realized accurate results as compared when the models were used in isolation yet some research papers argue that hybrids cannot always outperform individual models.

Download Full-text

Forecasting hourly emergency department arrival using time series analysis

British Journal of Healthcare Management ◽

10.12968/bjhc.2019.0067 ◽

2020 ◽

Vol 26 (1) ◽

pp. 34-43

Author(s):

Avishek Choudhury ◽

Estefania Urena

Keyword(s):

Emergency Department ◽

Time Series ◽

Root Mean Square Error ◽

Mean Square Error ◽

Root Mean Square ◽

Moving Average ◽

Time Series Forecasting ◽

Mean Square ◽

Mean Error ◽

Auto Regressive

Background/aims The stochastic arrival of patients at hospital emergency departments complicates their management. More than 50% of a hospital's emergency department tends to operate beyond its normal capacity and eventually fails to deliver high-quality care. To address this concern, much research has been carried out using yearly, monthly and weekly time-series forecasting. This article discusses the use of hourly time-series forecasting to help improve emergency department management by predicting the arrival of future patients. Methods Emergency department admission data from January 2014 to August 2017 was retrieved from a hospital in Iowa. The auto-regressive integrated moving average (ARIMA), Holt–Winters, TBATS, and neural network methods were implemented and compared as forecasters of hourly patient arrivals. Results The auto-regressive integrated moving average (3,0,0) (2,1,0) was selected as the best fit model, with minimum Akaike information criterion and Schwartz Bayesian criterion. The model was stationary and qualified under the Box–Ljung correlation test and the Jarque–Bera test for normality. The mean error and root mean square error were selected as performance measures. A mean error of 1.001 and a root mean square error of 1.55 were obtained. Conclusions The auto-regressive integrated moving average can be used to provide hourly forecasts for emergency department arrivals and can be implemented as a decision support system to aid staff when scheduling and adjusting emergency department arrivals.

Download Full-text

Peramalan Data Ekspor Nonmigas Provinsi Kalimantan Timur Menggunakan Metode Weighted Fuzzy Time Series Lee

J Statistika: Jurnal Ilmiah Teori dan Aplikasi Statistika ◽

10.36456/jstat.vol14.no1.a3747 ◽

2021 ◽

Vol 14 (1) ◽

pp. 1-10

Author(s):

Muhammad Wahdeni Pramana ◽

Ika Purnamasari ◽

Surya Prangga

Keyword(s):

Time Series ◽

Root Mean Square Error ◽

Mean Square Error ◽

Root Mean Square ◽

Mean Absolute Percentage Error ◽

Fuzzy Time Series ◽

Percentage Error ◽

Mean Square ◽

Absolute Percentage Error

Ekspor merupakan aktivitas perdagangan atau penjualan barang dari dalam negeri ke luar negeri. Ekspor nonmigas sebagai salah satu komponen pembentuk Produk Domestik Regional Bruto (PDRB) sehingga perlu adanya suatu peramalan nilai di masa mendatang. Fuzzy Time Series (FTS) merupakan metode peramalan dengan berdasarkan teori himpunan fuzzy, logika fuzzy, serta hasil peramalan yang dapat dibahasakan (linguistik). Metode Weighted Fuzzy Time Series (WFTS) Lee merupakan perluasan dari metode FTS dengan penambahan pembobotan pada tiap pola relasi yang terbentuk. Tujuan penelitian ini adalah memperoleh nilai peramalan ekspor nonmigas Provinsi Kalimantan Timur pada bulan November 2020 serta memperoleh nilai akurasi peramalan berdasarkan metode Mean Absolute Percentage Error (MAPE) dan Root Mean Square Error (RMSE). Berdasarkan hasil analisis diperoleh nilai akurasi peramalan untuk data Ekspor Nonmigas Provinsi Kalimantan Timur bulan Januari 2019 – Oktober 2020 dengan konstanta pembobot menggunakan metode MAPE diperoleh hasil keseluruhan dibawah 10% sehingga diperoleh konstanta pembobot terbaik yaitu dengan nilai MAPE terminimum yaitu sebesar 3,62% dan RMSE minimum sebesar 50,67. Dari hasil tersebut, diperoleh hasil peramalan untuk bulan November 2020 dengan menggunakan kontanta pembobot terbaik yaitu sebesar 850,96 juta USD.

Download Full-text

Prediksi Data Time Series Saham Bank BRI Dengan Mesin Belajar LSTM (Long ShortTerm Memory)

Journal of Informatic and Information Security ◽

10.31599/jiforty.v1i1.133 ◽

2020 ◽

Vol 1 (1) ◽

pp. 1-8

Author(s):

Adhitio Satyo Bayangkari Karno

Keyword(s):

Machine Learning ◽

Time Series ◽

Root Mean Square Error ◽

Mean Square Error ◽

Root Mean Square ◽

Short Term Memory ◽

Mean Square ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory

Abstract This study aims to measure the accuracy in predicting time series data using the LSTM (Long Short-Term Memory) machine learning method, and determine the number of epochs needed to produce a small RMSE (Root Mean Square Error) value. The result of this research is a high level of variation in RMSE value to the number of epochs needed in the data processing. This variation is quite difficult to obtain the right epoch value. By doing an iteration of the LSTM process on the number of different epochs (visualized in the graph), then the number of epochs with a minimum RMSE value will be easier to obtain. From the research of BBRI's stock data prediction, a good RMSE value was obtained (RMSE = 227.470333244533). Keywords: long short-term memory, machine learning, epoch, root mean square error, mean square error. Abstrak Penelitian ini bertujuan untuk mengukur ketelitian dalam memprediksi data time series menggunakan metode mesin belajar LSTM (Long Short-Term Memory), serta menentukan banyaknya epoch yang diperlukan untuk menghasilkan nilai RMSE (Root Mean Square Error) yang kecil. Hasil dari penelitian ini adalah tingkat variasi yang tinggi nilai rmse terhdap jumlah epoch yang diperlukan dalam proses pengolahan data. Variasi ini cukup menyulitkan untuk memperoleh nilai epoch yang tepat. Dengan melakukan iterasi dari proses LSTM terhadap jumlah epoch yang berbeda (di visualisasikan dalam grafik), maka jumlah epoch dengan nilai RMSE minimal akan lebih mudah diperoleh. Dari penelitan prediksi data saham BBRI diperoleh nilai RMSE yang cukup baik yaitu 227,470333244533. Kata kunci: long short-term memory, machine learning, epoch, root mean square error, mean square error.

Download Full-text

Prediction of hydrological time-series using extreme learning machine

Journal of Hydroinformatics ◽

10.2166/hydro.2015.020 ◽

2015 ◽

Vol 18 (2) ◽

pp. 345-353 ◽

Cited By ~ 14

Author(s):

Md Atiquzzaman ◽

Jaya Kandasamy

Keyword(s):

Neural Networks ◽

Time Series ◽

Root Mean Square Error ◽

Mean Square Error ◽

Root Mean Square ◽

Extreme Learning Machine ◽

Mean Square ◽

Feed Forward Neural Networks ◽

Learning Machine ◽

Hidden Layer

Applying feed-forward neural networks has been limited due to the use of conventional gradient-based slow learning algorithms in training and iterative determination of network parameters. This paper demonstrates a method that partly overcomes these problems by using an extreme learning machine (ELM) which predicts the hydrological time-series very quickly. ELMs, also called single-hidden layer feed-forward neural networks (SLFNs), are able to well generalize the performance for extremely complex problems. ELM randomly chooses a single hidden layer and analytically determines the weights to predict the output. The ELM method was applied to predict hydrological flow series for the Tryggevælde Catchment, Denmark and for the Mississippi River at Vicksburg, USA. The results confirmed that ELM's performance was similar or better in terms of root mean square error (RMSE) and normalized root mean square error (NRMSE) compared to ANN and other previously published techniques, namely evolutionary computation based support vector machine (EC-SVM), standard chaotic approach and inverse approach.

Download Full-text

PEMODELAN JUMLAH PENDUDUK MISKIN DI JAWA TENGAH MENGGUNAKAN GEOGRAPHICALLY WEIGHTED REGRESSION (GWR)

Jurnal Litbang Sukowati Media Penelitian dan Pengembangan ◽

10.32630/sukowati.v4i2.122 ◽

2019 ◽

Vol 4 (2) ◽

pp. 10

Author(s):

Sugi Haryanto ◽

Gilang Axelline Andriani

Keyword(s):

Sustainable Development ◽

Root Mean Square Error ◽

Mean Square Error ◽

Root Mean Square ◽

Geographically Weighted Regression ◽

Sustainable Development Goals ◽

Weighted Regression ◽

Percentage Error ◽

Mean Square ◽

Development Goals

Kemiskinan merupakan sesuatu yang sering menjadi ukuran keberhasilan kepemimpinan seorang kepala daerah. Selain itu juga sebagai tujuan pertama Sustainable Development Goals (SDG’s) untuk dientaskan. Kebijakan yang tepat sangat penting dibuat demi tercapainya tujuan pembangunan berkelanjutan. Pemodelan Geographically Weighted Regression (GWR) penting digunakan untuk menyusun model di setiap kabupaten/kota sebagai dasar pembuat kebijakan. Peubah yang digunakan dalam penelitian ini yaitu jumlah penduduk miskin, Indeks Pembangunan Manusia (IPM), Tingkat Pengangguran Terbuka (TPT), dan Upah Minimum Kabupaten/kota (UMK). Tujuan penelitian ini yaitu menentukan faktor-faktor yang berpengaruh terhadap jumlah penduduk miskin di setiap kabupaten/kota di Jawa Tengah. Pemodelan GWR lebih efektif dalam menggambarkan jumlah penduduk miskin di kabupaten/kota di Jawa Tengah tahun 2018. Hal ini ditunjukkan dengan adanya penigkatan nilai R2 serta penurunan nilai Root Mean Square Error (RMSE) dan Mean Absolute Percentage Error (MAPE).

Download Full-text

Subset selection of markers for the genome-enabled prediction of genetic values using radial basis function neural networks

Acta Scientiarum Agronomy ◽

10.4025/actasciagron.v43i1.46307 ◽

2020 ◽

Vol 43 ◽

pp. e46307 ◽

Cited By ~ 1

Author(s):

Isabela de Castro Sant'Anna ◽

Gabi Nunes Silva ◽

Moysés Nascimento ◽

Cosme Damião Cruz

Keyword(s):

Neural Networks ◽

Root Mean Square Error ◽

Radial Basis Function ◽

Mean Square Error ◽

Root Mean Square ◽

Basis Function ◽

Stepwise Regression ◽

Subset Selection ◽

Mean Square ◽

Selection Of

This paper aimed to evaluate the effectiveness of subset selection of markers for genome-enabled prediction of genetic values using radial basis function neural networks (RBFNN). To this end, an F1 population derived from the hybridization of divergent parents with 500 individuals genotyped with 1000 SNP-type markers was simulated. Phenotypic traits were determined by adopting three different gene action models – additive, additive-dominant, and epistatic, representing two dominance situations: partial and complete with quantitative traits having a heritability (h2) of 30 and 60%; traits were controlled by 50 loci, considering two alleles per locus. Twelve different scenarios were represented in the simulation. The stepwise regression was used before the prediction methods. The reliability and the root mean square error were used for estimation using a fivefold cross-validation scheme. Overall, dimensionality reduction improved the reliability values for all scenarios, specifically with h2 =30 the reliability value from 0.03 to 0.59 using RBFNN and from 0.10 to 0.57 with RR-BLUP in the scenario with additive effects. In the additive dominant scenario, the reliability values changed from 0.12 to 0.59 using RBFNN and from 0.12 to 0.58 with RR-BLUP, and in the epistasis scenarios, the reliability values changed from 0.07 to 0.50 using RBFNN and from 0.06 to 0.47 with RR-BLUP. The results showed that the use of stepwise regression before the use of these techniques led to an improvement in the accuracy of prediction of the genetic value and, mainly, to a large reduction of the root mean square error in addition to facilitating processing and analysis time due to a reduction in dimensionality.

Download Full-text

Flood Detection and Susceptibility Mapping Using Sentinel-1 Time Series, Alternating Decision Trees, and Bag-ADTree Models

Complexity ◽

10.1155/2020/4271376 ◽

2020 ◽

Vol 2020 ◽

pp. 1-21 ◽

Cited By ~ 1

Author(s):

Ayub Mohammadi ◽

Khalil Valizadeh Kamran ◽

Sadra Karimzadeh ◽

Himan Shahabi ◽

Nadhir Al-Ansari

Keyword(s):

Time Series ◽

Root Mean Square Error ◽

Decision Trees ◽

Mean Square Error ◽

Root Mean Square ◽

Susceptibility Mapping ◽

Mean Square ◽

Topographic Wetness Index ◽

Flood Detection ◽

Flooded Areas

Flooding is one of the most damaging natural hazards globally. During the past three years, floods have claimed hundreds of lives and millions of dollars of damage in Iran. In this study, we detected flood locations and mapped areas susceptible to floods using time series satellite data analysis as well as a new model of bagging ensemble-based alternating decision trees, namely, bag-ADTree. We used Sentinel-1 data for flood detection and time series analysis. We employed twelve conditioning parameters of elevation, normalized difference’s vegetation index, slope, topographic wetness index, aspect, curvature, stream power index, lithology, drainage density, proximities to river, soil type, and rainfall for mapping areas susceptible to floods. ADTree and bag-ADTree models were used for flood susceptibility mapping. We used software of Sentinel application platform, Waikato Environment for Knowledge Analysis, ArcGIS, and Statistical Package for the Social Sciences for preprocessing, processing, and postprocessing of the data. We extracted 199 locations as flooded areas, which were tested using a global positioning system to ensure that flooded areas were detected correctly. Root mean square error, accuracy, and the area under the ROC curve were used to validate the models. Findings showed that root mean square error was 0.31 and 0.3 for ADTree and bag-ADTree techniques, respectively. More findings illustrated that accuracy was obtained as 86.61 for bag-ADTree model, while it was 85.44 for ADTree method. Based on AUC, success and prediction rates were 0.736 and 0.786 for bag-ADTree algorithm, in order, while these proportions were 0.714 and 0.784 for ADTree. This study can be a good source of information for crisis management in the study area.

Download Full-text

Using Multi-Temporal MODIS NDVI Data to Monitor Tea Status and Forecast Yield: A Case Study at Tanuyen, Laichau, Vietnam

Remote Sensing ◽

10.3390/rs12111814 ◽

2020 ◽

Vol 12 (11) ◽

pp. 1814

Author(s):

Phamchimai Phan ◽

Nengcheng Chen ◽

Lei Xu ◽

Zeqiang Chen

Keyword(s):

Remote Sensing ◽

Root Mean Square Error ◽

Mean Square Error ◽

Root Mean Square ◽

Vegetation Index ◽

Coefficient Of Determination ◽

Percentage Error ◽

Support Vector ◽

Mean Square ◽

Multi Temporal

Tea is a cash crop that improves the quality of life for people in the Tanuyen District of Laichau Province, Vietnam. Tea yield, however, has stagnated in recent years, due to changes in temperature, precipitation, the age of the tea bushes, and diseases. Developing an approach for monitoring tea bushes by remote sensing and Geographic Information Systems (GIS) might be a way to alleviate this problem. Using multi-temporal remote sensing data, the paper details an investigation of the changes in tea health and yield forecasting through the normalized difference vegetation index (NDVI). In this study, we used NDVI as a support tool to demonstrate the temporal and spatial changes in NDVI through the extract tea NDVI value and calculate the mean NDVI value. The results of the study showed that the minimum NDVI value was 0.42 during January 2013 and February 2015 and 2016. The maximum NDVI value was in August 2015 and June 2017. We indicate that the linear relationship between NDVI value and mean temperature was strong with R 2 = 0.79 Our results confirm that the combination of meteorological data and NDVI data can achieve a high performance of yield prediction. Three models to predict tea yield were conducted: support vector machine (SVM), random forest (RF), and the traditional linear regression model (TLRM). For period 2009 to 2018, the prediction tea yield by the RF model was the best with a R 2 = 0.73 , by SVM it was 0.66, and 0.57 with the TLRM. Three evaluation indicators were used to consider accuracy: the coefficient of determination ( R 2 ), root-mean-square error (RMSE), and percentage error of tea yield (PETY). The highest accuracy for the three models was in 2015 with a R 2 ≥ 0.87, RMSE < 50 kg/ha, and PETY less 3% error. In the other years, the prediction accuracy was higher in the SVM and RF models. Meanwhile, the RF algorithm was better than PETY (≤10%) and the root mean square error for this algorithm was significantly less (≤80 kg/ha). RMSE and PETY showed relatively good values in the TLRM model with a RMSE from 80 to 100 kg/ha and a PETY from 8 to 15%.

Download Full-text

Penerapan Generalized Regression Neural Networks untuk Memprediksi Produksi Padi Terhadap Perubahan Iklim

Jurnal Teknologi Rekayasa ◽

10.31544/jtera.v2.i2.2017.117-124 ◽

2017 ◽

Vol 2 (2) ◽

pp. 117 ◽

Cited By ~ 1

Author(s):

Muhammad Alkaff ◽

Yuslena Sari

Keyword(s):

Neural Networks ◽

Root Mean Square Error ◽

Mean Square Error ◽

Root Mean Square ◽

Input Data ◽

Mean Square ◽

Generalized Regression Neural Networks ◽

Data Output ◽

Generalized Regression

Padi sebagai bahan makanan pokok utama bagi masyarakat Indonesia merupakan tanaman pangan yang rentan terhadap perubahan iklim. Pendataan dan perhitungan ramalan hasil produksi padi sangat diperlukan untuk mendukung kebijakan yang berkaitan dengan ketahanan pangan. Penelitian ini bertujuan untuk melakukan peramalan terhadap produksi padi di Kabupaten Barito Kuala sebagai kabupaten penghasil padi terbesar di Kalimantan Selatan dengan menggunakan data iklim sebagai input. Data iklim yang digunakan berasal dari Stasiun Meteorologi Syamsudin Noor, sedangkan sebagai data output adalah data produksi padi dari Badan Pusat Statistika (BPS) Provinsi Kalimantan Selatan. Metode yang digunakan untuk melakukan peramalan produksi padi adalah Generalized Regression Neural Networks (GRNN). Dari hasil pengujian didapatkan nilai Root Mean Square Error (RMSE) sebesar 0,296 dengan menggunakan parameter smoothness bernilai 1.Kata kunci: padi, iklim, Barito Kuala, GRNN, RMSE

Download Full-text

Hydrochars as Emerging Biofuels: Recent Advances and Application of Artificial Neural Networks for the Prediction of Heating Values

Energies ◽

10.3390/en13174572 ◽

2020 ◽

Vol 13 (17) ◽

pp. 4572

Author(s):

Ioannis O. Vardiambasis ◽

Theodoros N. Kapetanakis ◽

Christos D. Nikolopoulos ◽

Trinh Kieu Trang ◽

Toshiki Tsubota ◽

...

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

Sewage Sludge ◽

Root Mean Square Error ◽

Mean Square Error ◽

Root Mean Square ◽

Food Waste ◽

Hydrothermal Carbonization ◽

Mean Square ◽

Artificial Neural

In this study, the growing scientific field of alternative biofuels was examined, with respect to hydrochars produced from renewable biomasses. Hydrochars are the solid products of hydrothermal carbonization (HTC) and their properties depend on the initial biomass and the temperature and duration of treatment. The basic (Scopus) and advanced (Citespace) analysis of literature showed that this is a dynamic research area, with several sub-fields of intense activity. The focus of researchers on sewage sludge and food waste as hydrochar precursors was highlighted and reviewed. It was established that hydrochars have improved behavior as fuels compared to these feedstocks. Food waste can be particularly useful in co-hydrothermal carbonization with ash-rich materials. In the case of sewage sludge, simultaneous P recovery from the HTC wastewater may add more value to the process. For both feedstocks, results from large-scale HTC are practically non-existent. Following the review, related data from the years 2014–2020 were retrieved and fitted into four different artificial neural networks (ANNs). Based on the elemental content, HTC temperature and time (as inputs), the higher heating values (HHVs) and yields (as outputs) could be successfully predicted, regardless of original biomass used for hydrochar production. ANN3 (based on C, O, H content, and HTC temperature) showed the optimum HHV predicting performance (R2 0.917, root mean square error 1.124), however, hydrochars’ HHVs could also be satisfactorily predicted by the C content alone (ANN1, R2 0.897, root mean square error 1.289).

Download Full-text