Short and Very Short Term Firm-level Load Forecasting for Warehouses: A Comparison of Machine Learning and Deep Learning Models

Commercial buildings are a significant consumer of energy worldwide. Logistics facilities, and specifically warehouses, are a common building type yet under-researched in the demand-side energy forecasting literature. Warehouses have an idiosyncratic profile when compared to other commercial and industrial buildings with a significant reliance on a small number of energy systems. As such, warehouse owners and operators are increasingly entering in to energy performance contracts with energy service companies (ESCOs) to minimise environmental impact, reduce costs, and improve competitiveness. ESCOs and warehouse owners and operators require accurate forecasts of their energy consumption so that precautionary and mitigation measures can be taken. This paper explores the performance of three machine learning models (Support Vector Regression (SVR), Random Forest, and Extreme Gradient Boosting (XGBoost)), three deep learning models (Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU)), and a classical time series model, Autoregressive Integrated Moving Average (ARIMA) for predicting daily energy consumption. The dataset comprises 8,040 records generated over an 11-month period from January to November 2020 from a non-refrigerated logistics facility located in Ireland. The grid search method was used to identify the best configurations for each model. The proposed XGBoost models outperform other models for both very short load forecasting (VSTLF) and short term load forecasting (STLF); the ARIMA model performed the worst.

Download Full-text

Short-Term Firm-Level Energy-Consumption Forecasting for Energy-Intensive Manufacturing: A Comparison of Machine Learning and Deep Learning Models

Algorithms ◽

10.3390/a13110274 ◽

2020 ◽

Vol 13 (11) ◽

pp. 274 ◽

Cited By ~ 1

Author(s):

Andrea Maria N. C. Ribeiro ◽

Pedro Rafael X. do Carmo ◽

Iago Richard Rodrigues ◽

Djamel Sadok ◽

Theo Lynn ◽

...

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Energy Consumption ◽

Mitigation Measures ◽

Support Vector ◽

Level Energy ◽

Learning Models ◽

Short Term ◽

Case Site

To minimise environmental impact, to avoid regulatory penalties, and to improve competitiveness, energy-intensive manufacturing firms require accurate forecasts of their energy consumption so that precautionary and mitigation measures can be taken. Deep learning is widely touted as a superior analytical technique to traditional artificial neural networks, machine learning, and other classical time-series models due to its high dimensionality and problem-solving capabilities. Despite this, research on its application in demand-side energy forecasting is limited. We compare two benchmarks (Autoregressive Integrated Moving Average (ARIMA) and an existing manual technique used at the case site) against three deep-learning models (simple Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU)) and two machine-learning models (Support Vector Regression (SVR) and Random Forest) for short-term load forecasting (STLF) using data from a Brazilian thermoplastic resin manufacturing plant. We use the grid search method to identify the best configurations for each model and then use Diebold–Mariano testing to confirm the results. The results suggests that the legacy approach used at the case site is the worst performing and that the GRU model outperformed all other models tested.

Download Full-text

Multi-Sequence LSTM-RNN Deep Learning and Metaheuristics for Electric Load Forecasting

Energies ◽

10.3390/en13020391 ◽

2020 ◽

Vol 13 (2) ◽

pp. 391 ◽

Cited By ~ 8

Author(s):

Salah Bouktif ◽

Ali Fiaz ◽

Ali Ouni ◽

Mohamed Adel Serhani

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Energy Consumption ◽

Load Forecasting ◽

Optimal Configuration ◽

Electric Load ◽

Learning Models ◽

Short Term ◽

Metaheuristic Search ◽

Electric Load Forecasting

Short term electric load forecasting plays a crucial role for utility companies, as it allows for the efficient operation and management of power grid networks, optimal balancing between production and demand, as well as reduced production costs. As the volume and variety of energy data provided by building automation systems, smart meters, and other sources are continuously increasing, long short-term memory (LSTM) deep learning models have become an attractive approach for energy load forecasting. These models are characterized by their capabilities of learning long-term dependencies in collected electric data, which lead to accurate prediction results that outperform several alternative statistical and machine learning approaches. Unfortunately, applying LSTM models may not produce acceptable forecasting results, not only because of the noisy electric data but also due to the naive selection of its hyperparameter values. Therefore, an optimal configuration of an LSTM model is necessary to describe the electric consumption patterns and discover the time-series dynamics in the energy domain. Finding such an optimal configuration is, on the one hand, a combinatorial problem where selection is done from a very large space of choices; on the other hand, it is a learning problem where the hyperparameters should reflect the energy consumption domain knowledge, such as the influential time lags, seasonality, periodicity, and other temporal attributes. To handle this problem, we use in this paper metaheuristic-search-based algorithms, known by their ability to alleviate search complexity as well as their capacity to learn from the domain where they are applied, to find optimal or near-optimal values for the set of tunable LSTM hyperparameters in the electrical energy consumption domain. We tailor both a genetic algorithm (GA) and particle swarm optimization (PSO) to learn hyperparameters for load forecasting in the context of energy consumption of big data. The statistical analysis of the obtained result shows that the multi-sequence deep learning model tuned by the metaheuristic search algorithms provides more accurate results than the benchmark machine learning models and the LSTM model whose inputs and hyperparameters were established through limited experience and a discounted number of experimentations.

Download Full-text

Short-Term Firm-Level Energy Consumption Forecasting for Energy-Intensive Manufacturing: A Comparison of Machine Learning and Deep Learning Models

10.20944/preprints202009.0491.v1 ◽

2020 ◽

Author(s):

Andrea MariaN.C. Ribeiro ◽

Pedro RafaelX.do Carmo ◽

Iago Rodrigues ◽

Djamel Sadok ◽

Theo Lynn ◽

...

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Energy Consumption ◽

Mitigation Measures ◽

Support Vector ◽

Level Energy ◽

Learning Models ◽

Short Term ◽

Case Site

To minimise environmental impact, avoid regulatory penalties, and improve competitiveness, energy-intensive manufacturing firms require accurate forecasts of their energy consumption so that precautionary and mitigation measures can be taken. Deep learning is widely touted as a superior analytical technique to traditional artificial neural networks, machine learning, and other classical time series models due to its high dimensionality and problem solving capabilities. Despite this, research on its application in demand-side energy forecasting is limited. We compare two benchmarks (Autoregressive Integrated Moving Average (ARIMA), and an existing manual technique used at the case site) against three deep learning models (simple Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU)) and three machine learning models (Support Vector Regression (SVM), Random Forest, and K-Nearest Neighbors (KNN)) for short term load forecasting (STLF) using data from a Brazilian thermoplastic resin manufacturing plant. We use the grid search method to identify the best configurations for each model, and then use Diebold-Mariano testing to confirm the results. Results suggests that the legacy approach used at the case site is the worst performing, and that the GRU model outperformed all other models tested.

Download Full-text

Modelling on Car-Sharing Serial Prediction Based on Machine Learning and Deep Learning

Complexity ◽

10.1155/2022/8843000 ◽

2022 ◽

Vol 2022 ◽

pp. 1-20

Author(s):

Nihad Brahimi ◽

Huaping Zhang ◽

Lin Dai ◽

Jianzi Zhang

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Research Work ◽

Gradient Boosting ◽

Support Vector ◽

Learning Models ◽

Car Sharing ◽

K Nearest Neighbors ◽

Extreme Gradient Boosting ◽

On The Road

The car-sharing system is a popular rental model for cars in shared use. It has become particularly attractive due to its flexibility; that is, the car can be rented and returned anywhere within one of the authorized parking slots. The main objective of this research work is to predict the car usage in parking stations and to investigate the factors that help to improve the prediction. Thus, new strategies can be designed to make more cars on the road and fewer in the parking stations. To achieve that, various machine learning models, namely vector autoregression (VAR), support vector regression (SVR), eXtreme gradient boosting (XGBoost), k-nearest neighbors (kNN), and deep learning models specifically long short-time memory (LSTM), gated recurrent unit (GRU), convolutional neural network (CNN), CNN-LSTM, and multilayer perceptron (MLP), were performed on different kinds of features. These features include the past usage levels, Chongqing’s environmental conditions, and temporal information. After comparing the obtained results using different metrics, we found that CNN-LSTM outperformed other methods to predict the future car usage. Meanwhile, the model using all the different feature categories results in the most precise prediction than any of the models using one feature category at a time

Download Full-text

Machine learning models to identify low adherence to influenza vaccination among Korean adults with cardiovascular disease

BMC Cardiovascular Disorders ◽

10.1186/s12872-021-01925-7 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Moojung Kim ◽

Young Jae Kim ◽

Sung Jin Park ◽

Kwang Gi Kim ◽

Pyung Chun Oh ◽

...

Keyword(s):

Machine Learning ◽

Cardiovascular Disease ◽

Influenza Vaccination ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Support Vector ◽

Age Group ◽

Learning Models ◽

Extreme Gradient Boosting ◽

Machine Learning Models

Abstract Background Annual influenza vaccination is an important public health measure to prevent influenza infections and is strongly recommended for cardiovascular disease (CVD) patients, especially in the current coronavirus disease 2019 (COVID-19) pandemic. The aim of this study is to develop a machine learning model to identify Korean adult CVD patients with low adherence to influenza vaccination Methods Adults with CVD (n = 815) from a nationally representative dataset of the Fifth Korea National Health and Nutrition Examination Survey (KNHANES V) were analyzed. Among these adults, 500 (61.4%) had answered "yes" to whether they had received seasonal influenza vaccinations in the past 12 months. The classification process was performed using the logistic regression (LR), random forest (RF), support vector machine (SVM), and extreme gradient boosting (XGB) machine learning techniques. Because the Ministry of Health and Welfare in Korea offers free influenza immunization for the elderly, separate models were developed for the < 65 and ≥ 65 age groups. Results The accuracy of machine learning models using 16 variables as predictors of low influenza vaccination adherence was compared; for the ≥ 65 age group, XGB (84.7%) and RF (84.7%) have the best accuracies, followed by LR (82.7%) and SVM (77.6%). For the < 65 age group, SVM has the best accuracy (68.4%), followed by RF (64.9%), LR (63.2%), and XGB (61.4%). Conclusions The machine leaning models show comparable performance in classifying adult CVD patients with low adherence to influenza vaccination.

Download Full-text

Short-Term Forecasting of Photovoltaic Solar Power Production Using Variational Auto-Encoder Driven Deep Learning Approach

Applied Sciences ◽

10.3390/app10238400 ◽

2020 ◽

Vol 10 (23) ◽

pp. 8400 ◽

Cited By ~ 1

Author(s):

Abdelkader Dairi ◽

Fouzi Harrou ◽

Ying Sun ◽

Sofiane Khadraoui

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Solar Power ◽

Power Production ◽

Superior Performance ◽

Support Vector ◽

Learning Models ◽

Short Term ◽

Learning Methods ◽

Short Term Forecasting

The accurate modeling and forecasting of the power output of photovoltaic (PV) systems are critical to efficiently managing their integration in smart grids, delivery, and storage. This paper intends to provide efficient short-term forecasting of solar power production using Variational AutoEncoder (VAE) model. Adopting the VAE-driven deep learning model is expected to improve forecasting accuracy because of its suitable performance in time-series modeling and flexible nonlinear approximation. Both single- and multi-step-ahead forecasts are investigated in this work. Data from two grid-connected plants (a 243 kW parking lot canopy array in the US and a 9 MW PV system in Algeria) are employed to show the investigated deep learning models’ performance. Specifically, the forecasting outputs of the proposed VAE-based forecasting method have been compared with seven deep learning methods, namely recurrent neural network, Long short-term memory (LSTM), Bidirectional LSTM, Convolutional LSTM network, Gated recurrent units, stacked autoencoder, and restricted Boltzmann machine, and two commonly used machine learning methods, namely logistic regression and support vector regression. The results of this investigation demonstrate the satisfying performance of deep learning techniques to forecast solar power and point out that the VAE consistently performed better than the other methods. Also, results confirmed the superior performance of deep learning models compared to the two considered baseline machine learning models.

Download Full-text

Machine Learning Models of Acute Kidney Injury Prediction in Acute Pancreatitis Patients

Gastroenterology Research and Practice ◽

10.1155/2020/3431290 ◽

2020 ◽

Vol 2020 ◽

pp. 1-8

Author(s):

Cheng Qu ◽

Lin Gao ◽

Xian-qiang Yu ◽

Mei Wei ◽

Guo-quan Fang ◽

...

Keyword(s):

Machine Learning ◽

Acute Kidney Injury ◽

Acute Pancreatitis ◽

Logistic Regression ◽

Kidney Injury ◽

Gradient Boosting ◽

Support Vector ◽

Learning Models ◽

Extreme Gradient Boosting ◽

Machine Learning Models

Background. Acute kidney injury (AKI) has long been recognized as a common and important complication of acute pancreatitis (AP). In the study, machine learning (ML) techniques were used to establish predictive models for AKI in AP patients during hospitalization. This is a retrospective review of prospectively collected data of AP patients admitted within one week after the onset of abdominal pain to our department from January 2014 to January 2019. Eighty patients developed AKI after admission (AKI group) and 254 patients did not (non-AKI group) in the hospital. With the provision of additional information such as demographic characteristics or laboratory data, support vector machine (SVM), random forest (RF), classification and regression tree (CART), and extreme gradient boosting (XGBoost) were used to build models of AKI prediction and compared to the predictive performance of the classic model using logistic regression (LR). XGBoost performed best in predicting AKI with an AUC of 91.93% among the machine learning models. The AUC of logistic regression analysis was 87.28%. Present findings suggest that compared to the classical logistic regression model, machine learning models using features that can be easily obtained at admission had a better performance in predicting AKI in the AP patients.

Download Full-text

A Comparative Analysis of Machine Learning Models for Prediction of Insurance Uptake in Kenya

10.20944/preprints202010.0186.v1 ◽

2020 ◽

Author(s):

Nelson Yego ◽

Juma Kasozi ◽

Joseph Nkrunziza

Keyword(s):

Machine Learning ◽

Random Forest ◽

Characteristic Curve ◽

Confusion Matrix ◽

Gradient Boosting ◽

Support Vector ◽

Sampled Data ◽

Learning Models ◽

Extreme Gradient Boosting ◽

Machine Learning Models

The role of insurance in financial inclusion as well as in economic growth is immense. However, low uptake seems to impede the growth of the sector hence the need for a model that robustly predicts uptake of insurance among potential clients. In this research, we compared the performances of eight (8) machine learning models in predicting the uptake of insurance. The classifiers considered were Logistic Regression, Gaussian Naive Bayes, Support Vector Machines, K Nearest Neighbors, Decision Tree, Random Forest, Gradient Boosting Machines and Extreme Gradient boosting. The data used in the classification was from the 2016 Kenya FinAccess Household Survey. Comparison of performance was done for both upsampled and downsampled data due to data imbalance. For upsampled data, Random Forest classifier showed highest accuracy and precision compared to other classifiers but for down sampled data, gradient boosting was optimal. It is noteworthy that for both upsampled and downsampled data, tree-based classifiers were more robust than others in insurance uptake prediction. However, in spite of hyper-parameter optimization, the area under receiver operating characteristic curve remained highest for Random Forest as compared to other tree-based models. Also, the confusion matrix for Random Forest showed least false positives, and highest true positives hence could be construed as the most robust model for predicting the insurance uptake. Finally, the most important feature in predicting uptake was having a bank product hence bancassurance could be said to be a plausible channel of distribution of insurance products.

Download Full-text

Machine learning models for screening carotid atherosclerosis in asymptomatic adults

Scientific Reports ◽

10.1038/s41598-021-01456-3 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Jian Yu ◽

Yan Zhou ◽

Qiong Yang ◽

Xiaoling Liu ◽

Lili Huang ◽

...

Keyword(s):

Machine Learning ◽

Physical Examination ◽

Carotid Atherosclerosis ◽

Gradient Boosting ◽

Support Vector ◽

Learning Models ◽

Cerebrovascular Events ◽

Extreme Gradient Boosting ◽

Routine Physical Examination ◽

Machine Learning Models

AbstractCarotid atherosclerosis (CAS) is a risk factor for cardiovascular and cerebrovascular events, but duplex ultrasonography isn’t recommended in routine screening for asymptomatic populations according to medical guidelines. We aim to develop machine learning models to screen CAS in asymptomatic adults. A total of 2732 asymptomatic subjects for routine physical examination in our hospital were included in the study. We developed machine learning models to classify subjects with or without CAS using decision tree, random forest (RF), extreme gradient boosting (XGBoost), support vector machine (SVM) and multilayer perceptron (MLP) with 17 candidate features. The performance of models was assessed on the testing dataset. The model using MLP achieved the highest accuracy (0.748), positive predictive value (0.743), F1 score (0.742), area under receiver operating characteristic curve (AUC) (0.766) and Kappa score (0.445) among all classifiers. It’s followed by models using XGBoost and SVM. In conclusion, the model using MLP is the best one to screen CAS in asymptomatic adults based on the results from routine physical examination, followed by using XGBoost and SVM. Those models may provide an effective and applicable method for physician and primary care doctors to screen asymptomatic CAS without risk factors in general population, and improve risk predictions and preventions of cardiovascular and cerebrovascular events in asymptomatic adults.

Download Full-text

Short-Term Electricity Load Forecasting with Machine Learning

Information ◽

10.3390/info12020050 ◽

2021 ◽

Vol 12 (2) ◽

pp. 50

Author(s):

Ernesto Aguilar Madrid ◽

Nuno Antonio

Keyword(s):

Machine Learning ◽

Load Forecasting ◽

Production Costs ◽

Electricity Production ◽

Gradient Boosting ◽

Multiple Sources ◽

Short Term ◽

Extreme Gradient Boosting ◽

Short Term Load Forecasting ◽

Electricity Load

An accurate short-term load forecasting (STLF) is one of the most critical inputs for power plant units’ planning commitment. STLF reduces the overall planning uncertainty added by the intermittent production of renewable sources; thus, it helps to minimize the hydrothermal electricity production costs in a power grid. Although there is some research in the field and even several research applications, there is a continual need to improve forecasts. This research proposes a set of machine learning (ML) models to improve the accuracy of 168 h forecasts. The developed models employ features from multiple sources, such as historical load, weather, and holidays. Of the five ML models developed and tested in various load profile contexts, the Extreme Gradient Boosting Regressor (XGBoost) algorithm showed the best results, surpassing previous historical weekly predictions based on neural networks. Additionally, because XGBoost models are based on an ensemble of decision trees, it facilitated the model’s interpretation, which provided a relevant additional result, the features’ importance in the forecasting.

Download Full-text