scholarly journals Short-Term Energy Forecasting Using Machine-Learning-Based Ensemble Voting Regression

Symmetry ◽  
2022 ◽  
Vol 14 (1) ◽  
pp. 160
Author(s):  
Pyae-Pyae Phyo ◽  
Yung-Cheol Byun ◽  
Namje Park

Meeting the required amount of energy between supply and demand is indispensable for energy manufacturers. Accordingly, electric industries have paid attention to short-term energy forecasting to assist their management system. This paper firstly compares multiple machine learning (ML) regressors during the training process. Five best ML algorithms, such as extra trees regressor (ETR), random forest regressor (RFR), light gradient boosting machine (LGBM), gradient boosting regressor (GBR), and K neighbors regressor (KNN) are trained to build our proposed voting regressor (VR) model. Final predictions are performed using the proposed ensemble VR and compared with five selected ML benchmark models. Statistical autoregressive moving average (ARIMA) is also compared with the proposed model to reveal results. For the experiments, usage energy and weather data are gathered from four regions of Jeju Island. Error measurements, including mean absolute percentage error (MAPE), mean absolute error (MAE), and mean squared error (MSE) are computed to evaluate the forecasting performance. Our proposed model outperforms six baseline models in terms of the result comparison, giving a minimum MAPE of 0.845% on the whole test set. This improved performance shows that our approach is promising for symmetrical forecasting using time series energy data in the power system sector.

10.29007/mbb7 ◽  
2020 ◽  
Author(s):  
Maher Selim ◽  
Ryan Zhou ◽  
Wenying Feng ◽  
Omar Alam

Many statistical and machine learning models for prediction make use of historical data as an input and produce single or small numbers of output values. To forecast over many timesteps, it is necessary to run the program recursively. This leads to a compounding of errors, which has adverse effects on accuracy for long forecast periods. In this paper, we show this can be mitigated through the addition of generating features which can have an “anchoring” effect on recurrent forecasts, limiting the amount of compounded error in the long term. This is studied experimentally on a benchmark energy dataset using two machine learning models LSTM and XGBoost. Prediction accuracy over differing forecast lengths is compared using the forecasting MAPE. It is found that for LSTM model the accuracy of short term energy forecasting by using a past energy consumption value as a feature is higher than the accuracy when not using past values as a feature. The opposite behavior takes place for the long term energy forecasting. For the XGBoost model, the accuracy for both short and long term energy forecasting is higher when not using past values as a feature.


2021 ◽  
Vol 13 (4) ◽  
pp. 576
Author(s):  
Hua Su ◽  
Xuemei Lu ◽  
Zuoqi Chen ◽  
Hongsheng Zhang ◽  
Wenfang Lu ◽  
...  

Chlorophyll-a (chl-a) is an important parameter of water quality and its concentration can be directly retrieved from satellite observations. The Ocean and Land Color Instrument (OLCI), a new-generation water-color sensor onboard Sentinel-3A and Sentinel-3B, is an excellent tool for marine environmental monitoring. In this study, we introduce a new machine learning model, Light Gradient Boosting Machine (LightGBM), for estimating time-series chl-a concentration in Fujian’s coastal waters using multitemporal OLCI data and in situ data. We applied the Case 2 Regional CoastColour (C2RCC) processor to obtain OLCI band reflectance and constructed four spectral indices based on OLCI feature bands as supplementary input features. We also used root-mean-square error (RMSE), mean absolute error (MAE), median absolute percentage error (MAPE), and R2 as performance indicators. The results indicate that the addition of spectral indices can easily improve the prediction accuracy of the model, and normalized fluorescence height index (NFHI) has the best performance, with an RMSE of 0.38 µg/L, MAE of 0.22 µg/L, MAPE of 28.33%, and R2 of 0.785. Moreover, we used the well-known band ratio and three-band methods for chl-a estimation validation, and another two OLCI chl-a products were adopted for comparison (OC4Me chl-a and Inverse Modelling Technique (IMT) Neural Net chl-a). The results confirmed that the LightGBM model outperforms the traditional methods and OLCI chl-a products. This study provides an effective remote sensing technique for coastal chl-a concentration estimation and promotes the advantage of OLCI data in ocean color remote sensing.


Author(s):  
Marie Luthfi Ashari ◽  
Mujiono Sadikin

Sebagai upaya untuk memenangkan persaingan di pasar, perusahaan farmasi harus menghasilkan produk obat – obatan yang berkualitas. Untuk menghasilkan produk yang berkualitas, diperlukan perencanaan produksi yang baik dan efisien. Salah satu dasar perencanaan produksi adalah prediksi penjualan. PT. Metiska Farma telah menerapkan metode prediksi dalam proses produksi, akan tetapi prediksi yang dihasilkan tidak akurat sehingga menyebabkan tidak optimal dalam memenuhi permintaan pasar. Untuk meminimalisir masalah kurang akuratnya proses prediksi tersebut, dalam penelitian yang disajikan pada makalah ini dilakukan uji coba prediksi menggunakan teknik Machine Learning dengan metode Regresi Long Short Term Memory (LSTM). Teknik yang diusulkan diuji coba menggunakan dataset penjualan produk “X” dari PT. Metiska Farma dengan parameter kinerja Root Mean Squared Error (RMSE) dan MAPE (Mean Absolute Percentage Error). Hasil penelitian ini berupa nilai rata – rata evaluasi error dari pemodelan data training dan data testing. Di mana hasil menunjukan bahwa Regresi LSTM memiliki nilai prediksi penjualan dengan evaluasi model melalui RMSE sebesar 286.465.424 untuk data training dan 187.013.430 untuk data testing. Untuk nilai MAPE sebesar 787% dan 309% untuk data training dan data testing secara berurut.


Author(s):  
Napoleon Bezas ◽  
Christos Timplalexis ◽  
Athanasios I. Salamanis ◽  
Vasileios Karapatsias ◽  
Dimosthenis Ioannidis ◽  
...  

Residential load forecasting is one of the most important tasks of the overall supply management process in electrical grids, since it enables smart grid services such as demand response (DR). Hence, several approaches for accurate residential load forecasting have been proposed in the relevant literature. However, most of the existing methods focus on the forecasting performance and neglect other aspects of the problem like training time and model size (i.e. memory usage). In this paper, we introduce a new model for both short-term and day-ahead residential load forecasting. The model synthesizes an heterogeneous feature set, which is constituted by both automatically-selected lagged values from the load time series and manually-extracted temporal features. Then, the tree-based algorithm light gradient boosting machine (LGBM) is fed with the constructed feature set and used as a regression model. Finally, a data-lightweight strategy is used for retraining the proposed model, which leads to both high forecasting accuracy and low training times. The proposed model has been extensively evaluated on a large real-world residential load dataset. The experimental results indicate that the proposed model achieves both higher forecasting performance and lower training times and model sizes compared to state-of-the-art solutions.


2021 ◽  
Vol 13 (24) ◽  
pp. 13782
Author(s):  
Soyoung Park ◽  
Sanghun Son ◽  
Jaegu Bae ◽  
Doi Lee ◽  
Jae-Jin Kim ◽  
...  

Particulate matter (PM) as an air pollutant is harmful to the human body as well as to the ecosystem. It is crucial to understand the spatiotemporal PM distribution in order to effectively implement reduction methods. However, ground-based air quality monitoring sites are limited in providing reliable concentration values owing to their patchy distribution. Here, we aimed to predict daily PM10 concentrations using boosting algorithms such as gradient boosting machine (GBM), extreme gradient boost (XGB), and light gradient boosting machine (LightGBM). The three models performed well in estimating the spatial contrasts and temporal variability in daily PM10 concentrations. In particular, the LightGBM model outperformed the GBM and XGM models, with an adjusted R2 of 0.84, a root mean squared error of 12.108 μg/m2, a mean absolute error of 8.543 μg/m2, and a mean absolute percentage error of 16%. Despite having high performance, the LightGBM model showed low spatial prediction accuracy near the southwest part of the study area. Additionally, temporal differences were found between the observed and predicted values at high concentrations. These outcomes indicate that such methods can provide intuitive and reliable PM10 concentration values for the management, prevention, and mitigation of air pollution. In the future, performance accuracy could be improved through consideration of different variables related to spatial and seasonal characteristics.


Author(s):  
Christos S Ioakimidis ◽  
Napoleon Bezas ◽  
Christos Timplalexis ◽  
Athanasios I. Salamanis ◽  
Vasileios Karapatsias ◽  
...  

Residential load forecasting is one of the most important tasks of the overall supply management process in electrical grids, since it enables smart grid services such as demand response (DR). Hence, several approaches for accurate residential load forecasting have been proposed in the relevant literature. However, most of the existing methods focus on the forecasting performance and neglect other aspects of the problem like training time and model size (i.e. memory usage). In this paper, we introduce a new model for both short-term and day-ahead residential load forecasting. The model synthesizes an heterogeneous feature set, which is constituted by both automatically-selected lagged values from the load time series and manually-extracted temporal features. Then, the tree-based algorithm light gradient boosting machine (LGBM) is fed with the constructed feature set and used as a regression model. Finally, a data-lightweight strategy is used for retraining the proposed model, which leads to both high forecasting accuracy and low training times. The proposed model has been extensively evaluated on a large real-world residential load dataset. The experimental results indicate that the proposed model achieves both higher forecasting performance and lower training times and model sizes compared to state-of-the-art solutions.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Jong Ho Kim ◽  
Haewon Kim ◽  
Ji Su Jang ◽  
Sung Mi Hwang ◽  
So Young Lim ◽  
...  

Abstract Background Predicting difficult airway is challengeable in patients with limited airway evaluation. The aim of this study is to develop and validate a model that predicts difficult laryngoscopy by machine learning of neck circumference and thyromental height as predictors that can be used even for patients with limited airway evaluation. Methods Variables for prediction of difficulty laryngoscopy included age, sex, height, weight, body mass index, neck circumference, and thyromental distance. Difficult laryngoscopy was defined as Grade 3 and 4 by the Cormack-Lehane classification. The preanesthesia and anesthesia data of 1677 patients who had undergone general anesthesia at a single center were collected. The data set was randomly stratified into a training set (80%) and a test set (20%), with equal distribution of difficulty laryngoscopy. The training data sets were trained with five algorithms (logistic regression, multilayer perceptron, random forest, extreme gradient boosting, and light gradient boosting machine). The prediction models were validated through a test set. Results The model’s performance using random forest was best (area under receiver operating characteristic curve = 0.79 [95% confidence interval: 0.72–0.86], area under precision-recall curve = 0.32 [95% confidence interval: 0.27–0.37]). Conclusions Machine learning can predict difficult laryngoscopy through a combination of several predictors including neck circumference and thyromental height. The performance of the model can be improved with more data, a new variable and combination of models.


2021 ◽  
Author(s):  
Abdul Muqtadir Khan

Abstract With the advancement in machine learning (ML) applications, some recent research has been conducted to optimize fracturing treatments. There are a variety of models available using various objective functions for optimization and different mathematical techniques. There is a need to extend the ML techniques to optimize the choice of algorithm. For fracturing treatment design, the literature for comparative algorithm performance is sparse. The research predominantly shows that compared to the most commonly used regressors and classifiers, some sort of boosting technique consistently outperforms on model testing and prediction accuracy. A database was constructed for a heterogeneous reservoir. Four widely used boosting algorithms were used on the database to predict the design only from the output of a short injection/falloff test. Feature importance analysis was done on eight output parameters from the falloff analysis, and six were finalized for the model construction. The outputs selected for prediction were fracturing fluid efficiency, proppant mass, maximum proppant concentration, and injection rate. Extreme gradient boost (XGBoost), categorical boost (CatBoost), adaptive boost (AdaBoost), and light gradient boosting machine (LGBM) were the algorithms finalized for the comparative study. The sensitivity was done for a different number of classes (four, five, and six) to establish a balance between accuracy and prediction granularity. The results showed that the best algorithm choice was between XGBoost and CatBoost for the predicted parameters under certain model construction conditions. The accuracy for all outputs for the holdout sets varied between 80 and 92%, showing robust significance for a wider utilization of these models. Data science has contributed to various oil and gas industry domains and has tremendous applications in the stimulation domain. The research and review conducted in this paper add a valuable resource for the user to build digital databases and use the appropriate algorithm without much trial and error. Implementing this model reduced the complexity of the proppant fracturing treatment redesign process, enhanced operational efficiency, and reduced fracture damage by eliminating minifrac steps with crosslinked gel.


2018 ◽  
Author(s):  
Jatin Kumar ◽  
Qianxiao Li ◽  
Karen Y.T. Tang ◽  
Tonio Buonassisi ◽  
Anibal L. Gonzalez-Oyarce ◽  
...  

<div><div><div><p>Inverse design is an outstanding challenge in disordered systems with multiple length scales such as polymers, particularly when designing polymers with desired phase behavior. We demonstrate high-accuracy tuning of poly(2-oxazoline) cloud point via machine learning. With a design space of four repeating units and a range of molecular masses, we achieve an accuracy of 4°C root mean squared error (RMSE) in a temperature range of 24– 90°C, employing gradient boosting with decision trees. The RMSE is >3x better than linear and polynomial regression. We perform inverse design via particle-swarm optimization, predicting and synthesizing 17 polymers with constrained design at 4 target cloud points from 37 to 80°C. Our approach challenges the status quo in polymer design with a machine learning algorithm, that is capable of fast and systematic discovery of new polymers.</p></div></div></div>


Sign in / Sign up

Export Citation Format

Share Document