scholarly journals A systematic feature selection procedure for short-term data-driven building energy forecasting model development

2019 ◽  
Vol 183 ◽  
pp. 428-442 ◽  
Author(s):  
Liang Zhang ◽  
Jin Wen
Author(s):  
Liang Zhang ◽  
Jin Wen ◽  
Yimin Chen

An accurate building energy forecasting model is a key component for real-time and advanced control of building energy system and building-to-grid integration. With the fast deployment and advancement of building automation systems, data are collected by hundreds and sometimes thousands of sensors every few minutes in buildings, which provide great potential for data-driven building energy forecasting. To develop building energy forecasting models from a large number of potential inputs, feature selection is a critical procedure to ensure model accuracy and computation efficiency. Though the theory of feature selection is well developed in statistics and machine learning fields, it is not well studied in the application of building energy modeling. In this paper, a feature selection framework proposed in an earlier study is examined using a real campus building in Philadelphia. This feature selection framework combines domain knowledge and statistical methods and is developed for short-term data-driven building energy forecasting. In this case study, the feasibilities of using this feature selection framework in developing whole building energy forecasting model and chiller energy forecasting model are studied. Results show that, for both whole building and chiller energy forecasting applications, the model with systematic feature selection process presents better performance (in terms of cross validation error of forecasted output) than other models including that with conventional inputs and that uses only single feature selection technique.


2017 ◽  
Vol 2645 (1) ◽  
pp. 157-167 ◽  
Author(s):  
Jishun Ou ◽  
Jingxin Xia ◽  
Yao-Jan Wu ◽  
Wenming Rao

Urban traffic flow forecasting is essential to proactive traffic control and management. Most existing forecasting methods depend on proper and reliable input features, for example, weather conditions and spatiotemporal lagged variables of traffic flow. However, the feature selection process is often done manually without comprehensive evaluation and leads to inaccurate results. For that challenge, this paper presents an approach combining the bias-corrected random forests algorithm with a data-driven feature selection strategy for short-term urban traffic flow forecasting. First, several input features were extracted from traffic flow time series data. Then the importance of these features was quantified with the permutation importance measure. Next, a data-driven feature selection strategy was introduced to identify the most important features. Finally, the forecasting model was built on the bias-corrected random forests algorithm and the selected features. The proposed approach was validated with data collected from three types of urban roads (expressway, major arterial, and minor arterial) in Kunshan City, China. The proposed approach was also compared with 10 existing approaches to verify its effectiveness. The results of the validation and comparison show that even without further model tuning, the proposed approach achieves the lowest average mean absolute error and root mean square error on six stations while it achieves the second-best average performance in mean absolute percentage error. Meanwhile, the training efficiency is improved compared with the original random forests method owing to the use of the feature selection strategy.


2017 ◽  
Vol 190 ◽  
pp. 1245-1257 ◽  
Author(s):  
Cong Feng ◽  
Mingjian Cui ◽  
Bri-Mathias Hodge ◽  
Jie Zhang

Energies ◽  
2019 ◽  
Vol 12 (6) ◽  
pp. 1140 ◽  
Author(s):  
Xin Gao ◽  
Xiaobing Li ◽  
Bing Zhao ◽  
Weijia Ji ◽  
Xiao Jing ◽  
...  

Many factors affect short-term electric load, and the superposition of these factors leads to it being non-linear and non-stationary. Separating different load components from the original load series can help to improve the accuracy of prediction, but the direct modeling and predicting of the decomposed time series components will give rise to multiple random errors and increase the workload of prediction. This paper proposes a short-term electricity load forecasting model based on an empirical mode decomposition-gated recurrent unit (EMD-GRU) with feature selection (FS-EMD-GRU). First, the original load series is decomposed into several sub-series by EMD. Then, we analyze the correlation between the sub-series and the original load series through the Pearson correlation coefficient method. Some sub-series with high correlation with the original load series are selected as features and input into the GRU network together with the original load series to establish the prediction model. Three public data sets provided by the U.S. public utility and the load data from a region in northwestern China were used to evaluate the effectiveness of the proposed method. The experiment results showed that the average prediction accuracy of the proposed method on four data sets was 96.9%, 95.31%, 95.72%, and 97.17% respectively. Compared to a single GRU, support vector regression (SVR), random forest (RF) models and EMD-GRU, EMD-SVR, EMD-RF models, the prediction accuracy of the proposed method in this paper was higher.


Author(s):  
Ziyao Wang ◽  
Huaqiang Li ◽  
Zizhuo Tang ◽  
Yang Liu

Accurate ultra-short-term load forecasting is of great significance for real-time power generation scheduling and development of power cyber physical systems (Power CPS). However, in order to forecast the future load using the current high-dimensional, diverse and heterogeneous electric power consumption information, new challenges have been raised to the effective feature selection and the accurate load forecasting algorithms. However, very limited existing works consider the feature selection for the electric power consumption information and impacts to the thereafter load forecasting model. In view of this point, features that are critical to the load forecasting are selected using an embedded feature selection algorithm based on LightGBM to form an optimal feature set, with which a sequence to sequence (S2S) and gated recurrent unit (GRU)-based ultra-short-term load forecasting model that incorporates Bahdanau attention (BA) mechanism is presented. The S2S-GRU model is based on an encoding–decoding framework that is compatible to the input and output data series with variable lengths. By introducing the BA mechanism, loss of previous information issue of GRU can be solved. Experimental results show that first the presented feature selection algorithm can help to improve the performance of the load forecasting model. Second, the presented load forecasting model can find a compromise between the forecasting efficiency and accuracy.


Sign in / Sign up

Export Citation Format

Share Document