scholarly journals Machine Learning Based Integrated Feature Selection Approach for Improved Electricity Demand Forecasting in Decentralized Energy Systems

IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 91463-91475 ◽  
Author(s):  
Abinet Tesfaye Eseye ◽  
Matti Lehtonen ◽  
Toni Tukia ◽  
Semen Uimonen ◽  
R. John Millar
Author(s):  
Rodrigo Porteiro ◽  
Luis Hernández-Callejo ◽  
Sergio Nesmachnow

This article presents electricity demand forecasting models for industrial and residential facilities, developed using ensemble machine learning strategies. Short term electricity demand forecasting is beneficial for both consumers and suppliers, as it allows improving energy efficiency policies and the rational use of resources. Computational intelligence models are developed for day-ahead electricity demand forecasting. An ensemble strategy is applied to build the day-ahead forecasting model based on several one-hour models. Three steps of data preprocessing are carried out, including treating missing values, removing outliers, and standardization. Feature extraction is performed to reduce overfitting, reducing the training time and improving the accuracy. The best model is optimized using grid search strategies on hyperparameter space. Then, an ensemble of 24 instances is generated to build the complete day-ahead forecasting model. Considering the computational complexity of the applied techniques, they are developed and evaluated on the National Supercomputing Center (Cluster-UY), Uruguay. Three different real data sets are used for evaluation: an industrial park in Burgos (Spain), the total electricity demand for Uruguay, and demand from a distribution substation in Montevideo (Uruguay). Standard performance metrics are applied to evaluate the proposed models. The main results indicate that the best day ahead model based on ExtraTreesRegressor has a mean absolute percentage error of 2:55% on industrial data, 5:17% on total consumption data and 9:09% on substation data. 


2020 ◽  
Vol 10 (22) ◽  
pp. 8093
Author(s):  
Jun Wang ◽  
Yuanyuan Xu ◽  
Hengpeng Xu ◽  
Zhe Sun ◽  
Zhenglu Yang ◽  
...  

Feature selection has devoted a consistently great amount of effort to dimension reduction for various machine learning tasks. Existing feature selection models focus on selecting the most discriminative features for learning targets. However, this strategy is weak in handling two kinds of features, that is, the irrelevant and redundant ones, which are collectively referred to as noisy features. These features may hamper the construction of optimal low-dimensional subspaces and compromise the learning performance of downstream tasks. In this study, we propose a novel multi-label feature selection approach by embedding label correlations (dubbed ELC) to address these issues. Particularly, we extract label correlations for reliable label space structures and employ them to steer feature selection. In this way, label and feature spaces can be expected to be consistent and noisy features can be effectively eliminated. An extensive experimental evaluation on public benchmarks validated the superiority of ELC.


Malware is a serious threat to individuals and users. The security researchers present various solutions, striving to achieve efficient malware detection. Malware attackers devise detection avoidance techniques to escape from detection systems. The key challenge is that growth of malware increases every hour, leading to large damages to users’ privacy. The training process takes much longer time, mining the unnecessary features. Feature Selection is effective in achieving unique feature set in detecting malware. In this paper, we propose a malware detection system using hybrid feature selection approach to detect malware efficiently with a reduced feature set. Machine learning based classification is performed on eight classifiers with two malware datasets. The experiments were done without and with feature selection. The empirical results show that the classification using selected feature set and XGB classifier identifies malware efficiently with an accuracy of 98.9% and 99.26% for the two datasets.


Sign in / Sign up

Export Citation Format

Share Document