Online Streaming Feature Selection via Conditional Independence

Online feature selection is a challenging topic in data mining. It aims to reduce the dimensionality of streaming features by removing irrelevant and redundant features in real time. Existing works, such as Alpha-investing and Online Streaming Feature Selection (OSFS), have been proposed to serve this purpose, but they have drawbacks, including low prediction accuracy and high running time if the streaming features exhibit characteristics such as low redundancy and high relevance. In this paper, we propose a novel algorithm about online streaming feature selection, named ConInd that uses a three-layer filtering strategy to process streaming features with the aim of overcoming such drawbacks. Through three-layer filtering, i.e., null-conditional independence, single-conditional independence, and multi-conditional independence, we can obtain an approximate Markov blanket with high accuracy and low running time. To validate the efficiency, we implemented the proposed algorithm and tested its performance on a prevalent dataset, i.e., NIPS 2003 and Causality Workbench. Through extensive experimental results, we demonstrated that ConInd offers significant performance improvements in prediction accuracy and running time compared to Alpha-investing and OSFS. ConInd offers 5.62% higher average prediction accuracy than Alpha-investing, with a 53.56% lower average running time compared to that for OSFS when the dataset is lowly redundant and highly relevant. In addition, the ratio of the average number of features for ConInd is 242% less than that for Alpha-investing.

Download Full-text

Online Streaming Features Selection via Markov Blanket

Symmetry ◽

10.3390/sym14010149 ◽

2022 ◽

Vol 14 (1) ◽

pp. 149

Author(s):

Waqar Khan ◽

Lingfu Kong ◽

Brekhna Brekhna ◽

Ling Wang ◽

Huigui Yan

Keyword(s):

Feature Selection ◽

Real World ◽

Conditional Independence ◽

Prediction Accuracy ◽

Features Selection ◽

Sample Sizes ◽

Markov Blanket ◽

Independence Tests ◽

Online Feature Selection ◽

Real World Datasets

Streaming feature selection has always been an excellent method for selecting the relevant subset of features from high-dimensional data and overcoming learning complexity. However, little attention is paid to online feature selection through the Markov Blanket (MB). Several studies based on traditional MB learning presented low prediction accuracy and used fewer datasets as the number of conditional independence tests is high and consumes more time. This paper presents a novel algorithm called Online Feature Selection Via Markov Blanket (OFSVMB) based on a statistical conditional independence test offering high accuracy and less computation time. It reduces the number of conditional independence tests and incorporates the online relevance and redundant analysis to check the relevancy between the upcoming feature and target variable T, discard the redundant features from Parents-Child (PC) and Spouses (SP) online, and find PC and SP simultaneously. The performance OFSVMB is compared with traditional MB learning algorithms including IAMB, STMB, HITON-MB, BAMB, and EEMB, and Streaming feature selection algorithms including OSFS, Alpha-investing, and SAOLA on 9 benchmark Bayesian Network (BN) datasets and 14 real-world datasets. For the performance evaluation, F1, precision, and recall measures are used with a significant level of 0.01 and 0.05 on benchmark BN and real-world datasets, including 12 classifiers keeping a significant level of 0.01. On benchmark BN datasets with 500 and 5000 sample sizes, OFSVMB achieved significant accuracy than IAMB, STMB, HITON-MB, BAMB, and EEMB in terms of F1, precision, recall, and running faster. It finds more accurate MB regardless of the size of the features set. In contrast, OFSVMB offers substantial improvements based on mean prediction accuracy regarding 12 classifiers with small and large sample sizes on real-world datasets than OSFS, Alpha-investing, and SAOLA but slower than OSFS, Alpha-investing, and SAOLA because these algorithms only find the PC set but not SP. Furthermore, the sensitivity analysis shows that OFSVMB is more accurate in selecting the optimal features.

Download Full-text

Fuzzy Rank Based Parallel Online Feature Selection Method using Multiple Sliding Windows

Open Computer Science ◽

10.1515/comp-2020-0169 ◽

2021 ◽

Vol 11 (1) ◽

pp. 275-287

Author(s):

B. Venkatesh ◽

J. Anuradha

Keyword(s):

Feature Selection ◽

Feature Selection Method ◽

Selection Method ◽

Streaming Data ◽

Selection Methods ◽

Sliding Windows ◽

Real World Applications ◽

Benchmark Datasets ◽

Online Feature Selection ◽

Online Streaming

Abstract Nowadays, in real-world applications, the dimensions of data are generated dynamically, and the traditional batch feature selection methods are not suitable for streaming data. So, online streaming feature selection methods gained more attention but the existing methods had demerits like low classification accuracy, fails to avoid redundant and irrelevant features, and a higher number of features selected. In this paper, we propose a parallel online feature selection method using multiple sliding-windows and fuzzy fast-mRMR feature selection analysis, which is used for selecting minimum redundant and maximum relevant features, and also overcomes the drawbacks of existing online streaming feature selection methods. To increase the performance speed of the proposed method parallel processing is used. To evaluate the performance of the proposed online feature selection method k-NN, SVM, and Decision Tree Classifiers are used and compared against the state-of-the-art online feature selection methods. Evaluation metrics like Accuracy, Precision, Recall, F1-Score are used on benchmark datasets for performance analysis. From the experimental analysis, it is proved that the proposed method has achieved more than 95% accuracy for most of the datasets and performs well over other existing online streaming feature selection methods and also, overcomes the drawbacks of the existing methods.

Download Full-text

Online Streaming Feature Selection via Multi-Conditional Independence and Mutual Information Entropy†

International Journal of Computational Intelligence Systems ◽

10.2991/ijcis.d.200423.002 ◽

2020 ◽

Vol 13 (1) ◽

pp. 479

Author(s):

Hongyi Wang ◽

Dianlong You

Keyword(s):

Feature Selection ◽

Mutual Information ◽

Information Entropy ◽

Conditional Independence ◽

Online Streaming

Download Full-text

An Attention-Based Multilayer GRU Model for Multistep-Ahead Short-Term Load Forecasting

Sensors ◽

10.3390/s21051639 ◽

2021 ◽

Vol 21 (5) ◽

pp. 1639

Author(s):

Seungmin Jung ◽

Jihoon Moon ◽

Sungwoo Park ◽

Eenjun Hwang

Keyword(s):

Power Consumption ◽

Prediction Models ◽

Short Term Memory ◽

Load Forecasting ◽

Input Sequence ◽

Short Term ◽

Performance Improvements ◽

Short Term Load Forecasting ◽

Significant Performance ◽

Input Variables

Recently, multistep-ahead prediction has attracted much attention in electric load forecasting because it can deal with sudden changes in power consumption caused by various events such as fire and heat wave for a day from the present time. On the other hand, recurrent neural networks (RNNs), including long short-term memory and gated recurrent unit (GRU) networks, can reflect the previous point well to predict the current point. Due to this property, they have been widely used for multistep-ahead prediction. The GRU model is simple and easy to implement; however, its prediction performance is limited because it considers all input variables equally. In this paper, we propose a short-term load forecasting model using an attention based GRU to focus more on the crucial variables and demonstrate that this can achieve significant performance improvements, especially when the input sequence of RNN is long. Through extensive experiments, we show that the proposed model outperforms other recent multistep-ahead prediction models in the building-level power consumption forecasting.

Download Full-text

A Comparative Study among New Hybrid Root Finding Algorithms and Traditional Methods

Mathematics ◽

10.3390/math9111306 ◽

2021 ◽

Vol 9 (11) ◽

pp. 1306

Author(s):

Elsayed Badr ◽

Sultan Almotairi ◽

Abdallah El Ghamry

Keyword(s):

Comparative Study ◽

Numerical Results ◽

Traditional Methods ◽

Running Time ◽

Root Finding ◽

Average Running Time ◽

False Position ◽

Newton Raphson ◽

Number Of Iterations

In this paper, we propose a novel blended algorithm that has the advantages of the trisection method and the false position method. Numerical results indicate that the proposed algorithm outperforms the secant, the trisection, the Newton–Raphson, the bisection and the regula falsi methods, as well as the hybrid of the last two methods proposed by Sabharwal, with regard to the number of iterations and the average running time.

Download Full-text

Data Downlink System in the Vast IOT Node Condition Assisted by UAV, Large Intelligent Surface, and Power and Data Beacon

Sensors ◽

10.3390/s20205748 ◽

2020 ◽

Vol 20 (20) ◽

pp. 5748

Author(s):

Zhibo Zhang ◽

Qing Chang ◽

Na Zhao ◽

Chen Li ◽

Tianrun Li

Keyword(s):

Communication Systems ◽

Optimization Problems ◽

Minimum Energy ◽

Numerical Algorithms ◽

Flight Trajectory ◽

Performance Improvements ◽

Minimum Energy Consumption ◽

Significant Performance ◽

Downlink System ◽

The Internet Of Things

The future development of communication systems will create a great demand for the internet of things (IOT), where the overall control of all IOT nodes will become an important problem. Considering the essential issues of miniaturization and energy conservation, in this study, a new data downlink system is designed in which all IOT nodes harvest energy first and then receive data. To avoid the unsolvable problem of pre-locating all positions of vast IOT nodes, a device called the power and data beacon (PDB) is proposed. This acts as a relay station for energy and data. In addition, we model future scenes in which a communication system is assisted by unmanned aerial vehicles (UAVs), large intelligent surfaces (LISs), and PDBs. In this paper, we propose and solve the problem of determining the optimal flight trajectory to reach the minimum energy consumption or minimum time consumption. Four future feasible scenes are analyzed and then the optimization problems are solved based on numerical algorithms. Simulation results show that there are significant performance improvements in energy/time with the deployment of LISs and reasonable UAV trajectory planning.

Download Full-text

On the Average Running Time of Odd–Even Merge Sort

Journal of Algorithms ◽

10.1006/jagm.1996.0812 ◽

1997 ◽

Vol 22 (2) ◽

pp. 329-346 ◽

Cited By ~ 2

Author(s):

Christine Rüb

Keyword(s):

Running Time ◽

Average Running Time ◽

Merge Sort

Download Full-text

Online feature selection for pixel classification

Proceedings of the 22nd international conference on Machine learning - ICML '05 ◽

10.1145/1102351.1102383 ◽

2005 ◽

Cited By ~ 25

Author(s):

Karen Glocer ◽

Damian Eads ◽

James Theiler

Keyword(s):

Feature Selection ◽

Pixel Classification ◽

Online Feature Selection ◽

Selection For

Download Full-text

VESTA 3for three-dimensional visualization of crystal, volumetric and morphology data

Journal of Applied Crystallography ◽

10.1107/s0021889811038970 ◽

2011 ◽

Vol 44 (6) ◽

pp. 1272-1276 ◽

Cited By ~ 7956

Author(s):

Koichi Momma ◽

Fujio Izumi

Keyword(s):

Search Algorithm ◽

Voronoi Tessellation ◽

Three Dimensional ◽

Structure Parameters ◽

Volumetric Data ◽

Complex Molecules ◽

Visualization System ◽

Performance Improvements ◽

Significant Performance

VESTAis a three-dimensional visualization system for crystallographic studies and electronic state calculations. It has been upgraded to the latest version,VESTA 3, implementing new features including drawing the external morphology of crystals; superimposing multiple structural models, volumetric data and crystal faces; calculation of electron and nuclear densities from structure parameters; calculation of Patterson functions from structure parameters or volumetric data; integration of electron and nuclear densities by Voronoi tessellation; visualization of isosurfaces with multiple levels; determination of the best plane for selected atoms; an extended bond-search algorithm to enable more sophisticated searches in complex molecules and cage-like structures; undo and redo in graphical user interface operations; and significant performance improvements in rendering isosurfaces and calculating slices.

Download Full-text

Fräsen partikelverstärkter Titanmatrix-Verbundwerkstoffe*/Milling of particle-reinforced titanium metal matrix composites

wt Werkstattstechnik online ◽

10.37544/1436-4980-2017-04-105 ◽

2017 ◽

Vol 107 (04) ◽

pp. 301-305

Author(s):

E. Prof. Uhlmann ◽

F. Kaulfersch

Keyword(s):

High Performance ◽

High Wear Resistance ◽

Tool Geometry ◽

Matrix Composites ◽

Titanium Matrix ◽

Titanium Matrix Composites ◽

Performance Improvements ◽

Significant Performance ◽

High Temperature Components ◽

Process Strategy

Partikelverstärkte Titanmatrix-Verbundwerkstoffe erlauben erhebliche Leistungssteigerungen im Bereich hochtemperaturbeanspruchter Struktur- und Funktionsbauteile. Die durch die Partikelverstärkung gesteigerte Verschleißbeständigkeit, Festigkeit und Härte bedeuten eine große Herausforderung an die spanende Bearbeitung derartiger Hochleistungswerkstoffe. Mittels Zerspanuntersuchungen beim Fräsen konnten unter Variation der Werkzeuggeometrie, der Schneidstoffe und der Prozessstrategie Parameterbeiche identifiziert werden, mit denen die prozesssichere Zerspanung partikelverstärkter Titanmatrix-Verbundwerkstoffe möglich ist.   Particle-reinforced titanium matrix composites ensure significant performance improvements of structural and functional high-temperature components. However, the high wear resistance, toughness and hardness due to particle reinforcement is a major challenge in machining these high performance materials. By conducting milling experiments with a variation of tool geometry, cutting material and process strategy, process parameters could be identified that enable efficient machining of particle-reinforced titanium matrix composites.

Download Full-text