SENSOR (GROUP FEATURE) SELECTION WITH CONTROLLED REDUNDANCY IN A CONNECTIONIST FRAMEWORK

2014 ◽  
Vol 24 (06) ◽  
pp. 1450021 ◽  
Author(s):  
RUDRASIS CHAKRABORTY ◽  
CHIN-TENG LIN ◽  
NIKHIL R. PAL

For many applications, to reduce the processing time and the cost of decision making, we need to reduce the number of sensors, where each sensor produces a set of features. This sensor selection problem is a generalized feature selection problem. Here, we first present a sensor (group-feature) selection scheme based on Multi-Layered Perceptron Networks. This scheme sometimes selects redundant groups of features. So, we propose a selection scheme which can control the level of redundancy between the selected groups. The idea is general and can be used with any learning scheme. We have demonstrated the effectiveness of our scheme on several data sets. In this context, we define different measures of sensor dependency (dependency between groups of features). We have also presented an alternative learning scheme which is more effective than our old scheme. The proposed scheme is also adapted to radial basis function (RBS) network. The advantages of our scheme are threefold. It looks at all the groups together and hence can exploit nonlinear interaction between groups, if any. Our scheme can simultaneously select useful groups as well as learn the underlying system. The level of redundancy among groups can also be controlled.

2016 ◽  
Vol 2016 ◽  
pp. 1-7
Author(s):  
Junyang Qiu ◽  
Zhisong Pan

The Universum data, defined as a set of unlabeled examples that do not belong to any class of interest, have been shown to encode some prior knowledge by representing meaningful information in the same domain as the problem at hand. Universum data have been proved effective in improving learning performance in many tasks, such as classification and clustering. Inspired by its favorable performance, we address a novel semisupervised feature selection problem in this paper, called semisupervised feature selection with Universum, that can simultaneously exploit the unlabeled data and the Universum data. The experiments on several UCI data sets are presented to show that the proposed algorithms can achieve superior performances over conventional unsupervised and supervised methods.


2013 ◽  
Vol 2013 ◽  
pp. 1-13 ◽  
Author(s):  
Hong Zhao ◽  
Fan Min ◽  
William Zhu

Feature selection is an essential process in data mining applications since it reduces a model’s complexity. However, feature selection with various types of costs is still a new research topic. In this paper, we study the cost-sensitive feature selection problem of numeric data with measurement errors. The major contributions of this paper are fourfold. First, a new data model is built to address test costs and misclassification costs as well as error boundaries. It is distinguished from the existing models mainly on the error boundaries. Second, a covering-based rough set model with normal distribution measurement errors is constructed. With this model, coverings are constructed from data rather than assigned by users. Third, a new cost-sensitive feature selection problem is defined on this model. It is more realistic than the existing feature selection problems. Fourth, both backtracking and heuristic algorithms are proposed to deal with the new problem. Experimental results show the efficiency of the pruning techniques for the backtracking algorithm and the effectiveness of the heuristic algorithm. This study is a step toward realistic applications of the cost-sensitive learning.


Author(s):  
A. M. Bagirov ◽  
A. M. Rubinov ◽  
J. Yearwood

The feature selection problem involves the selection of a subset of features that will be sufficient for the determination of structures or clusters in a given dataset and in making predictions. This chapter presents an algorithm for feature selection, which is based on the methods of optimization. To verify the effectiveness of the proposed algorithm we applied it to a number of publicly available real-world databases. The results of numerical experiments are presented and discussed. These results demonstrate that the algorithm performs well on the datasets considered.


Sign in / Sign up

Export Citation Format

Share Document