scholarly journals Metric Based Attribute Reduction Method in Dynamic Decision Tables

2016 ◽  
Vol 16 (2) ◽  
pp. 3-15 ◽  
Author(s):  
Demetrovics Janos ◽  
Nguyen Thi Lan Huong ◽  
Vu Duc Thi ◽  
Nguyen Long Giang

Abstract Feature selection is a vital problem which needs to be effectively solved in knowledge discovery in databases and pattern recognition due to two basic reasons: minimizing costs and accurately classifying data. Feature selection using rough set theory is also called attribute reduction. It has attracted a lot of attention from researchers and numerous potential results have been gained. However, most of them are applied on static data and attribute reduction in dynamic databases is still in its early stages. This paper focuses on developing incremental methods and algorithms to derive reducts, employing a distance measure when decision systems vary in condition attribute set. We also conduct experiments on UCI data sets and the experimental results show that the proposed algorithms are better in terms of time consumption and reducts’ cardinality in comparison with non-incremental heuristic algorithm and the incremental approach using information entropy proposed by authors in [17].

Author(s):  
Nguyen Thi Lan Huong ◽  
Nguyen Long Giang

Feature  selection  is  a  crucial  problem need to be effectively solved in knowledge discovery in  databases  because  of  two  basic  reasons:  to minimize  cost  and  to  accurately  classify  data. Feature selection using rough set theory also called attribute  reduction  have  attracted  much  attention from  researchers  and  many  results  are  gained. However,  attribute  reduction  in  dynamic  databases is still in the first stage. This paper focus on develop incremental  methods  and  algorithms  to  derive reducts  hiring  a  distance  measure  when   adding, deleting or updating objects. Since not re-implement the  algorithms  on  the  varied   universal  set,  our algorithms  significantly  reduce  the  complexity  of implementation time.


Entropy ◽  
2019 ◽  
Vol 21 (2) ◽  
pp. 155 ◽  
Author(s):  
Lin Sun ◽  
Xiaoyu Zhang ◽  
Jiucheng Xu ◽  
Shiguang Zhang

Attribute reduction as an important preprocessing step for data mining, and has become a hot research topic in rough set theory. Neighborhood rough set theory can overcome the shortcoming that classical rough set theory may lose some useful information in the process of discretization for continuous-valued data sets. In this paper, to improve the classification performance of complex data, a novel attribute reduction method using neighborhood entropy measures, combining algebra view with information view, in neighborhood rough sets is proposed, which has the ability of dealing with continuous data whilst maintaining the classification information of original attributes. First, to efficiently analyze the uncertainty of knowledge in neighborhood rough sets, by combining neighborhood approximate precision with neighborhood entropy, a new average neighborhood entropy, based on the strong complementarity between the algebra definition of attribute significance and the definition of information view, is presented. Then, a concept of decision neighborhood entropy is investigated for handling the uncertainty and noisiness of neighborhood decision systems, which integrates the credibility degree with the coverage degree of neighborhood decision systems to fully reflect the decision ability of attributes. Moreover, some of their properties are derived and the relationships among these measures are established, which helps to understand the essence of knowledge content and the uncertainty of neighborhood decision systems. Finally, a heuristic attribute reduction algorithm is proposed to improve the classification performance of complex data sets. The experimental results under an instance and several public data sets demonstrate that the proposed method is very effective for selecting the most relevant attributes with great classification performance.


2014 ◽  
Vol 644-650 ◽  
pp. 1607-1619 ◽  
Author(s):  
Tao Yan ◽  
Chong Zhao Han

Z. Pawlak’s rough set theory has been widely applied in analyzing ordinary information systems and decision tables. While few studies have been conducted on attribute selection problem in incomplete decision systems because of its complexity. Therefore, it is necessary to investigate effective algorithms to tackle this issue. In this paper, In this paper, a new rough conditional entropy based uncertainty measure is introduced to evaluate the significance of subsets of attributes in incomplete decision systems. Moreover, some important properties of rough conditional entropy are derived and three attribute selection approaches are constructed, including an exhaustive approach, a heuristic approach, and a probabilistic approach. In the end, a series of experiments on practical incomplete data sets are carried out to assess the proposed approaches. The final experimental results indicate that two of these approaches perform satisfyingly in the process of attribute selection in incomplete decision systems.


Entropy ◽  
2019 ◽  
Vol 21 (2) ◽  
pp. 138 ◽  
Author(s):  
Lin Sun ◽  
Lanying Wang ◽  
Jiucheng Xu ◽  
Shiguang Zhang

For continuous numerical data sets, neighborhood rough sets-based attribute reduction is an important step for improving classification performance. However, most of the traditional reduction algorithms can only handle finite sets, and yield low accuracy and high cardinality. In this paper, a novel attribute reduction method using Lebesgue and entropy measures in neighborhood rough sets is proposed, which has the ability of dealing with continuous numerical data whilst maintaining the original classification information. First, Fisher score method is employed to eliminate irrelevant attributes to significantly reduce computation complexity for high-dimensional data sets. Then, Lebesgue measure is introduced into neighborhood rough sets to investigate uncertainty measure. In order to analyze the uncertainty and noisy of neighborhood decision systems well, based on Lebesgue and entropy measures, some neighborhood entropy-based uncertainty measures are presented, and by combining algebra view with information view in neighborhood rough sets, a neighborhood roughness joint entropy is developed in neighborhood decision systems. Moreover, some of their properties are derived and the relationships are established, which help to understand the essence of knowledge and the uncertainty of neighborhood decision systems. Finally, a heuristic attribute reduction algorithm is designed to improve the classification performance of large-scale complex data. The experimental results under an instance and several public data sets show that the proposed method is very effective for selecting the most relevant attributes with high classification accuracy.


2019 ◽  
Vol 9 (14) ◽  
pp. 2841 ◽  
Author(s):  
Nan Zhang ◽  
Xueyi Gao ◽  
Tianyou Yu

Attribute reduction is a challenging problem in rough set theory, which has been applied in many research fields, including knowledge representation, machine learning, and artificial intelligence. The main objective of attribute reduction is to obtain a minimal attribute subset that can retain the same classification or discernibility properties as the original information system. Recently, many attribute reduction algorithms, such as positive region preservation, generalized decision preservation, and distribution preservation, have been proposed. The existing attribute reduction algorithms for generalized decision preservation are mainly based on the discernibility matrix and are, thus, computationally very expensive and hard to use in large-scale and high-dimensional data sets. To overcome this problem, we introduce the similarity degree for generalized decision preservation. On this basis, the inner and outer significance measures are proposed. By using heuristic strategies, we develop two quick reduction algorithms for generalized decision preservation. Finally, theoretical and experimental results show that the proposed heuristic reduction algorithms are effective and efficient.


Author(s):  
Qing-Hua Zhang ◽  
Long-Yang Yao ◽  
Guan-Sheng Zhang ◽  
Yu-Ke Xin

In this paper, a new incremental knowledge acquisition method is proposed based on rough set theory, decision tree and granular computing. In order to effectively process dynamic data, describing the data by rough set theory, computing equivalence classes and calculating positive region with hash algorithm are analyzed respectively at first. Then, attribute reduction, value reduction and the extraction of rule set by hash algorithm are completed efficiently. Finally, for each new additional data, the incremental knowledge acquisition method is proposed and used to update the original rules. Both algorithm analysis and experiments show that for processing the dynamic information systems, compared with the traditional algorithms and the incremental knowledge acquisition algorithms based on granular computing, the time complexity of the proposed algorithm is lower due to the efficiency of hash algorithm and also this algorithm is more effective when it is used to deal with the huge data sets.


2014 ◽  
Vol 644-650 ◽  
pp. 2120-2123 ◽  
Author(s):  
De Zhi An ◽  
Guang Li Wu ◽  
Jun Lu

At present there are many data mining methods. This paper studies the application of rough set method in data mining, mainly on the application of attribute reduction algorithm based on rough set in the data mining rules extraction stage. Rough set in data mining is often used for reduction of knowledge, and thus for the rule extraction. Attribute reduction is one of the core research contents of rough set theory. In this paper, the traditional attribute reduction algorithm based on rough sets is studied and improved, and for large data sets of data mining, a new attribute reduction algorithm is proposed.


2016 ◽  
Vol 16 (4) ◽  
pp. 13-28 ◽  
Author(s):  
Cao Chinh Nghia ◽  
Demetrovics Janos ◽  
Nguyen Long Giang ◽  
Vu Duc Thi

Abstract According to traditional rough set theory approach, attribute reduction methods are performed on the decision tables with the discretized value domain, which are decision tables obtained by discretized data methods. In recent years, researches have proposed methods based on fuzzy rough set approach to solve the problem of attribute reduction in decision tables with numerical value domain. In this paper, we proposeafuzzy distance between two partitions and an attribute reduction method in numerical decision tables based on proposed fuzzy distance. Experiments on data sets show that the classification accuracy of proposed method is more efficient than the ones based fuzzy entropy.


2021 ◽  
Vol 17 (3) ◽  
pp. 44-67
Author(s):  
Nguyen Truong Thang ◽  
Giang Long Nguyen ◽  
Hoang Viet Long ◽  
Nguyen Anh Tuan ◽  
Tuan Manh Tran ◽  
...  

Attribute reduction is a crucial problem in the process of data mining and knowledge discovery in big data. In incomplete decision systems, the model using tolerance rough set is fundamental to solve the problem by computing the redact to reduce the execution time. However, these proposals used the traditional filter approach so that the reduct was not optimal in the number of attributes and the accuracy of classification. The problem is critical in the dynamic incomplete decision systems which are more appropriate for real-world applications. Therefore, this paper proposes two novel incremental algorithms using the combination of filter and wrapper approach, namely IFWA_ADO and IFWA_DEO, respectively, for the dynamic incomplete decision systems. The IFWA_ADO computes reduct incrementally in cases of adding multiple objects while IFWA_DEO updates reduct when removing multiple objects. These algorithms are also verified on six data sets. Experimental results show that the filter-wrapper algorithms get higher performance than the other filter incremental algorithms.


2014 ◽  
Vol 2014 ◽  
pp. 1-15 ◽  
Author(s):  
Tao Yan ◽  
Chongzhao Han

Pawlak's classical rough set theory has been applied in analyzing ordinary information systems and decision systems. However, few studies have been carried out on the attribute selection problem in incomplete decision systems because of its complexity. It is therefore necessary to investigate effective algorithms to deal with this issue. In this paper, a new rough conditional entropy-based uncertainty measure is introduced to evaluate the significance of subsets of attributes in incomplete decision systems. Furthermore, some important properties of rough conditional entropy are derived and three attribute selection approaches are constructed, including an exhaustive search strategy approach, a heuristic search strategy approach, and a probabilistic search strategy approach for incomplete decision systems. Moreover, several experiments on real-life incomplete data sets are conducted to assess the efficiency of the proposed approaches. The final experimental results indicate that two of these approaches can give satisfying performances in the process of attribute selection in incomplete decision systems.


Sign in / Sign up

Export Citation Format

Share Document