Efficient Algorithms for Dynamic Incomplete Decision Systems

2021 ◽  
Vol 17 (3) ◽  
pp. 44-67
Author(s):  
Nguyen Truong Thang ◽  
Giang Long Nguyen ◽  
Hoang Viet Long ◽  
Nguyen Anh Tuan ◽  
Tuan Manh Tran ◽  
...  

Attribute reduction is a crucial problem in the process of data mining and knowledge discovery in big data. In incomplete decision systems, the model using tolerance rough set is fundamental to solve the problem by computing the redact to reduce the execution time. However, these proposals used the traditional filter approach so that the reduct was not optimal in the number of attributes and the accuracy of classification. The problem is critical in the dynamic incomplete decision systems which are more appropriate for real-world applications. Therefore, this paper proposes two novel incremental algorithms using the combination of filter and wrapper approach, namely IFWA_ADO and IFWA_DEO, respectively, for the dynamic incomplete decision systems. The IFWA_ADO computes reduct incrementally in cases of adding multiple objects while IFWA_DEO updates reduct when removing multiple objects. These algorithms are also verified on six data sets. Experimental results show that the filter-wrapper algorithms get higher performance than the other filter incremental algorithms.

2021 ◽  
Vol 17 (2) ◽  
pp. 39-62
Author(s):  
Nguyen Long Giang ◽  
Le Hoang Son ◽  
Nguyen Anh Tuan ◽  
Tran Thi Ngan ◽  
Nguyen Nhu Son ◽  
...  

The tolerance rough set model is an effective tool to solve attribute reduction problem directly on incomplete decision systems without pre-processing missing values. In practical applications, incomplete decision systems are often changed and updated, especially in the case of adding or removing attributes. To solve the problem of finding reduct on dynamic incomplete decision systems, researchers have proposed many incremental algorithms to decrease execution time. However, the proposed incremental algorithms are mainly based on filter approach in which classification accuracy was calculated after the reduct has been obtained. As the results, these filter algorithms do not get the best result in term of the number of attributes in reduct and classification accuracy. This paper proposes two distance based filter-wrapper incremental algorithms: the algorithm IFWA_AA in case of adding attributes and the algorithm IFWA_DA in case of deleting attributes. Experimental results show that proposed filter-wrapper incremental algorithm IFWA_AA decreases significantly the number of attributes in reduct and improves classification accuracy compared to filter incremental algorithms such as UARA, IDRA.


2019 ◽  
Vol 57 (4) ◽  
pp. 499
Author(s):  
Nguyen Ba Quang ◽  
Nguyen Long Giang ◽  
Dang Thi Oanh

Tolerance rough set model is an effective tool for attribute reduction in incomplete decision tables. In recent years, some incremental algorithms have been proposed to find reduct of dynamic incomplete decision tables in order to reduce computation time. However, they are classical filter algorithms, in which the classification accuracy of decision tables is computed after obtaining reduct. Therefore, the obtained reducts of these algorithms are not optimal on cardinality of reduct and classification accuracy. In this paper, we propose the incremental filter-wrapper algorithm IDS_IFW_AO to find one reduct of an incomplete desision table in case of adding multiple objects. The experimental results on some sample datasets show that the proposed filter-wrapper algorithm IDS_IFW_AO is more effective than the filter algorithm IARM-I [17] on classification accuracy and cardinality of reduct


Entropy ◽  
2019 ◽  
Vol 21 (2) ◽  
pp. 138 ◽  
Author(s):  
Lin Sun ◽  
Lanying Wang ◽  
Jiucheng Xu ◽  
Shiguang Zhang

For continuous numerical data sets, neighborhood rough sets-based attribute reduction is an important step for improving classification performance. However, most of the traditional reduction algorithms can only handle finite sets, and yield low accuracy and high cardinality. In this paper, a novel attribute reduction method using Lebesgue and entropy measures in neighborhood rough sets is proposed, which has the ability of dealing with continuous numerical data whilst maintaining the original classification information. First, Fisher score method is employed to eliminate irrelevant attributes to significantly reduce computation complexity for high-dimensional data sets. Then, Lebesgue measure is introduced into neighborhood rough sets to investigate uncertainty measure. In order to analyze the uncertainty and noisy of neighborhood decision systems well, based on Lebesgue and entropy measures, some neighborhood entropy-based uncertainty measures are presented, and by combining algebra view with information view in neighborhood rough sets, a neighborhood roughness joint entropy is developed in neighborhood decision systems. Moreover, some of their properties are derived and the relationships are established, which help to understand the essence of knowledge and the uncertainty of neighborhood decision systems. Finally, a heuristic attribute reduction algorithm is designed to improve the classification performance of large-scale complex data. The experimental results under an instance and several public data sets show that the proposed method is very effective for selecting the most relevant attributes with high classification accuracy.


Author(s):  
Mekour Norreddine

One of the problems that gene expression data resolved is feature selection. There is an important process for choosing which features are important for prediction; there are two general approaches for feature selection: filter approach and wrapper approach. In this chapter, the authors combine the filter approach with method ranked information gain and wrapper approach with a searching method of the genetic algorithm. The authors evaluate their approach on two data sets of gene expression data: Leukemia, and the Central Nervous System. The classifier Decision tree (C4.5) is used for improving the classification performance.


2016 ◽  
Vol 16 (2) ◽  
pp. 3-15 ◽  
Author(s):  
Demetrovics Janos ◽  
Nguyen Thi Lan Huong ◽  
Vu Duc Thi ◽  
Nguyen Long Giang

Abstract Feature selection is a vital problem which needs to be effectively solved in knowledge discovery in databases and pattern recognition due to two basic reasons: minimizing costs and accurately classifying data. Feature selection using rough set theory is also called attribute reduction. It has attracted a lot of attention from researchers and numerous potential results have been gained. However, most of them are applied on static data and attribute reduction in dynamic databases is still in its early stages. This paper focuses on developing incremental methods and algorithms to derive reducts, employing a distance measure when decision systems vary in condition attribute set. We also conduct experiments on UCI data sets and the experimental results show that the proposed algorithms are better in terms of time consumption and reducts’ cardinality in comparison with non-incremental heuristic algorithm and the incremental approach using information entropy proposed by authors in [17].


Entropy ◽  
2019 ◽  
Vol 21 (2) ◽  
pp. 155 ◽  
Author(s):  
Lin Sun ◽  
Xiaoyu Zhang ◽  
Jiucheng Xu ◽  
Shiguang Zhang

Attribute reduction as an important preprocessing step for data mining, and has become a hot research topic in rough set theory. Neighborhood rough set theory can overcome the shortcoming that classical rough set theory may lose some useful information in the process of discretization for continuous-valued data sets. In this paper, to improve the classification performance of complex data, a novel attribute reduction method using neighborhood entropy measures, combining algebra view with information view, in neighborhood rough sets is proposed, which has the ability of dealing with continuous data whilst maintaining the classification information of original attributes. First, to efficiently analyze the uncertainty of knowledge in neighborhood rough sets, by combining neighborhood approximate precision with neighborhood entropy, a new average neighborhood entropy, based on the strong complementarity between the algebra definition of attribute significance and the definition of information view, is presented. Then, a concept of decision neighborhood entropy is investigated for handling the uncertainty and noisiness of neighborhood decision systems, which integrates the credibility degree with the coverage degree of neighborhood decision systems to fully reflect the decision ability of attributes. Moreover, some of their properties are derived and the relationships among these measures are established, which helps to understand the essence of knowledge content and the uncertainty of neighborhood decision systems. Finally, a heuristic attribute reduction algorithm is proposed to improve the classification performance of complex data sets. The experimental results under an instance and several public data sets demonstrate that the proposed method is very effective for selecting the most relevant attributes with great classification performance.


Tolerance rough set model is an effective tool to reduce attributes in incomplete decision tables. Over 40 years, several attribute reduction methods have been proposed to improve the efficiency of execution time and the number of attributes of the reduct. However, they are classical filter algorithms, in which the classification accuracy of decision tables is computed after obtaining the reducts. Therefore, the obtained reducts of these algorithms are not optimal in terms of reduct cardinality and classification accuracy. In this paper, we propose a filter-wrapper algorithm to find a reduct in incomplete decision tables. We then use this measure to determine the importance of the property and select the attribute based on the calculated importance (filter phase). In the next step, we find the reduct with the highest classification accuracy by iterating over elements of the set containing the sequence of attributes selected in the first step (wrapper phase). To verify the effectiveness of the method, we conduct experiments on 6 famous UCI data sets. Experimental results show that the proposed method increase classification accuracy as well as reduce the cardinality of reduct compared to Algorithm 1 [12].


2014 ◽  
Vol 644-650 ◽  
pp. 1607-1619 ◽  
Author(s):  
Tao Yan ◽  
Chong Zhao Han

Z. Pawlak’s rough set theory has been widely applied in analyzing ordinary information systems and decision tables. While few studies have been conducted on attribute selection problem in incomplete decision systems because of its complexity. Therefore, it is necessary to investigate effective algorithms to tackle this issue. In this paper, In this paper, a new rough conditional entropy based uncertainty measure is introduced to evaluate the significance of subsets of attributes in incomplete decision systems. Moreover, some important properties of rough conditional entropy are derived and three attribute selection approaches are constructed, including an exhaustive approach, a heuristic approach, and a probabilistic approach. In the end, a series of experiments on practical incomplete data sets are carried out to assess the proposed approaches. The final experimental results indicate that two of these approaches perform satisfyingly in the process of attribute selection in incomplete decision systems.


2021 ◽  
pp. 1-15
Author(s):  
Rongde Lin ◽  
Jinjin Li ◽  
Dongxiao Chen ◽  
Jianxin Huang ◽  
Yingsheng Chen

Fuzzy covering rough set model is a popular and important theoretical tool for computation of uncertainty, and provides an effective approach for attribute reduction. However, attribute reductions derived directly from fuzzy lower or upper approximations actually still occupy large of redundant information, which leads to a lower ratio of attribute-reduced. This paper introduces a kind of parametric observation sets on the approximations, and further proposes so called parametric observational-consistency, which is applied to attribute reduction in fuzzy multi-covering decision systems. Then the related discernibility matrix is developed to provide a way of attribute reduction. In addition, for multiple observational parameters, this article also introduces a recursive method to gradually construct the multiple discernibility matrix by composing the refined discernibility matrix and incremental discernibility matrix based on previous ones. In such case, an attribute reduction algorithm is proposed. Finally, experiments are used to demonstrate the feasibility and effectiveness of our proposed method.


Sign in / Sign up

Export Citation Format

Share Document