An attribute discretization algorithm based on Rough Set and information entropy

Author(s):  
He Liu ◽  
Da-You Liu ◽  
Xiao-Hu Shi ◽  
Ying Gao
2013 ◽  
Vol 416-417 ◽  
pp. 1399-1403 ◽  
Author(s):  
Zhi Cai Shi ◽  
Yong Xiang Xia ◽  
Chao Gang Yu ◽  
Jin Zu Zhou

The discretization is one of the most important steps for the application of Rough set theory. In this paper, we analyzed the shortcomings of the current relative works. Then we proposed a novel discretization algorithm based on information loss and gave its mathematical description. This algorithm used information loss as the measure so as to reduce the loss of the information entropy during discretizating. The algorithm was applied to different samples with the same attributes from KDDcup99 and intrusion detection systems. The experimental results show that this algorithm is sensitive to the samples only for parts of all attributes. But it dose not compromise the effect of intrusion detection and it improves the response performance of intrusion detection remarkably.


Author(s):  
JIYE LIANG ◽  
ZHONGZHI SHI

Rough set theory is a relatively new mathematical tool for use in computer applications in circumstances which are characterized by vagueness and uncertainty. In this paper, we introduce the concepts of information entropy, rough entropy and knowledge granulation in rough set theory, and establish the relationships among those concepts. These results will be very helpful for understanding the essence of concept approximation and establishing granular computing in rough set theory.


Author(s):  
Yaling Xun ◽  
Qingxia Yin ◽  
Jifu Zhang ◽  
Haifeng Yang ◽  
Xiaohui Cui

Author(s):  
Qiong Chen ◽  
Mengxing Huang

AbstractFeature discretization is an important preprocessing technology for massive data in industrial control. It improves the efficiency of edge-cloud computing by transforming continuous features into discrete ones, so as to meet the requirements of high-quality cloud services. Compared with other discretization methods, the discretization based on rough set has achieved good results in many applications because it can make full use of the known knowledge base without any prior information. However, the equivalence class of rough set is an ordinary set, which is difficult to describe the fuzzy components in the data, and the accuracy is low in some complex data types in big data environment. Therefore, we propose a rough fuzzy model based discretization algorithm (RFMD). Firstly, we use fuzzy c-means clustering to get the membership of each sample to each category. Then, we fuzzify the equivalence class of rough set by the obtained membership, and establish the fitness function of genetic algorithm based on rough fuzzy model to select the optimal discrete breakpoints on the continuous features. Finally, we compare the proposed method with the discretization algorithm based on rough set, the discretization algorithm based on information entropy, and the discretization algorithm based on chi-square test on remote sensing datasets. The experimental results verify the effectiveness of our method.


2013 ◽  
Vol 278-280 ◽  
pp. 1167-1173
Author(s):  
Guo Qiang Sun ◽  
Hong Li Wang ◽  
Jing Hui Lu ◽  
Xing He

Rough set theory is mainly used for analysing, processing fuzzy and uncertain information and knowledge, but most of data that we usually gain are continuous data, rough set theory can pretreat these data and can gain satisfied discretization results. So, discretization of continuous attributes is an important part of rough set theory. Field Programmable Gate Array(FPGA) has been became the mainly platforms that realized design of digital system. In order to improve processing speed of discretization, this paper proposed a FPGA-based discretization algorithm of continuous attributes in rough ret that make use of the speed advantage of FPGA and combined attributes dependency degree. This method could save much time of pretreatment in rough ret and improve operation efficiency.


Sign in / Sign up

Export Citation Format

Share Document