maximal frequent itemsets
Recently Published Documents


TOTAL DOCUMENTS

103
(FIVE YEARS 10)

H-INDEX

9
(FIVE YEARS 2)

2021 ◽  
Vol 11 (21) ◽  
pp. 10399
Author(s):  
Yalong Zhang ◽  
Wei Yu ◽  
Qiuqin Zhu ◽  
Xuan Ma ◽  
Hisakazu Ogura

When it comes to association rule mining, all frequent itemsets are first found, and then the confidence level of association rules is calculated through the support degree of frequent itemsets. As all non-empty subsets in frequent itemsets are still frequent itemsets, all frequent itemsets can be acquired only by finding all maximal frequent itemsets (MFIs), whose supersets are not frequent itemsets. In this study, an algorithm, named right-hand side expanding (RHSE), which can accurately find all MFIs, was proposed. First, an Expanding Operation was designed, which, starting from any given frequent itemset, could add items using certain rules and form some supersets of given frequent itemsets. In addition, these supersets were all MFIs. Next, this operator was used to add items by taking all frequent 1-itemsets as the starting point alternately, and all MFIs were found in the end. Due to the special design of the Expanding Operation, each MFI could be found. Moreover, the path found was unique, which avoided the algorithm redundancy in temporal and spatial complexity. This algorithm, which has a high operating rate, is applicable to the big data of high-dimensional mass transactions as it is capable of avoiding the computing redundancy and finding all MFIs. In the end, a detailed experimental report on 10 open standard transaction sets was given in this study, including the big data calculation results of million-class transactions.


2021 ◽  
pp. 175-186
Author(s):  
Bemarisika Parfait ◽  
André Totohasina

Given a large collection of transactions containing items, a basic common association rules problem is the huge size of the extracted rule set. Pruning uninteresting and redundant association rules is a promising approach to solve this problem. In this paper, we propose a Condensed Representation for Positive and Negative Association Rules representing non-redundant rules for both exact and approximate association rules based on the sets of frequent generator itemsets, frequent closed itemsets, maximal frequent itemsets, and minimal infrequent itemsets in database B. Experiments on dense (highly-correlated) databases show a significant reduction of the size of extracted association rule set in database B.


2019 ◽  
Vol 35 (4) ◽  
pp. 337-354
Author(s):  
Bac Le ◽  
Lien Kieu ◽  
Dat Tran

In the past few years, privacy issues in data mining have received considerable attention in the data mining literature. However, the problem of data security cannot simply be solved by restricting data collection or against unauthorized access, it should be dealt with by providing solutions that  not only protect sensitive information, but also not affect to the accuracy of the results in data mining and not violate the sensitive knowledge related with individual privacy or competitive advantage in businesses. Sensitive association rule hiding is an important issue in privacy preserving data mining. The aim of association rule hiding is to minimize the side effects on the sanitized database, which means to reduce the number of missing non-sensitive rules and the number of generated ghost rules. Current methods for hiding sensitive rules cause side effects and data loss. In this paper, we introduce a new distortion-based method to hide sensitive rules. This method proposes the determination of critical transactions based on the number of non-sensitive maximal frequent itemsets that contain at least one item to the consequent of the sensitive rule, they can be directly affected by the modified transactions. Using this set, the number of non-sensitive itemsets that need to be considered is reduced dramatically. We compute the smallest number of transactions for modification in advance to minimize the damage to the database. Comparative experimental results on real datasets showed that the proposed method can achieve better results than other methods with fewer side effects and data loss.


Author(s):  
Mohamed-Bachir Belaid ◽  
Christian Bessiere ◽  
Nadjib Lazaar

Frequent itemset mining is one of the most studied tasks in knowledge discovery. It is often reduced to mining the positive border of frequent itemsets, i.e. maximal frequent itemsets. Infrequent itemset mining, on the other hand, can be reduced to mining the negative border, i.e. minimal infrequent itemsets. We propose a generic framework based on constraint programming to mine both borders of frequent itemsets.One can easily decide which border to mine by setting a simple parameter. For this, we introduce two new global constraints, FREQUENTSUBS and INFREQUENTSUPERS, with complete polynomial propagators. We then consider the problem of mining borders with additional constraints. We prove that this problem is coNP-hard, ruling out the hope for the existence of a single CSP solving this problem (unless coNP ⊆ NP).


Sign in / Sign up

Export Citation Format

Share Document