FCILINK: Mining Frequent Closed Itemsets Based on a Link Structure between Transactions

2005 ◽  
Vol 04 (04) ◽  
pp. 257-267
Author(s):  
Kyong Rok Han ◽  
Jae Yearn Kim

The problem of discovering association rules between items in a database is an emerging area of research. Its goal is to extract significant patterns or interesting rules from large databases. Recent studies of mining association rules have proposed a closure mechanism. It is no longer necessary to mine the set of all of the frequent itemsets and their association rules. Rather, it is sufficient to mine the frequent closed itemsets and their corresponding rules. In the past, a number of algorithms for mining frequent closed itemsets have been based on items. In this paper, we use the transaction itself for mining frequent closed itemsets. An efficient algorithm called FCILINK is proposed that is based on a link structure between transactions. A given database is scanned once and then a much smaller sub-database is scanned twice. Our experimental results show that our algorithm is faster than previously proposed methods. Furthermore, our approach is significantly more efficient for dense databases.

2005 ◽  
Vol 1 (3) ◽  
pp. 129-135
Author(s):  
Jun Luo ◽  
Sanguthevar Rajasekaran

Association rules mining is an important data mining problem that has been studied extensively. In this paper, a simple but Fast algorithm for Intersecting attributes lists using hash Tables (FIT) is presented. FIT is designed for efficiently computing all the frequent itemsets in large databases. It deploys an idea similar to Eclat but has a much better computational performance than Eclat due to two reasons: 1) FIT makes fewer total number of comparisons for each intersection operation between two attributes lists, and 2) FIT significantly reduces the total number of intersection operations. Our experimental results demonstrate that the performance of FIT is much better than that of Eclat and Apriori algorithms.


2017 ◽  
Vol 26 (1) ◽  
pp. 69-85
Author(s):  
Mohammed M. Fouad ◽  
Mostafa G.M. Mostafa ◽  
Abdulfattah S. Mashat ◽  
Tarek F. Gharib

AbstractAssociation rules provide important knowledge that can be extracted from transactional databases. Owing to the massive exchange of information nowadays, databases become dynamic and change rapidly and periodically: new transactions are added to the database and/or old transactions are updated or removed from the database. Incremental mining was introduced to overcome the problem of maintaining previously generated association rules in dynamic databases. In this paper, we propose an efficient algorithm (IMIDB) for incremental itemset mining in large databases. The algorithm utilizes the trie data structure for indexing dynamic database transactions. Performance comparison of the proposed algorithm to recently cited algorithms shows that a significant improvement of about two orders of magnitude is achieved by our algorithm. Also, the proposed algorithm exhibits linear scalability with respect to database size.


2011 ◽  
Vol 145 ◽  
pp. 292-296
Author(s):  
Lee Wen Huang

Data Mining means a process of nontrivial extraction of implicit, previously and potentially useful information from data in databases. Mining closed large itemsets is a further work of mining association rules, which aims to find the set of necessary subsets of large itemsets that could be representative of all large itemsets. In this paper, we design a hybrid approach, considering the character of data, to mine the closed large itemsets efficiently. Two features of market basket analysis are considered – the number of items is large; the number of associated items for each item is small. Combining the cut-point method and the hash concept, the new algorithm can find the closed large itemsets efficiently. The simulation results show that the new algorithm outperforms the FP-CLOSE algorithm in the execution time and the space of storage.


2009 ◽  
Vol 12 (11) ◽  
pp. 49-56
Author(s):  
Bac Hoai Le ◽  
Bay Dinh Vo

In traditional mining of association rules, finding all association rules from databases that satisfy minSup and minConf faces with some problems in case of the number of frequent itemsets is large. Thus, it is necessary to have a suitable method for mining fewer rules but they still embrace all rules of traditional mining method. One of the approaches that is the mining method of essential rules: it only keeps the rule that its left hand side is minimal and its right side is maximal (follow in parent-child relationship). In this paper, we propose a new algorithm for mining the essential rules from the frequent closed itemsets lattice to reduce the time of mining rules. We use the parent-child relationship in lattice to reduce the cost of considering parent-child relationship and lead to reduce the time of mining rules.


2019 ◽  
Vol 8 (2) ◽  
pp. 3885-3889

Closed item sets are frequent itemsets that uniquely determines the exact frequency of frequent item sets. Closed Item sets reduces the massive output to a smaller magnitude without redundancy. In this paper, we present PSS-MCI, an efficient candidate generate based approach for mining all closed itemsets. It enumerates closed item sets using hash tree, candidate generation, super-set and sub-set checking. It uses partitioned based strategy to avoid unnecessary computation for the itemsets which are not useful. Using an efficient algorithm, it determines all closed item sets from a single scan over the database. However, several unnecessary item sets are being hashed in the buckets. To overcome the limitations, heuristics are enclosed with algorithm PSS-MCI. Empirical evaluation and results show that the PSS-MCI outperforms all candidate generate and other approaches. Further, PSS-MCI explores all closed item sets.


2013 ◽  
Vol 411-414 ◽  
pp. 386-389 ◽  
Author(s):  
Tian Tian Xu ◽  
Xiang Jun Dong

Negative frequent itemsets (NFIS) like (a1a2¬a3a4) have played important roles in real applications because we can mine valued negative association rules from them. In one of our previous work, we proposed a method, namede-NFISto mine NFIS from positive frequent itemsets (PFIS). However,e-NFISonly uses single minimum support, which implicitly assumes that all items in the database are of the same nature or of similar frequencies in the database. This is often not the case in real-life applications. So a lot of methods to mine frequent itemsets with multiple minimum supports have been proposed. These methods allow users to assign different minimum supports to different items. But these methods only mine PFIS, doesn’t consider negative ones. So in this paper, we propose a new method, namede-msNFIS, to mine NFIS from PFIS based on multiple minimum supports. E-msNFIScontains three steps: 1) using existing methods to mine PFIS with multiple minimum supports; 2) using the same method ine-NFISto generate NCIS from PFIS got in step 1; 3) calculating the support of these NCIS only using the support of PFIS and then gettingNFIS. Experimental results show that thee-msNFISis efficient.


2010 ◽  
Vol 09 (06) ◽  
pp. 873-888 ◽  
Author(s):  
TZUNG-PEI HONG ◽  
CHING-YAO WANG ◽  
CHUN-WEI LIN

Mining knowledge from large databases has become a critical task for organizations. Managers commonly use the obtained sequential patterns to make decisions. In the past, databases were usually assumed to be static. In real-world applications, however, transactions may be updated. In this paper, a maintenance algorithm for rapidly updating sequential patterns for real-time decision making is proposed. The proposed algorithm utilizes previously discovered large sequences in the maintenance process, thus greatly reducing the number of database rescans and improving performance. Experimental results verify the performance of the proposed approach. The proposed algorithm provides real-time knowledge that can be used for decision making.


2013 ◽  
Vol 333-335 ◽  
pp. 1247-1250 ◽  
Author(s):  
Na Xin Peng

Aiming at the problem that most of weighted association rules algorithm have not the anti-monotonicity, this paper presents a weighted support-confidence framework which supports anti-monotonicity. On this basis, Boolean weighted association rules algorithm and weighted fuzzy association rules algorithm are presented, which use pruning strategy of Apriori algorithm so as to improve the efficiency of frequent itemsets generated. Experimental results show that both algorithms have good performance.


Author(s):  
Yongshun Gong ◽  
Tiantian Xu ◽  
Xiangjun Dong ◽  
Guohua Lv

Negative sequential patterns (NSPs), which focus on nonoccurring but interesting behaviors (e.g. missing consumption records), provide a special perspective of analyzing sequential patterns. So far, very few methods have been proposed to solve for NSP mining problem, and these methods only mine NSP from positive sequential patterns (PSPs). However, as many useful negative association rules are mined from infrequent itemsets, many meaningful NSPs can also be found from infrequent positive sequences (IPSs). The challenge of mining NSP from IPS is how to constrain which IPS could be available used during NSP process because, if without constraints, the number of IPS would be too large to be handled. So in this study, we first propose a strategy to constrain which IPS could be available and utilized for mining NSP. Then we give a storage optimization method to hold this IPS information. Finally, an efficient algorithm called Efficient mining Negative Sequential Pattern from both Frequent and Infrequent positive sequential patterns (e-NSPFI) is proposed for mining NSP. The experimental results show that e-NSPFI can efficiently find much more interesting negative patterns than e-NSP.


Sign in / Sign up

Export Citation Format

Share Document