FCILINK: Mining Frequent Closed Itemsets Based on a Link Structure between Transactions

The problem of discovering association rules between items in a database is an emerging area of research. Its goal is to extract significant patterns or interesting rules from large databases. Recent studies of mining association rules have proposed a closure mechanism. It is no longer necessary to mine the set of all of the frequent itemsets and their association rules. Rather, it is sufficient to mine the frequent closed itemsets and their corresponding rules. In the past, a number of algorithms for mining frequent closed itemsets have been based on items. In this paper, we use the transaction itself for mining frequent closed itemsets. An efficient algorithm called FCILINK is proposed that is based on a link structure between transactions. A given database is scanned once and then a much smaller sub-database is scanned twice. Our experimental results show that our algorithm is faster than previously proposed methods. Furthermore, our approach is significantly more efficient for dense databases.

Download Full-text

FIT: A Fast Algorithm for Discovering Frequent Itemsets in Large Databases

Computing Letters ◽

10.1163/1574040054861285 ◽

2005 ◽

Vol 1 (3) ◽

pp. 129-135

Author(s):

Jun Luo ◽

Sanguthevar Rajasekaran

Keyword(s):

Data Mining ◽

Association Rules ◽

Fast Algorithm ◽

Frequent Itemsets ◽

Experimental Results ◽

Important Data ◽

Computational Performance ◽

Large Databases ◽

Intersection Operation ◽

Better Than

Association rules mining is an important data mining problem that has been studied extensively. In this paper, a simple but Fast algorithm for Intersecting attributes lists using hash Tables (FIT) is presented. FIT is designed for efficiently computing all the frequent itemsets in large databases. It deploys an idea similar to Eclat but has a much better computational performance than Eclat due to two reasons: 1) FIT makes fewer total number of comparisons for each intersection operation between two attributes lists, and 2) FIT significantly reduces the total number of intersection operations. Our experimental results demonstrate that the performance of FIT is much better than that of Eclat and Apriori algorithms.

Download Full-text

IMIDB: An Algorithm for Indexed Mining of Incremental Databases

Journal of Intelligent Systems ◽

10.1515/jisys-2015-0107 ◽

2017 ◽

Vol 26 (1) ◽

pp. 69-85

Author(s):

Mohammed M. Fouad ◽

Mostafa G.M. Mostafa ◽

Abdulfattah S. Mashat ◽

Tarek F. Gharib

Keyword(s):

Data Structure ◽

Association Rules ◽

Efficient Algorithm ◽

Performance Comparison ◽

Incremental Mining ◽

Itemset Mining ◽

Large Databases ◽

Transactional Databases ◽

Dynamic Databases ◽

Database Size

AbstractAssociation rules provide important knowledge that can be extracted from transactional databases. Owing to the massive exchange of information nowadays, databases become dynamic and change rapidly and periodically: new transactions are added to the database and/or old transactions are updated or removed from the database. Incremental mining was introduced to overcome the problem of maintaining previously generated association rules in dynamic databases. In this paper, we propose an efficient algorithm (IMIDB) for incremental itemset mining in large databases. The algorithm utilizes the trie data structure for indexing dynamic database transactions. Performance comparison of the proposed algorithm to recently cited algorithms shows that a significant improvement of about two orders of magnitude is achieved by our algorithm. Also, the proposed algorithm exhibits linear scalability with respect to database size.

Download Full-text

A Hybrid Algorithm of Mining Closed Itemsets for Large Databases

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.145.292 ◽

2011 ◽

Vol 145 ◽

pp. 292-296

Author(s):

Lee Wen Huang

Keyword(s):

Data Mining ◽

Association Rules ◽

Execution Time ◽

Hybrid Algorithm ◽

Hybrid Approach ◽

Market Basket Analysis ◽

Market Basket ◽

Large Databases ◽

Closed Itemsets ◽

Simulation Results

Data Mining means a process of nontrivial extraction of implicit, previously and potentially useful information from data in databases. Mining closed large itemsets is a further work of mining association rules, which aims to find the set of necessary subsets of large itemsets that could be representative of all large itemsets. In this paper, we design a hybrid approach, considering the character of data, to mine the closed large itemsets efficiently. Two features of market basket analysis are considered – the number of items is large; the number of associated items for each item is small. Combining the cut-point method and the hash concept, the new algorithm can find the closed large itemsets efficiently. The simulation results show that the new algorithm outperforms the FP-CLOSE algorithm in the execution time and the space of storage.

Download Full-text

MINING ESSENTIAL RULES USING FREQUENT CLOSED ITEMSETS LATTICE

Science and Technology Development Journal ◽

10.32508/stdj.v12i11.2311 ◽

2009 ◽

Vol 12 (11) ◽

pp. 49-56

Author(s):

Bac Hoai Le ◽

Bay Dinh Vo

Keyword(s):

Association Rules ◽

Frequent Itemsets ◽

Suitable Method ◽

Mining Method ◽

Parent Child Relationship ◽

Left Hand ◽

Child Relationship ◽

Closed Itemsets ◽

The Cost ◽

Parent Child

In traditional mining of association rules, finding all association rules from databases that satisfy minSup and minConf faces with some problems in case of the number of frequent itemsets is large. Thus, it is necessary to have a suitable method for mining fewer rules but they still embrace all rules of traditional mining method. One of the approaches that is the mining method of essential rules: it only keeps the rule that its left hand side is minimal and its right side is maximal (follow in parent-child relationship). In this paper, we propose a new algorithm for mining the essential rules from the frequent closed itemsets lattice to reduce the time of mining rules. We use the parent-child relationship in lattice to reduce the cost of considering parent-child relationship and lead to reduce the time of mining rules.

Download Full-text

Mining Closed Item sets using Partition based Single Scan Algorithm

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.a1920.078219 ◽

2019 ◽

Vol 8 (2) ◽

pp. 3885-3889

Keyword(s):

Efficient Algorithm ◽

Empirical Evaluation ◽

Frequent Itemsets ◽

Frequent Item ◽

Closed Itemsets ◽

Frequent Item Sets

Closed item sets are frequent itemsets that uniquely determines the exact frequency of frequent item sets. Closed Item sets reduces the massive output to a smaller magnitude without redundancy. In this paper, we present PSS-MCI, an efficient candidate generate based approach for mining all closed itemsets. It enumerates closed item sets using hash tree, candidate generation, super-set and sub-set checking. It uses partitioned based strategy to avoid unnecessary computation for the itemsets which are not useful. Using an efficient algorithm, it determines all closed item sets from a single scan over the database. However, several unnecessary item sets are being hashed in the buckets. To overcome the limitations, heuristics are enclosed with algorithm PSS-MCI. Empirical evaluation and results show that the PSS-MCI outperforms all candidate generate and other approaches. Further, PSS-MCI explores all closed item sets.

Download Full-text

E-MsNFIS: Efficient Negative Frequent Itemsets Mining Based on Multiple Minimum Supports

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.411-414.386 ◽

2013 ◽

Vol 411-414 ◽

pp. 386-389 ◽

Cited By ~ 1

Author(s):

Tian Tian Xu ◽

Xiang Jun Dong

Keyword(s):

Association Rules ◽

Real Life ◽

Frequent Itemsets ◽

Negative Association ◽

Experimental Results ◽

New Method ◽

Minimum Support ◽

Frequent Itemsets Mining ◽

Negative Association Rules ◽

Multiple Minimum Supports

Negative frequent itemsets (NFIS) like (a1a2¬a3a4) have played important roles in real applications because we can mine valued negative association rules from them. In one of our previous work, we proposed a method, namede-NFISto mine NFIS from positive frequent itemsets (PFIS). However,e-NFISonly uses single minimum support, which implicitly assumes that all items in the database are of the same nature or of similar frequencies in the database. This is often not the case in real-life applications. So a lot of methods to mine frequent itemsets with multiple minimum supports have been proposed. These methods allow users to assign different minimum supports to different items. But these methods only mine PFIS, doesn’t consider negative ones. So in this paper, we propose a new method, namede-msNFIS, to mine NFIS from PFIS based on multiple minimum supports. E-msNFIScontains three steps: 1) using existing methods to mine PFIS with multiple minimum supports; 2) using the same method ine-NFISto generate NCIS from PFIS got in step 1; 3) calculating the support of these NCIS only using the support of PFIS and then gettingNFIS. Experimental results show that thee-msNFISis efficient.

Download Full-text

PROVIDING TIMELY UPDATED SEQUENTIAL PATTERNS IN DECISION MAKING

International Journal of Information Technology & Decision Making ◽

10.1142/s0219622010004147 ◽

2010 ◽

Vol 09 (06) ◽

pp. 873-888 ◽

Cited By ~ 4

Author(s):

TZUNG-PEI HONG ◽

CHING-YAO WANG ◽

CHUN-WEI LIN

Keyword(s):

Decision Making ◽

Real Time ◽

Real World ◽

Experimental Results ◽

Sequential Patterns ◽

The Past ◽

Large Databases ◽

Real World Applications ◽

Critical Task ◽

Maintenance Process

Mining knowledge from large databases has become a critical task for organizations. Managers commonly use the obtained sequential patterns to make decisions. In the past, databases were usually assumed to be static. In real-world applications, however, transactions may be updated. In this paper, a maintenance algorithm for rapidly updating sequential patterns for real-time decision making is proposed. The proposed algorithm utilizes previously discovered large sequences in the maintenance process, thus greatly reducing the number of database rescans and improving performance. Experimental results verify the performance of the proposed approach. The proposed algorithm provides real-time knowledge that can be used for decision making.

Download Full-text

An Efficient Algorithm for Mining Frequent Itemsets in Large Databases,

Applied Mathematics & Information Sciences ◽

10.18576/amis/130604 ◽

2019 ◽

Vol 13 (11) ◽

pp. 913-921

Keyword(s):

Efficient Algorithm ◽

Frequent Itemsets ◽

Large Databases ◽

Mining Frequent Itemsets

Download Full-text

An Efficient Weighted Association Rules Mining Algorithm

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.333-335.1247 ◽

2013 ◽

Vol 333-335 ◽

pp. 1247-1250 ◽

Cited By ~ 1

Author(s):

Na Xin Peng

Keyword(s):

Association Rules ◽

Frequent Itemsets ◽

Experimental Results ◽

Apriori Algorithm ◽

Association Rules Mining ◽

Fuzzy Association Rules ◽

Weighted Association Rules ◽

Mining Algorithm ◽

Pruning Strategy

Aiming at the problem that most of weighted association rules algorithm have not the anti-monotonicity, this paper presents a weighted support-confidence framework which supports anti-monotonicity. On this basis, Boolean weighted association rules algorithm and weighted fuzzy association rules algorithm are presented, which use pruning strategy of Apriori algorithm so as to improve the efficiency of frequent itemsets generated. Experimental results show that both algorithms have good performance.

Download Full-text

e-NSPFI: Efficient Mining Negative Sequential Pattern from Both Frequent and Infrequent Positive Sequential Patterns

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001417500021 ◽

2017 ◽

Vol 31 (02) ◽

pp. 1750002 ◽

Cited By ~ 8

Author(s):

Yongshun Gong ◽

Tiantian Xu ◽

Xiangjun Dong ◽

Guohua Lv

Keyword(s):

Association Rules ◽

Efficient Algorithm ◽

Optimization Method ◽

Negative Association ◽

Experimental Results ◽

Sequential Pattern ◽

Sequential Patterns ◽

Negative Association Rules ◽

Storage Optimization

Negative sequential patterns (NSPs), which focus on nonoccurring but interesting behaviors (e.g. missing consumption records), provide a special perspective of analyzing sequential patterns. So far, very few methods have been proposed to solve for NSP mining problem, and these methods only mine NSP from positive sequential patterns (PSPs). However, as many useful negative association rules are mined from infrequent itemsets, many meaningful NSPs can also be found from infrequent positive sequences (IPSs). The challenge of mining NSP from IPS is how to constrain which IPS could be available used during NSP process because, if without constraints, the number of IPS would be too large to be handled. So in this study, we first propose a strategy to constrain which IPS could be available and utilized for mining NSP. Then we give a storage optimization method to hold this IPS information. Finally, an efficient algorithm called Efficient mining Negative Sequential Pattern from both Frequent and Infrequent positive sequential patterns (e-NSPFI) is proposed for mining NSP. The experimental results show that e-NSPFI can efficiently find much more interesting negative patterns than e-NSP.

Download Full-text