FIT: A Fast Algorithm for Discovering Frequent Itemsets in Large Databases

With the availability of inexpensive storage and the progress in data collection tools, many organizations have created large databases of business and scientific data, which create an imminent need and great opportunities for mining interesting knowledge from data.Mining association rules is an important topic in the data mining research. In the paper, research mining frequent itemsets algorithm based on recognizable matrix and mining association rules algorithm based on improved measure system, the above method is used to mine association rules to the students’ data table under Visual FoxPro 6.0.

Download Full-text

FCILINK: Mining Frequent Closed Itemsets Based on a Link Structure between Transactions

Journal of Information & Knowledge Management ◽

10.1142/s0219649205001213 ◽

2005 ◽

Vol 04 (04) ◽

pp. 257-267

Author(s):

Kyong Rok Han ◽

Jae Yearn Kim

Keyword(s):

Association Rules ◽

Efficient Algorithm ◽

Frequent Itemsets ◽

Experimental Results ◽

Link Structure ◽

The Past ◽

Large Databases ◽

Closure Mechanism ◽

Closed Itemsets ◽

Significant Patterns

The problem of discovering association rules between items in a database is an emerging area of research. Its goal is to extract significant patterns or interesting rules from large databases. Recent studies of mining association rules have proposed a closure mechanism. It is no longer necessary to mine the set of all of the frequent itemsets and their association rules. Rather, it is sufficient to mine the frequent closed itemsets and their corresponding rules. In the past, a number of algorithms for mining frequent closed itemsets have been based on items. In this paper, we use the transaction itself for mining frequent closed itemsets. An efficient algorithm called FCILINK is proposed that is based on a link structure between transactions. A given database is scanned once and then a much smaller sub-database is scanned twice. Our experimental results show that our algorithm is faster than previously proposed methods. Furthermore, our approach is significantly more efficient for dense databases.

Download Full-text

Association Rule Mining on Spambase Dataset using Tanagra

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.c8022.049620 ◽

2020 ◽

Vol 9 (6) ◽

pp. 890-894

Keyword(s):

Data Mining ◽

Association Rules ◽

Execution Time ◽

Association Rule ◽

Association Rule Mining ◽

Frequent Itemsets ◽

Rule Mining ◽

Huge Amount ◽

Memory Space ◽

Better Than

There is huge amount of data being generated every minute on internet. This data is of no use until we cannot extract useful information from it. Data mining is the process of extracting useful information or knowledge from this huge amount of data that can be further used for various purposes. Discovering Association rules is one of the most important tasks among all other data mining tasks. Association rules contain the rules in the form of IF then THAN form. The leftmost part of the rule i.e. IF is called as the Antecedent which defines the condition and the rightmost part i.e. ELSE is called as the Consequent which defines the result. In this paper, we present the overview and comparison of Apriori, Apriori PT and Frequent Itemsets algorithm of association component in Tanagra Tool. We analyzed the performance based on the execution time and memory used for different number of instances, support and Rule Length in Spambase Dataset. The results show that when we increase the support value the Apriori PT takes the less execution time and Apriori takes less memory space. When numbers of instances are reduced Frequent Itemsets outperforms well both in case of memory and execution time. When rule length is increased the Apriori algorithm performs better than Apriori PT and Frequent Itemsets.

Download Full-text

Reasoning about Frequent Patterns with Negation

Encyclopedia of Data Warehousing and Mining ◽

10.4018/978-1-59140-557-3.ch177 ◽

2011 ◽

pp. 941-946 ◽

Cited By ~ 3

Author(s):

Marzena Kryszkiewicz

Keyword(s):

Data Mining ◽

Association Rules ◽

Association Rule ◽

White Wine ◽

Frequent Patterns ◽

Sales Managers ◽

Important Data ◽

Large Databases ◽

Transaction Database ◽

Significant Patterns

Discovering frequent patterns in large databases is an important data mining problem. The problem was introduced in (Agrawal, Imielinski, & Swami, 1993) for a sales transaction database. Frequent patterns were defined there as sets of items that are purchased together frequently. Frequent patterns are commonly used for building association rules. For example, an association rule may state that 80% of customers who buy fish also buy white wine. This rule is derivable from the fact that fish occurs in 5% of sales transactions and set {fish, white wine} occurs in 4% of transactions. Patterns and association rules can be generalized by admitting negation. A sample association rule with negation could state that 75% of customers who buy coke also buy chips and neither beer nor milk. The knowledge of this kind is important not only for sales managers, but also in medical areas (Tsumoto, 2002). Admitting negation in patterns usually results in an abundance of mined patterns, which makes analysis of the discovered knowledge infeasible. It is thus preferable to discover and store a possibly small fraction of patterns, from which one can derive all other significant patterns when required. In this chapter, we introduce first lossless representations of frequent patterns with negation.

Download Full-text

Research of Improved FP-Growth Algorithm in Association Rules Mining

Scientific Programming ◽

10.1155/2015/910281 ◽

2015 ◽

Vol 2015 ◽

pp. 1-6 ◽

Cited By ~ 10

Author(s):

Yi Zeng ◽

Shiqun Yin ◽

Jiangyue Liu ◽

Miao Zhang

Keyword(s):

Data Mining ◽

Association Rules ◽

Experimental Results ◽

Frequent Pattern ◽

Association Rules Mining ◽

Classical Algorithm ◽

Pattern Growth ◽

Data Volume ◽

Better Than

Association rules mining is an important technology in data mining. FP-Growth (frequent-pattern growth) algorithm is a classical algorithm in association rules mining. But the FP-Growth algorithm in mining needs two times to scan database, which reduces the efficiency of algorithm. Through the study of association rules mining and FP-Growth algorithm, we worked out improved algorithms of FP-Growth algorithm—Painting-Growth algorithm and N (not) Painting-Growth algorithm (removes the painting steps, and uses another way to achieve). We compared two kinds of improved algorithms with FP-Growth algorithm. Experimental results show that Painting-Growth algorithm is more than 1050 and N Painting-Growth algorithm is less than 10000 in data volume; the performance of the two kinds of improved algorithms is better than that of FP-Growth algorithm.

Download Full-text

Reasoning about Frequent Patterns with Negation

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch254 ◽

2011 ◽

pp. 1667-1674

Author(s):

Marzena Kryszkiewicz

Keyword(s):

Data Mining ◽

Association Rules ◽

Association Rule ◽

White Wine ◽

Frequent Patterns ◽

Sales Managers ◽

Important Data ◽

Large Databases ◽

Transaction Database ◽

Significant Patterns

Discovering of frequent patterns in large databases is an important data mining problem. The problem was introduced in (Agrawal, Imielinski & Swami, 1993) for a sales transaction database. Frequent patterns were defined there as sets of items that are purchased together frequently. Frequent patterns are commonly used for building association rules. For example, an association rule may state that 80% of customers who buy fish also buy white wine. This rule is derivable from the fact that fish occurs in 5% of sales transactions and set {fish, white wine} occurs in 4% of transactions. Patterns and association rules can be generalized by admitting negation. A sample association rule with negation could state that 75% of customers who buy coke also buy chips and neither beer nor milk. The knowledge of this kind is important not only for sales managers, but also in medical areas (Tsumoto, 2002). Admitting negation in patterns usually results in an abundance of mined patterns, which makes analysis of the discovered knowledge infeasible. It is thus preferable to discover and store a possibly small fraction of patterns, from which one can derive all other significant patterns when required. In this chapter, we introduce first lossless representations of frequent patterns with negation.

Download Full-text

A Hybrid Algorithm of Mining Closed Itemsets for Large Databases

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.145.292 ◽

2011 ◽

Vol 145 ◽

pp. 292-296

Author(s):

Lee Wen Huang

Keyword(s):

Data Mining ◽

Association Rules ◽

Execution Time ◽

Hybrid Algorithm ◽

Hybrid Approach ◽

Market Basket Analysis ◽

Market Basket ◽

Large Databases ◽

Closed Itemsets ◽

Simulation Results

Data Mining means a process of nontrivial extraction of implicit, previously and potentially useful information from data in databases. Mining closed large itemsets is a further work of mining association rules, which aims to find the set of necessary subsets of large itemsets that could be representative of all large itemsets. In this paper, we design a hybrid approach, considering the character of data, to mine the closed large itemsets efficiently. Two features of market basket analysis are considered – the number of items is large; the number of associated items for each item is small. Combining the cut-point method and the hash concept, the new algorithm can find the closed large itemsets efficiently. The simulation results show that the new algorithm outperforms the FP-CLOSE algorithm in the execution time and the space of storage.

Download Full-text

Visual Data Mining for Discovering Association Rules

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch125 ◽

2008 ◽

pp. 2105-2120

Author(s):

Kesaraporn Techapichetvanich ◽

Amitava Datta

Keyword(s):

Data Mining ◽

Association Rules ◽

Association Rule ◽

Large Data ◽

Data Sets ◽

Visual Data Mining ◽

Useful Knowledge ◽

Large Databases ◽

A New Technique ◽

Mining Association Rule

Both visualization and data mining have become important tools in discovering hidden relationships in large data sets, and in extracting useful knowledge and information from large databases. Even though many algorithms for mining association rules have been researched extensively in the past decade, they do not incorporate users in the association-rule mining process. Most of these algorithms generate a large number of association rules, some of which are not practically interesting. This chapter presents a new technique that integrates visualization into the mining association rule process. Users can apply their knowledge and be involved in finding interesting association rules through interactive visualization, after obtaining visual feedback as the algorithm generates association rules. In addition, the users gain insight and deeper understanding of their data sets, as well as control over mining meaningful association rules.

Download Full-text

Visual Data Mining for Discovering Association Rules

Business Applications and Computational Intelligence ◽

10.4018/978-1-59140-702-7.ch011 ◽

2011 ◽

pp. 209-226

Author(s):

Kesaraporn Techapichetvanich ◽

Amitava Datta

Keyword(s):

Data Mining ◽

Association Rules ◽

Association Rule ◽

Large Data ◽

Data Sets ◽

Visual Data Mining ◽

Useful Knowledge ◽

Large Databases ◽

A New Technique ◽

Mining Association Rule

Both visualization and data mining have become important tools in discovering hidden relationships in large data sets, and in extracting useful knowledge and information from large databases. Even though many algorithms for mining association rules have been researched extensively in the past decade, they do not incorporate users in the association-rule mining process. Most of these algorithms generate a large number of association rules, some of which are not practically interesting. This chapter presents a new technique that integrates visualization into the mining association rule process. Users can apply their knowledge and be involved in finding interesting association rules through interactive visualization, after obtaining visual feedback as the algorithm generates association rules. In addition, the users gain insight and deeper understanding of their data sets, as well as control over mining meaningful association rules.

Download Full-text

Research on Intelligent Recommendation Method and its Application on Internet Bookstore

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.121-122.447 ◽

2010 ◽

Vol 121-122 ◽

pp. 447-452

Author(s):

Qing Zhang Chen ◽

Yu Jie Pei ◽

Yan Jin ◽

Li Yan Zhang

Keyword(s):

Data Mining ◽

Association Rules ◽

Recommendation System ◽

Recommendation Systems ◽

Experimental Results ◽

Personalized Recommendation ◽

The Internet

As the current personalized recommendation systems of Internet bookstore are limited too much in function, this paper build a kind of Internet bookstore recommendation system based on “Strategic Data Mining”, which can provide personalized recommendations that they really want. It helps us to get the weight attribute of type of book by using AHP, the weight attributes spoken on behalf of its owner, and we add it in association rules. Then the method clusters the customer and type of book, and gives some strategies of personalized recommendation. Internet bookstore recommendation system is implemented with ASP.NET in this article. The experimental results indicate that the Internet bookstore recommendation system is feasible.

Download Full-text