Frequent closed itemsets lattice used in data mining

Data Mining means a process of nontrivial extraction of implicit, previously and potentially useful information from data in databases. Mining closed large itemsets is a further work of mining association rules, which aims to find the set of necessary subsets of large itemsets that could be representative of all large itemsets. In this paper, we design a hybrid approach, considering the character of data, to mine the closed large itemsets efficiently. Two features of market basket analysis are considered – the number of items is large; the number of associated items for each item is small. Combining the cut-point method and the hash concept, the new algorithm can find the closed large itemsets efficiently. The simulation results show that the new algorithm outperforms the FP-CLOSE algorithm in the execution time and the space of storage.

Download Full-text

Mining Frequent Closed Itemsets for Association Rules

Handbook of Research on Innovations in Database Technologies and Applications ◽

10.4018/978-1-60566-242-8.ch057 ◽

2009 ◽

pp. 537-546

Author(s):

Anamika Gupta ◽

Shikha Gupta ◽

Naveen Kumar

Keyword(s):

Data Mining ◽

Decision Making ◽

Marketing Strategies ◽

Strategic Decision ◽

Strategic Decision Making ◽

Rule Mining ◽

Market Basket ◽

Large Databases ◽

Very Large Databases ◽

Closed Itemsets

Association refers to correlations that exist among data. Association Rule Mining (ARM) is an important data-mining task. It refers to discovery of rules between different sets of attributes/items in very large databases (Agrawal R. & Srikant R. 1994). The discovered rules help in strategic decision making in both commercial and scientific domains. A classical application of ARM is market basket analysis, an application of data mining in retail sales where associations between the different items are discovered to analyze the customer’s buying habits in order to develop better marketing strategies. ARM has been extensively used in other applications like spatial-temporal, health care, bioinformatics, web data etc (Han J., Cheng H., Xin D., & Yan X. 2007).

Download Full-text

MINING NON-REDUNDANT ASSOCIATION RULES BASED ON CONCISE BASES

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001407005600 ◽

2007 ◽

Vol 21 (04) ◽

pp. 659-675 ◽

Cited By ~ 11

Author(s):

YUE XU ◽

YUEFENG LI

Keyword(s):

Data Mining ◽

Association Rules ◽

Association Rule ◽

Association Rule Mining ◽

Rule Mining ◽

Mining Community ◽

Rule Set ◽

Closed Itemsets ◽

Minimal Generators

Association rule mining has many achievements in the area of knowledge discovery. However, the quality of the extracted association rules has not drawn adequate attention from researchers in data mining community. One big concern with the quality of association rule mining is the size of the extracted rule set. As a matter of fact, very often tens of thousands of association rules are extracted among which many are redundant, thus useless. In this paper, we first analyze the redundancy problem in association rules and then propose a reliable exact association rule basis from which more concise nonredundant rules can be extracted. We prove that the redundancy eliminated using the proposed reliable association rule basis does not reduce the belief to the extracted rules. Moreover, this paper proposes a level wise approach for efficiently extracting closed itemsets and minimal generators — a key issue in closure based association rule mining.

Download Full-text

Closed Frequent Itemsets Mining Based on It-Tree

Journal of Medical Informatics and Decision Making ◽

10.14302/issn.2641-5526.jmid-20-3424 ◽

2020 ◽

Vol 1 (2) ◽

pp. 44-52

Author(s):

Youssef Fakir ◽

Chaima Ahle Touate ◽

Rachid Elayachi ◽

Mohamed Fakir

Keyword(s):

Data Mining ◽

Association Rule ◽

Computing Time ◽

Frequent Itemsets ◽

Closed Frequent Itemsets ◽

Hidden Knowledge ◽

Closed Itemsets ◽

Frequent Itemsets Mining ◽

Direct Counting ◽

Very High

In the last decade, the amount of collected data, in various computer science applications, has grown considerably. These large volumes of data need to be analysed in order to extract useful hidden knowledge. This work focuses on association rule extraction. This technique is one of the most popular in data mining. Nevertheless, the number of extracted association rules is often very high, and many of them are redundant. In this paper, we propose an algorithm, for mining closed itemsets, with the construction of an it-tree. This algorithm is compared with the DCI (direct counting & intersect) algorithm based on min support and computing time. CHARM is not memery-efficient. It needs to store all closed itemsets in the memory. The lower min-sup is, the more frequent closed itemsets there are so that the amounts of memory used by CHARM are increasing.

Download Full-text

Toward Integrating Data Warehousing with Data Mining Techniques

Data Warehouses and OLAP ◽

10.4018/987-1-59904-364-7.ch011 ◽

2011 ◽

pp. 253-276 ◽

Cited By ~ 1

Author(s):

Rokia Missaoui ◽

Ganaël Jatteau ◽

Ameur Boujenoui ◽

Sami Naouali

Keyword(s):

Data Mining ◽

Data Warehousing ◽

Multidimensional Data ◽

Concept Lattices ◽

Data Mining Techniques ◽

Multidimensional Database ◽

On Demand ◽

Closed Itemsets ◽

Analytical Processing ◽

Ultimate Objective

In this paper, we present alternatives for coupling data warehousing and data mining techniques so that they can benefit from each other’s advances for the ultimate objective of efficiently providing a flexible answer to data mining queries addressed either to a bidimensional (relational) or a multidimensional database. In particular, we investigate two techniques: (i) the first one exploits concept lattices for generating frequent closed itemsets, clusters and association rules from multidimensional data, and (ii) the second one defines new operators similar in spirit to online analytical processing (OLAP) techniques to allow “data mining on demand” (i.e., data mining according to user’s needs and perspectives). The implementation of OLAP-like techniques relies on three operations on lattices, namely selection, projection and assembly. A detailed running example serves to illustrate the scope and benefits of the proposed techniques.

Download Full-text

Toward Integrating Data Warehousing with Data Mining Techniques

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch211 ◽

2008 ◽

pp. 3346-3363

Author(s):

Rokia Missaoui ◽

Ganaël Jatteau ◽

Ameur Boujenoui ◽

Sami Naouali

Keyword(s):

Data Mining ◽

Data Warehousing ◽

Multidimensional Data ◽

Concept Lattices ◽

Data Mining Techniques ◽

Multidimensional Database ◽

On Demand ◽

Closed Itemsets ◽

Analytical Processing ◽

Ultimate Objective

In this paper, we present alternatives for coupling data warehousing and data mining techniques so that they can benefit from each other’s advances for the ultimate objective of efficiently providing a flexible answer to data mining queries addressed either to a bidimensional (relational) or a multidimensional database. In particular, we investigate two techniques: (i) the first one exploits concept lattices for generating frequent closed itemsets, clusters and association rules from multidimensional data, and (ii) the second one defines new operators similar in spirit to online analytical processing (OLAP) techniques to allow “data mining on demand” (i.e., data mining according to user’s needs and perspectives). The implementation of OLAP-like techniques relies on three operations on lattices, namely selection, projection and assembly. A detailed running example serves to illustrate the scope and benefits of the proposed techniques.

Download Full-text

An Efficient Algorithm for Mining Frequent Closed Itemsets over Data Stream

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.151.570 ◽

2012 ◽

Vol 151 ◽

pp. 570-575

Author(s):

Guo Dong Li ◽

Ke Wen Xia

Keyword(s):

Data Mining ◽

Data Structure ◽

Data Streams ◽

Efficient Algorithm ◽

Data Stream ◽

New Method ◽

Closed Itemsets ◽

Low Efficiency

Aiming at the problem of NewMoment algorithm frequently do leftcheck operation in the data mining process, which leads to the low efficiency of algorithm. In this paper, a new method, called LevelMoment, is proposed to improve the NewMoment algorithm which mines frequent closed itemsets over data streams. In this process, a new data structure that added in level node, called LevelCET, is proposed. On this structure, using level checking strategy and optimum frequent closed items checking strategy can quickly tap all the frequent closed itemsets over data streams. The experiments and analysis show that the algorithm has good performance.

Download Full-text

Closed frequent itemsets mining based on It-Tree

Global Journal of Computer Sciences Theory and Research ◽

10.18844/gjcs.v11i1.4912 ◽

2021 ◽

Vol 11 (1) ◽

pp. 01-11

Author(s):

Youssef Fakir ◽

Chaima Ahle Touateb ◽

Rachid Elayachi

Keyword(s):

Data Mining ◽

Association Rule ◽

Computing Time ◽

Frequent Itemsets ◽

Closed Frequent Itemsets ◽

Hidden Knowledge ◽

Closed Itemsets ◽

Frequent Itemsets Mining ◽

Direct Counting ◽

Very High

In the last decade, the amount of collected data, in various computer science applications, has grown considerably. These large volumes of data need to be analysed in order to extract useful hidden knowledge. This work focuses on association rule extraction. This technique is one of the most popular in data mining. Nevertheless, the number of extracted association rules is often very high, and many of them are redundant. In this paper, we propose an algorithm, for mining closed itemsets, with the construction of an it-tree. This algorithm is compared with the DCI (direct counting & intersect) algorithm based on min support and computing time. CHARM is not memery-efficient. It needs to store all closed itemsets in the memory. The lower min-sup is, the more frequent closed itemsets there are so that the amounts of memory used by CHARM are increasing.

Download Full-text