scholarly journals Closed frequent itemsets mining based on It-Tree

2021 ◽  
Vol 11 (1) ◽  
pp. 01-11
Author(s):  
Youssef Fakir ◽  
Chaima Ahle Touateb ◽  
Rachid Elayachi

In the last decade, the amount of collected data, in various computer science applications, has grown considerably. These large volumes of data need to be analysed in order to extract useful hidden knowledge. This work focuses on association rule extraction. This technique is one of the most popular in data mining. Nevertheless, the number of extracted association rules is often very high, and many of them are redundant. In this paper, we propose an algorithm, for mining closed itemsets, with the construction of an it-tree. This algorithm is compared with the DCI (direct counting & intersect) algorithm based on min support and computing time. CHARM is not memery-efficient. It needs to store all closed itemsets in the memory. The lower min-sup is, the more frequent closed itemsets there are so that the amounts of memory used by CHARM are increasing.

Author(s):  
Youssef Fakir ◽  
Chaima Ahle Touate ◽  
Rachid Elayachi ◽  
Mohamed Fakir

In the last decade, the amount of collected data, in various computer science applications, has grown considerably. These large volumes of data need to be analysed in order to extract useful hidden knowledge. This work focuses on association rule extraction. This technique is one of the most popular in data mining. Nevertheless, the number of extracted association rules is often very high, and many of them are redundant. In this paper, we propose an algorithm, for mining closed itemsets, with the construction of an it-tree. This algorithm is compared with the DCI (direct counting & intersect) algorithm based on min support and computing time. CHARM is not memery-efficient. It needs to store all closed itemsets in the memory. The lower min-sup is, the more frequent closed itemsets there are so that the amounts of memory used by CHARM are increasing.


2020 ◽  
Vol 1 (3) ◽  
pp. 1-7
Author(s):  
Sarbani Dasgupta ◽  
Banani Saha

In data mining, Apriori technique is generally used for frequent itemsets mining and association rule learning over transactional databases. The frequent itemsets generated by the Apriori technique provides association rules which are used for finding trends in the database. As the size of the database increases, sequential implementation of Apriori technique will take a lot of time and at one point of time the system may crash. To overcome this problem, several algorithms for parallel implementation of Apriori technique have been proposed. This paper gives a comparative study on various parallel implementation of Apriori technique .It also focuses on the advantages of using the Map Reduce technology, the latest technology used in parallelization of large dataset mining.


2019 ◽  
Vol 892 ◽  
pp. 157-167
Author(s):  
Fatimah Audah Md Zaki ◽  
Nurul Fariza Zulkurnain

The task in mining closed frequent itemsets requires the algorithm to mine the frequent ones then determine its closure. The efficiency of closure computation is very important as it will determine the total mining time and the required memory. Over the years, many closure computation methods have been proposed to achieve these goals. However, to the best of our knowledge, there is no suitable method that can be adapted for algorithms that enumerate the rowset lattice, which is effective for biological datasets. Therefore, this paper proposed a method for computing closure compare with the method used in BVBUC algorithm method. Finally, BVBUC_I is proposed and the performances of these algorithms were evaluated using two synthetic datasets and three real datasets. The results of these tests proved the efficiency of the proposed method.


2014 ◽  
Vol 998-999 ◽  
pp. 899-902 ◽  
Author(s):  
Cheng Luo ◽  
Ying Chen

Existing data miming algorithms have mostly implemented data mining under centralized environment, but the large-scale database exists in the distributed form. According to the existing problem of the distributed data mining algorithm FDM and its improved algorithms, which exist the problem that the frequent itemsets are lost and network communication cost too much. This paper proposes a association rule mining algorithm based on distributed data (ARADD). The mapping marks the array mechanism is included in the ARADD algorithm, which can not only keep the integrity of the frequent itemsets, but also reduces the cost of network communication. The efficiency of algorithm is proved in the experiment.


2012 ◽  
Vol 256-259 ◽  
pp. 2910-2913
Author(s):  
Jun Tan

Online mining of frequent closed itemsets over streaming data is one of the most important issues in mining data streams. In this paper, we proposed a novel sliding window based algorithm. The algorithm exploits lattice properties to limit the search to frequent close itemsets which share at least one item with the new transaction. Experiments results on synthetic datasets show that our proposed algorithm is both time and space efficient.


2013 ◽  
Vol 756-759 ◽  
pp. 3692-3695 ◽  
Author(s):  
Nai Li Liu ◽  
Lei Ma

Mining association rule is an important matter in data mining, in which mining maximum frequent patterns is a key problem. Many of the previous algorithms mine maximum frequent patterns by producing candidate patterns firstly, then pruning. But the cost of producing candidate patterns is very high, especially when there exists long patterns. In this paper, the structure of a FP-tree is improved, we propose a fast algorithm based on FP-Tree for mining maximum frequent patterns, the algorithm does not produce maximum frequent candidate patterns and is more effectively than other improved algorithms. The new FP-Tree is a one-way tree and only retains pointers to point its father in each node, so at least one third of memory is saved. Experiment results show that the algorithm is efficient and saves memory space.


2012 ◽  
Vol 195-196 ◽  
pp. 984-986
Author(s):  
Ming Ru Zhao ◽  
Yuan Sun ◽  
Jian Guo ◽  
Ping Ping Dong

Frequent itemsets mining is an important data mining task and a focused theme in data mining research. Apriori algorithm is one of the most important algorithm of mining frequent itemsets. However, the Apriori algorithm scans the database too many times, so its efficiency is relatively low. The paper has therefore conducted a research on the mining frequent itemsets algorithm based on a across linker. Through comparing with the classical algorithm, the improved algorithm has obvious advantages.


2013 ◽  
Vol 327 ◽  
pp. 197-200
Author(s):  
Guo Fang Kuang ◽  
Ying Cun Cao

The material is used by humans to manufacture the machines, components, devices and other products of substances. Association rules originated in the field of data mining, people use it to find large amounts of data between itemsets of the association. Apriori is a breadth-first algorithm to obtain the support is greater than the minimum support of frequent itemsets by repeatedly scanning the database. This paper presents the construction of materials science and information model based on association rule mining. Experimental data sets prove that the proposed algorithm is effective and reasonable.


Sign in / Sign up

Export Citation Format

Share Document