closed frequent itemsets Latest Research Papers

Closed frequent itemsets mining based on It-Tree

Global Journal of Computer Sciences Theory and Research ◽

10.18844/gjcs.v11i1.4912 ◽

2021 ◽

Vol 11 (1) ◽

pp. 01-11

Author(s):

Youssef Fakir ◽

Chaima Ahle Touateb ◽

Rachid Elayachi

Keyword(s):

Data Mining ◽

Association Rule ◽

Computing Time ◽

Frequent Itemsets ◽

Closed Frequent Itemsets ◽

Hidden Knowledge ◽

Closed Itemsets ◽

Frequent Itemsets Mining ◽

Direct Counting ◽

Very High

In the last decade, the amount of collected data, in various computer science applications, has grown considerably. These large volumes of data need to be analysed in order to extract useful hidden knowledge. This work focuses on association rule extraction. This technique is one of the most popular in data mining. Nevertheless, the number of extracted association rules is often very high, and many of them are redundant. In this paper, we propose an algorithm, for mining closed itemsets, with the construction of an it-tree. This algorithm is compared with the DCI (direct counting & intersect) algorithm based on min support and computing time. CHARM is not memery-efficient. It needs to store all closed itemsets in the memory. The lower min-sup is, the more frequent closed itemsets there are so that the amounts of memory used by CHARM are increasing.

Get full-text (via PubEx)

High Scalability Document Clustering Algorithm Based On Top-K Weighted Closed Frequent Itemsets

Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) ◽

10.29207/resti.v5i2.2987 ◽

2021 ◽

Vol 5 (2) ◽

pp. 359-368

Author(s):

Gede Aditra Pradnyana ◽

Arif Djunaidy

Keyword(s):

Clustering Algorithm ◽

Computing Time ◽

Curse Of Dimensionality ◽

Frequent Itemsets ◽

Frequent Pattern ◽

Minimum Support ◽

Closed Frequent Itemsets ◽

Clustering Quality ◽

Time Required ◽

F Measure

Documents clustering based on frequent itemsets can be regarded a new method of documents clustering which is aimed to overcome curse of dimensionality of items produced by documents being clustered. The Maximum Capturing (MC) technique is an algorithm of documents clustering based on frequent itemsets that is capable of producing a better clustering quality in compared to other similar algorithms. However, since the maximum capturing technique employed frequent itemsets, it still suffers from such several weaknesses as the emergence of items redundancy that may still cause curse of dimensionality, difficult to determine the minimum support value from a set of documents to be clustered, and no weighting on items incurred to the resulting frequent itemsets. To cope with those various weaknesses, in this research, an algorithm of documents clustering based on weighted top-k closed frequent itemsets, which is called as Weighted Maximum Capturing (WMC) algorithm, is developed. The proposed algorithm involves the frequent pattern tree algorithm to mine closed frequent itemsets from a set of documents without specifying the minimum support value of items to be generated. Experimental results showed that improvement on the resulting clustering accuracy was produced. The resulting average values of F-measure of 0.713 and purity of 0.721 with improvement ratio of 1.4% for F-measure and 2% for purity. Nevertheless, results of the scalability test showed very significant improvement. The WMC algorithm only requires the average computing time of 623.77 minutes, 518.05 minutes faster than the average computing time required by the MC algorithm.

Get full-text (via PubEx)

NUCLEAR: An Efficient Methods for Mining Frequent Itemsets and Generators from Closed Frequent Itemsets

INFORMATION TECHNOLOGY IN INDUSTRY ◽

10.17762/itii.v7i2.65 ◽

2021 ◽

Vol 7 (2) ◽

Author(s):

Huy Quang Pham, Duc Tran, Ninh Bao Duong, Philippe Fournier-Viger, Alioune Ngom

Keyword(s):

Data Mining ◽

Frequent Itemsets ◽

Frequent Itemset ◽

Experimental Results ◽

Extra Cost ◽

Effective Algorithm ◽

Closed Frequent Itemsets ◽

Mining Frequent Itemsets

Frequent itemset (FI) mining is an interesting data mining task. Instead of directly mining the FIs from data it is preferred to mine only the closed frequent itemsets (CFIs) first and then extract the FIs for each CFI. However, some algorithms require the generators for each CFI in order to extract the FIs, leading to an extra cost. In this paper, we introduce an effective algorithm, called NUCLEAR, which can induce the FIs from the lattice of CFIs without the need of the generators. It can enumerate generators as well by similar fashion. Experimental results showed that NUCLEAR is effective as compared to previous studies, especially, the time for extracting the FIs is usually much smaller than that for mining the CFIs.

Get full-text (via PubEx)

Closed Frequent Itemsets Mining Based on It-Tree

Journal of Medical Informatics and Decision Making ◽

10.14302/issn.2641-5526.jmid-20-3424 ◽

2020 ◽

Vol 1 (2) ◽

pp. 44-52

Author(s):

Youssef Fakir ◽

Chaima Ahle Touate ◽

Rachid Elayachi ◽

Mohamed Fakir

Keyword(s):

Data Mining ◽

Association Rule ◽

Computing Time ◽

Frequent Itemsets ◽

Closed Frequent Itemsets ◽

Hidden Knowledge ◽

Closed Itemsets ◽

Frequent Itemsets Mining ◽

Direct Counting ◽

Very High

In the last decade, the amount of collected data, in various computer science applications, has grown considerably. These large volumes of data need to be analysed in order to extract useful hidden knowledge. This work focuses on association rule extraction. This technique is one of the most popular in data mining. Nevertheless, the number of extracted association rules is often very high, and many of them are redundant. In this paper, we propose an algorithm, for mining closed itemsets, with the construction of an it-tree. This algorithm is compared with the DCI (direct counting & intersect) algorithm based on min support and computing time. CHARM is not memery-efficient. It needs to store all closed itemsets in the memory. The lower min-sup is, the more frequent closed itemsets there are so that the amounts of memory used by CHARM are increasing.

Get full-text (via PubEx)

Compressing Closed Frequent Itemsets with Controlled Information Loss

2019 IEEE International Conference on Cloud Computing in Emerging Markets (CCEM) ◽

10.1109/ccem48484.2019.00015 ◽

2019 ◽

Author(s):

Pavitra Bai S. ◽

Ravikumar G.K. ◽

Narendra B.K.

Keyword(s):

Information Loss ◽

Frequent Itemsets ◽

Closed Frequent Itemsets

Get full-text (via PubEx)

Improved BVBUC Algorithm to Discover Closed Itemsets in Long Biological Datasets

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.892.157 ◽

2019 ◽

Vol 892 ◽

pp. 157-167

Author(s):

Fatimah Audah Md Zaki ◽

Nurul Fariza Zulkurnain

Keyword(s):

Frequent Itemsets ◽

Suitable Method ◽

Closed Frequent Itemsets ◽

Closed Itemsets ◽

Synthetic Datasets

The task in mining closed frequent itemsets requires the algorithm to mine the frequent ones then determine its closure. The efficiency of closure computation is very important as it will determine the total mining time and the required memory. Over the years, many closure computation methods have been proposed to achieve these goals. However, to the best of our knowledge, there is no suitable method that can be adapted for algorithms that enumerate the rowset lattice, which is effective for biological datasets. Therefore, this paper proposed a method for computing closure compare with the method used in BVBUC algorithm method. Finally, BVBUC_I is proposed and the performances of these algorithms were evaluated using two synthetic datasets and three real datasets. The results of these tests proved the efficiency of the proposed method.

Get full-text (via PubEx)

Maximal and closed frequent itemsets mining from uncertain database and data stream

International Journal of Data Science ◽

10.1504/ijds.2019.102792 ◽

2019 ◽

Vol 4 (3) ◽

pp. 237

Author(s):

Maliha Momtaz ◽

Abu Ahmed Ferdaus ◽

Chowdhury Farhan Ahmed ◽

Mohammad Samiullah

Keyword(s):

Data Stream ◽

Frequent Itemsets ◽

Closed Frequent Itemsets ◽

Frequent Itemsets Mining ◽

Uncertain Database

Get full-text (via PubEx)

An Efficient Mining Algorithm of Closed Frequent Itemsets on Multi-core Processor

Advanced Data Mining and Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-030-35231-8_8 ◽

2019 ◽

pp. 107-118

Author(s):

Huan Phan

Keyword(s):

Frequent Itemsets ◽

Mining Algorithm ◽

Closed Frequent Itemsets ◽

Multi Core Processor

Get full-text (via PubEx)

Maximal and closed frequent itemsets mining from uncertain database and data stream

International Journal of Data Science ◽

10.1504/ijds.2019.10024378 ◽

2019 ◽

Vol 4 (3) ◽

pp. 237

Author(s):

Abu Ahmed Ferdaus ◽

Mohammad Samiullah ◽

Chowdhury Farhan Ahmed ◽

Maliha Momtaz

Keyword(s):

Data Stream ◽

Frequent Itemsets ◽

Closed Frequent Itemsets ◽

Frequent Itemsets Mining ◽

Uncertain Database

Get full-text (via PubEx)

New and Efficient Algorithms for Producing Frequent Itemsets with the Map-Reduce Framework

Algorithms ◽

10.3390/a11120194 ◽

2018 ◽

Vol 11 (12) ◽

pp. 194

Author(s):

Yaron Gonen ◽

Ehud Gudes ◽

Kirill Kandalov

Keyword(s):

Data Mining ◽

Big Data ◽

Experimental Evaluation ◽

Distributed Databases ◽

Frequent Itemsets ◽

Parallel Architectures ◽

Efficient Algorithms ◽

Map Reduce ◽

Closed Frequent Itemsets ◽

New Algorithms

The Map-Reduce (MR) framework has become a popular framework for developing new parallel algorithms for Big Data. Efficient algorithms for data mining of big data and distributed databases has become an important problem. In this paper we focus on algorithms producing association rules and frequent itemsets. After reviewing the most recent algorithms that perform this task within the MR framework, we present two new algorithms: one algorithm for producing closed frequent itemsets, and the second one for producing frequent itemsets when the database is updated and new data is added to the old database. Both algorithms include novel optimizations which are suitable to the MR framework, as well as to other parallel architectures. A detailed experimental evaluation shows the effectiveness and advantages of the algorithms over existing methods when it comes to large distributed databases.

Get full-text (via PubEx)

closed frequent itemsets
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Closed frequent itemsets mining based on It-Tree

High Scalability Document Clustering Algorithm Based On Top-K Weighted Closed Frequent Itemsets

NUCLEAR: An Efficient Methods for Mining Frequent Itemsets and Generators from Closed Frequent Itemsets

Closed Frequent Itemsets Mining Based on It-Tree

Compressing Closed Frequent Itemsets with Controlled Information Loss

Improved BVBUC Algorithm to Discover Closed Itemsets in Long Biological Datasets

Maximal and closed frequent itemsets mining from uncertain database and data stream

An Efficient Mining Algorithm of Closed Frequent Itemsets on Multi-core Processor

Maximal and closed frequent itemsets mining from uncertain database and data stream

New and Efficient Algorithms for Producing Frequent Itemsets with the Map-Reduce Framework

Export Citation Format

closed frequent itemsetsRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Closed frequent itemsets mining based on It-Tree

High Scalability Document Clustering Algorithm Based On Top-K Weighted Closed Frequent Itemsets

NUCLEAR: An Efficient Methods for Mining Frequent Itemsets and Generators from Closed Frequent Itemsets

Closed Frequent Itemsets Mining Based on It-Tree

Compressing Closed Frequent Itemsets with Controlled Information Loss

Improved BVBUC Algorithm to Discover Closed Itemsets in Long Biological Datasets

Maximal and closed frequent itemsets mining from uncertain database and data stream

An Efficient Mining Algorithm of Closed Frequent Itemsets on Multi-core Processor

Maximal and closed frequent itemsets mining from uncertain database and data stream

New and Efficient Algorithms for Producing Frequent Itemsets with the Map-Reduce Framework

closed frequent itemsets
Recently Published Documents