Mining Approximate Frequent Itemsets Using Pattern Growth Approach

Approximate frequent itemsets (AFI) mining from noisy databases are computationally more expensive than traditional frequent itemset mining. This is because the AFI mining algorithms generate large number of candidate itemsets. This article proposes an algorithm to mine AFIs using pattern growth approach. The major contribution of the proposed approach is it mines core patterns and examines approximate conditions of candidate AFIs directly with single phase and two full scans of database. Related algorithms apply Apriori-based candidate generation and test approach and require multiple phases to obtain complete AFIs. First phase generates core patterns, and second phase examines approximate conditions of core patterns. Specifically, the article proposes novel techniques that how to map transactions on approximate FP-tree, and how to mine AFIs from the conditional patterns of approximate FP-tree. The approximate FP-tree maps transactions on shared branches when the transactions share a similar set of items. This reduces the size of databases and helps to efficiently compute the approximate conditions of candidate itemsets. We compare the performance of our algorithm with the state of the art AFI mining algorithms on benchmark databases. The experiments are analyzed by comparing the processing time of algorithms and scalability of algorithms on varying database size and transaction length. The results show pattern growth approach mines AFIs in less processing time than related Apriori-based algorithms.

Download Full-text

Security and Verification of Server Data Using Frequent Itemset Mining in Ecommerce

International Journal of Synthetic Emotions ◽

10.4018/ijse.2017010103 ◽

2017 ◽

Vol 8 (1) ◽

pp. 31-43

Author(s):

Zuber Shaikh ◽

Antara Mohadikar ◽

Rachana Nayak ◽

Rohith Padamadan

Keyword(s):

Data Mining ◽

Frequent Itemsets ◽

Frequent Itemset ◽

Graphical Password ◽

Itemset Mining ◽

Frequent Item ◽

Data Mining Algorithms ◽

Shoulder Surfing ◽

Mining Algorithms ◽

Frequent Item Sets

Frequent itemsets refer to a set of data values (e.g., product items) whose number of co-occurrences exceeds a given threshold. The challenge is that the design of proofs and verification objects has to be customized for different data mining algorithms. Intended method will implement a basic idea of completeness verification and authentication approach in which the client will uses a set of frequent item sets as the evidence, and checks whether the server has missed any frequent item set as evidence in its returned result. It will help client detect untrusted server and system will become much more efficiency by reducing time. In authentication process CaRP is both a captcha and a graphical password scheme. CaRP addresses a number of security problems altogether, such as online guessing attacks, relay attacks, and, if combined with dual-view technologies, shoulder-surfing attacks.

Download Full-text

A Proposed Frequent Itemset Discovery Algorithm Based on Item Weights and Uncertainty

International Journal of Sociotechnology and Knowledge Development ◽

10.4018/ijskd.2020010106 ◽

2020 ◽

Vol 12 (1) ◽

pp. 98-118

Author(s):

Hanaa Ibrahim Abu Zahra ◽

Shaker El-Sappagh ◽

Tarek Ahmef El Shishtawy

Keyword(s):

High Performance ◽

Frequent Itemsets ◽

Frequent Itemset ◽

Real Word ◽

Memory Consumption ◽

Itemset Mining ◽

Uncertain Database ◽

Additional Value ◽

Mining Algorithms ◽

New Algorithms

Most frequent itemset mining algorithms (FIMA) discover hidden relationships from unrelated items. They find the most frequent itemsets depending only on the frequency of the item's existence in the dataset. These algorithms give all items the same importance, and neglect the differences in importance of the items. They assume the full certainty of data, but in most cases, real word data may be uncertain. As a result, the data could be incomplete and/or imprecise. These two problems are the most common challenges that face FIMA algorithms. Some new algorithms proposed some solutions to face these two issues separately. In other words, some algorithms handle item importance only, and others handle uncertainty only. Few algorithms dealt with the two issues together. In this article, the single scan for weighted itemsets over the uncertain database (SSU-Wfim) is proposed. It depends on the single scan frequent itemsets algorithm (SS_FIM), and enhances it to deal with weighted items in an uncertain database. SSU_WFIM deals with the uncertainty of data by giving each item in a transaction an additional value to indicate occurrence likelihood. It gives the items different values to define the weight of them. It uses a table called Ptable to save the items and their probability values. This table is used to generate all possible candidates itemsets. The results indicate the high performance in aspects of runtime, memory consumption and scalability of SSU-Wfim comparing with the UApriori algorithm. The proposed algorithm saves time and memory with a percentage exceeds 70% for all tested datasets.

Download Full-text

An UBMFFP Tree for Mining Multiple Fuzzy Frequent Itemsets

International Journal of Uncertainty Fuzziness and Knowledge-Based Systems ◽

10.1142/s0218488515500385 ◽

2015 ◽

Vol 23 (06) ◽

pp. 861-879 ◽

Cited By ~ 7

Author(s):

Jerry Chun-Wei Lin ◽

Tzung-Pei Hong ◽

Tsung-Ching Lin ◽

Shing-Tai Pan

Keyword(s):

Upper Bound ◽

Frequent Itemsets ◽

Tree Structure ◽

Frequent Pattern ◽

Second Phase ◽

Two Phase ◽

Tree Algorithm ◽

Large Databases ◽

Frequent Pattern Tree ◽

Mining Algorithms

Frequent itemsets are useful for discovering interesting associations hidden in large databases. Many mining algorithms use data with binary attributes to represent the occurrence of items and find frequent itemsets. However, many real-world applications provide a richer source of transactions with quantitative values. The fuzzy frequent pattern tree algorithm was thus proposed for extracting fuzzy frequent itemsets from the quantitative transactions. In this paper, a tree structure called the upper-bound multiple fuzzy frequent-pattern (UBMFFP)-tree is designed for improving the pruning effect in the mining process. A two-phase fuzzy mining approach based on the tree structure is also proposed to obtain the complete fuzzy frequent itemsets from a quantitative database. The proposed fuzzy mining approach recursively and efficiently finds the upper-bound fuzzy counts of itemsets with the aid of the tree structure. It prunes unpromising itemsets in the first phase, and then finds the actual fuzzy frequent itemsets in the second phase. Experimental results indicate that the proposed UBMFFP-tree algorithm has good performance in terms of execution time and number of tree nodes.

Download Full-text

TKFIM: Top-K frequent itemset mining technique based on equivalence classes

PeerJ Computer Science ◽

10.7717/peerj-cs.385 ◽

2021 ◽

Vol 7 ◽

pp. e385

Author(s):

Saood Iqbal ◽

Abdul Shahid ◽

Muhammad Roman ◽

Zahid Khan ◽

Shaha Al-Otaibi ◽

...

Keyword(s):

State Of The Art ◽

Threshold Value ◽

Frequent Itemsets ◽

Frequent Itemset ◽

Large Dataset ◽

Mining Technique ◽

Support Threshold ◽

Frequent Itemsets Mining ◽

And Performance ◽

The Given

Frequently used items mining is a significant subject of data mining studies. In the last ten years, due to innovative development, the quantity of data has grown exponentially. For frequent Itemset (FIs) mining applications, it imposes new challenges. Misconceived information may be found in recent algorithms, including both threshold and size based algorithms. Threshold value plays a central role in generating frequent itemsets from the given dataset. Selecting a support threshold value is very complicated for those unaware of the dataset’s characteristics. The performance of algorithms for finding FIs without the support threshold is, however, deficient due to heavy computation. Therefore, we have proposed a method to discover FIs without the support threshold, called Top-k frequent itemsets mining (TKFIM). It uses class equivalence and set-theory concepts for mining FIs. The proposed procedure does not miss any FIs; thus, accurate frequent patterns are mined. Furthermore, the results are compared with state-of-the-art techniques such as Top-k miner and Build Once and Mine Once (BOMO). It is found that the proposed TKFIM has outperformed the results of these approaches in terms of execution and performance, achieving 92.70, 35.87, 28.53, and 81.27 percent gain on Top-k miner using Chess, Mushroom, and Connect and T1014D100K datasets, respectively. Similarly, it has achieved a performance gain of 97.14, 100, 78.10, 99.70 percent on BOMO using Chess, Mushroom, Connect, and T1014D100K datasets, respectively. Therefore, it is argued that the proposed procedure may be adopted on a large dataset for better performance.

Download Full-text

Implementation of Improved Association Rule Mining Algorithms for Fast Mining with Efficient Tree Structures on Large Datasets

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.b3876.129219 ◽

2019 ◽

Vol 9 (2) ◽

pp. 5136-5141

Keyword(s):

Association Rule ◽

Frequent Itemsets ◽

Large Datasets ◽

Frequent Itemset ◽

Rule Mining ◽

Tree Structures ◽

Significant Area ◽

Dataset Size ◽

Mining Algorithms ◽

Mining Frequent Itemsets

ARM is a significant area of knowledge mining which enables association rules which are essential for decision making. Frequent itemset mining has a challenge against large datasets. As going on the dataset size increases the burden and time to discover rules will increase. In this paper the ARM algorithms with tree structures like FP-tree, FIN with POC tree and PPC tree are discussed for reducing overheads and time consuming. These algorithms use highly competent data structures for mining frequent itemsets from the database. FIN uses nodeset a unique and novel data structure to extract frequent itemsets and POC tree to store frequent itemset information. These techniques are extremely helpful in the marketing fields. The proposed and implemented techniques reveal that they have improved about performance by means of time and efficiency

Download Full-text

Enhancing the Performance of Large-scale Profitable Itemset Mining using Efficient Data Structures

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.i8151.078919 ◽

2019 ◽

Vol 8 (9) ◽

pp. 1768-1772

Keyword(s):

Data Structures ◽

Large Scale ◽

Frequent Itemsets ◽

Frequent Itemset ◽

Frequent Itemset Mining ◽

Second Phase ◽

Itemset Mining ◽

Efficient Data ◽

Profit Value ◽

Efficient Data Structures

The process of extracting the most frequently bought items from a transactional database is termed as frequent itemset mining. Although it provides us with an idea of the best-selling itemsets, the method fails to identify the most profitable items from the database. It is not uncommon to have minimal intersection between frequent itemsets and profitable itemsets, and the process of extracting the most profitable itemsets is termed as Greater Profitable Itemset (GPI) mining. There have been various approaches to mine GPI in which [7] proposed a two-phased algorithm to optimize regeneration of GPI when the profit value of any item changes. This constituted of keeping track of the pruned items in the first phase and using it to efficiently regenerate GPI in the second phase. This paper proposes an enhancement to the way these changes are tracked by storing the pruned itemsets according to their constituent items, unlike the earlier algorithm that stored records iteration wise. By storing the itemsets according to their constituent items, we make sure that only the required items are being retrieved. In contrast, the earlier algorithm would fetch all the items pruned in any iteration, regardless of its relevance. By fetching only relevant itemset, the proposed method would significantly bring down the computational requirements.

Download Full-text

Mining Frequent Weighted Itemsets without Storing Transaction IDs and Generating Candidates

International Journal of Uncertainty Fuzziness and Knowledge-Based Systems ◽

10.1142/s0218488517500052 ◽

2017 ◽

Vol 25 (01) ◽

pp. 111-144 ◽

Cited By ~ 20

Author(s):

Gangin Lee ◽

Unil Yun ◽

Keun Ho Ryu

Keyword(s):

State Of The Art ◽

Frequent Itemset ◽

Experimental Results ◽

Frequent Itemset Mining ◽

Memory Usage ◽

Tree Structures ◽

Prefix Tree ◽

Itemset Mining ◽

Mining Methods ◽

Mining Algorithms

Weighted itemset mining, which is one of the important areas in frequent itemset mining, is an approach for mining meaningful itemsets considering different importance or weights for each item in databases. Because of the merit of the weighted itemset mining, various related works have been studied actively. As one of the methods in the weighted itemset mining, FWI (Frequent Weighted Itemset) mining calculates weights of transactions from weights of items and then finds FWIs based on the transaction weights. However, previous FWI mining methods still have limitations in terms of runtime and memory usage performance. For this reason, in this paper, we propose two algorithms for mining FWIs more efficiently from databases with weights of items. In contrast to the previous approaches storing transaction IDs for mining FWIs, the proposed methods employ new types of prefix tree structures and mine these patterns more efficiently without storing any transaction ID. Through extensive experimental results in this paper, we show that the proposed algorithms outperform state-of-the-art FWI mining algorithms in terms of runtime, memory usage, and scalability.

Download Full-text

An efficient pattern growth approach for mining fault tolerant frequent itemsets

Expert Systems with Applications ◽

10.1016/j.eswa.2019.113046 ◽

2020 ◽

Vol 143 ◽

pp. 113046 ◽

Cited By ~ 1

Author(s):

Shariq Bashir

Keyword(s):

Fault Tolerant ◽

Frequent Itemsets ◽

Pattern Growth ◽

Growth Approach

Download Full-text

An enhanced constraint based technique for frequent itemset mining in transactional databases

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.22.11807 ◽

2018 ◽

Vol 7 (2.22) ◽

pp. 45

Author(s):

Ramah Sivakumar ◽

Dr J.G.R. Sathiaseelan

Keyword(s):

Processing Time ◽

Frequent Itemsets ◽

Frequent Itemset ◽

Frequent Patterns ◽

Unique Minimum ◽

Itemset Mining ◽

Transactional Databases ◽

Pros And Cons ◽

Social Applications ◽

Mining Frequent Itemsets

Mining frequent patterns is one of the wide area of research in recent times as it has numerous social applications. Variety of frequent patterns finds usage in diverse applications and the research to mine those in an optimized way is an important aspect under consideration. So far, many algorithms had been proposed for mining frequent itemsets and each has their own pros and cons. The basic algorithms used in the process are Apriori, Fpgrowth and Eclat. Many enhancements of these algorithms are ongoing process in recent times. In this paper, an enhanced Varied Support Frequent Itemset (VSFIM) algorithm is proposed which is an enhancement of FPGrowth algorithm. Unique minimum support for each item in the transaction is provided and then mining is done in the proposed approach. The performance of the proposed algorithm is tested with existing algorithms. It is found that VSFIM outperformed the existing algorithms in both processing time and space utilization.

Download Full-text

A Comprehensive Survey of Frequent Itemsets Mining on Transactional Database with Weighted Items

Research and Development on Information and Communication Technology ◽

10.32913/mic-ict-research.v2021.n1.967 ◽

2021 ◽

Vol 2021 (1) ◽

pp. 19-28

Author(s):

Thanh Huan Phan ◽

Hoài Bắc Lê

Keyword(s):

Scale Up ◽

Frequent Itemsets ◽

Frequent Itemset ◽

Future Research ◽

Technical Solution ◽

Comprehensive Survey ◽

Frequent Itemsets Mining ◽

Mining Algorithms

In 1993, Agrawal et al. proposed the first algorithm for mining traditional frequent itemset on binarytransactional database with unweighted items - This algorithmis essential in finding hindden relationships among items inyour data. Until 1998, with the development of various typesof transactional database - some researchers have proposed afrequent itemsets mining algorithms on transactional databasewith weighted items (the importance/meaning/value of itemsis different) - It provides more pieces of knowledge thantraditional frequent itemsets mining. In this article, the authors present a survey of frequent itemsets mining algorithmson transactional database with weighted items over the pasttwenty years. This research helps researchers to choose theright technical solution when it comes to scale up in big datamining. Finally, the authors give their recommendations anddirections for their future research.

Download Full-text