Optimum Frequent Pattern Approach for Efficient Incremental Mining on Large Databases using Map Reduce

AbstractAssociation rules provide important knowledge that can be extracted from transactional databases. Owing to the massive exchange of information nowadays, databases become dynamic and change rapidly and periodically: new transactions are added to the database and/or old transactions are updated or removed from the database. Incremental mining was introduced to overcome the problem of maintaining previously generated association rules in dynamic databases. In this paper, we propose an efficient algorithm (IMIDB) for incremental itemset mining in large databases. The algorithm utilizes the trie data structure for indexing dynamic database transactions. Performance comparison of the proposed algorithm to recently cited algorithms shows that a significant improvement of about two orders of magnitude is achieved by our algorithm. Also, the proposed algorithm exhibits linear scalability with respect to database size.

Download Full-text

BIG DATA MINING FOR INTERESTING PATTERNS WITH MAP REDUCE TECHNIQUE

Asian Journal of Pharmaceutical and Clinical Research ◽

10.22159/ajpcr.2017.v10s1.19634 ◽

2017 ◽

Vol 10 (13) ◽

pp. 191

Author(s):

Nikhil Jamdar ◽

A Vijayalakshmi

Keyword(s):

Data Mining ◽

Pattern Mining ◽

Uncertain Data ◽

Frequent Pattern Mining ◽

Frequent Pattern ◽

Map Reduce ◽

Frequent Patterns ◽

Precise Data ◽

Big Data Mining ◽

Transactional Databases

There are many algorithms available in data mining to search interesting patterns from transactional databases of precise data. Frequent pattern mining is a technique to find the frequently occurred items in data mining. Most of the techniques used to find all the interesting patterns from a collection of precise data, where items occurred in each transaction are certainly known to the system. As well as in many real-time applications, users are interested in a tiny portion of large frequent patterns. So the proposed user constrained mining approach, will help to find frequent patterns in which user is interested. This approach will efficiently find user interested frequent patterns by applying user constraints on the collections of uncertain data. The user can specify their own interest in the form of constraints and uses the Map Reduce model to find uncertain frequent pattern that satisfy the user-specified constraints

Download Full-text

H-Mine: Fast and space-preserving frequent pattern mining in large databases

IIE Transactions ◽

10.1080/07408170600897460 ◽

2007 ◽

Vol 39 (6) ◽

pp. 593-605 ◽

Cited By ~ 25

Author(s):

Jian Pei ◽

Jiawei Han ◽

Hongjun Lu† ◽

Shojiro Nishio ◽

Shiwei Tang ◽

...

Keyword(s):

Pattern Mining ◽

Frequent Pattern Mining ◽

Frequent Pattern ◽

Large Databases

Download Full-text

Frequent Pattern Mining Based on Pattern Space Division in Map/Reduce Cluster

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.588-589.2038 ◽

2012 ◽

Vol 588-589 ◽

pp. 2038-2041

Author(s):

Qian Liu ◽

Ming Chen

Keyword(s):

Pattern Mining ◽

Recursive Algorithm ◽

Frequent Pattern Mining ◽

Frequent Pattern ◽

Map Reduce ◽

Data Set ◽

Combinatorial Explosion ◽

Space Division ◽

Pattern Space ◽

The Many

By means of pattern space division and based on Map/Reduce, the problem of processing the many-to-many corresponding relationship between the data set and the patterns set is converted to the problem of processing the many-to-many corresponding relationship between the data subsets and the pattern subspaces associated with the frequent 1-itemsets. Thus, the scale of the intermediate key/value pairs set is reduced so dramatically that the problem of single Map node bottleneck which results from combinatorial explosion of candidate patterns space is avoided. Over three rounds of Map/Reduce tasks, the pattern space is constructed and divided, the filtering rules is established and employed, father more, the mining of frequent patterns is realized in each pattern subspace independently. By making the best of both the universal trait of the entire pattern space and the individuality of each pattern subspace, the optimized non-recursive algorithm is designed and implemented to improve the efficiency of mining phase.

Download Full-text

Association Rules Analysis on FP-Growth Method in Predicting Sales

10.31227/osf.io/8m57c ◽

2017 ◽

Author(s):

Andysah Putera Utama Siahaan ◽

Mesran Mesran ◽

Andre Hasudungan Lubis ◽

Ali Ikhwan ◽

Supiyandi

Keyword(s):

Data Mining ◽

Association Rules ◽

Frequent Itemset ◽

Frequent Pattern ◽

Data Set ◽

Pattern Processing ◽

Large Databases ◽

Growth Method ◽

Association Rules Analysis ◽

A Company

Sales transaction data on a company will continue to increase day by day. Large amounts of data can be problematic for a company if it is not managed properly. Data mining is a field of science that unifies techniques from machine learning, pattern processing, statistics, databases, and visualization to handle the problem of retrieving information from large databases. The relationship sought in data mining can be a relationship between two or more in one dimension. The algorithm included in association rules in data mining is the Frequent Pattern Growth (FP-Growth) algorithm is one of the alternatives that can be used to determine the most frequent itemset in a data set.

Download Full-text

Incremental Mining of Popular Patterns from Transactional Databases

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.7.10913 ◽

2018 ◽

Vol 7 (2.7) ◽

pp. 636

Author(s):

G Vijay Kumar ◽

M Sreedevi ◽

K Bhargav ◽

P Mohan Krishna

Keyword(s):

Tree Structure ◽

Frequent Pattern ◽

Frequent Patterns ◽

Incremental Mining ◽

Incremental Value ◽

Transactional Databases ◽

Regular Patterns ◽

Pattern Problem

From the day the mining of frequent pattern problem has been introduced the researchers have extended the frequent patterns to various helpful patterns like cyclic, periodic, regular patterns in emerging databases. In this paper, we get to know about popular pattern which gives the Popularity of every items between the incremental databases. The method that used for the mining of popular patterns is known as Incrpop-growth algorithm. Incrpop-tree structure is been applied in this algorithm. In incremental databases the event recurrence and the event conduct of the example changes at whatever point a little arrangement of new exchanges are added to the database. In this way proposes another calculation called Incrpop-tree to mine mainstream designs in incremental value-based database utilizing Incrpop-tree structure. At long last analyses have been done and comes about are indicated which gives data about conservativeness, time proficient and space productive.

Download Full-text

An UBMFFP Tree for Mining Multiple Fuzzy Frequent Itemsets

International Journal of Uncertainty Fuzziness and Knowledge-Based Systems ◽

10.1142/s0218488515500385 ◽

2015 ◽

Vol 23 (06) ◽

pp. 861-879 ◽

Cited By ~ 7

Author(s):

Jerry Chun-Wei Lin ◽

Tzung-Pei Hong ◽

Tsung-Ching Lin ◽

Shing-Tai Pan

Keyword(s):

Upper Bound ◽

Frequent Itemsets ◽

Tree Structure ◽

Frequent Pattern ◽

Second Phase ◽

Two Phase ◽

Tree Algorithm ◽

Large Databases ◽

Frequent Pattern Tree ◽

Mining Algorithms

Frequent itemsets are useful for discovering interesting associations hidden in large databases. Many mining algorithms use data with binary attributes to represent the occurrence of items and find frequent itemsets. However, many real-world applications provide a richer source of transactions with quantitative values. The fuzzy frequent pattern tree algorithm was thus proposed for extracting fuzzy frequent itemsets from the quantitative transactions. In this paper, a tree structure called the upper-bound multiple fuzzy frequent-pattern (UBMFFP)-tree is designed for improving the pruning effect in the mining process. A two-phase fuzzy mining approach based on the tree structure is also proposed to obtain the complete fuzzy frequent itemsets from a quantitative database. The proposed fuzzy mining approach recursively and efficiently finds the upper-bound fuzzy counts of itemsets with the aid of the tree structure. It prunes unpromising itemsets in the first phase, and then finds the actual fuzzy frequent itemsets in the second phase. Experimental results indicate that the proposed UBMFFP-tree algorithm has good performance in terms of execution time and number of tree nodes.

Download Full-text