An Efficient Parallel Algorithm for Mining Frequent Pattern

2012 ◽  
Vol 562-564 ◽  
pp. 876-881
Author(s):  
Guan Xun Cui ◽  
Qian Wu ◽  
Bo He ◽  
Wei Ni

Extraction of frequent patterns in transaction-oriented database is crucial to several data mining tasks such as association rule generation, time series analysis, classification, etc. An Efficient Parallel algorithm for Mining frequent pattern (EPM) was proposed and Fast Distributed association rules Mining (FDM) algorithm was improved. Hash table technology was used to improve the generation efficiency of the 2nd candidate items . It also reduces the number of transactions in transaction database using Tid table technology. A master-slave model of parallel algorithm for mining association rules is designed in the algorithm to reduce the communication cost. The experimental results show that this algorithm has a high efficiency to deal with large database.

2005 ◽  
Vol 277-279 ◽  
pp. 287-292 ◽  
Author(s):  
Lu Na Byon ◽  
Jeong Hye Han

As electronic commerce progresses, temporal association rules are developed by time to offer personalized services for customer’s interests. In this article, we propose a temporal association rule and its discovering algorithm with exponential smoothing filter in a large transaction database. Through experimental results, we confirmed that this is more precise and consumes a shorter running time than existing temporal association rules.


2011 ◽  
Vol 317-319 ◽  
pp. 1868-1871
Author(s):  
Jian Hong Li

This paper focuses on an important research topic in data mining (DM) which heavily replies on the association rules. In order to deal with the maintenance issues within the background of the static transaction database, there are some minor changes to minimum support and confidence coefficient. A novel algorithm based on incremental updated is proposed, which is termed as NIUA (Novel Incremental Updating Algorithm). IUA uses association rules to mining the database, aiming at finding the potential information or finding the reasons from massive data.


Author(s):  
Paul D. McNicholas ◽  
Yanchang Zhao

Association rules present one of the most versatile techniques for the analysis of binary data, with applications in areas as diverse as retail, bioinformatics, and sociology. In this chapter, the origin of association rules is discussed along with the functions by which association rules are traditionally characterised. Following the formal definition of an association rule, these functions – support, confidence and lift – are defined and various methods of rule generation are presented, spanning 15 years of development. There is some discussion about negations and negative association rules and an analogy between association rules and 2×2 tables is outlined. Pruning methods are discussed, followed by an overview of measures of interestingness. Finally, the post-mining stage of the association rule paradigm is put in the context of the preceding stages of the mining process.


Author(s):  
Subba Reddy Meruva ◽  
Venkateswarlu Bondu

Association rule defines the relationship among the items and discovers the frequent items using a support-confidence framework. This framework establishes user-interested or strong association rules with two thresholds (i.e., minimum support and minimum confidence). Traditional association rule mining methods (i.e., apriori and frequent pattern growth [FP-growth]) are widely used for discovering of frequent itemsets, and limitation of these methods is that they are not considering the key factors of the items such as profit, quantity, or cost of items during the mining process. Applications like e-commerce, marketing, healthcare, and web recommendations, etc. consist of items with their utility or profit. Such cases, utility-based itemsets mining methods, are playing a vital role in the generation of effective association rules and are also useful in the mining of high utility itemsets. This paper presents the survey on high-utility itemsets mining methods and discusses the observation study of existing methods with their experimental study using benchmarked datasets.


Data Mining ◽  
2013 ◽  
pp. 859-879
Author(s):  
Qin Ding ◽  
Gnanasekaran Sundarraj

Finding frequent patterns and association rules in large data has become a very important task in data mining. Various algorithms have been proposed to solve such problems, but most algorithms are only applicable to relational data. With the increasing use and popularity of XML representation, it is of importance yet challenging to find solutions to frequent pattern discovery and association rule mining of XML data. The challenge comes from the complexity of the structure in XML data. In this chapter, we provide an overview of the state-of-the-art research in content-based and structure-based mining of frequent patterns and association rules from XML data. We also discuss the challenges and issues, and provide our insight for solutions and future research directions.


2012 ◽  
Vol 236-237 ◽  
pp. 326-333
Author(s):  
Zhi Cheng Qu ◽  
Meng Ye ◽  
Bin Jiang

Association rules tell us interesting relationships between different items in transaction database. But traditional association rule has two disadvantages. Firstly it assumes every two items have same significance in database, which is unreasonable in many real applications and usually leads to incorrect results. On the other hand, traditional association rule representation contains too much redundancy which makes it difficult to be mined and used. This paper addresses the problem of mining weighted concise association rules based on closed itemsets under weighted support-significant framework, in which each item with different significance is assigned different weight. Through exploiting specific technique, the proposed algorithm can mine all weighted concise association rules while duplicate weighted itemset search space is pruned. As illustrated in experiments, the proposed method leads to good results and achieves good performance.


2021 ◽  
Vol 2 (1) ◽  
pp. 132-139
Author(s):  
Wiwit Pura Nurmayanti ◽  
Hanipar Mahyulis Sastriana ◽  
Abdul Rahim ◽  
Muhammad Gazali ◽  
Ristu Haiban Hirzi ◽  
...  

Indonesia is an equatorial country that has abundant natural wealth from the seabed to the top of the mountains, the beauty of the country of Indonesia also lies in the mountains that it has in various provinces, for example in the province of West Nusa Tenggara known for its beautiful mountain, namely Rinjani. The increase in outdoor activities has attracted many people to open outdoor shops in the West Nusa Tenggara region. Sales transaction data in outdoor stores can be processed into information that can be profitable for the store itself. Using a market basket analysis method to see the association (rules) between a number of sales attributes. The purpose of this study is to determine the pattern of relationships in the transactions that occur. The data used is the transaction data of outdoor goods. The analysis used is the Association Rules with the Apriori algorithm and the frequent pattern growth (FP-growth) algorithm. The results of this study are formed 10 rules in the Apriori algorithm and 4 rules in the FP-Growth algorithm. The relationship pattern or association rule that is formed is in the item "if a consumer buys a portable stove, it is possible that portable gas will also be purchased" at the strength level of the rules with a minimum support of 0.296 and confidence 0.774 at Apriori and 0.296 and 0.750 at FP-Growth.  


Author(s):  
Qin Ding ◽  
Gnanasekaran Sundarraj

Finding frequent patterns and association rules in large data has become a very important task in data mining. Various algorithms have been proposed to solve such problems, but most algorithms are only applicable to relational data. With the increasing use and popularity of XML representation, it is of importance yet challenging to find solutions to frequent pattern discovery and association rule mining of XML data. The challenge comes from the complexity of the structure in XML data. In this chapter, we provide an overview of the state-of-the-art research in content-based and structure-based mining of frequent patterns and association rules from XML data. We also discuss the challenges and issues, and provide our insight for solutions and future research directions.


2018 ◽  
Vol 189 ◽  
pp. 10012 ◽  
Author(s):  
Ming Yin ◽  
Wenjie Wang ◽  
Yang Liu ◽  
Dan Jiang

FP-Growth algorithm is an association rule mining algorithm based on frequent pattern tree (FP-Tree), which doesn’t need to generate a large number of candidate sets. However, constructing FP-Tree requires two scansof the original transaction database and the recursive mining of FP-Tree to generate frequent itemsets. In addition, the algorithm can’t work effectively when the dataset is dense. To solve the problems of large memory usage and low time-effectiveness of data mining in this algorithm, this paper proposes an improved algorithm based on adjacency table using a hash table to store adjacency table, which considerably saves the finding time. The experimental results show that the improved algorithm has good performance especially for mining frequent itemsets in dense data sets.


2020 ◽  
Vol 10 (13) ◽  
pp. 4590 ◽  
Author(s):  
Hyun-Jin Kim ◽  
Ji-Won Baek ◽  
Kyungyong Chung

This study proposes the optimization method of the associative knowledge graph using TF-IDF based ranking scores. The proposed method calculates TF-IDF weights in all documents and generates term ranking. Based on the terms with high scores from TF-IDF based ranking, optimized transactions are generated. News data are first collected through crawling and then are converted into a corpus through preprocessing. Unnecessary data are removed through preprocessing including lowercase conversion, removal of punctuation marks and stop words. In the document term matrix, words are extracted and then transactions are generated. In the data cleaning process, the Apriori algorithm is applied to generate association rules and make a knowledge graph. To optimize the generated knowledge graph, the proposed method utilizes TF-IDF based ranking scores to remove terms with low scores and recreate transactions. Based on the result, the association rule algorithm is applied to create an optimized knowledge model. The performance is evaluated in rule generation speed and usefulness of association rules. The association rule generation speed of the proposed method is about 22 seconds faster. And the lift value of the proposed method for usefulness is about 0.43 to 2.51 higher than that of each one of conventional association rule algorithms.


Sign in / Sign up

Export Citation Format

Share Document