scholarly journals Coalesce based binary table: an enhanced algorithm for mining frequent patterns

2017 ◽  
Vol 7 (1.5) ◽  
pp. 51
Author(s):  
M. Sireesha ◽  
Srikanth Vemuru ◽  
S. N. TirumalaRao

Frequent item set mining and association rule mining is the key tasks in knowledge discovery process. Various customized algorithms are being implemented in Association Rule Mining process to find the set of frequent patterns. Though we have many algorithms apriori is one of the standard algorithm for finding frequent itemsets, but this algorithm is inefficient because of several scans of database and more number of candidates to be generated. To overcome these limitations, in this paper a new algorithm called Coalesce based Binary Table is introduced. Through this algorithm the given database is scanned only once to generate Binary Table by which frequent-1 itemsets are found.  To progress the process, infrequent-1 itemsets are identified and removed from the Binary Table to rearrange the items in support ascending order. To each frequent-1 itemset find Coalesce matrix and Index List to generate all frequent itemsets having the same support count as representative items and the remaining frequent itemsets are obtained in depth first manner. The significant benefits with the proposed method are the whole database is scanned only once, no need to generate and check each candidate to find the set of frequent items. On the other hand frequent items having the same support counts as representative items can be identified directly by joining the representative item with all the combinations of Coalesce matrix. So, it is proven that coalesce based Binary Table is panacea to cut short the time in identifying the frequent itemsets hence the efficiency is improved.

2020 ◽  
Vol 1 (3) ◽  
pp. 1-7
Author(s):  
Sarbani Dasgupta ◽  
Banani Saha

In data mining, Apriori technique is generally used for frequent itemsets mining and association rule learning over transactional databases. The frequent itemsets generated by the Apriori technique provides association rules which are used for finding trends in the database. As the size of the database increases, sequential implementation of Apriori technique will take a lot of time and at one point of time the system may crash. To overcome this problem, several algorithms for parallel implementation of Apriori technique have been proposed. This paper gives a comparative study on various parallel implementation of Apriori technique .It also focuses on the advantages of using the Map Reduce technology, the latest technology used in parallelization of large dataset mining.


2008 ◽  
pp. 2993-3004
Author(s):  
George Tzanis ◽  
Christos Berberidis

Association rule mining is a popular task that involves the discovery of co-occurences of items in transaction databases. Several extensions of the traditional association rule mining model have been proposed so far; however, the problem of mining for mutually exclusive items has not been directly tackled yet. Such information could be useful in various cases (e.g., when the expression of a gene excludes the expression of another), or it can be used as a serious hint in order to reveal inherent taxonomical information. In this article, we address the problem of mining pairs of items, such that the presence of one excludes the other. First, we provide a concise review of the literature, then we define this problem, we propose a probability-based evaluation metric, and finally a mining algorithm that we test on transaction data.


Author(s):  
Ling Zhou ◽  
Stephen Yau

Association rule mining among frequent items has been extensively studied in data mining research. However, in recent years, there is an increasing demand for mining infrequent items (such as rare but expensive items). Since exploring interesting relationships among infrequent items has not been discussed much in the literature, in this chapter, the authors propose two simple, practical and effective schemes to mine association rules among rare items. Their algorithms can also be applied to frequent items with bounded length. Experiments are performed on the well-known IBM synthetic database. The authors’ schemes compare favorably to Apriori and FP-growth under the situation being evaluated. In addition, they explore quantitative association rule mining in transactional databases among infrequent items by associating quantities of items: some interesting examples are drawn to illustrate the significance of such mining.


Author(s):  
Carson K.-S. Leung ◽  
Fan Jiang ◽  
Edson M. Dela Cruz ◽  
Vijay Sekar Elango

Collaborative filtering uses data mining and analysis to develop a system that helps users make appropriate decisions in real-life applications by removing redundant information and providing valuable to information users. Data mining aims to extract from data the implicit, previously unknown and potentially useful information such as association rules that reveals relationships between frequently co-occurring patterns in antecedent and consequent parts of association rules. This chapter presents an algorithm called CF-Miner for collaborative filtering with association rule miner. The CF-Miner algorithm first constructs bitwise data structures to capture important contents in the data. It then finds frequent patterns from the bitwise structures. Based on the mined frequent patterns, the algorithm forms association rules. Finally, the algorithm ranks the mined association rules to recommend appropriate merchandise products, goods or services to users. Evaluation results show the effectiveness of CF-Miner in using association rule mining in collaborative filtering.


Author(s):  
Reshu Agarwal

A modified framework that applies temporal association rule mining to inventory management is proposed in this article. The ordering policy of frequent items is determined and inventory is classified based on loss rule. This helps inventory managers to determine optimum order quantity of frequent items together with the most profitable item in each time-span. An example is illustrated to validate the results.


2012 ◽  
Vol 6-7 ◽  
pp. 625-630 ◽  
Author(s):  
Hong Sheng Xu

In the form of background in the form of concept partial relation to the corresponding concept lattice, concept lattice is the core data structure of formal concept analysis. Association rule mining process includes two phases: first find all the frequent itemsets in data collection, Second it is by these frequent itemsets to generate association rules. This paper analyzes the association rule mining algorithms, such as Apriori and FP-Growth. The paper presents the construction search engine based on formal concept analysis and association rule mining. Experimental results show that the proposed algorithm has high efficiency.


Author(s):  
Mafruz Ashrafi ◽  
David Taniar ◽  
Kate Smith

Association rule mining is one of the most widely used data mining techniques. To achieve a better performance, many efficient algorithms have been proposed. Despite these efforts, many of these algorithms require a large amount of main memory to enumerate all frequent itemsets, especially when the dataset is large or the user-specified support is low. Thus, it becomes apparent that we need to have an efficient main memory handling technique, which allows association rule mining algorithms to handle larger datasets in the main memory. To achieve this goal, in this chapter we propose an algorithm for vertical association rule mining that compresses a vertical dataset in an efficient manner, using bit vectors. Our performance evaluations show that the compression ratio attained by our proposed technique is better than those of the other well-known techniques.


Author(s):  
Reshu Agarwal ◽  
Sarla Pareek ◽  
Biswajit Sarkar ◽  
Mandeep Mittal

In this article, an inventory model for a retailer's ordering policy is studied. Multi-level association rule mining is used to find frequent item-sets at each level by applying different threshold at different levels. During order quantity estimation, category, content, and brand of the items are considered, which leads to the discovery of more specific and concrete knowledge of the required order quantity. At each level, optimum order quantity of frequent items is determined. This assists inventory manager to order optimal quantity of items as per the actual requirement of the item with respect to their category, content and brand. An example is devised to explain the new approach. Further, to understand the effect of above approach in the real scenario, experiments are conducted on the exiting dataset.


2014 ◽  
Vol 998-999 ◽  
pp. 899-902 ◽  
Author(s):  
Cheng Luo ◽  
Ying Chen

Existing data miming algorithms have mostly implemented data mining under centralized environment, but the large-scale database exists in the distributed form. According to the existing problem of the distributed data mining algorithm FDM and its improved algorithms, which exist the problem that the frequent itemsets are lost and network communication cost too much. This paper proposes a association rule mining algorithm based on distributed data (ARADD). The mapping marks the array mechanism is included in the ARADD algorithm, which can not only keep the integrity of the frequent itemsets, but also reduces the cost of network communication. The efficiency of algorithm is proved in the experiment.


2004 ◽  
Vol 03 (03) ◽  
pp. 245-257 ◽  
Author(s):  
Kwang-Il Ahn ◽  
Jae-Yearn Kim

Association rule mining is an important research topic in data mining. Association rule mining consists of two steps: finding frequent itemsets and then extracting interesting rules from the frequent itemsets. In the first step, efficiency is important since discovering frequent itemsets is computationally time consuming. In the second step, unbiased assessment is important for good decision making. In this paper, we deal with both the efficiency of the mining algorithm and the measure of interest of the resulting rules. First, we present an algorithm for finding frequent itemsets that uses a vertical database. We also introduce a modified vertical data format to reduce the size of the database and an itemset reordering strategy to reduce the size of the intermediate tidsets. Second, we present a new measure to evaluate the interest of the resulting association rules. Our performance analysis shows that our proposed algorithm reduces the size of the intermediate tidsets that are generated during the mining process. The smaller tidsets make intersection operations faster. Using our interest-measuring test helps to avoid the discovery of misleading rules.


Sign in / Sign up

Export Citation Format

Share Document