DWFIST

2008 ◽  
pp. 3142-3163
Author(s):  
Rodrigo Salvador Monteiro ◽  
Geraldo Zimbrao ◽  
Holger Schwarz ◽  
Bernhard Mitschang ◽  
Jano Moreira de Souza

This chapter presents the core of the DWFIST approach, which is concerned with supporting the analysis and exploration of frequent itemsets and derived patterns, e.g., association rules in transactional datasets. The goal of this new approach is to provide: (1) flexible pattern-retrieval capabilities without requiring the original data during the analysis phase; and (2) a standard modeling for data warehouses of frequent itemsets, allowing an easier development and reuse of tools for analysis and exploration of itemset-based patterns. Instead of storing the original datasets, our approach organizes frequent itemsets holding on different partitions of the original transactions in a data warehouse that retains sufficient information for future analysis. A running example for mining calendar-based patterns on data streams is presented. Staging area tasks are discussed and standard conceptual and logical schemas are presented. Properties of this standard modeling allow retrieval of frequent itemsets holding on any set of partitions, along with upper and lower bounds on their frequency counts. Furthermore, precision guarantees for some interestingness measures of association rules are provided as well.

Author(s):  
Rodrigo Salvador Monteiro ◽  
Geraldo Zimbrao ◽  
Holger Schwarz ◽  
Bernhard Mitschang ◽  
Jano Moreira de Souza

This chapter presents the core of the DWFIST approach, which is concerned with supporting the analysis and exploration of frequent itemsets and derived patterns, e.g., association rules in transactional datasets. The goal of this new approach is to provide: (1) flexible pattern-retrieval capabilities without requiring the original data during the analysis phase; and (2) a standard modeling for data warehouses of frequent itemsets, allowing an easier development and reuse of tools for analysis and exploration of itemset-based patterns. Instead of storing the original datasets, our approach organizes frequent itemsets holding on different partitions of the original transactions in a data warehouse that retains sufficient information for future analysis. A running example for mining calendar-based patterns on data streams is presented. Staging area tasks are discussed and standard conceptual and logical schemas are presented. Properties of this standard modeling allow retrieval of frequent itemsets holding on any set of partitions, along with upper and lower bounds on their frequency counts. Furthermore, precision guarantees for some interestingness measures of association rules are provided as well.


2011 ◽  
Vol 130-134 ◽  
pp. 2629-2632
Author(s):  
Jie Liu ◽  
Tian Qi Li ◽  
Jian Pei Zhang

Multi-parameters data perturbation method is a kind of original data perturbation methods for privacy preserving association rules mining. However, the time-efficiency of restoring the frequent itemsets in multi-parameters perturbation algorithm is still not high.One method is proposed in this paper to improve the time efficiency of multi-parameters randomized perturbation algorithm according to the characteristics of the model to restore frequent itemsets. The method improves the time efficiency by getting the elements of the first line of the inversed matrix of transformation matrix. Finally, both theoretical analysis and experimental results show that the improved algorithm is more time-efficient and space-efficient than the original algorithm.


2009 ◽  
Vol 12 (11) ◽  
pp. 49-56
Author(s):  
Bac Hoai Le ◽  
Bay Dinh Vo

In traditional mining of association rules, finding all association rules from databases that satisfy minSup and minConf faces with some problems in case of the number of frequent itemsets is large. Thus, it is necessary to have a suitable method for mining fewer rules but they still embrace all rules of traditional mining method. One of the approaches that is the mining method of essential rules: it only keeps the rule that its left hand side is minimal and its right side is maximal (follow in parent-child relationship). In this paper, we propose a new algorithm for mining the essential rules from the frequent closed itemsets lattice to reduce the time of mining rules. We use the parent-child relationship in lattice to reduce the cost of considering parent-child relationship and lead to reduce the time of mining rules.


2021 ◽  
Vol 336 ◽  
pp. 05009
Author(s):  
Junrui Yang ◽  
Lin Xu

Aiming at the shortcomings of the traditional "support-confidence" association rules mining framework and the problems of mining negative association rules, the concept of interestingness measure is introduced. Analyzed the advantages and disadvantages of some commonly used interestingness measures at present, and combined the cosine measure on the basis of the interestingness measure model based on the difference idea, and proposed a new interestingness measure model. The interestingness measure can effectively express the relationship between the antecedent and the subsequent part of the rule. According to this model, an association rules mining algorithm based on the interestingness measure fusion model is proposed to improve the accuracy of mining. Experiments show that the algorithm has better performance and can effectively help mining positive and negative association rules.


2019 ◽  
Vol 8 (2S11) ◽  
pp. 3448-3453

Classification is a data mining technique that categorizes the items in a database to target classes. The aim of classification is to accurately find the target class for each instance of the data. Associative classification is a classification method that uses Class Association Rules for classification. Associative classification is found to be often more accurate than some traditional classification methods. The major disadvantage of associative classification is the generation of redundant and weak class association rules. Weak class association rules results in increase in size and decrease in accuracy of the classifier. This paper proposes an efficient approach to build a compact and accurate classifier by using interestingness measures for pruning rules. Interestingness measures play a vital role in reducing the size and increasing the accuracy of classifier by pruning redundant or weak rules. Rules which are strong are retained and these rules are further used to build the classifier. The source of the data used in this paper is University of California Irvine Machine Learning Repository. The approach proposed in this paper is effective and the results show that the approach can produce a highly compact and accurate classifier


2018 ◽  
Vol 62 ◽  
pp. 817-829 ◽  
Author(s):  
Ling Wang ◽  
Jianyao Meng ◽  
Peipei Xu ◽  
Kaixiang Peng

Author(s):  
Rodrigo Salvador Monteiro ◽  
Geraldo Zimbrão ◽  
Holger Schwarz ◽  
Bernhard Mitschang ◽  
Jano Moreira de Souza

Calendar-based pattern mining aims at identifying patterns on specific calendar partitions. Potential calendar partitions are for example: every Monday, every first working day of each month, every holiday. Providing flexible mining capabilities for calendar-based partitions is especially challenging in a data stream scenario. The calendar partitions of interest are not known a priori and at each point in time only a subset of the detailed data is available. The authors show how a data warehouse approach can be applied to this problem. The data warehouse that keeps track of frequent itemsets holding on different partitions of the original stream has low storage requirements. Nevertheless, it allows to derive sets of patterns that are complete and precise. Furthermore, the authors demonstrate the effectiveness of their approach by a series of experiments.


Sign in / Sign up

Export Citation Format

Share Document