An Interestingness Measure and Computation Method of Association Rules Based on Frequent Itemsets Relatedness

2011 ◽  
Vol 71-78 ◽  
pp. 4039-4043
Author(s):  
Xiang Chen ◽  
Xue Feng Zhou ◽  
Yong Zhang

To address inadequacy of association rules interestingness measure method currently, we present a novel method to measure interestingness with relatedness among items in frequent itemsets. It firstly computed relatedness between frequent k-itemsets and each subset of frequent 2-itemsets, which is a linear combination of Complementarity Intensity (CI), Substitutability Intensity (SI) and Mutual Interaction (MI). The mean of relatedness of all frequent 2-itemsets subsets was regarded as relatedness of frequent k-itemsets. Finally weighted computation method of association rule interestingness was given according to principle of objective interestingness of association rule is inversely proportional to relatedness of frequent itemsets. The method can not only sort rules, but also analyze actual relationship among all items in frequent 2-itemsets, which is conductive to selection of users on rules.

2021 ◽  
Author(s):  
Pouya Mehrannia ◽  
Behzad Moshiri ◽  
Otman Adam Basir

Abstract This paper introduces knowledgebase approximation and fusion using association rules aggregation as a means to facilitate accelerated insight induction from high-dimensional and disparate knowledgebases. There are two typical observations that make approximating knowledgebases of interest: (1) it is quite often that insights can be derived based from a partial set of the samples, and not necessarily from all of them; and (2) generally speaking, it is rare that the knowledge of interest is contained in one knowledgebase, but rather distributed among a disparate set of unidentical knowledgebases. As a matter of fact, the insights derivable from knowledgebases tend to be uncertain, even if they were to be derived from a wholistic analysis of the knowledgebase. Thus, optimal knowledgebase approximation may yield the computational efficiency benefit without necessarily compromising insight accuracy. This paper presents a novel method to approximate a set of knowledgebases based on association rule aggregation using the disjunctive pooling rule. We show that this method can reduce insight discovery time while maintaining approximation accuracy within a desirable level.


2014 ◽  
Vol 685 ◽  
pp. 575-578
Author(s):  
Guang Jiang Wang ◽  
Shi Guo Jin

Association rule mining is an important data mining method; it is the key link of finding frequent itemsets. The process of association rules mining is roughly into two steps: the first step is to find out from all the concentration of all the frequent itemsets; the second step is to obtain the association rules from frequent itemsets. This paper analyzes the collected information of nodes in wireless sensor network and management. The paper presents application of association rule mining technology in the collection and management of wireless sensor network node.


2014 ◽  
Vol 536-537 ◽  
pp. 520-523
Author(s):  
Jia Liu ◽  
Zhen Ya Zhang ◽  
Hong Mei Cheng ◽  
Qian Sheng Fang

Usually, non trivial network visiting behaviors implied in network visiting log can be treated as the frequent itemsets or association rules if data in networking log file are transformed into transaction and technologies on association rule can be used to mine those frequent itemsets which are focused by user or some application. To mine non trivial behaviors of network visiting effectively, an attention based frequent itemsets mining method is proposed in this paper. In our proposed method, properties of users focusing is described as attention set and the early selection model of attention as information filter is referenced in the design of our method. Experimental results show that our proposed method is faster than apriori algorithm on the mining of frequent itemsets which is focused by our attention.


2013 ◽  
Vol 2013 ◽  
pp. 1-11 ◽  
Author(s):  
Hai Quoc Le ◽  
Somjit Arch-int ◽  
Ngamnij Arch-int

Association rule hiding has been playing a vital role in sensitive knowledge preservation when sharing data between enterprises. The aim of association rule hiding is to remove sensitive association rules from the released database such that side effects are reduced as low as possible. This research proposes an efficient algorithm for hiding a specified set of sensitive association rules based on intersection lattice of frequent itemsets. In this research, we begin by analyzing the theory of the intersection lattice of frequent itemsets and the applicability of this theory into association rule hiding problem. We then formulate two heuristics in order to (a) specify the victim items based on the characteristics of the intersection lattice of frequent itemsets and (b) identify transactions for data sanitization based on the weight of transactions. Next, we propose a new algorithm for hiding a specific set of sensitive association rules with minimum side effects and low complexity. Finally, experiments were carried out to clarify the efficiency of the proposed approach. Our results showed that the proposed algorithm, AARHIL, achieved minimum side effects and CPU-Time when compared to current similar state of the art approaches in the context of hiding a specified set of sensitive association rules.


2004 ◽  
Vol 03 (04) ◽  
pp. 317-329 ◽  
Author(s):  
Imad Rahal ◽  
Dongmei Ren ◽  
William Perrizo

Association rule mining (ARM) is the data-mining process for finding all association rules in datasets matching user-defined measures of interest such as support and confidence. Usually, ARM proceeds by mining all frequent itemsets — a step known to be very computationally intensive — from which rules are then derived in a straight forward manner. In general, mining all frequent itemsets prunes the space by using the downward closure (or anti-monotonicity) property of support which states that no itemset can be frequent unless all of its subsets are frequent. A large number of papers have addressed the problem of ARM but not many of them have focused on scalability over very large datasets (i.e. when datasets contain a very large number of transactions). In this paper, we propose a new model for representing data and mining frequent itemsets that is based on the P-tree technology for compression and faster logical operations over vertically structured data and on set enumeration trees for fast itemset enumeration. Experimental results presented hereinafter show big improvements for our approach over large datasets when compared to other contemporary approaches in the literature.


Author(s):  
Raed Shatnawi ◽  
Qutaibah Althebyan ◽  
Baraq Ghaleb ◽  
Mohammed Al-Maolegi

Academic advising is a time-consuming activity that takes a considerable effort in guiding students to improve student performance. Traditional advising systems depend greatly on the effort of the advisor to find the best selection of courses to improve student performance in the next semester. There is a need to know the associations and patterns among course registration. Finding associations among courses can guide and direct students in selecting the appropriate courses that leads to performance improvement. In this paper, the authors propose to use association rule mining to help both students and advisors in selecting and prioritizing courses. Association rules find dependences among courses that help students in selecting courses based on their performance in previous courses. The association rule mining is conducted on thousands of student records to find associations between courses that have been registered by students in many previous semesters. The system has successfully generated a list of association rules that guide a particular student to select courses. The system was validated on the registration of 100 students, and the precision and recall showed acceptable prediction of courses.


Kybernetes ◽  
2018 ◽  
Vol 47 (3) ◽  
pp. 441-457 ◽  
Author(s):  
Cheng-Hsiung Weng ◽  
Tony Cheng-Kui Huang

Purpose Customer lifetime value (CLV) scoring is highly effective when applied to marketing databases. Some researchers have extended the traditional association rule problem by associating a weight with each item in a transaction. However, studies of association rule mining have considered the relative benefits or significance of “items” rather than “transactions” belonging to different customers. Because not all customers are financially attractive to firms, it is crucial that their profitability be determined and that transactions be weighted according to CLV. This study aims to discover association rules from the CLV perspective. Design/methodology/approach This study extended the traditional association rule problem by allowing the association of CLV weight with a transaction to reflect the interest and intensity of customer values. Furthermore, the authors proposed a new algorithm, frequent itemsets of CLV weight (FICLV), to discover frequent itemsets from CLV-weighted transactions. Findings Experimental results from the survey data indicate that the proposed FICLV algorithm can discover valuable frequent itemsets. Moreover, the frequent itemsets identified using the FICLV algorithm outperform those discovered through conventional approaches for predicting customer purchasing itemsets in the coming period. Originality/value This study is the first to introduce the optimum approach for discovering frequent itemsets from transactions through considering CLV.


2021 ◽  
Vol 48 (4) ◽  
Author(s):  
Hafiz I. Ahmad ◽  
◽  
Alex T. H. Sim ◽  
Roliana Ibrahim ◽  
Mohammad Abrar ◽  
...  

Association rule mining (ARM) is used for discovering frequent itemsets for interesting relationships of associative and correlative behaviors within the data. This gives new insights of great value, both commercial and academic. The traditional ARM techniques discover interesting association rules based on a predefined minimum support threshold. However, there is no known standard of an exact definition of minimum support and providing an inappropriate minimum support value may result in missing important rules. In addition, most of the rules discovered by these traditional ARM techniques refer to already known knowledge. To address these limitations of the minimum support threshold in ARM techniques, this study proposes an algorithm to mine interesting association rules without minimum support using predicate logic and a property of a proposed interestingness measure (g measure). The algorithm scans the database and uses g measure’s property to search for interesting combinations. The selected combinations are mapped to pseudo-implications and inference rules of logic are used on the pseudo-implications to produce and validate the predicate rules. Experimental results of the proposed technique show better performance against state-of-the-art classification techniques, and reliable predicate rules are discovered based on the reliability differences of the presence and absence of the rule’s consequence.


Author(s):  
Vladimír Bartík

Association rules are one of the most frequently used types of knowledge discovered from databases. The problem of discovering association rules was first introduced in (Agrawal, Imielinski & Swami, 1993). Here, association rules are discovered from transactional databases –a set of transactions where a transaction is a set of items. An association rule is an expression of a form A?B where A and B are sets of items. A typical application is market basket analysis. Here, the transaction is the content of a basket and items are products. For example, if a rule milk ? juice ? coffee is discovered, it is interpreted as: “If the customer buys milk and juice, s/he is likely to buy coffee too.” These rules are called single-dimensional Boolean association rules (Han & Kamber, 2001). The potential usefulness of the rule is expressed by means of two metrics – support and confidence. A lot of algorithms have been developed for mining association rules in transactional databases. The best known is the Apriori algorithm (Agrawal & Srikant, 1994), which has many modifications, e.g. (Kotásek & Zendulka, 2000). These algorithms usually consist of two phases: discovery of frequent itemsets and generation of association rules from them. A frequent itemset is a set of items having support greater than a threshold called minimum support. Association rule generation is controlled by another threshold referred to as minimum confidence. Association rules discovered can have a more general form and their mining is more complex than mining rules from transactional databases. In relational databases, association rules are ordinarily discovered from data of one table (it can be the result of joining several other tables). The table can have many columns (attributes) defined on domains of different types. It is useful to distinguish two types of attributes. A categorical attribute (also called nominal) has a finite number of possible values with no ordering among the values (e.g. a country of a customer). A quantitative attribute is a numeric attribute, domain of which is infinite or very large. In addition, it has an implicit ordering among values (e.g. age and salary of a customer). An association rule (Age = [20…30]) ? (Country = “Czech Rep.”) ? (Salary = [1000$...2000$]) says that if the customer is between 20 and 30 and is from the Czech Republic, s/he is likely to earn between 1000$ and 2000$ per month. Such rules with two or more predicates (items) containing different attributes are also called multidimensional association rules. If some attributes of rules are quantitative, the rules are called quantitative association rules (Han & Kamber, 2001). If a table contains only categorical attributes, it is possible to use modified algorithms for mining association rules in transactional databases. The crucial problem is to process quantitative attributes because their domains are very large and these algorithms cannot be used. Quantitative attributes must be discretized. This article deals with mining multidimensional association rules from relational databases, with main focus on distance-based methods. One of them is a novel method developed by the authors.


Sign in / Sign up

Export Citation Format

Share Document