Mine Rule

Author(s):  
Rosa Meo ◽  
Giuseppe Psaila

Mining of association rules is one of the most adopted techniques for data mining in the most widespread application domains. A great deal of work has been carried out in the last years on the development of efficient algorithms for association rules extraction. Indeed, this problem is a computationally difficult task, known as NP-hard (Calders, 2004), which has been augmented by the fact that normally association rules are being extracted from very large databases. Moreover, in order to increase the relevance and interestingness of obtained results and to reduce the volume of the overall result, constraints on association rules are introduced and must be evaluated (Ng et al.,1998; Srikant et al., 1997). However, in this contribution, we do not focus on the problem of developing efficient algorithms but on the semantic problem behind the extraction of association rules (see Tsur et al. [1998] for an interesting generalization of this problem).

Author(s):  
LAWRENCE MAZLACK

Determining causality has been a tantalizing goal throughout human history. Proper sacrifices to the gods were thought to bring rewards; failure to make suitable observations were thought to lead to disaster. Today, data mining holds the promise of extracting unsuspected information from very large databases. Methods have been developed to build association rules from large data sets. Association rules indicate the strength of association of two or more data attributes. In many ways, the interest in association rules is that they offer the promise (or illusion) of causal, or at least, predictive relationships. However, association rules only calculate a joint probability; they do not express a causal relationship. If causal relationships could be discovered, it would be very useful. Our goal is to explore causality in the data mining context.


2011 ◽  
Vol 145 ◽  
pp. 292-296
Author(s):  
Lee Wen Huang

Data Mining means a process of nontrivial extraction of implicit, previously and potentially useful information from data in databases. Mining closed large itemsets is a further work of mining association rules, which aims to find the set of necessary subsets of large itemsets that could be representative of all large itemsets. In this paper, we design a hybrid approach, considering the character of data, to mine the closed large itemsets efficiently. Two features of market basket analysis are considered – the number of items is large; the number of associated items for each item is small. Combining the cut-point method and the hash concept, the new algorithm can find the closed large itemsets efficiently. The simulation results show that the new algorithm outperforms the FP-CLOSE algorithm in the execution time and the space of storage.


2008 ◽  
pp. 2105-2120
Author(s):  
Kesaraporn Techapichetvanich ◽  
Amitava Datta

Both visualization and data mining have become important tools in discovering hidden relationships in large data sets, and in extracting useful knowledge and information from large databases. Even though many algorithms for mining association rules have been researched extensively in the past decade, they do not incorporate users in the association-rule mining process. Most of these algorithms generate a large number of association rules, some of which are not practically interesting. This chapter presents a new technique that integrates visualization into the mining association rule process. Users can apply their knowledge and be involved in finding interesting association rules through interactive visualization, after obtaining visual feedback as the algorithm generates association rules. In addition, the users gain insight and deeper understanding of their data sets, as well as control over mining meaningful association rules.


Author(s):  
Kesaraporn Techapichetvanich ◽  
Amitava Datta

Both visualization and data mining have become important tools in discovering hidden relationships in large data sets, and in extracting useful knowledge and information from large databases. Even though many algorithms for mining association rules have been researched extensively in the past decade, they do not incorporate users in the association-rule mining process. Most of these algorithms generate a large number of association rules, some of which are not practically interesting. This chapter presents a new technique that integrates visualization into the mining association rule process. Users can apply their knowledge and be involved in finding interesting association rules through interactive visualization, after obtaining visual feedback as the algorithm generates association rules. In addition, the users gain insight and deeper understanding of their data sets, as well as control over mining meaningful association rules.


2014 ◽  
Vol 998-999 ◽  
pp. 842-845 ◽  
Author(s):  
Jia Mei Guo ◽  
Yin Xiang Pei

Association rules extraction is one of the important goals of data mining and analyzing. Aiming at the problem that information lose caused by crisp partition of numerical attribute , in this article, we put forward a fuzzy association rules mining method based on fuzzy logic. First, we use c-means clustering to generate fuzzy partitions and eliminate redundant data, and then map the original data set into fuzzy interval, in the end, we extract the fuzzy association rules on the fuzzy data set as providing the basis for proper decision-making. Results show that this method can effectively improve the efficiency of data mining and the semantic visualization and credibility of association rules.


2005 ◽  
Vol 1 (3) ◽  
pp. 129-135
Author(s):  
Jun Luo ◽  
Sanguthevar Rajasekaran

Association rules mining is an important data mining problem that has been studied extensively. In this paper, a simple but Fast algorithm for Intersecting attributes lists using hash Tables (FIT) is presented. FIT is designed for efficiently computing all the frequent itemsets in large databases. It deploys an idea similar to Eclat but has a much better computational performance than Eclat due to two reasons: 1) FIT makes fewer total number of comparisons for each intersection operation between two attributes lists, and 2) FIT significantly reduces the total number of intersection operations. Our experimental results demonstrate that the performance of FIT is much better than that of Eclat and Apriori algorithms.


2017 ◽  
Author(s):  
Andysah Putera Utama Siahaan ◽  
Mesran Mesran ◽  
Andre Hasudungan Lubis ◽  
Ali Ikhwan ◽  
Supiyandi

Sales transaction data on a company will continue to increase day by day. Large amounts of data can be problematic for a company if it is not managed properly. Data mining is a field of science that unifies techniques from machine learning, pattern processing, statistics, databases, and visualization to handle the problem of retrieving information from large databases. The relationship sought in data mining can be a relationship between two or more in one dimension. The algorithm included in association rules in data mining is the Frequent Pattern Growth (FP-Growth) algorithm is one of the alternatives that can be used to determine the most frequent itemset in a data set.


Author(s):  
Suma B. ◽  
Shobha G.

<div>Association rule mining is a well-known data mining technique used for extracting hidden correlations between data items in large databases. In the majority of the situations, data mining results contain sensitive information about individuals and publishing such data will violate individual secrecy. The challenge of association rule mining is to preserve the confidentiality of sensitive rules when releasing the database to external parties. The association rule hiding technique conceals the knowledge extracted by the sensitive association rules by modifying the database. In this paper, we introduce a border-based algorithm for hiding sensitive association rules. The main purpose of this approach is to conceal the sensitive rule set while maintaining the utility of the database and association rule mining results at the highest level. The performance of the algorithm in terms of the side effects is demonstrated using experiments conducted on two real datasets. The results show that the information loss is minimized without sacrificing the accuracy. </div>


Sign in / Sign up

Export Citation Format

Share Document