Filtering Association Rules by Their Semantics and Structures

Rare Association Rule Mining and Knowledge Discovery ◽

10.4018/978-1-60566-754-6.ch014 ◽

2010 ◽

pp. 216-230

Author(s):

Rangsipan Marukatat

Keyword(s):

Association Rules ◽

Association Rule Mining ◽

Traffic Accidents ◽

Complete Information ◽

Simple Rule ◽

Rule Generation ◽

Rule Mining ◽

Data Set ◽

Domain Description

Association rule mining produces a large number of rules but many of them are usually redundant ones. When a data set contains infrequent items, the authors need to set the minimum support criterion very low; otherwise, these items will not be discovered. The downside is that it leads to even more redundancy. To deal with this dilemma, some proposed more efficient, and perhaps more complicated, rule generation methods. The others suggested using simple rule generation methods and rather focused on the post-pruning of the rules. This chapter follows the latter approach. The classic Apriori is employed for the rule generation. Their goal is to gain as much insight as possible about the domain. Therefore, the discovered rules are filtered by their semantics and structures. An individual rule is classified by its own semantic, or by how clear its domain description is. It can be labelled as one of the following: strongly meaningless, weakly meaningless, partially meaningful, and meaningful. In addition, multiple rules are compared. Rules with repetitive patterns are removed, while those conveying the most complete information are retained. They demonstrate an application of our techniques to a real case study, an analysis of traffic accidents in Nakorn Pathom, Thailand.

Download Full-text

Reduction of Redundant Rules in Association Rule Mining-Based Bug Assignment

International Journal of Reliability Quality and Safety Engineering ◽

10.1142/s0218539317400058 ◽

2017 ◽

Vol 24 (06) ◽

pp. 1740005 ◽

Cited By ~ 3

Author(s):

Meera Sharma ◽

Abhishek Tandon ◽

Madhu Kumari ◽

V. B. Singh

Keyword(s):

Operating System ◽

Association Rules ◽

Association Rule ◽

Association Rule Mining ◽

Clustering Algorithm ◽

Large Data ◽

Software Project ◽

Rule Mining ◽

Data Set ◽

Bug Reports

Bug triaging is a process to decide what to do with newly coming bug reports. In this paper, we have mined association rules for the prediction of bug assignee of a newly reported bug using different bug attributes, namely, severity, priority, component and operating system. To deal with the problem of large data sets, we have taken subsets of data set by dividing the large data set using [Formula: see text]-means clustering algorithm. We have used an Apriori algorithm in MATLAB to generate association rules. We have extracted the association rules for top 5 assignees in each cluster. The proposed method has been empirically validated on 14,696 bug reports of Mozilla open source software project, namely, Seamonkey, Firefox and Bugzilla. In our approach, we observe that taking on these attributes (severity, priority, component and operating system) as antecedents, essential rules are more than redundant rules, whereas in [M. Sharma and V. B. Singh, Clustering-based association rule mining for bug assignee prediction, Int. J. Business Intell. Data Mining 11(2) (2017) 130–150.] essential rules are less than redundant rules in every cluster. The proposed method provides an improvement over the existing techniques for bug assignment problem.

Download Full-text

A Kernel Density Estimation Based Interestingness Measure for Association Rule Mining

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.20-23.389 ◽

2010 ◽

Vol 20-23 ◽

pp. 389-394

Author(s):

Zhi Feng Hao ◽

Rui Chu Cai ◽

Tang Wu ◽

Yi Yuan Zhou

Keyword(s):

Density Estimation ◽

Association Rules ◽

Kernel Density Estimation ◽

Association Rule ◽

Association Rule Mining ◽

Kernel Density ◽

Rule Mining ◽

Data Set ◽

Interestingness Measures ◽

Interestingness Measure

Association rules provide a concise statement of potentially useful information, and have been widely used in real applications. However, the usefulness of association rules highly depends on the interestingness measure which is used to select interesting rules from millions of candidates. In this study, a probability analysis of association rules is conducted, and a discrete kernel density estimation based interestingness measure is proposed accordingly. The new proposed interestingness measure makes the most of the information contained in the data set and obtains much lower falsely discovery rate than the existing interestingness measures. Experimental results show the effectiveness of the proposed interestingness measure.

Download Full-text

A Novel Framework to Use Association Rule Mining for classification of traffic accident severity

Ingeniería solidaria ◽

10.16925/in.v13i21.1726 ◽

2017 ◽

Vol 13 (21) ◽

pp. 37-44 ◽

Cited By ~ 2

Author(s):

Meenu Gupta ◽

Vijender Kumar Solanki ◽

Vijay Kumar Singh

Keyword(s):

Association Rule ◽

Association Rule Mining ◽

Traffic Accident ◽

Traffic Accidents ◽

Prevention Measures ◽

Rule Mining ◽

Data Set ◽

Accident Severity ◽

Institute Of Technology

Introduction: Traffic accidents are an undesirable burden on society. Every year around one million deaths and more than ten million injuries are reported due to traffic accidents. Hence, traffic accidents prevention measures must be taken to overcome the accident rate. Different countries have different geographical and environmental conditions and hence the accident factors diverge in each country. Traffic accident data analysis is very useful in revealing the factors that affect the accidents in different countries. This article was written in the year 2016 in the Institute of Technology & Science, Mohan Nagar, Ghaziabad, up, India. Methology: We propose a framework to utilize association rule mining (arm) for the severity classification of traffic accidents data obtained from police records in Mujjafarnagar district, Uttarpradesh, India. Results: The results certainly reveal some hidden factors which can be applied to understand the factors behind road accidentality in this region. Conclusions: The framework enables us to find three clusters from the data set. Each cluster represents a type of accident severity, i.e. fatal, major injury and minor/no injury. The association rules exposed different factors that are associated with road accidents in each category. The information extracted provides important information which can be employed to adapt preventive measures to overcome the accident severity in Muzzafarnagar district.

Download Full-text

Profile-Based Assessment of Diseases Affective Factors Using Fuzzy Association Rule Mining Approach: A Case Study in Heart Diseases

Journal of Biomedical Informatics ◽

10.1016/j.jbi.2021.103695 ◽

2021 ◽

pp. 103695

Author(s):

Ali Yavari ◽

Amir Rajabzadeh ◽

Fardin Abdali-Mohammadi

Keyword(s):

Association Rule ◽

Association Rule Mining ◽

Heart Diseases ◽

Rule Mining ◽

Affective Factors ◽

Fuzzy Association Rule ◽

Fuzzy Association Rule Mining

Download Full-text

Privacy Preserving Association Rule Mining on Distributed Healthcare Data: COVID-19 and Breast Cancer Case Study

SN Computer Science ◽

10.1007/s42979-021-00801-7 ◽

2021 ◽

Vol 2 (6) ◽

Author(s):

Nikunj Domadiya ◽

Udai Pratap Rao

Keyword(s):

Breast Cancer ◽

Association Rule ◽

Association Rule Mining ◽

Breast Cancer Case ◽

Privacy Preserving ◽

Cancer Case ◽

Rule Mining ◽

Healthcare Data

Download Full-text

Binary Particle Swarm Optimization-Based Association Rule Mining for Discovering Relationships between Machine Capabilities and Product Features

Mathematical Problems in Engineering ◽

10.1155/2018/2456010 ◽

2018 ◽

Vol 2018 ◽

pp. 1-16 ◽

Cited By ~ 2

Author(s):

Zhicong Kou ◽

Lifeng Xi

Keyword(s):

Particle Swarm Optimization ◽

Association Rules ◽

Association Rule ◽

Association Rule Mining ◽

Particle Swarm ◽

Performance Comparison ◽

Binary Particle Swarm Optimization ◽

Rule Mining ◽

Swarm Optimization ◽

Product Features

An effective data mining method to automatically extract association rules between manufacturing capabilities and product features from the available historical data is essential for an efficient and cost-effective product development and production. This paper proposes a new binary particle swarm optimization- (BPSO-) based association rule mining (BPSO-ARM) method for discovering the hidden relationships between machine capabilities and product features. In particular, BPSO-ARM does not need to predefine thresholds of minimum support and confidence, which improves its applicability in real-world industrial cases. Moreover, a novel overlapping measure indication is further proposed to eliminate those lower quality rules to further improve the applicability of BPSO-ARM. The effectiveness of BPSO-ARM is demonstrated on a benchmark case and an industrial case about the automotive part manufacturing. The performance comparison indicates that BPSO-ARM outperforms other regular methods (e.g., Apriori) for ARM. The experimental results indicate that BPSO-ARM is capable of discovering important association rules between machine capabilities and product features. This will help support planners and engineers for the new product design and manufacturing.

Download Full-text

Predicting Anxiety in Routine Palliative Care Using Bayesian-Inspired Association Rule Mining

Frontiers in Digital Health ◽

10.3389/fdgth.2021.724049 ◽

2021 ◽

Vol 3 ◽

Author(s):

Oliver Haas ◽

Luis Ignacio Lopera Gonzalez ◽

Sonja Hofmann ◽

Christoph Ostgathe ◽

Andreas Maier ◽

...

Keyword(s):

Palliative Care ◽

Association Rule ◽

Association Rule Mining ◽

Predictive Accuracy ◽

Characteristic Curve ◽

Rule Mining ◽

Data Set ◽

Routinely Collected Data ◽

Previous State ◽

Insight Into

We propose a novel knowledge extraction method based on Bayesian-inspired association rule mining to classify anxiety in heterogeneous, routinely collected data from 9,924 palliative patients. The method extracts association rules mined using lift and local support as selection criteria. The extracted rules are used to assess the maximum evidence supporting and rejecting anxiety for each patient in the test set. We evaluated the predictive accuracy by calculating the area under the receiver operating characteristic curve (AUC). The evaluation produced an AUC of 0.89 and a set of 55 atomic rules with one item in the premise and the conclusion, respectively. The selected rules include variables like pain, nausea, and various medications. Our method outperforms the previous state of the art (AUC = 0.72). We analyzed the relevance and novelty of the mined rules. Palliative experts were asked about the correlation between variables in the data set and anxiety. By comparing expert answers with the retrieved rules, we grouped rules into expected and unexpected ones and found several rules for which experts' opinions and the data-backed rules differ, most notably with the patients' sex. The proposed method offers a novel way to predict anxiety in palliative settings using routinely collected data with an explainable and effective model based on Bayesian-inspired association rule mining. The extracted rules give further insight into potential knowledge gaps in the palliative care field.

Download Full-text

Association Rule Mining for the Talents Introduction Strategy: A Case Study of Zhejiang University of Finance & Economics

American Journal of Applied Mathematics ◽

10.11648/j.ajam.20180602.15 ◽

2018 ◽

Vol 6 (2) ◽

pp. 55 ◽

Cited By ~ 1

Author(s):

Wang Qin

Keyword(s):

Association Rule ◽

Association Rule Mining ◽

Rule Mining

Download Full-text

Semi-Automatic Ontology Construction by Exploiting Functional Dependencies and Association Rules

Semantic Web ◽

10.4018/978-1-4666-3610-1.ch004 ◽

2013 ◽

pp. 76-96

Author(s):

Luca Cagliero ◽

Tania Cerquitelli ◽

Paolo Garza

Keyword(s):

Association Rules ◽

Association Rule ◽

Association Rule Mining ◽

Description Logic ◽

Functional Dependency ◽

Functional Dependencies ◽

Rule Mining ◽

Domain Experts ◽

Data Schema ◽

Input Dataset

This paper presents a novel semi-automatic approach to construct conceptual ontologies over structured data by exploiting both the schema and content of the input dataset. It effectively combines two well-founded database and data mining techniques, i.e., functional dependency discovery and association rule mining, to support domain experts in the construction of meaningful ontologies, tailored to the analyzed data, by using Description Logic (DL). To this aim, functional dependencies are first discovered to highlight valuable conceptual relationships among attributes of the data schema (i.e., among concepts). The set of discovered correlations effectively support analysts in the assertion of the Tbox ontological statements (i.e., the statements involving shared data conceptualizations and their relationships). Then, the analyst-validated dependencies are exploited to drive the association rule mining process. Association rules represent relevant and hidden correlations among data content and they are used to provide valuable knowledge at the instance level. The pushing of functional dependency constraints into the rule mining process allows analysts to look into and exploit only the most significant data item recurrences in the assertion of the Abox ontological statements (i.e., the statements involving concept instances and their relationships).

Download Full-text

Constraint-Based Association Rule Mining

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch049 ◽

2011 ◽

pp. 307-312 ◽

Cited By ~ 10

Author(s):

Carson Kai-Sang Leung

Keyword(s):

Data Mining ◽

Association Rules ◽

Association Rule ◽

Association Rule Mining ◽

Computational Cost ◽

Knowledge Discovery In Databases ◽

Rule Mining ◽

The Subject ◽

User Focus ◽

High Computational Cost

The problem of association rule mining was introduced in 1993 (Agrawal et al., 1993). Since then, it has been the subject of numerous studies. Most of these studies focused on either performance issues or functionality issues. The former considered how to compute association rules efficiently, whereas the latter considered what kinds of rules to compute. Examples of the former include the Apriori-based mining framework (Agrawal & Srikant, 1994), its performance enhancements (Park et al., 1997; Leung et al., 2002), and the tree-based mining framework (Han et al., 2000); examples of the latter include extensions of the initial notion of association rules to other rules such as dependence rules (Silverstein et al., 1998) and ratio rules (Korn et al., 1998). In general, most of these studies basically considered the data mining exercise in isolation. They did not explore how data mining can interact with the human user, which is a key component in the broader picture of knowledge discovery in databases. Hence, they provided little or no support for user focus. Consequently, the user usually needs to wait for a long period of time to get numerous association rules, out of which only a small fraction may be interesting to the user. In other words, the user often incurs a high computational cost that is disproportionate to what he wants to get. This calls for constraint-based association rule mining.

Download Full-text