Mining Rare Association Rules by Discovering Quasi-Functional Dependencies

Author(s):  
Giulia Bruno ◽  
Paolo Garza ◽  
Elisa Quintarelli

In the context of anomaly detection, the data mining technique of extracting association rules can be used to identify rare rules which represent infrequent situations. A method to detect rare rules is to first infer the normal behavior of objects in the form of quasi-functional dependencies (i.e. functional dependencies that frequently hold), and then analyzing rare violations with respect to them. The quasi-functional dependencies are usually inferred from the current instance of a database. However, in several applications, the database is not static, but new data are added or deleted continuously. Thus, the anomalies have to be updated because they change over time. In this chapter, we propose an incremental algorithm to efficiently maintain up-to-date rules (i.e., functional and quasi-functional dependencies). The impact of the cardinality of the data set and the number of new tuples on the execution time is evaluated through a set of experiments on synthetic and real databases, whose results are here reported.

2018 ◽  
Vol 59 (6) ◽  
pp. 1103-1111 ◽  
Author(s):  
Joukje C Swinkels ◽  
Marjolein I Broese van Groenou ◽  
Alice de Boer ◽  
Theo G van Tilburg

Abstract Background and Objectives The general view is that partner-caregiver burden increases over time but findings are inconsistent. Moreover, the pathways underlying caregiver burden may differ between men and women. This study examines to what degree and why partner-caregiver burden changes over time. It adopts Pearlin’s Caregiver Stress Process Model, as it is expected that higher primary and secondary stressors will increase burden and larger amounts of resources will lower burden. Yet, the impact of stressors and resources may change over time. The wear-and-tear model predicts an increase of burden due to a stronger impact of stressors and lower impact of resources over time. Alternatively, the adaptation model predicts a decrease of burden due to a lower impact of stressors and higher impact of resources over time. Research Design and Methods We used 2 observations with a 1-year interval of 279 male and 443 female partner-caregivers, derived from the Netherlands Older Persons and Informal Caregivers Survey Minimum Data Set. We applied multilevel regression analysis, stratified by gender. Results Adjusted for all predictors, caregiver burden increased over time for both men and women. For female caregivers, the impact of poor spousal health on burden increased and the impact of fulfillment decreased over time. Among male caregivers, the impact of predictors did not change over time. Discussion and Implications The increase of burden over time supports the wear-and-tear model, in particular for women. This study highlights the need for gender-specific interventions that are focused on enabling older partners to be better prepared for long-term partner-care.


The improvement of an information processing and Memory capacity, the vast amount of data is collected for various data analyses purposes. Data mining techniques are used to get knowledgeable information. The process of extraction of data by using data mining techniques the data get discovered publically and this leads to breaches of specific privacy data. Privacypreserving data mining is used to provide to protection of sensitive information from unwanted or unsanctioned disclosure. In this paper, we analysis the problem of discovering similarity checks for functional dependencies from a given dataset such that application of algorithm (l, d) inference with generalization can anonymised the micro data without loss in utility. [8] This work has presented Functional dependency based perturbation approach which hides sensitive information from the user, by applying (l, d) inference model on the dependency attributes based on Information Gain. This approach works on both categorical and numerical attributes. The perturbed data set does not affects the original dataset it maintains the same or very comparable patterns as the original data set. Hence the utility of the application is always high, when compared to other data mining techniques. The accuracy of the original and perturbed datasets is compared and analysed using tools, data mining classification algorithm.


Author(s):  
Anthony Scime ◽  
Karthik Rajasethupathy ◽  
Kulathur S. Rajasethupathy ◽  
Gregg R. Murray

Data mining is a collection of algorithms for finding interesting and unknown patterns or rules in data. However, different algorithms can result in different rules from the same data. The process presented here exploits these differences to find particularly robust, consistent, and noteworthy rules among much larger potential rule sets. More specifically, this research focuses on using association rules and classification mining to select the persistently strong association rules. Persistently strong association rules are association rules that are verifiable by classification mining the same data set. The process for finding persistent strong rules was executed against two data sets obtained from the American National Election Studies. Analysis of the first data set resulted in one persistent strong rule and one persistent rule, while analysis of the second data set resulted in 11 persistent strong rules and 10 persistent rules. The persistent strong rule discovery process suggests these rules are the most robust, consistent, and noteworthy among the much larger potential rule sets.


2014 ◽  
Vol 2014 ◽  
pp. 1-5 ◽  
Author(s):  
Sarawut Saichanma ◽  
Sucha Chulsomlee ◽  
Nonthaya Thangrua ◽  
Pornsuri Pongsuchart ◽  
Duangmanee Sanmun

It is undeniable that laboratory information is important in healthcare in many ways such as management, planning, and quality improvement. Laboratory diagnosis and laboratory results from each patient are organized from every treatment. These data are useful for retrospective study exploring a relationship between laboratory results and diseases. By doing so, it increases efficiency in diagnosis and quality in laboratory report. Our study will utilize J48 algorithm, a data mining technique to predict abnormality in peripheral blood smear from 1,362 students by using 13 data set of hematological parameters gathered from automated blood cell counter. We found that the decision tree which is created from the algorithm can be used as a practical guideline for RBC morphology prediction by using 4 hematological parameters (MCV, MCH, Hct, and RBC). The average prediction of RBC morphology has true positive, false positive, precision, recall, and accuracy of 0.940, 0.050, 0.945, 0.940, and 0.943, respectively. A newly found paradigm in managing medical laboratory information will be helpful in organizing, researching, and assisting correlation in multiple disciplinary other than medical science which will eventually lead to an improvement in quality of test results and more accurate diagnosis.


2014 ◽  
Vol 998-999 ◽  
pp. 842-845 ◽  
Author(s):  
Jia Mei Guo ◽  
Yin Xiang Pei

Association rules extraction is one of the important goals of data mining and analyzing. Aiming at the problem that information lose caused by crisp partition of numerical attribute , in this article, we put forward a fuzzy association rules mining method based on fuzzy logic. First, we use c-means clustering to generate fuzzy partitions and eliminate redundant data, and then map the original data set into fuzzy interval, in the end, we extract the fuzzy association rules on the fuzzy data set as providing the basis for proper decision-making. Results show that this method can effectively improve the efficiency of data mining and the semantic visualization and credibility of association rules.


2017 ◽  
Vol 20 (3) ◽  
pp. 41-56 ◽  
Author(s):  
Foluso A. Akinsola ◽  
Nicholas M. Odhiambo

This paper surveys the existing literature on the relationship between inflation and economic growth in developed and developing countries, highlighting the theoretical and empirical indications. The study finds that the impact of inflation on economic growth varies from country to country and over time. The study also finds that the results from these studies depend on country‑specific characteristics, the data set used, and the methodology employed. On balance, the study finds overwhelming support in favour of a negative relationship between inflation and growth, especially in developed economies. However, there is still much controversy about the specific threshold level of inflation that is appropriate for growth. Most previous studies on this subject just assume a unidirectional causal relationsship between inflation and economic growth. To our knowledge, this may be the first review of its kind to survey, in detail, the existing research on the relationship between inflation and economic growth in developed and developing countries.


2017 ◽  
Author(s):  
Andysah Putera Utama Siahaan ◽  
Mesran Mesran ◽  
Andre Hasudungan Lubis ◽  
Ali Ikhwan ◽  
Supiyandi

Sales transaction data on a company will continue to increase day by day. Large amounts of data can be problematic for a company if it is not managed properly. Data mining is a field of science that unifies techniques from machine learning, pattern processing, statistics, databases, and visualization to handle the problem of retrieving information from large databases. The relationship sought in data mining can be a relationship between two or more in one dimension. The algorithm included in association rules in data mining is the Frequent Pattern Growth (FP-Growth) algorithm is one of the alternatives that can be used to determine the most frequent itemset in a data set.


Sign in / Sign up

Export Citation Format

Share Document