Improving the Quality of Association Rule Mining by Means of Rough Sets

Author(s):  
Daniel Delic ◽  
Hans-J. Lenz ◽  
Mattis Neiling
2017 ◽  
Vol 7 (1.1) ◽  
pp. 19
Author(s):  
T. Nusrat Jabeen ◽  
M. Chidambaram ◽  
G. Suseendran

Security and privacy has emerged to be a serious concern in which the business professional don’t desire to share their classified transaction data. In the earlier work, secured sharing of transaction databases are carried out. The performance of those methods is enhanced further by bringing in Security and Privacy aware Large Database Association Rule Mining (SPLD-ARM) framework. Now the Improved Secured Association Rule Mining (ISARM) is introduced for the horizontal and vertical segmentation of huge database. Then k-Anonymization methods referred to as suppression and generalization based Anonymization method is employed for privacy guarantee. At last, Diffie-Hellman encryption algorithm is presented in order to safeguard the sensitive information and for the storage service provider to work on encrypted information. The Diffie-Hellman algorithm is utilized for increasing the quality of the system on the overall by the generation of the secured keys and thus the actual data is protected more efficiently. Realization of the newly introduced technique is conducted in the java simulation environment that reveals that the newly introduced technique accomplishes privacy in addition to security.


Electronics ◽  
2020 ◽  
Vol 9 (1) ◽  
pp. 100
Author(s):  
Daniele Apiletti ◽  
Eliana Pastor

Coffee is among the most popular beverages in many cities all over the world, being both at the core of the busiest shops and a long-standing tradition of recreational and social value for many people. Among the many coffee variants, espresso attracts the interest of different stakeholders: from citizens consuming espresso around the city, to local business activities, coffee-machine vendors and international coffee industries. The quality of espresso is one of the most discussed and investigated issues. So far, it has been addressed by means of human experts, electronic noses, and chemical approaches. The current work, instead, proposes a data-driven approach exploiting association rule mining. We analyze a real-world dataset of espresso brewing by professional coffee-making machines, and extract all correlations among external quality-influencing variables and actual metrics determining the quality of the espresso. Thanks to the application of association rule mining, a powerful data-driven exhaustive and explainable approach, results are expressed in the form of human-readable rules combining the variables of interest, such as the grinder settings, the extraction time, and the dose amount. Novel insights from real-world coffee extractions collected on the field are presented, together with a data-driven approach, able to uncover insights into the espresso quality and its impact on both the life of consumers and the choices of coffee-making industries.


Author(s):  
YUE XU ◽  
YUEFENG LI

Association rule mining has many achievements in the area of knowledge discovery. However, the quality of the extracted association rules has not drawn adequate attention from researchers in data mining community. One big concern with the quality of association rule mining is the size of the extracted rule set. As a matter of fact, very often tens of thousands of association rules are extracted among which many are redundant, thus useless. In this paper, we first analyze the redundancy problem in association rules and then propose a reliable exact association rule basis from which more concise nonredundant rules can be extracted. We prove that the redundancy eliminated using the proposed reliable association rule basis does not reduce the belief to the extracted rules. Moreover, this paper proposes a level wise approach for efficiently extracting closed itemsets and minimal generators — a key issue in closure based association rule mining.


2017 ◽  
Vol 26 (1) ◽  
pp. 139-152
Author(s):  
◽  
M. Umme Salma

AbstractRecent advancements in science and technology and advances in the medical field have paved the way for the accumulation of huge amount of medical data in the digital repositories, where they are stored for future endeavors. Mining medical data is the most challenging task as the data are subjected to many social concerns and ethical issues. Moreover, medical data are more illegible as they contain many missing and misleading values and may sometimes be faulty. Thus, pre-processing tasks in medical data mining are of great importance, and the main focus is on feature selection, because the quality of the input determines the quality of the resultant data mining process. This paper provides insight to develop a feature selection process, where a data set subjected to constraint-governed association rule mining and interestingness measures results in a small feature subset capable of producing better classification results. From the results of the experimental study, the feature subset was reduced to more than 50% by applying syntax-governed constraints and dimensionality-governed constraints, and this resulted in a high-quality result. This approach yielded about 98% of classification accuracy for the Breast Cancer Surveillance Consortium (BCSC) data set.


Author(s):  
Yun Sing Koh ◽  
Russel Pears

Rare association rule mining has received a great deal of attention in the past few years. In this chapter, the authors propose a multi methodological approach to the problem of rare association rule mining that integrates three different strands of research in this area. Firstly, the authors make use of statistical techniques such as the Fisher test to determine whether itemsets co-occur by chance or not. Secondly, they use clustering as a pre-processing technique to improve the quality of the rare rules generated. Their third strategy is to weigh itemsets to ensure upward closure, thus checking unbounded growth of the rule base. Their results show that clustering isolates heterogeneous segments from each other, thus promoting the discovery of rules which would otherwise remain undiscovered. Likewise, the use of itemset weighting tends to improve rule quality by promoting the generation of rules with rarer itemsets that would otherwise not be possible with a simple weighting scheme that assigns an equal weight to all possible itemsets. The use of clustering enabled us to study in detail an important sub-class of rare rules, which we term absolute rare rules. Absolute rare rules are those are not just rare to the dataset as a whole but are also rare to the cluster from which they are derived.


2017 ◽  
Vol 9 (2) ◽  
pp. 1 ◽  
Author(s):  
Meenakshi Bansal ◽  
Dinesh Grover ◽  
Dhiraj Sharma

Mining of sensitive rules is the most important task in data mining. Most of the existing techniques worked on finding sensitive rules based upon the crisp thresh hold value of support and confidence which cause serious side effects to the original database. To avoid these crisp boundaries this paper aims to use WFPPM (Weighted Fuzzy Privacy Preserving Mining) to extract sensitive association rules. WFPPM completely find the sensitive rules by calculating the weights of the rules. At first, we apply FP-Growth to mine association rules from the database. Next, we implement fuzzy to find the sensitive rules among the extracted rules. Experimental results show that the proposed scheme find actual sensitive rules without any modification along with maintaining the quality of the released data as compared to the previous techniques.


Author(s):  
Ronaldo Cristiano Prati

Receiver Operating Characteristics (ROC) graph is a popular way of assessing the performance of classification rules. However, as such graphs are based on class conditional probabilities, they are inappropriate to evaluate the quality of association rules. This follows from the fact that there is no class in association rule mining, and the consequent part of two different association rules might not have any correlation at all. This chapter presents an extension of ROC graphs, named QROC (for Quality ROC), which can be used in association rule context. Furthermore, QROC can be used to help analysts to evaluate the relative interestingness among different association rules in different cost scenarios.


2014 ◽  
Vol 8 (3) ◽  
pp. 39-62 ◽  
Author(s):  
Janakiramaiah Bonam ◽  
Ramamohan Reddy

Privacy preserving association rule mining protects the sensitive association rules specified by the owner of the data by sanitizing the original database so that the sensitive rules are hidden. In this paper, the authors study a problem of hiding sensitive association rules by carefully modifying the transactions in the database. The algorithm BHPSP calculates the impact factor of items in the sensitive association rules. Then it selects a rule which contains an item with minimum impact factor. The algorithm alters the transactions of the database to hide the sensitive association rule by reducing the loss of other non-sensitive association rules. The quality of a database can be well maintained by greedily selecting the alterations in the database with negligible side effects. The BHPSP algorithm is experimentally compared with a HCSRIL algorithm with respect to the performance measures misses cost and difference between original and sanitized databases. Experimental results are also mentioned demonstrating the effectiveness of the proposed approach.


Sign in / Sign up

Export Citation Format

Share Document