EGEA : A New Hybrid Approach Towards Extracting Reduced Generic Association Rule Set (Application to AML Blood Cancer Therapy)

Author(s):  
M. A. Esseghir ◽  
G. Gasmi ◽  
S. Ben Yahia ◽  
Y. Slimani
Author(s):  
Loan T. T. Nguyen ◽  
Hai T. Nguyen ◽  
Bay Vo ◽  
Ngoc-Thanh Nguyen

2017 ◽  
Vol 26 ◽  
pp. S135-S136
Author(s):  
J. Franzon ◽  
N. Berry ◽  
S. Ullah ◽  
V. Versace ◽  
A. McCarthy ◽  
...  

2021 ◽  
pp. 13-26
Author(s):  
Felix Kruse ◽  
Jan-Philipp Awick ◽  
Jorge Marx Gómez ◽  
Peter Loos

This paper explores the data integration process step record linkage. Thereby we focus on the entity company. For the integration of company data, the company name is a crucial attribute, which often includes the legal form. This legal form is not concise and consistent represented among different data sources, which leads to considerable data quality problems for the further process steps in record linkage. To solve these problems, we classify and ex-tract the legal form from the attribute company name. For this purpose, we iteratively developed four different approaches and compared them in a benchmark. The best approach is a hybrid approach combining a rule set and a supervised machine learning model. With our developed hybrid approach, any company data sets from research or business can be processed. Thus, the data quality for subsequent data processing steps such as record linkage can be improved. Furthermore, our approach can be adapted to solve the same data quality problems in other attributes.


2014 ◽  
Vol 8 (1) ◽  
pp. 303-307 ◽  
Author(s):  
Zhonglin Zhang ◽  
Zongcheng Liu ◽  
Chongyu Qiao

A method of tendency mining in dynamic association rule based on compatibility feature vector SVM classifier is proposed. Firstly, the class association rule set named CARs is mined by using the method of tendency mining in dynamic association rules. Secondly, the algorithm of SVM is used to construct the classifier based on compatibility feature vector to classify the obtained CARs taking advantage when dealing with high complex data. It uses a method based on judging rules’ weight to construct the model. At last, the method is compared with the traditional methods with respect to the mining accuracy. The method can solve the problem of high time complexity and have a higher accuracy than the traditional methods which is helpful to make mining dynamic association rules more accurate and effective. By analyzing the final results, it is proved that the method has lower complexity and higher classification accuracy.


Author(s):  
Suma B. ◽  
Shobha G.

<div>Association rule mining is a well-known data mining technique used for extracting hidden correlations between data items in large databases. In the majority of the situations, data mining results contain sensitive information about individuals and publishing such data will violate individual secrecy. The challenge of association rule mining is to preserve the confidentiality of sensitive rules when releasing the database to external parties. The association rule hiding technique conceals the knowledge extracted by the sensitive association rules by modifying the database. In this paper, we introduce a border-based algorithm for hiding sensitive association rules. The main purpose of this approach is to conceal the sensitive rule set while maintaining the utility of the database and association rule mining results at the highest level. The performance of the algorithm in terms of the side effects is demonstrated using experiments conducted on two real datasets. The results show that the information loss is minimized without sacrificing the accuracy. </div>


Author(s):  
Kaoru Shimada ◽  
Hisae Aoki ◽  
Keiko Kubota ◽  
Satoru Haresaku ◽  
Shinsuke Mizutani ◽  
...  

Author(s):  
YUE XU ◽  
YUEFENG LI

Association rule mining has many achievements in the area of knowledge discovery. However, the quality of the extracted association rules has not drawn adequate attention from researchers in data mining community. One big concern with the quality of association rule mining is the size of the extracted rule set. As a matter of fact, very often tens of thousands of association rules are extracted among which many are redundant, thus useless. In this paper, we first analyze the redundancy problem in association rules and then propose a reliable exact association rule basis from which more concise nonredundant rules can be extracted. We prove that the redundancy eliminated using the proposed reliable association rule basis does not reduce the belief to the extracted rules. Moreover, this paper proposes a level wise approach for efficiently extracting closed itemsets and minimal generators — a key issue in closure based association rule mining.


2012 ◽  
Vol 2012 ◽  
pp. 1-12 ◽  
Author(s):  
Ferdinando Di Martino ◽  
Salvatore Sessa

We present a new method based on the use of fuzzy transforms for detecting coarse-grained association rules in the datasets. The fuzzy association rules are represented in the form of linguistic expressions and we introduce a pre-processing phase to determine the optimal fuzzy partition of the domains of the quantitative attributes. In the extraction of the fuzzy association rules we use the AprioriGen algorithm and a confidence index calculated via the inverse fuzzy transform. Our method is applied to datasets of the 2001 census database of the district of Naples (Italy); the results show that the extracted fuzzy association rules provide a correct coarse-grained view of the data association rule set.


2013 ◽  
Vol 2013 ◽  
pp. 1-13 ◽  
Author(s):  
Amir Hossein Azadnia ◽  
Shahrooz Taheri ◽  
Pezhman Ghadimi ◽  
Muhamad Zameri Mat Saman ◽  
Kuan Yew Wong

One of the cost-intensive issues in managing warehouses is the order picking problem which deals with the retrieval of items from their storage locations in order to meet customer requests. Many solution approaches have been proposed in order to minimize traveling distance in the process of order picking. However, in practice, customer orders have to be completed by certain due dates in order to avoid tardiness which is neglected in most of the related scientific papers. Consequently, we proposed a novel solution approach in order to minimize tardiness which consists of four phases. First of all, weighted association rule mining has been used to calculate associations between orders with respect to their due date. Next, a batching model based on binary integer programming has been formulated to maximize the associations between orders within each batch. Subsequently, the order picking phase will come up which used a Genetic Algorithm integrated with the Traveling Salesman Problem in order to identify the most suitable travel path. Finally, the Genetic Algorithm has been applied for sequencing the constructed batches in order to minimize tardiness. Illustrative examples and comparisons are presented to demonstrate the proficiency and solution quality of the proposed approach.


Sign in / Sign up

Export Citation Format

Share Document