IMIDB: An Algorithm for Indexed Mining of Incremental Databases

2017 ◽  
Vol 26 (1) ◽  
pp. 69-85
Author(s):  
Mohammed M. Fouad ◽  
Mostafa G.M. Mostafa ◽  
Abdulfattah S. Mashat ◽  
Tarek F. Gharib

AbstractAssociation rules provide important knowledge that can be extracted from transactional databases. Owing to the massive exchange of information nowadays, databases become dynamic and change rapidly and periodically: new transactions are added to the database and/or old transactions are updated or removed from the database. Incremental mining was introduced to overcome the problem of maintaining previously generated association rules in dynamic databases. In this paper, we propose an efficient algorithm (IMIDB) for incremental itemset mining in large databases. The algorithm utilizes the trie data structure for indexing dynamic database transactions. Performance comparison of the proposed algorithm to recently cited algorithms shows that a significant improvement of about two orders of magnitude is achieved by our algorithm. Also, the proposed algorithm exhibits linear scalability with respect to database size.

2010 ◽  
Vol 69 (8) ◽  
pp. 800-815 ◽  
Author(s):  
Tarek F. Gharib ◽  
Hamed Nassar ◽  
Mohamed Taha ◽  
Ajith Abraham

2005 ◽  
Vol 04 (04) ◽  
pp. 257-267
Author(s):  
Kyong Rok Han ◽  
Jae Yearn Kim

The problem of discovering association rules between items in a database is an emerging area of research. Its goal is to extract significant patterns or interesting rules from large databases. Recent studies of mining association rules have proposed a closure mechanism. It is no longer necessary to mine the set of all of the frequent itemsets and their association rules. Rather, it is sufficient to mine the frequent closed itemsets and their corresponding rules. In the past, a number of algorithms for mining frequent closed itemsets have been based on items. In this paper, we use the transaction itself for mining frequent closed itemsets. An efficient algorithm called FCILINK is proposed that is based on a link structure between transactions. A given database is scanned once and then a much smaller sub-database is scanned twice. Our experimental results show that our algorithm is faster than previously proposed methods. Furthermore, our approach is significantly more efficient for dense databases.


1997 ◽  
Vol 06 (02) ◽  
pp. 273-290 ◽  
Author(s):  
David W. Cheung ◽  
Vincent T. Ng ◽  
Benjamin W. Tam

Update of the single- and multi-level association rules discovered in large databases is inherently costly. The straight forward approach of re-running the discovery algorithm on the entire updated database to re-discover the association rules is not cost-effective. An incremental algorithm FUP have been proposed for the update of discovered single-level association rules. In this study, we have shown that the incremental technique in FUP can be generalized to other data mining systems. An efficient algorithm MLUp has been proposed for the updating of discovered multi-level association rules. Our performance study shows that MLUp has a superior performance over the representative mining algorithm such as ML-T2 in updating discovered multi-level association rules.


2018 ◽  
Vol 7 (3.34) ◽  
pp. 309
Author(s):  
UMohan Srinivas ◽  
Ch Anuradha ◽  
Dr P. Sri Rama Chandra Murty

Frequent Itemset Mining become so popular in extracting hidden patterns from transactional databases. Among the several approaches, Apriori algorithm is known to be a basic approach which follows candidate generate and test based strategy. Although it is efficient level-wise approach, it has two limitations, (i) several passes are required to check the support of candidate itemsets. (ii) Towards more candidate itemsets and minimum threshold variations. A novel approach is proposed to tackle the above limitations. The proposed approach is one pass Hash-based Frequent Itemset Mining to derive frequent patterns. HFIM has feature that maintains candidate itemsets dynamically which are independent on minimum threshold. This feature allows to limit the number of scans over the database to one. In this paper, HFIM is compared with the Apriori to show the performance on standard datasets. The result section shows that HFIM outperforms Apriori over large databases. 


Author(s):  
Jimmy Ming-Tai Wu ◽  
Qian Teng ◽  
Shahab Tayeb ◽  
Jerry Chun-Wei Lin

AbstractThe high average-utility itemset mining (HAUIM) was established to provide a fair measure instead of genetic high-utility itemset mining (HUIM) for revealing the satisfied and interesting patterns. In practical applications, the database is dynamically changed when insertion/deletion operations are performed on databases. Several works were designed to handle the insertion process but fewer studies focused on processing the deletion process for knowledge maintenance. In this paper, we then develop a PRE-HAUI-DEL algorithm that utilizes the pre-large concept on HAUIM for handling transaction deletion in the dynamic databases. The pre-large concept is served as the buffer on HAUIM that reduces the number of database scans while the database is updated particularly in transaction deletion. Two upper-bound values are also established here to reduce the unpromising candidates early which can speed up the computational cost. From the experimental results, the designed PRE-HAUI-DEL algorithm is well performed compared to the Apriori-like model in terms of runtime, memory, and scalability in dynamic databases.


2013 ◽  
Vol 23 (04n05) ◽  
pp. 335-355 ◽  
Author(s):  
HAIM KAPLAN ◽  
MICHA SHARIR

Let P be a set of n points in the plane. We present an efficient algorithm for preprocessing P, so that, for a given query point q, we can quickly report the largest disk that contains q but its interior is disjoint from P. The storage required by the data structure is O(n log n), the preprocessing cost is O(n log 2 n), and a query takes O( log 2 n) time. We also present an alternative solution with an improved query cost and with slightly worse storage and preprocessing requirements.


2018 ◽  
Vol 2018 ◽  
pp. 1-16 ◽  
Author(s):  
Zhicong Kou ◽  
Lifeng Xi

An effective data mining method to automatically extract association rules between manufacturing capabilities and product features from the available historical data is essential for an efficient and cost-effective product development and production. This paper proposes a new binary particle swarm optimization- (BPSO-) based association rule mining (BPSO-ARM) method for discovering the hidden relationships between machine capabilities and product features. In particular, BPSO-ARM does not need to predefine thresholds of minimum support and confidence, which improves its applicability in real-world industrial cases. Moreover, a novel overlapping measure indication is further proposed to eliminate those lower quality rules to further improve the applicability of BPSO-ARM. The effectiveness of BPSO-ARM is demonstrated on a benchmark case and an industrial case about the automotive part manufacturing. The performance comparison indicates that BPSO-ARM outperforms other regular methods (e.g., Apriori) for ARM. The experimental results indicate that BPSO-ARM is capable of discovering important association rules between machine capabilities and product features. This will help support planners and engineers for the new product design and manufacturing.


2012 ◽  
Vol 24 (06) ◽  
pp. 513-524
Author(s):  
Mohsen Alavash Shooshtari ◽  
Keivan Maghooli ◽  
Kambiz Badie

One of the main objectives of data mining as a promising multidisciplinary field in computer science is to provide a classification model to be used for decision support purposes. In the medical imaging domain, mammograms classification is a difficult diagnostic task which calls for development of automated classification systems. Associative classification, as a special case of association rules mining, has been adopted in classification problems for years. In this paper, an associative classification framework based on parallel mining of image blocks is proposed to be used for mammograms discrimination. Indeed, association rules mining is applied to a commonly used mammography image database to classify digital mammograms into three categories, namely normal, benign and malign. In order to do so, first images are preprocessed and then features are extracted from non-overlapping image blocks and discretized for rule discovery. Association rules are then discovered through parallel mining of transactional databases which correspond to the image blocks, and finally are used within a unique decision-making scheme to predict the class of unknown samples. Finally, experiments are conducted to assess the effectiveness of the proposed framework. Results show that the proposed framework proved successful in terms of accuracy, precision, and recall, and suggest that the framework could be used as the core of any future associative classifier to support mammograms discrimination.


Sign in / Sign up

Export Citation Format

Share Document