IMIDB: An Algorithm for Indexed Mining of Incremental Databases

Mohammed M. Fouad; Mostafa G.M. Mostafa; Abdulfattah S. Mashat; Tarek F. Gharib

doi:10.1515/jisys-2015-0107

IMIDB: An Algorithm for Indexed Mining of Incremental Databases

Journal of Intelligent Systems ◽

10.1515/jisys-2015-0107 ◽

2017 ◽

Vol 26 (1) ◽

pp. 69-85

Author(s):

Mohammed M. Fouad ◽

Mostafa G.M. Mostafa ◽

Abdulfattah S. Mashat ◽

Tarek F. Gharib

Keyword(s):

Data Structure ◽

Association Rules ◽

Efficient Algorithm ◽

Performance Comparison ◽

Incremental Mining ◽

Itemset Mining ◽

Large Databases ◽

Transactional Databases ◽

Dynamic Databases ◽

Database Size

AbstractAssociation rules provide important knowledge that can be extracted from transactional databases. Owing to the massive exchange of information nowadays, databases become dynamic and change rapidly and periodically: new transactions are added to the database and/or old transactions are updated or removed from the database. Incremental mining was introduced to overcome the problem of maintaining previously generated association rules in dynamic databases. In this paper, we propose an efficient algorithm (IMIDB) for incremental itemset mining in large databases. The algorithm utilizes the trie data structure for indexing dynamic database transactions. Performance comparison of the proposed algorithm to recently cited algorithms shows that a significant improvement of about two orders of magnitude is achieved by our algorithm. Also, the proposed algorithm exhibits linear scalability with respect to database size.

An efficient algorithm for incremental mining of temporal association rules

Data & Knowledge Engineering ◽

10.1016/j.datak.2010.03.002 ◽

2010 ◽

Vol 69 (8) ◽

pp. 800-815 ◽

Cited By ~ 44

Author(s):

Tarek F. Gharib ◽

Hamed Nassar ◽

Mohamed Taha ◽

Ajith Abraham

Keyword(s):

Association Rules ◽

Efficient Algorithm ◽

Temporal Association ◽

Incremental Mining

An Efficient Algorithm for Incremental Mining of Association Rules

15th International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications (RIDE-SDMA'05) ◽

10.1109/ride.2005.6 ◽

2005 ◽

Cited By ~ 11

Author(s):

Chin-Chen Chang ◽

Yu-Chiang Li ◽

Jung-San Lee

Keyword(s):

Association Rules ◽

Efficient Algorithm ◽

Incremental Mining

FCILINK: Mining Frequent Closed Itemsets Based on a Link Structure between Transactions

Journal of Information & Knowledge Management ◽

10.1142/s0219649205001213 ◽

2005 ◽

Vol 04 (04) ◽

pp. 257-267

Author(s):

Kyong Rok Han ◽

Jae Yearn Kim

Keyword(s):

Association Rules ◽

Efficient Algorithm ◽

Frequent Itemsets ◽

Experimental Results ◽

Link Structure ◽

The Past ◽

Large Databases ◽

Closure Mechanism ◽

Closed Itemsets ◽

Significant Patterns

The problem of discovering association rules between items in a database is an emerging area of research. Its goal is to extract significant patterns or interesting rules from large databases. Recent studies of mining association rules have proposed a closure mechanism. It is no longer necessary to mine the set of all of the frequent itemsets and their association rules. Rather, it is sufficient to mine the frequent closed itemsets and their corresponding rules. In the past, a number of algorithms for mining frequent closed itemsets have been based on items. In this paper, we use the transaction itself for mining frequent closed itemsets. An efficient algorithm called FCILINK is proposed that is based on a link structure between transactions. A given database is scanned once and then a much smaller sub-database is scanned twice. Our experimental results show that our algorithm is faster than previously proposed methods. Furthermore, our approach is significantly more efficient for dense databases.

Incremental Updates of Discovered Multi-Level Association Rules

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213097000153 ◽

1997 ◽

Vol 06 (02) ◽

pp. 273-290 ◽

Cited By ~ 2

Author(s):

David W. Cheung ◽

Vincent T. Ng ◽

Benjamin W. Tam

Keyword(s):

Data Mining ◽

Association Rules ◽

Efficient Algorithm ◽

Cost Effective ◽

Incremental Algorithm ◽

Superior Performance ◽

Performance Study ◽

Mining Algorithm ◽

Large Databases ◽

Multi Level

Update of the single- and multi-level association rules discovered in large databases is inherently costly. The straight forward approach of re-running the discovery algorithm on the entire updated database to re-discover the association rules is not cost-effective. An incremental algorithm FUP have been proposed for the update of discovered single-level association rules. In this study, we have shown that the incremental technique in FUP can be generalized to other data mining systems. An efficient algorithm MLUp has been proposed for the updating of discovered multi-level association rules. Our performance study shows that MLUp has a superior performance over the representative mining algorithm such as ML-T2 in updating discovered multi-level association rules.

Hash based Approach for Mining Frequent Item Sets from Transactional Databases

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i3.34.19214 ◽

2018 ◽

Vol 7 (3.34) ◽

pp. 309

Author(s):

UMohan Srinivas ◽

Ch Anuradha ◽

Dr P. Sri Rama Chandra Murty

Keyword(s):

Frequent Itemset ◽

Frequent Itemset Mining ◽

Minimum Threshold ◽

Result Section ◽

Itemset Mining ◽

Novel Approach ◽

Large Databases ◽

Transactional Databases ◽

Efficient Level ◽

Frequent Item Sets

Frequent Itemset Mining become so popular in extracting hidden patterns from transactional databases. Among the several approaches, Apriori algorithm is known to be a basic approach which follows candidate generate and test based strategy. Although it is efficient level-wise approach, it has two limitations, (i) several passes are required to check the support of candidate itemsets. (ii) Towards more candidate itemsets and minimum threshold variations. A novel approach is proposed to tackle the above limitations. The proposed approach is one pass Hash-based Frequent Itemset Mining to derive frequent patterns. HFIM has feature that maintains candidate itemsets dynamically which are independent on minimum threshold. This feature allows to limit the number of scans over the database to one. In this paper, HFIM is compared with the Apriori to show the performance on standard datasets. The result section shows that HFIM outperforms Apriori over large databases.

Improved approaches to mine rare association rules in transactional databases

Proceedings of the Fourth SIGMOD PhD Workshop on Innovative Database Research - IDAR '10 ◽

10.1145/1811136.1811140 ◽

2010 ◽

Cited By ~ 2

Author(s):

R. Uday Kiran ◽

P. Krishna Reddy

Keyword(s):

Association Rules ◽

Rare Association ◽

Transactional Databases

Dynamic maintenance model for high average-utility pattern mining with deletion operation

Applied Intelligence ◽

10.1007/s10489-021-02539-4 ◽

2021 ◽

Author(s):

Jimmy Ming-Tai Wu ◽

Qian Teng ◽

Shahab Tayeb ◽

Jerry Chun-Wei Lin

Keyword(s):

Pattern Mining ◽

Computational Cost ◽

Practical Applications ◽

Itemset Mining ◽

Dynamic Databases ◽

Speed Up ◽

Dynamic Maintenance ◽

Average Utility ◽

High Utility ◽

Maintenance Model

AbstractThe high average-utility itemset mining (HAUIM) was established to provide a fair measure instead of genetic high-utility itemset mining (HUIM) for revealing the satisfied and interesting patterns. In practical applications, the database is dynamically changed when insertion/deletion operations are performed on databases. Several works were designed to handle the insertion process but fewer studies focused on processing the deletion process for knowledge maintenance. In this paper, we then develop a PRE-HAUI-DEL algorithm that utilizes the pre-large concept on HAUIM for handling transaction deletion in the dynamic databases. The pre-large concept is served as the buffer on HAUIM that reduces the number of database scans while the database is updated particularly in transaction deletion. Two upper-bound values are also established here to reduce the unpromising candidates early which can speed up the computational cost. From the experimental results, the designed PRE-HAUI-DEL algorithm is well performed compared to the Apriori-like model in terms of runtime, memory, and scalability in dynamic databases.

FINDING THE LARGEST EMPTY DISK CONTAINING A QUERY POINT

International Journal of Computational Geometry & Applications ◽

10.1142/s021819591360008x ◽

2013 ◽

Vol 23 (04n05) ◽

pp. 335-355 ◽

Cited By ~ 1

Author(s):

HAIM KAPLAN ◽

MICHA SHARIR

Keyword(s):

Data Structure ◽

Efficient Algorithm ◽

Alternative Solution ◽

Query Point

Let P be a set of n points in the plane. We present an efficient algorithm for preprocessing P, so that, for a given query point q, we can quickly report the largest disk that contains q but its interior is disjoint from P. The storage required by the data structure is O(n log n), the preprocessing cost is O(n log 2 n), and a query takes O( log 2 n) time. We also present an alternative solution with an improved query cost and with slightly worse storage and preprocessing requirements.

Binary Particle Swarm Optimization-Based Association Rule Mining for Discovering Relationships between Machine Capabilities and Product Features

Mathematical Problems in Engineering ◽

10.1155/2018/2456010 ◽

2018 ◽

Vol 2018 ◽

pp. 1-16 ◽

Cited By ~ 2

Author(s):

Zhicong Kou ◽

Lifeng Xi

Keyword(s):

Particle Swarm Optimization ◽

Association Rules ◽

Association Rule ◽

Association Rule Mining ◽

Particle Swarm ◽

Performance Comparison ◽

Binary Particle Swarm Optimization ◽

Rule Mining ◽

Swarm Optimization ◽

Product Features

An effective data mining method to automatically extract association rules between manufacturing capabilities and product features from the available historical data is essential for an efficient and cost-effective product development and production. This paper proposes a new binary particle swarm optimization- (BPSO-) based association rule mining (BPSO-ARM) method for discovering the hidden relationships between machine capabilities and product features. In particular, BPSO-ARM does not need to predefine thresholds of minimum support and confidence, which improves its applicability in real-world industrial cases. Moreover, a novel overlapping measure indication is further proposed to eliminate those lower quality rules to further improve the applicability of BPSO-ARM. The effectiveness of BPSO-ARM is demonstrated on a benchmark case and an industrial case about the automotive part manufacturing. The performance comparison indicates that BPSO-ARM outperforms other regular methods (e.g., Apriori) for ARM. The experimental results indicate that BPSO-ARM is capable of discovering important association rules between machine capabilities and product features. This will help support planners and engineers for the new product design and manufacturing.

ASSOCIATIVE CLASSIFICATION OF MAMMOGRAMS BASED ON PARALLEL MINING OF IMAGE BLOCKS

Biomedical Engineering Applications Basis and Communications ◽

10.4015/s1016237212500470 ◽

2012 ◽

Vol 24 (06) ◽

pp. 513-524

Author(s):

Mohsen Alavash Shooshtari ◽

Keivan Maghooli ◽

Kambiz Badie

Keyword(s):

Association Rules ◽

Classification Systems ◽

Classification Model ◽

Automated Classification ◽

Associative Classification ◽

Classification Problems ◽

Association Rules Mining ◽

Parallel Mining ◽

Transactional Databases ◽

Unique Decision

One of the main objectives of data mining as a promising multidisciplinary field in computer science is to provide a classification model to be used for decision support purposes. In the medical imaging domain, mammograms classification is a difficult diagnostic task which calls for development of automated classification systems. Associative classification, as a special case of association rules mining, has been adopted in classification problems for years. In this paper, an associative classification framework based on parallel mining of image blocks is proposed to be used for mammograms discrimination. Indeed, association rules mining is applied to a commonly used mammography image database to classify digital mammograms into three categories, namely normal, benign and malign. In order to do so, first images are preprocessed and then features are extracted from non-overlapping image blocks and discretized for rule discovery. Association rules are then discovered through parallel mining of transactional databases which correspond to the image blocks, and finally are used within a unique decision-making scheme to predict the class of unknown samples. Finally, experiments are conducted to assess the effectiveness of the proposed framework. Results show that the proposed framework proved successful in terms of accuracy, precision, and recall, and suggest that the framework could be used as the core of any future associative classifier to support mammograms discrimination.