Correlated High Average-Utility Itemset Mining

AbstractThe high average-utility itemset mining (HAUIM) was established to provide a fair measure instead of genetic high-utility itemset mining (HUIM) for revealing the satisfied and interesting patterns. In practical applications, the database is dynamically changed when insertion/deletion operations are performed on databases. Several works were designed to handle the insertion process but fewer studies focused on processing the deletion process for knowledge maintenance. In this paper, we then develop a PRE-HAUI-DEL algorithm that utilizes the pre-large concept on HAUIM for handling transaction deletion in the dynamic databases. The pre-large concept is served as the buffer on HAUIM that reduces the number of database scans while the database is updated particularly in transaction deletion. Two upper-bound values are also established here to reduce the unpromising candidates early which can speed up the computational cost. From the experimental results, the designed PRE-HAUI-DEL algorithm is well performed compared to the Apriori-like model in terms of runtime, memory, and scalability in dynamic databases.

Download Full-text

Mining of High Average-Utility Pattern Using Multiple Minimum Thresholds in Big Data

Asian Journal of Computer Science and Technology ◽

10.51983/ajcst-2019.8.s2.2024 ◽

2019 ◽

Vol 8 (S2) ◽

pp. 57-60

Author(s):

R. Vasumathi ◽

S. Murugan

Keyword(s):

Big Data ◽

Single Machine ◽

New Method ◽

Map Reduce ◽

Itemset Mining ◽

The Past ◽

Pruning Method ◽

Average Utility ◽

Pruning Methods

In the past years most of the research have been conducted on high average-utility itemset mining (HAUIM) with wide applications. However, most of the methods are used for centralized databases with a single machine performing the mining job. Existing algorithms cannot be applied for big data. We try to solve this issue, by developing a new method for mining high average-utility itemset mining in big data. Map Reduce also used in this paper. Many algorithms were proposed only mine HAUIs using a single minimum high average-utility threshold. In this paper we also try solve this by mining HAUIs multiple minimum high average-utility thresholds. We have developed two pruning methods namely Reduction of utility co-occurrence pruning Method (RUCPM) and Pruning without Scanning Database (PWSD).

Download Full-text

A Systematic Survey on High Utility Itemset Mining

International Journal of Information Technology & Decision Making ◽

10.1142/s0219622019300027 ◽

2019 ◽

Vol 18 (04) ◽

pp. 1113-1185 ◽

Cited By ~ 2

Author(s):

Bahareh Rahmati ◽

Mohammad Karim Sohrabi

Keyword(s):

Data Structures ◽

Search Space ◽

Frequent Itemset ◽

Itemset Mining ◽

Efficient Data ◽

Average Utility ◽

High Utility ◽

High Utility Itemsets ◽

Downward Closure ◽

Efficient Data Structures

High utility itemset mining considers unit profits and quantities of items in a transaction database to extract more applicable and more useful association rules. Downward closure property, which causes significant pruning in frequent itemset mining, is not established in the utility of itemsets and so the mining problem will require alternative solutions to reduce its search space and to enhance its efficiency. Using an anti-monotonic upper bound of the utility function and exploiting efficient data structures for storing and compacting the dataset to perform efficient pruning strategies are the main solutions to address high utility itemset mining problem. Different mining methods and techniques have attempted to improve performance of extracting high utility itemsets and their several variants, including high-average utility itemsets, top-k high utility itemsets, and high utility itemsets with negative values, using more efficient data structures, more appropriate anti-monotonic upper bounds, and stronger pruning strategies. This paper aims to represent a comprehensive systematic review for high utility itemset mining techniques and to classify them based on their problem-solving approaches.

Download Full-text

An efficient algorithm for high average utility itemset mining with buffered average utility-list

10.1109/icisce50968.2020.00056 ◽

2020 ◽

Author(s):

Min-Jie Gao ◽

Jia-Xiang Lin ◽

Jian-Wei Wu

Keyword(s):

Efficient Algorithm ◽

Itemset Mining ◽

Average Utility

Download Full-text

Incrementally updating the high average-utility patterns with pre-large concept

Applied Intelligence ◽

10.1007/s10489-020-01743-y ◽

2020 ◽

Vol 50 (11) ◽

pp. 3788-3807

Author(s):

Jerry Chun-Wei Lin ◽

Matin Pirouz ◽

Youcef Djenouri ◽

Chien-Fu Cheng ◽

Usman Ahmed

Keyword(s):

State Of The Art ◽

The State ◽

Batch Mode ◽

Itemset Mining ◽

The Past ◽

Dynamic Databases ◽

Speed Up ◽

Average Utility ◽

High Utility ◽

High Utility Patterns

Abstract High-utility itemset mining (HUIM) is considered as an emerging approach to detect the high-utility patterns from databases. Most existing algorithms of HUIM only consider the itemset utility regardless of the length. This limitation raises the utility as a result of a growing itemset size. High average-utility itemset mining (HAUIM) considers the size of the itemset, thus providing a more balanced scale to measure the average-utility for decision-making. Several algorithms were presented to efficiently mine the set of high average-utility itemsets (HAUIs) but most of them focus on handling static databases. In the past, a fast-updated (FUP)-based algorithm was developed to efficiently handle the incremental problem but it still has to re-scan the database when the itemset in the original database is small but there is a high average-utility upper-bound itemset (HAUUBI) in the newly inserted transactions. In this paper, an efficient framework called PRE-HAUIMI for transaction insertion in dynamic databases is developed, which relies on the average-utility-list (AUL) structures. Moreover, we apply the pre-large concept on HAUIM. A pre-large concept is used to speed up the mining performance, which can ensure that if the total utility in the newly inserted transaction is within the safety bound, the small itemsets in the original database could not be the large ones after the database is updated. This, in turn, reduces the recurring database scans and obtains the correct HAUIs. Experiments demonstrate that the PRE-HAUIMI outperforms the state-of-the-art batch mode HAUI-Miner, and the state-of-the-art incremental IHAUPM and FUP-based algorithms in terms of runtime, memory, number of assessed patterns and scalability.

Download Full-text

Analytics of high average-utility patterns in the industrial internet of things

Applied Intelligence ◽

10.1007/s10489-021-02751-2 ◽

2021 ◽

Author(s):

Jimmy Ming-Tai Wu ◽

Zhongcui Li ◽

Gautam Srivastava ◽

Unil Yun ◽

Jerry Chun-Wei Lin

Keyword(s):

High Performance ◽

Pattern Mining ◽

Search Space ◽

Research Field ◽

Industrial Internet Of Things ◽

Utility Measure ◽

Itemset Mining ◽

Industrial Internet ◽

Average Utility ◽

High Utility

AbstractRecently, revealing more valuable information except for quantity value for a database is an essential research field. High utility itemset mining (HAUIM) was suggested to reveal useful patterns by average-utility measure for pattern analytics and evaluations. HAUIM provides a more fair assessment than generic high utility itemset mining and ignores the influence of the length of itemsets. There are several high-performance HAUIM algorithms proposed to gain knowledge from a disorganized database. However, most existing works do not concern the uncertainty factor, which is one of the characteristics of data gathered from IoT equipment. In this work, an efficient algorithm for HAUIM to handle the uncertainty databases in IoTs is presented. Two upper-bound values are estimated to early diminish the search space for discovering meaningful patterns that greatly solve the limitations of pattern mining in IoTs. Experimental results showed several evaluations of the proposed approach compared to the existing algorithms, and the results are acceptable to state that the designed approach efficiently reveals high average utility itemsets from an uncertain situation.

Download Full-text