A Scalable Approach for Data Mining – AHUIM

Vandna Dahiya; Sandeep Dalal

doi:10.14704/web/v18i1/web18029

A Scalable Approach for Data Mining – AHUIM

Webology ◽

10.14704/web/v18i1/web18029 ◽

2021 ◽

Vol 18 (1) ◽

pp. 92-103

Author(s):

Vandna Dahiya ◽

Sandeep Dalal

Keyword(s):

Data Mining ◽

Research Paper ◽

Large Datasets ◽

Novel Technique ◽

Itemset Mining ◽

Essential Form ◽

High Utility

Utility itemset mining, which finds the item sets based on utility factors, has established itself as an essential form of data mining. The utility is defined in terms of quantity and some interest factor. Various methods have been developed so far by the researchers to mine these itemsets but most of them are not scalable. In the present times, a scalable approach is required that can fulfill the budding needs of data mining. A Spark based novel technique has been recommended in this research paper for mining the data in a distributed way, called as Absolute High Utility Itemset Mining (AHUIM). The technique is suitable for small as well as large datasets. The performance of the technique is being measured for various parameters such as speed, scalability, and accuracy etc.

Download Full-text

Mining of top-k high utility itemsets with negative utility

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-201357 ◽

2020 ◽

pp. 1-16

Author(s):

Rui Sun ◽

Meng Han ◽

Chunyan Zhang ◽

Mingyao Shen ◽

Shiyu Du

Keyword(s):

Data Mining ◽

Search Space ◽

Experimental Results ◽

Effective Algorithm ◽

Memory Usage ◽

Utility Value ◽

Itemset Mining ◽

High Utility ◽

High Utility Itemsets

High utility itemset mining(HUIM) with negative utility is an emerging data mining task. However, the setting of the minimum utility threshold is always a challenge when mining high utility itemsets(HUIs) with negative items. Although the top-k HUIM method is very common, this method can only mine itemsets with positive items, and the problem of missing itemsets occurs when mining itemsets with negative items. To solve this problem, we first propose an effective algorithm called THN (Top-k High Utility Itemset Mining with Negative Utility). It proposes a strategy for automatically increasing the minimum utility threshold. In order to solve the problem of multiple scans of the database, it uses transaction merging and dataset projection technology. It uses a redefined sub-tree utility value and a redefined local utility value to prune the search space. Experimental results on real datasets show that THN is efficient in terms of runtime and memory usage, and has excellent scalability. Moreover, experiments show that THN performs particularly well on dense datasets.

Download Full-text

Association Mining for Super Market Sales using UP Growth and Top-K Algorithm

ITM Web of Conferences ◽

10.1051/itmconf/20203203012 ◽

2020 ◽

Vol 32 ◽

pp. 03012

Author(s):

Harshal Bhope ◽

Yash Mahajan ◽

Swapnil Deore ◽

Vimla Jethani

Keyword(s):

Data Mining ◽

Frequent Itemsets ◽

Large Datasets ◽

Association Mining ◽

Minimum Threshold ◽

Weighted Utility ◽

High Utility

Frequent itemsets(HUIs) mining is an evolving field in data mining, that centers around finding itemsets having a utility that meets a user-specified minimum utility by finding all the itemsets. A problem arises in setting up minimum utility exactly which causes difficulties for users. By setting minimum utility underneath average, too many incessant itemsets will be generated, which in turn will make the mining process quite inefficient. No frequent itemsets will be found if the minimum utility is set too huge. The research focuses on generating frequent itemsets by using the transaction weighted utility of each product. While using UP growth methodology for discovering high utility items from large datasets it takes more time and consumes more memory due to which it is less efficient. So to overcome these drawbacks of UP growth we use the Top-K algorithm which makes it more scalable and efficient. Therefore, we use the Top-K algorithm which does not require a minimum threshold.

Download Full-text

A Novel Technique - Absolute High Utility Itemset Mining (AHUIM) Algorithm for Big Data

International Journal of Advanced Trends in Computer Science and Engineering ◽

10.30534/ijatcse/2020/78952020 ◽

2020 ◽

Vol 9 (5) ◽

pp. 7451-7460

Keyword(s):

Big Data ◽

Novel Technique ◽

Itemset Mining ◽

High Utility

Download Full-text

Novel Algorithm for High Utility Itemset Mining: A Review

International Journal of Innovative Research in Computer and Communication Engineering ◽

10.15680/ijircce.2015.0306050 ◽

2015 ◽

Vol 03 (06) ◽

pp. 5496-5499

Author(s):

Nilesh Vani, Yamini Jawale

Keyword(s):

Itemset Mining ◽

High Utility ◽

Novel Algorithm

Download Full-text

A predictive GA-based model for closed high-utility itemset mining

Applied Soft Computing ◽

10.1016/j.asoc.2021.107422 ◽

2021 ◽

pp. 107422

Author(s):

Jerry Chun-Wei Lin ◽

Youcef Djenouri ◽

Gautam Srivastava ◽

Unil Yun ◽

Philippe Fourier-Viger

Keyword(s):

Itemset Mining ◽

High Utility

Download Full-text

High-utility and diverse itemset mining

Applied Intelligence ◽

10.1007/s10489-020-02063-x ◽

2021 ◽

Author(s):

Amit Verma ◽

Siddharth Dawar ◽

Raman Kumar ◽

Shamkant Navathe ◽

Vikram Goyal

Keyword(s):

Itemset Mining ◽

High Utility

Download Full-text

Dynamic maintenance model for high average-utility pattern mining with deletion operation

Applied Intelligence ◽

10.1007/s10489-021-02539-4 ◽

2021 ◽

Author(s):

Jimmy Ming-Tai Wu ◽

Qian Teng ◽

Shahab Tayeb ◽

Jerry Chun-Wei Lin

Keyword(s):

Pattern Mining ◽

Computational Cost ◽

Practical Applications ◽

Itemset Mining ◽

Dynamic Databases ◽

Speed Up ◽

Dynamic Maintenance ◽

Average Utility ◽

High Utility ◽

Maintenance Model

AbstractThe high average-utility itemset mining (HAUIM) was established to provide a fair measure instead of genetic high-utility itemset mining (HUIM) for revealing the satisfied and interesting patterns. In practical applications, the database is dynamically changed when insertion/deletion operations are performed on databases. Several works were designed to handle the insertion process but fewer studies focused on processing the deletion process for knowledge maintenance. In this paper, we then develop a PRE-HAUI-DEL algorithm that utilizes the pre-large concept on HAUIM for handling transaction deletion in the dynamic databases. The pre-large concept is served as the buffer on HAUIM that reduces the number of database scans while the database is updated particularly in transaction deletion. Two upper-bound values are also established here to reduce the unpromising candidates early which can speed up the computational cost. From the experimental results, the designed PRE-HAUI-DEL algorithm is well performed compared to the Apriori-like model in terms of runtime, memory, and scalability in dynamic databases.

Download Full-text