A two-phase approach to mine short-period high-utility itemsets in transactional databases

2017 ◽  
Vol 33 ◽  
pp. 29-43 ◽  
Author(s):  
Jerry Chun-Wei Lin ◽  
Jiexiong Zhang ◽  
Philippe Fournier-Viger ◽  
Tzung-Pei Hong ◽  
Ji Zhang
2020 ◽  
Vol 1 (2) ◽  
pp. 44-47
Author(s):  
Tung N.T ◽  
Nguyen Le Van ◽  
Trinh Cong Nhut ◽  
Tran Van Sang

The goal of the high-utility itemset mining task is to discover combinations of items that yield high profits from transactional databases. HUIM is a useful tool for retail stores to analyze customer behaviors. However, in the real world, items are found with both positive and negative utility values. To address this issue, we propose an algorithm named Modified Efficient High‐utility Itemsets mining with Negative utility (MEHIN) to find all HUIs with negative utility. This algorithm is an improved version of the EHIN algorithm. MEHIN utilizes 2 new upper bounds for pruning, named revised subtree and revised local utility. To reduce dataset scans, the proposed algorithm uses transaction merging and dataset projection techniques. An array‐based utility‐counting technique is also utilized to calculate upper‐bound efficiently. The MEHIN employs a novel structure called P-set to reduce the number of transaction scans and to speed up the mining process. Experimental results show that the proposed algorithms considerably outperform the state-of-the-art HUI-mining algorithms on negative utility in retail databases in terms of runtime.


2013 ◽  
Vol 760-762 ◽  
pp. 1713-1717
Author(s):  
Yi Pan ◽  
Bo Zhang

Owing to their major contribution to the total transaction's sales profits, increasingly importance has been attached to high utility itemsets mining. This paper has proposed a TIFF-tree based algorithm, which takes two-pass database scan to obtain the transaction utility information, the conditional matrix of potential high utility is adopted, through the row-column operation, the calculation of transaction utility can be simplified. The experiment result analysis shows that as the decreasing of user-defined threshold, the performance of TIFP-Growth algorithm is much better than the two-phase algorithm.


2013 ◽  
Vol 25 (8) ◽  
pp. 1772-1786 ◽  
Author(s):  
Vincent S. Tseng ◽  
Bai-En Shie ◽  
Cheng-Wei Wu ◽  
Philip S. Yu

Author(s):  
R. Uday Kiran ◽  
Pradeep Pallikila ◽  
J. M. Luna ◽  
Philippe Fournier-Viger ◽  
Masashi Toyoda ◽  
...  

2019 ◽  
Vol 15 (1) ◽  
pp. 58-79 ◽  
Author(s):  
P. Lalitha Kumari ◽  
S. G. Sanjeevi ◽  
T.V. Madhusudhana Rao

Mining high-utility itemsets is an important task in the area of data mining. It involves exponential mining space and returns a very large number of high-utility itemsets. In a real-time scenario, it is often sufficient to mine a small number of high-utility itemsets based on user-specified interestingness. Recently, the temporal regularity of an itemset is considered as an important interesting criterion for many applications. Methods for finding the regular high utility itemsets suffers from setting the threshold value. To address this problem, a novel algorithm called as TKRHU (Top k Regular High Utility Itemset) Miner is proposed to mine top-k high utility itemsets that appears regularly where k represents the desired number of regular high itemsets. A novel list structure RUL and efficient pruning techniques are developed to discover the top-k regular itemsets with high profit. Efficient pruning techniques are designed for reducing search space. Experimental results show that proposed algorithm using novel list structure achieves high efficiency in terms of runtime and space.


2010 ◽  
Vol 09 (06) ◽  
pp. 905-934 ◽  
Author(s):  
YING LIU ◽  
JIANWEI LI ◽  
WEI-KENG LIAO ◽  
ALOK CHOUDHARY ◽  
YONG SHI

High utility itemsets mining identifies itemsets whose utility satisfies a given threshold. It allows users to quantify the usefulness or preferences of items using different values. Thus, it reflects the impact of different items. High utility itemsets mining is useful in decision-making process of many applications, such as retail marketing and Web service, since items are actually different in many aspects in real applications. However, due to the lack of "downward closure property", the cost of candidate generation of high utility itemsets mining is intolerable in terms of time and memory space. This paper presents a Two-Phase algorithm which can efficiently prune down the number of candidates and precisely obtain the complete set of high utility itemsets. The performance of our algorithm is evaluated by applying it to synthetic databases and two real-world applications. It performs very efficiently in terms of speed and memory cost on large databases composed of short transactions, which are difficult for existing high utility itemsets mining algorithms to handle. Experiments on real-world applications demonstrate the significance of high utility itemsets in business decision-making, as well as the difference between frequent itemsets and high utility itemsets.


Sign in / Sign up

Export Citation Format

Share Document