BAHUI

2014 ◽  
Vol 10 (1) ◽  
pp. 1-15 ◽  
Author(s):  
Wei Song ◽  
Yu Liu ◽  
Jinhong Li

Mining high utility itemsets is one of the most important research issues in data mining owing to its ability to consider nonbinary frequency values of items in transactions and different profit values for each item. Although a number of relevant approaches have been proposed in recent years, they incur the problem of producing a large number of candidate itemsets for high utility itemsets. In this paper, the authors propose an efficient algorithm, namely BAHUI (Bitmap-based Algorithm for High Utility Itemsets), for mining high utility itemsets with bitmap database representation. In BAHUI, bitmap is used vertically and horizontally. On the one hand, BAHUI exploits a divide-and-conquer approach to visit itemset lattice by using bitmap vertically. On the other hand, BAHUI horizontally uses bitmap to calculate the real utilities of candidates. Using bitmap compression scheme, BAHUI reduces the memory usage and makes use of the efficient bitwise operation. Furthermore, BAHUI only records candidate high utility itemsets with maximal length, and inherits the pruning and searching strategies from maximal itemset mining problem. Extensive experimental results show that the BAHUI algorithm is both efficient and scalable.

2020 ◽  
pp. 1-16
Author(s):  
Rui Sun ◽  
Meng Han ◽  
Chunyan Zhang ◽  
Mingyao Shen ◽  
Shiyu Du

High utility itemset mining(HUIM) with negative utility is an emerging data mining task. However, the setting of the minimum utility threshold is always a challenge when mining high utility itemsets(HUIs) with negative items. Although the top-k HUIM method is very common, this method can only mine itemsets with positive items, and the problem of missing itemsets occurs when mining itemsets with negative items. To solve this problem, we first propose an effective algorithm called THN (Top-k High Utility Itemset Mining with Negative Utility). It proposes a strategy for automatically increasing the minimum utility threshold. In order to solve the problem of multiple scans of the database, it uses transaction merging and dataset projection technology. It uses a redefined sub-tree utility value and a redefined local utility value to prune the search space. Experimental results on real datasets show that THN is efficient in terms of runtime and memory usage, and has excellent scalability. Moreover, experiments show that THN performs particularly well on dense datasets.


2008 ◽  
pp. 47-52
Author(s):  
Zoltán Magyar

In Hungary the operating medium of game management and the guided hunting sector is undergoing such a radical change nowadays that on the one hand it considerably influences the profit-producing ability of the sector, and on the other hand it sets the actors of the industry new challenges and opportunities. If the Hungarian hunting industry, which has a traditional past, also wishes to preserve its position in this changed business medium, it is essential that the new situation be thoroughly assessed, and the value-oriented marketing attitude be adapted. The phenomena presented in this essay discuss the consequences and causes of the appearance of new service providers regarding the supply side, and the content changes of the consumer group and the modification of earlier consumption preferences and their causes on the demand side. The changing of the two media jointly generate the adaptation of the value-oriented service - marketing concepts, by using which the areas to be developed and deemed as the narrow cross-section of consumer decisions can be determined. After specifying the target group specific marketing properties of the aove-mentioned – prestige – service, such services of high utility content can already be established successfully that can be positioned as a proper alternative for the new consumer group of higher value expectation. On the other hand, the employment and profitability indexes related to this sector may considerably be improved.


2019 ◽  
Vol 18 (04) ◽  
pp. 1113-1185 ◽  
Author(s):  
Bahareh Rahmati ◽  
Mohammad Karim Sohrabi

High utility itemset mining considers unit profits and quantities of items in a transaction database to extract more applicable and more useful association rules. Downward closure property, which causes significant pruning in frequent itemset mining, is not established in the utility of itemsets and so the mining problem will require alternative solutions to reduce its search space and to enhance its efficiency. Using an anti-monotonic upper bound of the utility function and exploiting efficient data structures for storing and compacting the dataset to perform efficient pruning strategies are the main solutions to address high utility itemset mining problem. Different mining methods and techniques have attempted to improve performance of extracting high utility itemsets and their several variants, including high-average utility itemsets, top-k high utility itemsets, and high utility itemsets with negative values, using more efficient data structures, more appropriate anti-monotonic upper bounds, and stronger pruning strategies. This paper aims to represent a comprehensive systematic review for high utility itemset mining techniques and to classify them based on their problem-solving approaches.


2020 ◽  
Vol 1 (2) ◽  
pp. 44-47
Author(s):  
Tung N.T ◽  
Nguyen Le Van ◽  
Trinh Cong Nhut ◽  
Tran Van Sang

The goal of the high-utility itemset mining task is to discover combinations of items that yield high profits from transactional databases. HUIM is a useful tool for retail stores to analyze customer behaviors. However, in the real world, items are found with both positive and negative utility values. To address this issue, we propose an algorithm named Modified Efficient High‐utility Itemsets mining with Negative utility (MEHIN) to find all HUIs with negative utility. This algorithm is an improved version of the EHIN algorithm. MEHIN utilizes 2 new upper bounds for pruning, named revised subtree and revised local utility. To reduce dataset scans, the proposed algorithm uses transaction merging and dataset projection techniques. An array‐based utility‐counting technique is also utilized to calculate upper‐bound efficiently. The MEHIN employs a novel structure called P-set to reduce the number of transaction scans and to speed up the mining process. Experimental results show that the proposed algorithms considerably outperform the state-of-the-art HUI-mining algorithms on negative utility in retail databases in terms of runtime.


2013 ◽  
Vol 760-762 ◽  
pp. 1713-1717
Author(s):  
Yi Pan ◽  
Bo Zhang

Owing to their major contribution to the total transaction's sales profits, increasingly importance has been attached to high utility itemsets mining. This paper has proposed a TIFF-tree based algorithm, which takes two-pass database scan to obtain the transaction utility information, the conditional matrix of potential high utility is adopted, through the row-column operation, the calculation of transaction utility can be simplified. The experiment result analysis shows that as the decreasing of user-defined threshold, the performance of TIFP-Growth algorithm is much better than the two-phase algorithm.


2013 ◽  
Vol 385-386 ◽  
pp. 1362-1365
Author(s):  
Wei Min Ouyang ◽  
Qin Hua Huang

Sequential pattern is an important research topic in data mining and knowledge discovery. Traditional algorithms for mining sequential patterns focus on the frequent sequences, which do not consider the infrequent sequences and lifespan of each sequence. On the one hand, some infrequent patterns can provide very useful insight view into the data set, on the other hand, without taking lifespan of each sequence into account, not only some discovered patterns may be invalid, but also some useful patterns may not be discovered. So, we extend the sequential patterns to the indirect temporal sequential patterns, and put forward an algorithm to discover indirect temporal sequential patterns in this paper.


2012 ◽  
Vol 616-618 ◽  
pp. 1478-1483
Author(s):  
You Pei Hu

There is a close connection between urban forms and microclimates. Shaping an urban form with good climate performance is meaningful for sustainable development. However, there is a professional gap between the field of microclimate and urban form studies and the urban form design practice, which impedes the transformation of research achievements from the former to the latter and has an impact on the orientation of the research issues. This paper adopts a perspective of design-oriented formalism to construct a knowledge and research framework for the field. On the one hand, it presents the researches and knowledge in this field in a form that is easy to be understood by designers; on the other hand, it intends to reveal the design-oriented research path and issues.


2019 ◽  
Vol 15 (3) ◽  
pp. 1-27
Author(s):  
Kuldeep Singh ◽  
Bhaskar Biswas

High utility itemset (HUI) mining is one of the popular and important data mining tasks. Several studies have been carried out on this topic, which often discovers a very large number of itemsets and rules, which reduces not only the efficiency but also the effectiveness of HUI mining. In order to increase the efficiency and discover more interesting HUIs, constraint-based mining plays an important role. To address this issue, the authors propose an algorithm to discover HUIs with length constraints named EHIL (Efficient High utility Itemsets with Length constraints) to decrease the number of HUIs by removing tiny itemsets. EHIL adopts two new upper bound named sub-tree and local utility for pruning and modify them by incorporating length constraints. To reduce the dataset scans, the proposed algorithm uses transaction merging and dataset projection techniques. The execution time improvements ranged from a modest five percent to two orders of magnitude across benchmark datasets. The memory usage is up to twenty-eight times less than state-of-the-art algorithm FHM+.


2021 ◽  
pp. 1-26
Author(s):  
Haodong Cheng ◽  
Meng Han ◽  
Ni Zhang ◽  
Xiaojuan Li ◽  
Le Wang

Traditional association rule mining has been widely studied, but this is not applicable to practical applications that must consider factors such as the unit profit of the item and the purchase quantity. High-utility itemset mining (HUIM) aims to find high-utility patterns by considering the number of items purchased and the unit profit. However, most high-utility itemset mining algorithms are designed for static databases. In real-world applications (such as market analysis and business decisions), databases are usually updated by inserting new data dynamically. Some researchers have proposed algorithms for finding high-utility itemsets in dynamically updated databases. Different from the batch processing algorithms that always process the databases from scratch, the incremental HUIM algorithms update and output high-utility itemsets in an incremental manner, thereby reducing the cost of finding high-utility itemsets. This paper provides the latest research on incremental high-utility itemset mining algorithms, including methods of storing itemsets and utilities based on tree, list, array and hash set storage structures. It also points out several important derivative algorithms and research challenges for incremental high-utility itemset mining.


Sign in / Sign up

Export Citation Format

Share Document