Mining correlated high-utility itemsets using various measures

Abstract Discovering high-utility itemsets (HUIs) consists of finding sets of items that yield a high profit in customer transaction databases. An important limitation of traditional high-utility itemset mining (HUIM) is that only the utility measure is used for assessing the interestingness of patterns. This leads to finding several itemsets that have a high profit but contain items that are weakly correlated. To address this issue, this paper proposes to integrate the concept of correlation in HUIM to find profitable itemsets that are highly correlated, using the all-confidence and bond measures. An efficient algorithm named FCHM (fast correlated high-utility itemset miner) is proposed to efficiently discover correlated high-utility itemsets (CHIs). Two versions of the algorithm are proposed: FCHM$_{all\text{-}confidence}$ and FCHM$_{bond}$, which are based on the all-confidence and bond measures, respectively. An experimental evaluation was done using four real-life benchmark datasets from the HUIM literature: mushroom, retail, kosarak and foodmart. Results show that FCHM is efficient and can prune a huge amount of weakly CHIs.

Download Full-text

Efficient Algorithm for Mining High Utility Pattern Considering Length Constraints

International Journal of Data Warehousing and Mining ◽

10.4018/ijdwm.2019070101 ◽

2019 ◽

Vol 15 (3) ◽

pp. 1-27

Author(s):

Kuldeep Singh ◽

Bhaskar Biswas

Keyword(s):

Data Mining ◽

Upper Bound ◽

Efficient Algorithm ◽

State Of The Art ◽

Memory Usage ◽

Important Data ◽

Benchmark Datasets ◽

Projection Techniques ◽

High Utility ◽

High Utility Itemsets

High utility itemset (HUI) mining is one of the popular and important data mining tasks. Several studies have been carried out on this topic, which often discovers a very large number of itemsets and rules, which reduces not only the efficiency but also the effectiveness of HUI mining. In order to increase the efficiency and discover more interesting HUIs, constraint-based mining plays an important role. To address this issue, the authors propose an algorithm to discover HUIs with length constraints named EHIL (Efficient High utility Itemsets with Length constraints) to decrease the number of HUIs by removing tiny itemsets. EHIL adopts two new upper bound named sub-tree and local utility for pruning and modify them by incorporating length constraints. To reduce the dataset scans, the proposed algorithm uses transaction merging and dataset projection techniques. The execution time improvements ranged from a modest five percent to two orders of magnitude across benchmark datasets. The memory usage is up to twenty-eight times less than state-of-the-art algorithm FHM+.

Download Full-text

Mining of top-k high utility itemsets with negative utility

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-201357 ◽

2020 ◽

pp. 1-16

Author(s):

Rui Sun ◽

Meng Han ◽

Chunyan Zhang ◽

Mingyao Shen ◽

Shiyu Du

Keyword(s):

Data Mining ◽

Search Space ◽

Experimental Results ◽

Effective Algorithm ◽

Memory Usage ◽

Utility Value ◽

Itemset Mining ◽

High Utility ◽

High Utility Itemsets

High utility itemset mining(HUIM) with negative utility is an emerging data mining task. However, the setting of the minimum utility threshold is always a challenge when mining high utility itemsets(HUIs) with negative items. Although the top-k HUIM method is very common, this method can only mine itemsets with positive items, and the problem of missing itemsets occurs when mining itemsets with negative items. To solve this problem, we first propose an effective algorithm called THN (Top-k High Utility Itemset Mining with Negative Utility). It proposes a strategy for automatically increasing the minimum utility threshold. In order to solve the problem of multiple scans of the database, it uses transaction merging and dataset projection technology. It uses a redefined sub-tree utility value and a redefined local utility value to prune the search space. Experimental results on real datasets show that THN is efficient in terms of runtime and memory usage, and has excellent scalability. Moreover, experiments show that THN performs particularly well on dense datasets.

Download Full-text

FHN: An efficient algorithm for mining high-utility itemsets with negative unit profits

Knowledge-Based Systems ◽

10.1016/j.knosys.2016.08.022 ◽

2016 ◽

Vol 111 ◽

pp. 283-298 ◽

Cited By ~ 27

Author(s):

Jerry Chun-Wei Lin ◽

Philippe Fournier-Viger ◽

Wensheng Gan

Keyword(s):

Efficient Algorithm ◽

High Utility ◽

High Utility Itemsets

Download Full-text

An efficient algorithm for hiding sensitive-high utility itemsets

Intelligent Data Analysis ◽

10.3233/ida-194697 ◽

2020 ◽

Vol 24 (4) ◽

pp. 831-845

Author(s):

Vy Huynh Trieu ◽

Hai Le Quoc ◽

Chau Truong Ngoc

Keyword(s):

Efficient Algorithm ◽

High Utility ◽

High Utility Itemsets

Download Full-text

EHNL: An efficient algorithm for mining high utility itemsets with negative utility value and length constraints

Information Sciences ◽

10.1016/j.ins.2019.01.056 ◽

2019 ◽

Vol 484 ◽

pp. 44-70 ◽

Cited By ~ 2

Author(s):

Kuldeep Singh ◽

Ajay Kumar ◽

Shashank Sheshar Singh ◽

Harish Kumar Shakya ◽

Bhaskar Biswas

Keyword(s):

Efficient Algorithm ◽

Utility Value ◽

High Utility ◽

High Utility Itemsets

Download Full-text

EFIM: a fast and memory efficient algorithm for high-utility itemset mining

Knowledge and Information Systems ◽

10.1007/s10115-016-0986-0 ◽

2016 ◽

Vol 51 (2) ◽

pp. 595-625 ◽

Cited By ~ 69

Author(s):

Souleymane Zida ◽

Philippe Fournier-Viger ◽

Jerry Chun-Wei Lin ◽

Cheng-Wei Wu ◽

Vincent S. Tseng

Keyword(s):

Efficient Algorithm ◽

Itemset Mining ◽

High Utility ◽

Memory Efficient

Download Full-text

An efficient algorithm for mining temporal high utility itemsets from data streams

Journal of Systems and Software ◽

10.1016/j.jss.2007.07.026 ◽

2008 ◽

Vol 81 (7) ◽

pp. 1105-1117 ◽

Cited By ~ 52

Author(s):

Chun-Jung Chu ◽

Vincent S. Tseng ◽

Tyne Liang

Keyword(s):

Data Streams ◽

Efficient Algorithm ◽

High Utility ◽

High Utility Itemsets

Download Full-text

Efficient Algorithm for Mining Non-Redundant High-Utility Association Rules

Sensors ◽

10.3390/s20041078 ◽

2020 ◽

Vol 20 (4) ◽

pp. 1078 ◽

Cited By ~ 7

Author(s):

Thang Mai ◽

Loan T.T. Nguyen ◽

Bay Vo ◽

Unil Yun ◽

Tzung-Pei Hong

Keyword(s):

Association Rules ◽

Business Strategy ◽

Efficient Algorithm ◽

Business Managers ◽

Competitive Strategies ◽

Computing Systems ◽

Other Information ◽

High Utility ◽

High Utility Itemsets ◽

The Internet Of Things

In business, managers may use the association information among products to define promotion and competitive strategies. The mining of high-utility association rules (HARs) from high-utility itemsets enables users to select their own weights for rules, based either on the utility or confidence values. This approach also provides more information, which can help managers to make better decisions. Some efficient methods for mining HARs have been developed in recent years. However, in some decision-support systems, users only need to mine a smallest set of HARs for efficient use. Therefore, this paper proposes a method for the efficient mining of non-redundant high-utility association rules (NR-HARs). We first build a semi-lattice of mined high-utility itemsets, and then identify closed and generator itemsets within this. Following this, an efficient algorithm is developed for generating rules from the built lattice. This new approach was verified on different types of datasets to demonstrate that it has a faster runtime and does not require more memory than existing methods. The proposed algorithm can be integrated with a variety of applications and would combine well with external systems, such as the Internet of Things (IoT) and distributed computer systems. Many companies have been applying IoT and such computing systems into their business activities, monitoring data or decision-making. The data can be sent into the system continuously through the IoT or any other information system. Selecting an appropriate and fast approach helps management to visualize customer needs as well as make more timely decisions on business strategy.

Download Full-text

A Systematic Survey on High Utility Itemset Mining

International Journal of Information Technology & Decision Making ◽

10.1142/s0219622019300027 ◽

2019 ◽

Vol 18 (04) ◽

pp. 1113-1185 ◽

Cited By ~ 2

Author(s):

Bahareh Rahmati ◽

Mohammad Karim Sohrabi

Keyword(s):

Data Structures ◽

Search Space ◽

Frequent Itemset ◽

Itemset Mining ◽

Efficient Data ◽

Average Utility ◽

High Utility ◽

High Utility Itemsets ◽

Downward Closure ◽

Efficient Data Structures

High utility itemset mining considers unit profits and quantities of items in a transaction database to extract more applicable and more useful association rules. Downward closure property, which causes significant pruning in frequent itemset mining, is not established in the utility of itemsets and so the mining problem will require alternative solutions to reduce its search space and to enhance its efficiency. Using an anti-monotonic upper bound of the utility function and exploiting efficient data structures for storing and compacting the dataset to perform efficient pruning strategies are the main solutions to address high utility itemset mining problem. Different mining methods and techniques have attempted to improve performance of extracting high utility itemsets and their several variants, including high-average utility itemsets, top-k high utility itemsets, and high utility itemsets with negative values, using more efficient data structures, more appropriate anti-monotonic upper bounds, and stronger pruning strategies. This paper aims to represent a comprehensive systematic review for high utility itemset mining techniques and to classify them based on their problem-solving approaches.

Download Full-text

CHN: an efficient algorithm for mining closed high utility itemsets with negative utility

IEEE Transactions on Knowledge and Data Engineering ◽

10.1109/tkde.2018.2882421 ◽

2018 ◽

pp. 1-1 ◽

Cited By ~ 2

Author(s):

Kuldeep Singh ◽

Shashank Sheshar Singh ◽

Ajay Kumar ◽

Harish Kumar Shakya ◽

Bhaskar Biswas

Keyword(s):

Efficient Algorithm ◽

High Utility ◽

High Utility Itemsets

Download Full-text