ETKDS: An efficient algorithm of Top-K high utility itemsets mining over data streams under sliding window model

The researcher proposed the concept of Top-K high-utility itemsets mining over data streams. Users directly specify the number K of high-utility itemsets they wish to obtain for mining with no need to set a minimum utility threshold. There exist some problems in current Top-K high-utility itemsets mining algorithms over data streams including the complex construction process of the storage structure, the inefficiency of threshold raising strategies and utility pruning strategies, and large scale of the search space, etc., which still can not meet the requirement of real-time processing over data streams with limited time and memory constraints. To solve this problem, this paper proposes an efficient algorithm based on dataset projection for mining Top-K high-utility itemsets from a data stream. A data structure CIUDataListSW is also proposed, which stores the position of the item in the transaction to effectively obtain the initial projected dataset of the item. In order to improve the projection efficiency, this paper innovates a new reorganization technology for projected transactions in common batches to maintain the sort order of transactions in the process of dataset projection. Dual pruning strategy and transaction merging mechanism are also used to further reduce search space and dataset scanning costs. In addition, based on the proposed CUDH S W structure, an efficient threshold raising strategy CUD is used, and a new threshold raising strategy CUDCB is designed to further shorten the mining time. Experimental results show that the algorithm has great advantages in running time and memory consumption, and it is especially suitable for the mining of high-utility itemsets of dense datasets.

Download Full-text

An efficient algorithm for mining temporal high utility itemsets from data streams

Journal of Systems and Software ◽

10.1016/j.jss.2007.07.026 ◽

2008 ◽

Vol 81 (7) ◽

pp. 1105-1117 ◽

Cited By ~ 52

Author(s):

Chun-Jung Chu ◽

Vincent S. Tseng ◽

Tyne Liang

Keyword(s):

Data Streams ◽

Efficient Algorithm ◽

High Utility ◽

High Utility Itemsets

Download Full-text

MHUI-max: An efficient algorithm for discovering high-utility itemsets from data streams

Journal of Information Science ◽

10.1177/0165551511416436 ◽

2011 ◽

Vol 37 (5) ◽

pp. 532-545 ◽

Cited By ~ 6

Author(s):

Hua-Fu Li

Keyword(s):

Data Streams ◽

Efficient Algorithm ◽

High Utility ◽

High Utility Itemsets

Download Full-text

Mining of top-k high utility itemsets with negative utility

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-201357 ◽

2020 ◽

pp. 1-16

Author(s):

Rui Sun ◽

Meng Han ◽

Chunyan Zhang ◽

Mingyao Shen ◽

Shiyu Du

Keyword(s):

Data Mining ◽

Search Space ◽

Experimental Results ◽

Effective Algorithm ◽

Memory Usage ◽

Utility Value ◽

Itemset Mining ◽

High Utility ◽

High Utility Itemsets

High utility itemset mining(HUIM) with negative utility is an emerging data mining task. However, the setting of the minimum utility threshold is always a challenge when mining high utility itemsets(HUIs) with negative items. Although the top-k HUIM method is very common, this method can only mine itemsets with positive items, and the problem of missing itemsets occurs when mining itemsets with negative items. To solve this problem, we first propose an effective algorithm called THN (Top-k High Utility Itemset Mining with Negative Utility). It proposes a strategy for automatically increasing the minimum utility threshold. In order to solve the problem of multiple scans of the database, it uses transaction merging and dataset projection technology. It uses a redefined sub-tree utility value and a redefined local utility value to prune the search space. Experimental results on real datasets show that THN is efficient in terms of runtime and memory usage, and has excellent scalability. Moreover, experiments show that THN performs particularly well on dense datasets.

Download Full-text

Optimization of Evolutionary Algorithm Using Machine Learning Techniques for Pattern Mining in Transactional Database

Handbook of Research on Applications and Implementations of Machine Learning Techniques - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-5225-9902-9.ch010 ◽

2020 ◽

pp. 173-200

Author(s):

Logeswaran K. ◽

Suresh P. ◽

Savitha S. ◽

Prasanna Kumar K. R.

Keyword(s):

Evolutionary Algorithm ◽

Pattern Mining ◽

Fitness Function ◽

Search Space ◽

Machine Learning Techniques ◽

Dynamic Selection ◽

Learning Techniques ◽

Optimal Function ◽

High Utility ◽

Mining Algorithms

In recent years, the data analysts are facing many challenges in high utility itemset (HUI) mining from given transactional database using existing traditional techniques. The challenges in utility mining algorithms are exponentially growing search space and the minimum utility threshold appropriate to the given database. To overcome these challenges, evolutionary algorithm-based techniques can be used to mine the HUI from transactional database. However, testing each of the supporting functions in the optimization problem is very inefficient and it increases the time complexity of the algorithm. To overcome this drawback, reinforcement learning-based approach is proposed for improving the efficiency of the algorithm, and the most appropriate fitness function for evaluation can be selected automatically during execution of an algorithm. Furthermore, during the optimization process when distinct functions are skillful, dynamic selection of current optimal function is done.

Download Full-text

FHN: An efficient algorithm for mining high-utility itemsets with negative unit profits

Knowledge-Based Systems ◽

10.1016/j.knosys.2016.08.022 ◽

2016 ◽

Vol 111 ◽

pp. 283-298 ◽

Cited By ~ 27

Author(s):

Jerry Chun-Wei Lin ◽

Philippe Fournier-Viger ◽

Wensheng Gan

Keyword(s):

Efficient Algorithm ◽

High Utility ◽

High Utility Itemsets

Download Full-text

An efficient algorithm for hiding sensitive-high utility itemsets

Intelligent Data Analysis ◽

10.3233/ida-194697 ◽

2020 ◽

Vol 24 (4) ◽

pp. 831-845

Author(s):

Vy Huynh Trieu ◽

Hai Le Quoc ◽

Chau Truong Ngoc

Keyword(s):

Efficient Algorithm ◽

High Utility ◽

High Utility Itemsets

Download Full-text

EHNL: An efficient algorithm for mining high utility itemsets with negative utility value and length constraints

Information Sciences ◽

10.1016/j.ins.2019.01.056 ◽

2019 ◽

Vol 484 ◽

pp. 44-70 ◽

Cited By ~ 2

Author(s):

Kuldeep Singh ◽

Ajay Kumar ◽

Shashank Sheshar Singh ◽

Harish Kumar Shakya ◽

Bhaskar Biswas

Keyword(s):

Efficient Algorithm ◽

Utility Value ◽

High Utility ◽

High Utility Itemsets

Download Full-text

Efficient Algorithm for Mining Non-Redundant High-Utility Association Rules

Sensors ◽

10.3390/s20041078 ◽

2020 ◽

Vol 20 (4) ◽

pp. 1078 ◽

Cited By ~ 7

Author(s):

Thang Mai ◽

Loan T.T. Nguyen ◽

Bay Vo ◽

Unil Yun ◽

Tzung-Pei Hong

Keyword(s):

Association Rules ◽

Business Strategy ◽

Efficient Algorithm ◽

Business Managers ◽

Competitive Strategies ◽

Computing Systems ◽

Other Information ◽

High Utility ◽

High Utility Itemsets ◽

The Internet Of Things

In business, managers may use the association information among products to define promotion and competitive strategies. The mining of high-utility association rules (HARs) from high-utility itemsets enables users to select their own weights for rules, based either on the utility or confidence values. This approach also provides more information, which can help managers to make better decisions. Some efficient methods for mining HARs have been developed in recent years. However, in some decision-support systems, users only need to mine a smallest set of HARs for efficient use. Therefore, this paper proposes a method for the efficient mining of non-redundant high-utility association rules (NR-HARs). We first build a semi-lattice of mined high-utility itemsets, and then identify closed and generator itemsets within this. Following this, an efficient algorithm is developed for generating rules from the built lattice. This new approach was verified on different types of datasets to demonstrate that it has a faster runtime and does not require more memory than existing methods. The proposed algorithm can be integrated with a variety of applications and would combine well with external systems, such as the Internet of Things (IoT) and distributed computer systems. Many companies have been applying IoT and such computing systems into their business activities, monitoring data or decision-making. The data can be sent into the system continuously through the IoT or any other information system. Selecting an appropriate and fast approach helps management to visualize customer needs as well as make more timely decisions on business strategy.

Download Full-text

A Systematic Survey on High Utility Itemset Mining

International Journal of Information Technology & Decision Making ◽

10.1142/s0219622019300027 ◽

2019 ◽

Vol 18 (04) ◽

pp. 1113-1185 ◽

Cited By ~ 2

Author(s):

Bahareh Rahmati ◽

Mohammad Karim Sohrabi

Keyword(s):

Data Structures ◽

Search Space ◽

Frequent Itemset ◽

Itemset Mining ◽

Efficient Data ◽

Average Utility ◽

High Utility ◽

High Utility Itemsets ◽

Downward Closure ◽

Efficient Data Structures

High utility itemset mining considers unit profits and quantities of items in a transaction database to extract more applicable and more useful association rules. Downward closure property, which causes significant pruning in frequent itemset mining, is not established in the utility of itemsets and so the mining problem will require alternative solutions to reduce its search space and to enhance its efficiency. Using an anti-monotonic upper bound of the utility function and exploiting efficient data structures for storing and compacting the dataset to perform efficient pruning strategies are the main solutions to address high utility itemset mining problem. Different mining methods and techniques have attempted to improve performance of extracting high utility itemsets and their several variants, including high-average utility itemsets, top-k high utility itemsets, and high utility itemsets with negative values, using more efficient data structures, more appropriate anti-monotonic upper bounds, and stronger pruning strategies. This paper aims to represent a comprehensive systematic review for high utility itemset mining techniques and to classify them based on their problem-solving approaches.

Download Full-text

CHN: an efficient algorithm for mining closed high utility itemsets with negative utility

IEEE Transactions on Knowledge and Data Engineering ◽

10.1109/tkde.2018.2882421 ◽

2018 ◽

pp. 1-1 ◽

Cited By ~ 2

Author(s):

Kuldeep Singh ◽

Shashank Sheshar Singh ◽

Ajay Kumar ◽

Harish Kumar Shakya ◽

Bhaskar Biswas

Keyword(s):

Efficient Algorithm ◽

High Utility ◽

High Utility Itemsets

Download Full-text