HIGH UTILITY ITEM INTERVAL SEQUENTIAL PATTERN MINING ALGORITHM

High utility sequential pattern mining is a popular topic in data mining with the main purpose is to extract sequential patterns with high utility in the sequence database. Many recent works have proposed methods to solve this problem. However, most of them does not consider item intervals of sequential patterns which can lead to the extraction of sequential patterns with too long item interval, thus making little sense. In this paper, we propose a High Utility Item Interval Sequential Pattern (HUISP) algorithm to solve this problem. Our algorithm uses pattern growth approach and some techniques to increase algorithm's performance.

Download Full-text

The Sequential Pattern Mining Algorithm MHSP Based on MH

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.63-64.425 ◽

2011 ◽

Vol 63-64 ◽

pp. 425-430

Author(s):

Jun Wang ◽

Ya Qiong Jiang

Keyword(s):

Pattern Mining ◽

Sequential Pattern Mining ◽

Experimental Results ◽

Sequential Pattern ◽

Sequential Patterns ◽

Important Method ◽

The Real ◽

Mining Algorithm ◽

Large Projection ◽

Growth Approach

Pattern growth approach is an important method in sequential pattern mining. Projection database based on the method is introduced in PrefixSpan, and the PrefixSpan algorithm can solve the problem of mining sequential patterns. But relative to large projection database, the performance of PrefixSpan is affected. Inspired by the prefix-divide method and MH structure, this paper proposed a new algorithm MHSP for sequential pattern mining. Based on the real datasets, experimental results show that the performance of MHSP algorithm is more than twice as fast as PrefixSpan.

Download Full-text

Mining of Sequential Patterns using Directed Graphs

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.k2242.0981119 ◽

2019 ◽

Vol 8 (11) ◽

pp. 4002-4007

Keyword(s):

Pattern Mining ◽

Directed Graphs ◽

Real Life ◽

Sequential Pattern Mining ◽

Sequential Pattern ◽

Sequential Patterns ◽

Sequential Data ◽

Sequence Database ◽

Directed Paths ◽

Digraph Model

Sequential pattern mining is one of the important functionalities of data mining. It is used for analyzing sequential database and discovers sequential patterns. It is focused for extracting interesting subsequences from a set of sequences. Various factors such as rate of occurrence, length, and profit are used to define the interestingness of subsequence derived from the sequence database. Sequential pattern mining has abundant real-life applications since sequential data is logically programmed as sequences of cipher in many fields such as bioinformatics, e-learning, market basket analysis, texts, and webpage click-stream analysis. A large diversity of competent algorithms such as Prefixspan, GSP and Freespan have been proposed during the past few years. In this paper we propose a data model for organizing the sequential database, which consists of a directed graph DGS (cycles and several edges are allowed) and an organization of directed paths in DGS to represent a sequential data for discovering sequential pattern3 from a sequence database. Competent algorithms for constructing the digraph model (DGS) for extracting all sequential patterns and mining association rules are proposed. A number of theoretical parameters of digraph model are also introduced, which lead to more understanding of the problem.

Download Full-text

Scalable Mining of High-Utility Sequential Patterns With Three-Tier MapReduce Model

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3487046 ◽

2022 ◽

Vol 16 (3) ◽

pp. 1-26

Author(s):

Jerry Chun-Wei Lin ◽

Youcef Djenouri ◽

Gautam Srivastava ◽

Yuanfa Li ◽

Philip S. Yu

Keyword(s):

Large Scale ◽

Pattern Mining ◽

Sequential Pattern Mining ◽

Main Memory ◽

Frequent Itemset ◽

Sequential Pattern ◽

Sequential Patterns ◽

Speed Up ◽

Mapreduce Model ◽

High Utility

High-utility sequential pattern mining (HUSPM) is a hot research topic in recent decades since it combines both sequential and utility properties to reveal more information and knowledge rather than the traditional frequent itemset mining or sequential pattern mining. Several works of HUSPM have been presented but most of them are based on main memory to speed up mining performance. However, this assumption is not realistic and not suitable in large-scale environments since in real industry, the size of the collected data is very huge and it is impossible to fit the data into the main memory of a single machine. In this article, we first develop a parallel and distributed three-stage MapReduce model for mining high-utility sequential patterns based on large-scale databases. Two properties are then developed to hold the correctness and completeness of the discovered patterns in the developed framework. In addition, two data structures called sidset and utility-linked list are utilized in the developed framework to accelerate the computation for mining the required patterns. From the results, we can observe that the designed model has good performance in large-scale datasets in terms of runtime, memory, efficiency of the number of distributed nodes, and scalability compared to the serial HUSP-Span approach.

Download Full-text

Mining Time-Interval Sequential Patterns with High Utility from Transaction Databases

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2016.p1018 ◽

2016 ◽

Vol 20 (6) ◽

pp. 1018-1026 ◽

Cited By ~ 1

Author(s):

Wen-Yen Wang ◽

◽

Anna Y.-Q. Huang ◽

Keyword(s):

Pattern Mining ◽

Sequential Pattern Mining ◽

Business Practice ◽

Sequential Pattern ◽

Sequential Patterns ◽

Time Interval ◽

Business Managers ◽

Time Intervals ◽

High Utility ◽

Product Sales

The purpose of time-interval sequential pattern mining is to help superstore business managers promote product sales. Sequential pattern mining discovers the time interval patterns for items: for example, if most customers purchase product item A, and then buy items B and C after r to s and t to u days respectively, the time interval between r to s and t to u days can be provided to business managers to facilitate informed marketing decisions. We treat these time intervals as patterns to be mined, to predict the purchasing time intervals between A and B, as well as B and C. Nevertheless, little work considers the significance of product items while mining these time-interval sequential patterns. This work extends previous work and retains high-utility time interval patterns during pattern mining. This type of mining is meant to more closely reflect actual business practice. Experimental results show the differences between three mining approaches when jointly considering item utility and time intervals for purchased items. In addition to yielding more accurate patterns than the other two methods, the proposed UTMining_A method shortens execution times by delaying join processing and removing unnecessary records.

Download Full-text

An Efficient Parallel High Utility Sequential Pattern Mining Algorithm

2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS) ◽

10.1109/hpcc/smartcity/dss.2019.00392 ◽

2019 ◽

Author(s):

Chunkai Zhang ◽

Yiwen Zu

Keyword(s):

Pattern Mining ◽

Sequential Pattern Mining ◽

Sequential Pattern ◽

Mining Algorithm ◽

High Utility

Download Full-text

From sequential pattern mining to structured pattern mining: A pattern-growth approach

Journal of Computer Science and Technology ◽

10.1007/bf02944897 ◽

2004 ◽

Vol 19 (3) ◽

pp. 257-279 ◽

Cited By ~ 27

Author(s):

Jia-Wei Han ◽

Jian Pei ◽

Xi-Feng Yan

Keyword(s):

Pattern Mining ◽

Sequential Pattern Mining ◽

Sequential Pattern ◽

Pattern Growth ◽

Growth Approach

Download Full-text

Dramatically Reducing Search for High Utility Sequential Patterns by Maintaining Candidate Lists

Information ◽

10.3390/info11010044 ◽

2020 ◽

Vol 11 (1) ◽

pp. 44

Author(s):

Scott Buffett

Keyword(s):

Upper Bound ◽

Pattern Mining ◽

Computational Cost ◽

Search Space ◽

Sequential Pattern Mining ◽

Sequential Pattern ◽

Sequential Patterns ◽

Frequent Patterns ◽

Run Time ◽

High Utility

A ubiquitous challenge throughout all areas of data mining, particularly in the mining of frequent patterns in large databases, is centered on the necessity to reduce the time and space required to perform the search. The extent of this reduction proportionally facilitates the ability to identify patterns of interest. High utility sequential pattern mining (HUSPM) seeks to identify frequent patterns that are (1) sequential in nature and (2) hold a significant magnitude of utility in a sequence database, by considering the aspect of item value or importance. While traditional sequential pattern mining relies on the downward closure property to significantly reduce the required search space, with HUSPM, this property does not hold. To address this drawback, an approach is proposed that establishes a tight upper bound on the utility of future candidate sequential patterns by maintaining a list of items that are deemed potential candidates for concatenation. Such candidates are provably the only items that are ever needed for any extension of a given sequential pattern or its descendants in the search tree. This list is then exploited to significantly further tighten the upper bound on the utilities of descendent patterns. An extension of this work is then proposed that significantly reduces the computational cost of updating database utilities each time a candidate item is removed from the list, resulting in a massive reduction in the number of candidate sequential patterns that need to be generated in the search. Sequential pattern mining methods implementing these new techniques for bound reduction and further candidate list reduction are demonstrated via the introduction of the CRUSP and CRUSPPivot algorithms, respectively. Validation of the techniques was conducted on six public datasets. Tests show that use of the CRUSP algorithm results in a significant reduction in the overall number of candidate sequential patterns that need to be considered, and subsequently a significant reduction in run time, when compared to the current state of the art in bounding techniques. When employing the CRUSPPivot algorithm, the further reduction in the size of the search space was found to be dramatic, with the reduction in run time found to be dramatic to moderate, depending on the dataset. Demonstrating the practical significance of the work, experiments showed that time required for one particularly complex dataset was reduced from many hours to less than one minute.

Download Full-text

An Asynchronous Periodic Sequential Pattern Mining Algorithm with Multiple Minimum Item Supports for Ad Hoc Networking

Journal of Sensors ◽

10.1155/2015/461659 ◽

2015 ◽

Vol 2015 ◽

pp. 1-13 ◽

Cited By ~ 1

Author(s):

Xiangzhan Yu ◽

Zhaoxin Zhang ◽

Haining Yu ◽

Feng Jiang ◽

Wen Ji

Keyword(s):

Ad Hoc ◽

Pattern Mining ◽

Sequential Pattern Mining ◽

Experimental Results ◽

Sequential Pattern ◽

Divide And Conquer ◽

Sequential Patterns ◽

Ad Hoc Networking ◽

Mining Algorithm ◽

Mining Model

The original sequential pattern mining model only considers occurrence frequencies of sequential patterns, disregarding their occurrence periodicity. We propose an asynchronous periodic sequential pattern mining model to discover the sequential patterns that not only occur frequently but also appear periodically. For this mining model, we propose a pattern-growth mining algorithm to mine asynchronous periodic sequential patterns with multiple minimum item supports. This algorithm employs a divide-and-conquer strategy to mine asynchronous periodic sequential patterns in a depth-first manner recursively. We describe the process of algorithm realization and demonstrate the efficiency and stability of the algorithm through experimental results.

Download Full-text

A review on sequential pattern mining using pattern growth approach

2016 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET) ◽

10.1109/wispnet.2016.7566371 ◽

2016 ◽

Cited By ~ 3

Author(s):

Roshani Patel ◽

Tarunika Chaudhari

Keyword(s):

Pattern Mining ◽

Sequential Pattern Mining ◽

Sequential Pattern ◽

Pattern Growth ◽

Growth Approach

Download Full-text