Mining weighted sequential patterns in a sequence database with a time-interval weight

2011 ◽  
Vol 24 (1) ◽  
pp. 1-9 ◽  
Author(s):  
Joong Hyuk Chang
2019 ◽  
Vol 19 (4) ◽  
pp. 3-16
Author(s):  
Tran Huy Duong ◽  
Demetrovics Janos ◽  
Vu Duc Thi ◽  
Nguyen Truong Thang ◽  
Tran The Anh

Abstract Mining High Utility Sequential Patterns (HUSP) is an emerging topic in data mining which attracts many researchers. The HUSP mining algorithms can extract sequential patterns having high utility (importance) in a quantitative sequence database. In real world applications, the time intervals between elements are also very important. However, recent HUSP mining algorithms cannot extract sequential patterns with time intervals between elements. Thus, in this paper, we propose an algorithm for mining high utility sequential patterns with the time interval problem. We consider not only sequential patterns’ utilities, but also their time intervals. The sequence weight utility value is used to ensure the important downward closure property. Besides that, we use four time constraints for dealing with time interval in the sequence to extract more meaningful patterns. Experimental results show that our proposed method is efficient and effective in mining high utility sequential pattern with time intervals.


Sequential pattern mining is one of the important functionalities of data mining. It is used for analyzing sequential database and discovers sequential patterns. It is focused for extracting interesting subsequences from a set of sequences. Various factors such as rate of occurrence, length, and profit are used to define the interestingness of subsequence derived from the sequence database. Sequential pattern mining has abundant real-life applications since sequential data is logically programmed as sequences of cipher in many fields such as bioinformatics, e-learning, market basket analysis, texts, and webpage click-stream analysis. A large diversity of competent algorithms such as Prefixspan, GSP and Freespan have been proposed during the past few years. In this paper we propose a data model for organizing the sequential database, which consists of a directed graph DGS (cycles and several edges are allowed) and an organization of directed paths in DGS to represent a sequential data for discovering sequential pattern3 from a sequence database. Competent algorithms for constructing the digraph model (DGS) for extracting all sequential patterns and mining association rules are proposed. A number of theoretical parameters of digraph model are also introduced, which lead to more understanding of the problem.


Author(s):  
Wen-Yen Wang ◽  
◽  
Anna Y.-Q. Huang ◽  

The purpose of time-interval sequential pattern mining is to help superstore business managers promote product sales. Sequential pattern mining discovers the time interval patterns for items: for example, if most customers purchase product item <span class="bold">A</span>, and then buy items <span class="bold">B</span> and <span class="bold">C</span> after <span class="bold">r</span> to <span class="bold">s</span> and <span class="bold">t</span> to <span class="bold">u</span> days respectively, the time interval between <span class="bold">r</span> to <span class="bold">s</span> and <span class="bold">t</span> to <span class="bold">u</span> days can be provided to business managers to facilitate informed marketing decisions. We treat these time intervals as patterns to be mined, to predict the purchasing time intervals between <span class="bold">A</span> and <span class="bold">B</span>, as well as <span class="bold">B</span> and <span class="bold">C</span>. Nevertheless, little work considers the significance of product items while mining these time-interval sequential patterns. This work extends previous work and retains high-utility time interval patterns during pattern mining. This type of mining is meant to more closely reflect actual business practice. Experimental results show the differences between three mining approaches when jointly considering item utility and time intervals for purchased items. In addition to yielding more accurate patterns than the other two methods, the proposed UTMining_A method shortens execution times by delaying join processing and removing unnecessary records.


Sign in / Sign up

Export Citation Format

Share Document