scholarly journals Pre-processing time constraints for efficiently mining generalized sequential patterns

Author(s):  
F. Masseglia ◽  
P. Poncelet ◽  
M. Teisseire
2016 ◽  
Vol 29 (10) ◽  
pp. 1105-1127 ◽  
Author(s):  
Grzegorz Bocewicz ◽  
Izabela Ewa Nielsen ◽  
Zbigniew Antoni Banaszak

2009 ◽  
Vol 36 (2) ◽  
pp. 2677-2690 ◽  
Author(s):  
F. Masseglia ◽  
P. Poncelet ◽  
M. Teisseire

2020 ◽  
Vol 16 (1) ◽  
pp. 1-21
Author(s):  
Christie I. Ezeife ◽  
Vignesh Aravindan ◽  
Ritu Chaturvedi

Existing work on multiple databases (MDBs) sequential pattern mining cannot mine frequent sequences to answer exact and historical queries from MDBs having different table structures. This article proposes the transaction id frequent sequence pattern (TidFSeq) algorithm to handle the difficult problem of mining frequent sequences from diverse MDBs. The TidFSeq algorithm transforms candidate 1-sequences to get transaction subsequences where candidate 1-sequences occurred as (1-sequence, itssubsequenceidlist) tuple or (1-sequence, position id list). Subsequent frequent i-sequences are computed using the counts of the sequence ids in each candidate i-sequence position id list tuples. An extended version of the general sequential pattern (GSP)-like candidate generates and a frequency count approach is used for computing supports of itemset (I-step) and separate (S-step) sequences without repeated database scans but with transaction ids. Generated patterns answer complex queries from MDBs. The TidFSeq algorithm has a faster processing time than existing algorithms.


Author(s):  
Shigeaki Sakurai

This article proposes a method for discovering characteristic sequential patterns from sequential data by using background knowledge. In the case of the tabular structured data, each item is composed of an attribute and an attribute value. This article focuses on two types of constraints describing background knowledge. The first one is time constraints. It can flexibly describe relationships related to the time between items. The second one is item constraints, it can select items included in sequential patterns. These constraints can represent the background knowledge representing the interests of analysts. Therefore, they can easily discover sequential patterns coinciding the interests as characteristic sequential patterns. Lastly, this article verifies the effect of the pattern discovery method based on both the evaluation criteria of sequential patterns and the background knowledge. The method can be applied to the analysis of the healthcare data.


2015 ◽  
Vol 21 (4) ◽  
pp. 523-547 ◽  
Author(s):  
Lucio Grandinetti ◽  
Francesca Guerriero ◽  
Luigi Di Puglia Pugliese ◽  
Mehdi Sheikhalishahi

2011 ◽  
Vol 341-342 ◽  
pp. 530-534
Author(s):  
Zai Ping Tao

In this paper, a new algorithm named TCSP is proposed to mine sequential patterns with different time constraints. It scans the database into memory and constructs time-index sets for efficient processing. It mines the desired sequential patterns without generating any candidates. We have evaluated the new algorithm with the well-known GSP algorithm and the DELISP algorithm for various datasets and constraints. The comprehensive experiments show that the TCSP algorithm works better and it has good scalability.


2019 ◽  
Vol 19 (4) ◽  
pp. 3-16
Author(s):  
Tran Huy Duong ◽  
Demetrovics Janos ◽  
Vu Duc Thi ◽  
Nguyen Truong Thang ◽  
Tran The Anh

Abstract Mining High Utility Sequential Patterns (HUSP) is an emerging topic in data mining which attracts many researchers. The HUSP mining algorithms can extract sequential patterns having high utility (importance) in a quantitative sequence database. In real world applications, the time intervals between elements are also very important. However, recent HUSP mining algorithms cannot extract sequential patterns with time intervals between elements. Thus, in this paper, we propose an algorithm for mining high utility sequential patterns with the time interval problem. We consider not only sequential patterns’ utilities, but also their time intervals. The sequence weight utility value is used to ensure the important downward closure property. Besides that, we use four time constraints for dealing with time interval in the sequence to extract more meaningful patterns. Experimental results show that our proposed method is efficient and effective in mining high utility sequential pattern with time intervals.


Sign in / Sign up

Export Citation Format

Share Document