Maintaining the discovered sequential patterns for sequence insertion in dynamic databases

2014 ◽  
Vol 35 ◽  
pp. 131-142 ◽  
Author(s):  
Binbin Zhang ◽  
Chun-Wei Lin ◽  
Wensheng Gan ◽  
Tzung-Pei Hong
2015 ◽  
Vol 11 (1) ◽  
pp. 1-22 ◽  
Author(s):  
Jerry Chun-Wei Lin ◽  
Wensheng Gan ◽  
Tzung-Pei Hong ◽  
Jingliang Zhang

Mining useful information or knowledge from a very large database to aid managers or decision makers to make appropriate decisions is a critical issue in recent years. Sequential patterns can be used to discover the purchased behaviors of customers or the usage behaviors of users from Web log data. Most approaches process a static database to discover sequential patterns in a batch way. In real-world applications, transactions or sequences in databases are frequently changed. In the past, a fast updated sequential pattern (FUSP)-tree was proposed to handle dynamic databases whether for sequence insertion, deletion or modification based on FUP concepts. Original database is required to be re-scanned if it is necessary to maintain the small sequences which was not kept in the FUSP tree. In this paper, the prelarge concept was adopted to maintain and update the built prelarge FUSP tree for sequence modification. A prelarge FUSP tree is modified from FUSP tree for preserving not only the frequent 1-sequences but also the prelarge 1-sequences in the tree structure. The PRELARGE-FUSP-TREE-MOD maintenance algorithm is proposed to reduce the rescans of the original database due to the pruning properties of prelarge concept. When the number of modified sequences is smaller than the safety bound of the prelarge concept, better results can be obtained by the proposed PRELARGE-FUSP-TREE-MOD maintenance algorithm for sequence modification in dynamic databases.


Author(s):  
Jerry Chun-Wei Lin ◽  
Wensheng Gan ◽  
Philippe Fournier-Viger ◽  
Tzung-Pei Hong

Mining sequential patterns (SPs) is a popular data mining task, which consists in finding interesting, unexpected, and useful patterns in sequence databases. It has several applications in many domains. However, most sequential pattern mining algorithms assume that databases are static, i.e. that they do not change over time. But in real-word applications, sequences are often modified. Thus, it is an important issue to design algorithms for updating SPs in a dynamic database environment. Although some algorithms have been proposed to maintain SPs in dynamic databases, these algorithms may have poor performance, especially when databases contain long sequences or a large number of sequences. This paper addresses this issue by proposing a novel dynamic mining approach named PreFUSP-TREE-MOD to address the problem of maintaining and updating discovered SPs when sequences in a database are modified. The proposed approach adopts the previously proposed pre-large concept using two support thresholds, to avoid scanning the database when possible, for updating the set of discovered patterns. Due to the pruning properties of the pre-large concept, the PreFUSP-TREE-MOD maintenance algorithm can effectively reduce the cost of database scans to maintain and update the built FUSP-tree for sequence modification. When the number of modified sequences is less than the safety bound of the pre-large concept, the proposed maintenance algorithm outperforms traditional SPM algorithms in batch mode, and the state-of-the-art maintenance algorithm in terms of execution time and number of tree nodes.


2013 ◽  
Vol 12 (03) ◽  
pp. 1350024
Author(s):  
R. B. V. Subramanyam ◽  
A. Suresh Rao ◽  
Ramesh Karnati ◽  
Somaraju Suvvari ◽  
D. V. L. N. Somayajulu

Previous studies of Mining Closed Sequential Patterns suggested several heuristics and proposed some computationally effective techniques. Like, Bidirectional Extension with closure checking schemas, Back scan search space pruning, and scan skip optimization used in BIDE (BI-Directional Extension) algorithm. Many researchers were inspired with the efficiency of BIDE, have tried to apply the technique implied by BIDE to various kinds of databases; we toofelt that it can be applied over progressive databases. Without tailoring BIDE, it cannot be applied to dynamic databases. The concept of progressive databases explores the nature of incremental databases by defining the parameters like, Period of Interest (POI), user defined minimum support. An algorithm PISA (Progressive mIning Sequential pAttern mining) was proposed by Huang et al. for finding all sequential patterns over progressive databases. The structure of PISA helps in space utilization by limiting the height of the tree, to the length of POI and this issue is also a motivation for further improvement in this work. In this paper, a tree structure LCT (Label, Customer-id, and Time stamp) is proposed, and an approach formining closed sequential patterns using closure checking schemas across the progressive databases concept. The significance of LCT structure is, confining its height to a maximum of two levels. The algorithmic approach describes that the window size can be increased by one unit of time. The complexity of the proposed algorithmic approach is also analysed. The approach is validated using synthetic data sets available in Internet and shows a better performance in comparison to the existing methods.


2010 ◽  
Vol 58 (Supplement 1) ◽  
pp. 1-5 ◽  
Author(s):  
M. Jolánkai ◽  
F. Nyárai ◽  
K. Kassai

Long-term trials have a twofold role in life sciences, acting as both live laboratories and public collections. Long-term trials are not simply scientific curios or the honoured relics of a museum, but highly valuable live ecological models that can never be replaced or restarted if once terminated or suspended. These trials provide valuable and dynamic databases for solving scientific problems. The present paper is intended to give a brief summary of the crop production aspects of long-term trials.


Author(s):  
Yuri Rogozov ◽  
Alexander Sviridov ◽  
Sergey Kucherov
Keyword(s):  

Author(s):  
Xinming Gao ◽  
Yongshun Gong ◽  
Tiantian Xu ◽  
Jinhu Lu ◽  
Yuhai Zhao ◽  
...  
Keyword(s):  

Author(s):  
Jimmy Ming-Tai Wu ◽  
Qian Teng ◽  
Shahab Tayeb ◽  
Jerry Chun-Wei Lin

AbstractThe high average-utility itemset mining (HAUIM) was established to provide a fair measure instead of genetic high-utility itemset mining (HUIM) for revealing the satisfied and interesting patterns. In practical applications, the database is dynamically changed when insertion/deletion operations are performed on databases. Several works were designed to handle the insertion process but fewer studies focused on processing the deletion process for knowledge maintenance. In this paper, we then develop a PRE-HAUI-DEL algorithm that utilizes the pre-large concept on HAUIM for handling transaction deletion in the dynamic databases. The pre-large concept is served as the buffer on HAUIM that reduces the number of database scans while the database is updated particularly in transaction deletion. Two upper-bound values are also established here to reduce the unpromising candidates early which can speed up the computational cost. From the experimental results, the designed PRE-HAUI-DEL algorithm is well performed compared to the Apriori-like model in terms of runtime, memory, and scalability in dynamic databases.


Sign in / Sign up

Export Citation Format

Share Document