HOVA-FPPM: Flexible Periodic Pattern Mining in Time Series Databases Using Hashed Occurrence Vectors and Apriori Approach

Finding flexible periodic patterns in a time series database is nontrivial due to irregular occurrence of unimportant events, which makes it intractable or computationally intensive for large datasets. There exist various solutions based on Apriori, projection, tree, and other techniques to mine these patterns. However, the existence of constant size tree structure, i.e., suffix tree, with extra information in memory throughout the mining process, redundant and invalid pattern generation, limited types of mined flexible periodic patterns, and repeated traversal over tree data structure for pattern discovery, results in unacceptable space and time complexity. In order to overcome these issues, we introduce an efficient approach called HOVA-FPPM based on Apriori approach with hashed occurrence vectors to find all types of flexible periodic patterns. We do not rely on complex tree structure rather manage necessary information in a hash table for efficient lookup during the mining process. We measured the performance of our proposed approach and compared the results with the baseline approach, i.e., FPPM. The results show that our approach requires lesser time and space, regardless of the data size or period value.

Download Full-text

Mining Dense Periodic Patterns in Time Series Databases

Temporal and Spatio-Temporal Data Mining ◽

10.4018/978-1-59904-387-6.ch003 ◽

2008 ◽

pp. 44-62

Author(s):

Wynne Hsu ◽

Mong Li Lee ◽

Junmei Wang

Keyword(s):

Time Series ◽

Pattern Mining ◽

Real Life ◽

Search Space ◽

Detection Algorithm ◽

Limited Range ◽

Periodic Pattern ◽

Periodic Patterns ◽

Pruning Strategy ◽

Synthetic Datasets

In this chapter, we describe a new periodicity detection algorithm to efficiently discover short period patterns that may exist in only a limited range of the time series. We refer to these patterns as the dense periodic patterns, where the periodicity is focused on part of the time series. We present a dense periodic pattern mining algorithm called DPMiner to find dense periodic patterns, and design a pruning strategy to limit the search space to the feasible periods. Experimental results on both real-life and synthetic datasets indicate that DPMiner is both scalable and efficient.

Download Full-text

Mining Weighted Periodic Patterns by a Weighted Direction Graph Based Approach for Time-Series Databases

Journal of Software ◽

10.17706/jsw.16.6.267-284 ◽

2021 ◽

pp. 267-284

Author(s):

Ye-In Chang ◽

◽

Cheng-An Fu ◽

Jia-Zhen Que

Keyword(s):

Time Series ◽

Data Structure ◽

Data Structures ◽

Processing Time ◽

Pattern Mining ◽

Periodic Pattern ◽

Performance Study ◽

Periodic Patterns ◽

Memory Space ◽

Suffix Trie

Periodic pattern mining in time series database plays an important part in data mining. However, most existing algorithms consider only the count of each item, but do not consider about the value of each item. To consider the value of each item on periodic pattern mining in time series databases, Chanda et al. proposed an algorithm called WPPM. In their algorithm, they construct the suffix trie to store the candidate pattern at first. However, the suffix trie would use too much storage space. In order to decrease the processing time for constructing the data structure, in this paper, we propose two data structures to store the candidates. The first data structure is Weighted Paired Matrix. After scanning the database, we will transform the database into the matrix type, and it is used for the second data structures. Therefore, our algorithm not only can decrease the usage of the memory space, but also the processing time. Because we do not need to use so much time to construct so many nodes and edges. Moreover, wealso consider the case of incremental mining for the increase of the data length. From the performance study, we show that our proposed algorithm based on the Weighted Direction Graphis more efficient than the WPPMalgorithm.

Download Full-text

FREQUENT CORRELATED PERIODIC PATTERN MINING FOR LARGE VOLUME SET USING TIME SERIES DATA

Journal of Computer Science ◽

10.3844/jcssp.2014.2105.2116 ◽

2014 ◽

Vol 10 (10) ◽

pp. 2105-2116 ◽

Cited By ~ 2

Author(s):

G. M. Karthik ◽

S. Karthik

Keyword(s):

Time Series ◽

Large Volume ◽

Pattern Mining ◽

Time Series Data ◽

Series Data ◽

Periodic Pattern

Download Full-text

A FAST ALGORITHM FOR COMPUTING SAMPLE ENTROPY

Advances in Adaptive Data Analysis ◽

10.1142/s1793536911000775 ◽

2011 ◽

Vol 03 (01n02) ◽

pp. 167-186 ◽

Cited By ~ 18

Author(s):

YING JIANG ◽

DONG MAO ◽

YUESHENG XU

Keyword(s):

Time Series ◽

Data Structure ◽

Fast Algorithm ◽

Time Complexity ◽

Computing Time ◽

Sample Entropy ◽

Computational Costs ◽

Tree Data ◽

Input Time ◽

Tree Data Structure

Sample entropy is a widely used tool for quantifying complexity of a biological system. Computing sample entropy directly using its definition requires large computational costs. We propose a fast algorithm based on a k-d tree data structure for computing sample entropy. We prove that the time complexity of the proposed algorithm is [Formula: see text] and its space complexity is O(N log N), where N is the length of the input time series and m is the length of its pattern templates. We present a numerical experiment that demonstrates significant improvement of the proposed algorithm in computing time.

Download Full-text

Discovering Periodic Patterns in Time Series from Twitter Data Set

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.a2014.119420 ◽

2020 ◽

Vol 9 (4) ◽

pp. 92-102

Keyword(s):

Time Series ◽

Periodic Pattern ◽

Periodic Patterns ◽

Time Intervals ◽

Data Set ◽

Periodic Behavior ◽

Expensive Process ◽

Monotonic Property ◽

Pruning Technique ◽

Regular Patterns

The important class of regularities that exist in a time series is nothing but the Partial periodic patterns. These patterns have key properties such as starting, stopping, and restartinganywhere− within a series. Partial periodic patterns areclassifiedinto two types: (i) regular patterns− exhibiting periodic behavior throughout a series with some exceptions and( ii) periodic patterns exhibiting periodic behavior only for particular time intervals within a series. We have focused primarily on finding regular patterns during past studies on partial periodic search. The knowledge pertaining to periodic patterns cannot be ignored. This is because useful information pertaining to seasonal or time-based associations between events is provided bythem. Because of the foll o wi n g two main reasons, finding periodic patterns is a non-trivial task. (i) Each periodic pattern is associated with time-based information pertaining to its durations of periodic appearances in a series. Since the information can vary within and across patterns, obtaining this information ischallenging. (ii) As they do not satisfy the anti-monotonic property, finding all periodic patterns is a computationally expensive process. In this paper, periodic pattern model is proposed by addressing the above issues. Periodic Pattern growth algorithm along with an efficient pruning technique is also proposed to discover these patterns. The results through Experimentation have shown that Periodic patterns canbe really useful and it has also proven that our algorithm isnoteworthy.

Download Full-text

The Benefits of Using Prefix Tree Data Structure in Multi-Level Frequent Pattern Mining

2007 2nd International Workshop on Soft Computing Applications ◽

10.1109/sofa.2007.4318326 ◽

2007 ◽

Author(s):

Mirela Pater ◽

Daniela E. Popescu

Keyword(s):

Data Structure ◽

Pattern Mining ◽

Frequent Pattern Mining ◽

Frequent Pattern ◽

Prefix Tree ◽

Tree Data ◽

Multi Level ◽

Tree Data Structure

Download Full-text

An intelligent prediction system for time series data using periodic pattern mining in temporal databases

Proceedings of the First International Conference on Intelligent Interactive Technologies and Multimedia - IITM '10 ◽

10.1145/1963564.1963592 ◽

2010 ◽

Cited By ~ 4

Author(s):

S. Sridevi ◽

S. Rajaram ◽

C. Swadhikar

Keyword(s):

Time Series ◽

Pattern Mining ◽

Time Series Data ◽

Temporal Databases ◽

Series Data ◽

Periodic Pattern ◽

Prediction System

Download Full-text

Effective periodic pattern mining in time series databases

Expert Systems with Applications ◽

10.1016/j.eswa.2012.12.017 ◽

2013 ◽

Vol 40 (8) ◽

pp. 3015-3027 ◽

Cited By ~ 28

Author(s):

Manziba Akanda Nishi ◽

Chowdhury Farhan Ahmed ◽

Md. Samiullah ◽

Byeong-Soo Jeong

Keyword(s):

Time Series ◽

Pattern Mining ◽

Periodic Pattern

Download Full-text

Mining Fuzzy Time Interval Periodic Patterns in Smart Home Data

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v8i5.pp3374-3385 ◽

2018 ◽

Vol 8 (5) ◽

pp. 3374

Author(s):

Imam Mukhlash ◽

Desna Yuanda ◽

Mohammad Iqbal

Keyword(s):

Data Mining ◽

Smart Home ◽

Pattern Mining ◽

Technology Development ◽

Sensor Data ◽

Mining Machine ◽

Periodic Pattern ◽

Time Interval ◽

Periodic Patterns ◽

Using Data

A convergence of technologies in data mining, machine learning, and a persuasive computer has led to an interest in the development of smart environment to help human with functions, such as monitoring and remote health interventions, activity recognition, energy saving. The need for technology development was confirmed again by the aging population and the importance of individual independent in their own homes. Pattern mining on sensor data from smart home is widely applied in research such as using data mining. In this paper, we proposed a periodic pattern mining in smart house data that is integrated between the FP-Growth PrefixSpan algorithm and a fuzzy approach, which is called as fuzzy-time interval periodic patterns mining. Our purpose is to obtain the periodic pattern of activity at various time intervals. The simulation results show that the resident activities can be recognized by analyzing the triggered sensor patterns, and the impacts of minimum support values to the number of fuzzy-time-interval periodic patterns generated. Moreover, fuzzy-time-interval periodic patterns that are generated encourages to find daily or anomalies resident’s habits.

Download Full-text

Massive Parallelization of the Global Hydrological Model mHM

10.5194/egusphere-egu2020-14396 ◽

2020 ◽

Author(s):

Maren Kaluza ◽

Luis Samaniego ◽

Stephan Thober ◽

Robert Schweppe ◽

Rohini Kumar ◽

...

Keyword(s):

Time Series ◽

High Performance ◽

Hydrological Model ◽

Graph Algorithm ◽

Global Scale ◽

Communication Time ◽

Tree Data ◽

Tree Data Structure ◽

Trivial Graph ◽

Massive Parallelization

Parameter estimation of a global-scale, high-resolution hydrological model requires a powerful supercomputer and an optimized parallelization algorithm. Improving the efficiency of such an implementation is essential to advance hydrological science and to minimize the uncertainty of the major hydrologic fluxes and storages at continental and global scales. Within the ESM project [1], the main transfer-function parameters of the mHM model will be estimated by jointly assimilating evapotranspiration (ET) from FLUXNET, the TWS anomaly from GRACE (NASA) and streamflow time series from 5500 GRDC gauges to achieve this goal.For the parallelization of the objective functions, a hybrid MPI-OpenMP scheme is implemented. While the parallelization into equally sized subdomains for cell-wise computations&#160; of fluxes (e.g., ET, TWS) is trivial, cell-to-cell fluxes need to be computed for streamflow routing. For time series datasets, the advanced parallelization algorithm MPI parallelized Decomposition of Forest (MDF) will be used.&#160;In this study, we go beyond the standard approach which decomposes the river into tributaries (e.g. the Pfaffenstetter System [2]). We apply a non-trivial graph algorithm to decompose each river-network into a tree data structure with nodes representing subbasin domains of almost equal size [3].&#160;We analyze several aspects affecting the MDF parallelization:&#160; (1) the communication time between nodes; (2) buffering data before sending; (3) optimizing total node idle time and total run time; (4) memory imbalance between master processes and other processes.&#160;We run the mHM model on the high-performance JUWELS supercomputer at J&#252;lich Supercomputing Center (JSC) where the (routing) code efficiently scales up to ~180 nodes with 96 CPUs each. We discuss different parallelization aspects,&#160; including the effect of parameters onto the scaling of MDF and we show the benefits of MDF over a non-parallelized routing module.[1] https://www.esm-project.net/ [2] http://proceedings.esri.com/library/userconf/proc01/professional/papers/pap1008/p1008.htm [3] https://meetingorganizer.copernicus.org/EGU2019/EGU2019-8129-1.pdf

Download Full-text