Exploring Decomposition for Solving Pattern Mining Problems

Youcef Djenouri; Jerry Chun-Wei Lin; Kjetil Nørvåg; Heri Ramampiaro; Philip S. Yu

doi:10.1145/3439771

Exploring Decomposition for Solving Pattern Mining Problems

ACM Transactions on Management Information Systems ◽

10.1145/3439771 ◽

2021 ◽

Vol 12 (2) ◽

pp. 1-36

Author(s):

Youcef Djenouri ◽

Jerry Chun-Wei Lin ◽

Kjetil Nørvåg ◽

Heri Ramampiaro ◽

Philip S. Yu

Keyword(s):

Pattern Mining ◽

Good Accuracy ◽

Memory Usage ◽

Clustering Techniques ◽

Mining Technique ◽

Mining Algorithm ◽

Transaction Database ◽

Highly Correlated ◽

Mining Algorithms ◽

Gpu Implementation

This article introduces a highly efficient pattern mining technique called Clustering-based Pattern Mining (CBPM). This technique discovers relevant patterns by studying the correlation between transactions in the transaction database based on clustering techniques. The set of transactions is first clustered, such that highly correlated transactions are grouped together. Next, we derive the relevant patterns by applying a pattern mining algorithm to each cluster. We present two different pattern mining algorithms, one applying an approximation-based strategy and another based on an exact strategy. The approximation-based strategy takes into account only the clusters, whereas the exact strategy takes into account both clusters and shared items between clusters. To boost the performance of the CBPM, a GPU-based implementation is investigated. To evaluate the CBPM framework, we perform extensive experiments on several pattern mining problems. The results from the experimental evaluation show that the CBPM provides a reduction in both the runtime and memory usage. Also, CBPM based on the approximate strategy provides good accuracy, demonstrating its effectiveness and feasibility. Our GPU implementation achieves significant speedup of up to 552× on a single GPU using big transaction databases.

Improved Strategy for High-Utility Pattern Mining Algorithm

Mathematical Problems in Engineering ◽

10.1155/2020/1971805 ◽

2020 ◽

Vol 2020 ◽

pp. 1-11

Author(s):

Le Wang ◽

Shui Wang ◽

Haiyan Li ◽

Chunliang Zhou

Keyword(s):

Pattern Mining ◽

State Of The Art ◽

Search Space ◽

Research Topics ◽

Main Research ◽

Mining Algorithm ◽

Temporal Efficiency ◽

High Utility ◽

High Utility Patterns ◽

Mining Algorithms

High-utility pattern mining is a research hotspot in the field of pattern mining, and one of its main research topics is how to improve the efficiency of the mining algorithm. Based on the study on the state-of-the-art high-utility pattern mining algorithms, this paper proposes an improved strategy that removes noncandidate items from the global header table and local header table as early as possible, thus reducing search space and improving efficiency of the algorithm. The proposed strategy is applied to the algorithm EFIM (EFficient high-utility Itemset Mining). Experimental verification was carried out on nine typical datasets (including two large datasets); results show that our strategy can effectively improve temporal efficiency for mining high-utility patterns.

An Application of Improved Gap-BIDE Algorithm for Discovering Access Patterns

Applied Computational Intelligence and Soft Computing ◽

10.1155/2012/593147 ◽

2012 ◽

Vol 2012 ◽

pp. 1-7 ◽

Cited By ~ 1

Author(s):

Xiuming Yu ◽

Meijing Li ◽

Taewook Kim ◽

Seon-phil Jeong ◽

Keun Ho Ryu

Keyword(s):

Pattern Mining ◽

Sequential Pattern ◽

Access Pattern ◽

Large Database ◽

Log Data ◽

Web Log ◽

Mining Algorithm ◽

User Access ◽

Access Patterns ◽

Mining Algorithms

Discovering access patterns from web log data is a typical sequential pattern mining application, and a lot of access pattern mining algorithms have been proposed. In this paper, we propose an improved approach of Gap-BIDE algorithm to extract user access patterns from web log data. Compared with the previous Gap-BIDE algorithm, a process of getting a large event set is proposed in the provided algorithm; the proposed approach can find out the frequent events by discarding the infrequent events which do not occur continuously in an accessing time before generating candidate patterns. In the experiment, we compare the previous access pattern mining algorithm with the proposed one, which shows that our approach is very efficient in discovering access patterns in large database.

An Efficient Incremental Mining Algorithm for Discovering Sequential Pattern in Wireless Sensor Network Environments

Sensors ◽

10.3390/s19010029 ◽

2018 ◽

Vol 19 (1) ◽

pp. 29 ◽

Cited By ~ 2

Author(s):

Xin Lyu ◽

Hongxu Ma

Keyword(s):

Pattern Mining ◽

Sequential Pattern ◽

Sensor Data ◽

Incremental Algorithm ◽

Wireless Sensor ◽

Incremental Mining ◽

The Real ◽

Mining Algorithm ◽

Intelligent Decision ◽

Mining Algorithms

Wireless sensor networks (WSNs) are an important type of network for sensing the environment and collecting information. It can be deployed in almost every type of environment in the real world, providing a reliable and low-cost solution for management. Huge amounts of data are produced from WSNs all the time, and it is significant to process and analyze data effectively to support intelligent decision and management. However, the new characteristics of sensor data, such as rapid growth and frequent updates, bring new challenges to the mining algorithms, especially given the time constraints for intelligent decision-making. In this work, an efficient incremental mining algorithm for discovering sequential pattern (novel incremental algorithm, NIA) is proposed, in order to enhance the efficiency of the whole mining process. First, a reasoned proof is given to demonstrate how to update the frequent sequences incrementally, and the mining space is greatly narrowed based on the proof. Second, an improvement is made on PrefixSpan, which is a classic sequential pattern mining algorithm with a high-complexity recursive process. The improved algorithm, named PrefixSpan+, utilizes a mapping structure to extend the prefixes to sequential patterns, making the mining step more efficient. Third, a fast support number-counting algorithm is presented to choose frequent sequences from the potential frequent sequences. A reticular tree is constructed to store all the potential frequent sequences according to subordinate relations between them, and then the support degree can be efficiently calculated without scanning the original database repeatedly. NIA is compared with various kinds of mining algorithms via intensive experiments on the real monitoring datasets, benchmarking datasets and synthetic datasets from aspects including time cost, sensitivity of factors, and space cost. The results show that NIA performs better than the existed methods.

Supervised sequential pattern mining of event sequences in sport to identify important patterns of play: An application to rugby union

PLoS ONE ◽

10.1371/journal.pone.0256329 ◽

2021 ◽

Vol 16 (9) ◽

pp. e0256329

Author(s):

Rory Bunker ◽

Keisuke Fujii ◽

Hiroyuki Hanada ◽

Ichiro Takeuchi

Keyword(s):

Pattern Mining ◽

Sequential Pattern Mining ◽

Rugby Union ◽

Sequential Pattern ◽

Frequent Patterns ◽

Event Sequences ◽

Mining Algorithm ◽

Original Dataset ◽

And Performance ◽

Mining Algorithms

Given a set of sequences comprised of time-ordered events, sequential pattern mining is useful to identify frequent subsequences from different sequences or within the same sequence. However, in sport, these techniques cannot determine the importance of particular patterns of play to good or bad outcomes, which is often of greater interest to coaches and performance analysts. In this study, we apply a recently proposed supervised sequential pattern mining algorithm called safe pattern pruning (SPP) to 490 labelled event sequences representing passages of play from one rugby team’s matches in the 2018 Japan Top League season. We obtain patterns that are the most discriminative between scoring and non-scoring outcomes from both the team’s and opposition teams’ perspectives using SPP, and compare these with the most frequent patterns obtained with well-known unsupervised sequential pattern mining algorithms when applied to subsets of the original dataset, split on the label. From our obtained results, line breaks, successful line-outs, regained kicks in play, repeated phase-breakdown play, and failed exit plays by the opposition team were found to be the patterns that discriminated most between the team scoring and not scoring. Opposition team line breaks, errors made by the team, opposition team line-outs, and repeated phase-breakdown play by the opposition team were found to be the patterns that discriminated most between the opposition team scoring and not scoring. It was also found that, probably because of the supervised nature and pruning/safe-screening mechanisms of SPP, compared to the patterns obtained by the unsupervised methods, those obtained by SPP were more sophisticated in terms of containing a greater variety of events, and when interpreted, the SPP-obtained patterns would also be more useful for coaches and performance analysts.

RAKING: An Efficient K-Maximal Frequent Pattern Mining Algorithm on Uncertain Graph Database

Chinese Journal of Computers ◽

10.3724/sp.j.1016.2010.01387 ◽

2010 ◽

Vol 33 (8) ◽

pp. 1387-1395 ◽

Cited By ~ 4

Author(s):

Meng HAN ◽

Wei ZHANG ◽

Jian-Zhong LI

Keyword(s):

Pattern Mining ◽

Frequent Pattern Mining ◽

Frequent Pattern ◽

Graph Database ◽

Uncertain Graph ◽

Mining Algorithm ◽

Maximal Frequent Pattern

Customized frequent patterns mining algorithms for enhanced Top-Rank-K frequent pattern mining

Expert Systems with Applications ◽

10.1016/j.eswa.2020.114530 ◽

2021 ◽

Vol 169 ◽

pp. 114530

Author(s):

Areej Ahmad Abdelaal ◽

Sa'ed Abed ◽

Mohammad Al-Shayeji ◽

Mohammad Allaho

Keyword(s):

Pattern Mining ◽

Frequent Pattern Mining ◽

Frequent Pattern ◽

Frequent Patterns ◽

Mining Algorithms

An Efficient Weighted Negative Sequence Pattern Mining Algorithm with Multiple Minimum Support

Advances in Intelligent Systems and Computing - International Conference on Applications and Techniques in Cyber Intelligence ATCI 2019 ◽

10.1007/978-3-030-25128-4_58 ◽

2019 ◽

pp. 460-469

Author(s):

Dongyuan Wang ◽

He Jiang ◽

Aixin Yang

Keyword(s):

Pattern Mining ◽

Sequence Pattern ◽

Negative Sequence ◽

Minimum Support ◽

Mining Algorithm

Network Alarm Flood Pattern Mining Algorithm Based on Multi-dimensional Association

Proceedings of the 21st ACM International Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems - MSWIM '18 ◽

10.1145/3242102.3242130 ◽

2018 ◽

Author(s):

Xudong Zhang ◽

Yuebin Bai ◽

Peng Feng ◽

Weitao Wang ◽

Shuai Liu ◽

...

Keyword(s):

Pattern Mining ◽

Mining Algorithm

Sequential Pattern Mining Algorithm Based on Interestingness

2018 1st International Cognitive Cities Conference (IC3) ◽

10.1109/ic3.2018.00024 ◽

2018 ◽

Author(s):

Tao Li ◽

Shuaichi Zhang ◽

Hui Chen ◽

Yongjun Ren ◽

Xiang Li ◽

...

Keyword(s):

Pattern Mining ◽

Sequential Pattern Mining ◽

Sequential Pattern ◽

Mining Algorithm

Attack pattern mining algorithm based on security log

2017 IEEE International Conference on Intelligence and Security Informatics (ISI) ◽

10.1109/isi.2017.8004918 ◽

2017 ◽

Cited By ~ 4

Author(s):

Keyi Li ◽

Yang Li ◽

Jianyi Liu ◽

Ru Zhang ◽

Xi Duan

Keyword(s):

Pattern Mining ◽

Attack Pattern ◽

Mining Algorithm