sequential pattern mining
Recently Published Documents


TOTAL DOCUMENTS

554
(FIVE YEARS 122)

H-INDEX

27
(FIVE YEARS 6)

2022 ◽  
Vol 16 (3) ◽  
pp. 1-26
Author(s):  
Jerry Chun-Wei Lin ◽  
Youcef Djenouri ◽  
Gautam Srivastava ◽  
Yuanfa Li ◽  
Philip S. Yu

High-utility sequential pattern mining (HUSPM) is a hot research topic in recent decades since it combines both sequential and utility properties to reveal more information and knowledge rather than the traditional frequent itemset mining or sequential pattern mining. Several works of HUSPM have been presented but most of them are based on main memory to speed up mining performance. However, this assumption is not realistic and not suitable in large-scale environments since in real industry, the size of the collected data is very huge and it is impossible to fit the data into the main memory of a single machine. In this article, we first develop a parallel and distributed three-stage MapReduce model for mining high-utility sequential patterns based on large-scale databases. Two properties are then developed to hold the correctness and completeness of the discovered patterns in the developed framework. In addition, two data structures called sidset and utility-linked list are utilized in the developed framework to accelerate the computation for mining the required patterns. From the results, we can observe that the designed model has good performance in large-scale datasets in terms of runtime, memory, efficiency of the number of distributed nodes, and scalability compared to the serial HUSP-Span approach.


Author(s):  
Yan Li ◽  
Shuai Zhang ◽  
Lei Guo ◽  
Jing Liu ◽  
Youxi Wu ◽  
...  

Author(s):  
Youxi Wu ◽  
Zhu Yuan ◽  
Yan Li ◽  
Lei Guo ◽  
Philippe Fournier-Viger ◽  
...  

Author(s):  
S Imavathy ◽  
M. Chinnadurai

Now a days the pattern recognition is the major challenge in the field of data mining. The researchers focus on using data mining for wide variety of applications like market basket analysis, advertisement, and medical field etc., Here the transcriptional database is used for all the conventional algorithms, which is based on daily usage of object and/or performance of patients. Here the proposed research work uses sequential pattern mining approach using classification technique of Threshold based Support Vector Machine learning (T-SVM) algorithm. The pattern mining is to give the variable according to the user’s interest by statistical model. Here this proposed research work is used to analysis the gene sequence datasets. Further, the T-SVM technique is used to classify the dataset based on sequential pattern mining approach. Especially, the threshold-based model is used for predicting the upcoming state of interest by sequential patterns. Because this makes deeper understanding about sequential input data and classify the result by providing threshold values. Therefore, the proposed method is efficient than the conventional method by getting the value of achievable classification accuracy, precision, False Positive rate, True Positive rate and it also reduces operating time. This proposed model is performed in MATLAB in the adaptation of 2018a.


2021 ◽  
Vol 11 (22) ◽  
pp. 10683
Author(s):  
Jakkrit Kaewyotha ◽  
Wararat Songpan

Product layout significantly impacts consumer demand for purchases in supermarkets. Product shelf renovation is a crucial process that can increase supermarket efficiency. The development of a sequential pattern mining algorithm for investigating the correlation patterns of product layouts, solving the numerous problems of shelf design, and the development of an algorithm that considers in-store purchase and shelf profit data with the goal of improving supermarket efficiency, and consequently profitability, were the goals of this research. The authors of this research developed two types of algorithms to enhance efficiency and reach the goals. The first was a PrefixSpan algorithm, which was used to optimize sequential pattern mining, known as the PrefixSpan mining approach. The second was a new multi-objective design that considered the objective functions of profit volumes and closeness rating using the mutation-based harmony search (MBHS) optimization algorithm, which was used to evaluate the performance of the first algorithm based on the PrefixSpan algorithm. The experimental results demonstrated that the PrefixSpan algorithm can determine correlation rules more efficiently and accurately ascertain correlation rules better than any other algorithms used in the study. Additionally, the authors found that MBHS with a new multi-objective design can effectively find the product layout in supermarket solutions. Finally, the proposed product layout algorithm was found to lead to higher profit volumes and closeness ratings than traditional shelf layouts, as well as to be more efficient than other algorithms.


PLoS ONE ◽  
2021 ◽  
Vol 16 (9) ◽  
pp. e0256329
Author(s):  
Rory Bunker ◽  
Keisuke Fujii ◽  
Hiroyuki Hanada ◽  
Ichiro Takeuchi

Given a set of sequences comprised of time-ordered events, sequential pattern mining is useful to identify frequent subsequences from different sequences or within the same sequence. However, in sport, these techniques cannot determine the importance of particular patterns of play to good or bad outcomes, which is often of greater interest to coaches and performance analysts. In this study, we apply a recently proposed supervised sequential pattern mining algorithm called safe pattern pruning (SPP) to 490 labelled event sequences representing passages of play from one rugby team’s matches in the 2018 Japan Top League season. We obtain patterns that are the most discriminative between scoring and non-scoring outcomes from both the team’s and opposition teams’ perspectives using SPP, and compare these with the most frequent patterns obtained with well-known unsupervised sequential pattern mining algorithms when applied to subsets of the original dataset, split on the label. From our obtained results, line breaks, successful line-outs, regained kicks in play, repeated phase-breakdown play, and failed exit plays by the opposition team were found to be the patterns that discriminated most between the team scoring and not scoring. Opposition team line breaks, errors made by the team, opposition team line-outs, and repeated phase-breakdown play by the opposition team were found to be the patterns that discriminated most between the opposition team scoring and not scoring. It was also found that, probably because of the supervised nature and pruning/safe-screening mechanisms of SPP, compared to the patterns obtained by the unsupervised methods, those obtained by SPP were more sophisticated in terms of containing a greater variety of events, and when interpreted, the SPP-obtained patterns would also be more useful for coaches and performance analysts.


Author(s):  
Yuehua Wang ◽  
Youxi Wu ◽  
Yan Li ◽  
Fang Yao ◽  
Philippe Fournier-Viger ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document