Highly Scalable Sequential Pattern Mining Based on MapReduce Model on the Cloud

Author(s):  
Chun-Chieh Chen ◽  
Chi-Yao Tseng ◽  
Ming-Syan Chen
2022 ◽  
Vol 16 (3) ◽  
pp. 1-26
Author(s):  
Jerry Chun-Wei Lin ◽  
Youcef Djenouri ◽  
Gautam Srivastava ◽  
Yuanfa Li ◽  
Philip S. Yu

High-utility sequential pattern mining (HUSPM) is a hot research topic in recent decades since it combines both sequential and utility properties to reveal more information and knowledge rather than the traditional frequent itemset mining or sequential pattern mining. Several works of HUSPM have been presented but most of them are based on main memory to speed up mining performance. However, this assumption is not realistic and not suitable in large-scale environments since in real industry, the size of the collected data is very huge and it is impossible to fit the data into the main memory of a single machine. In this article, we first develop a parallel and distributed three-stage MapReduce model for mining high-utility sequential patterns based on large-scale databases. Two properties are then developed to hold the correctness and completeness of the discovered patterns in the developed framework. In addition, two data structures called sidset and utility-linked list are utilized in the developed framework to accelerate the computation for mining the required patterns. From the results, we can observe that the designed model has good performance in large-scale datasets in terms of runtime, memory, efficiency of the number of distributed nodes, and scalability compared to the serial HUSP-Span approach.


Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1525
Author(s):  
Felipe Vieira ◽  
Cristian Cechinel ◽  
Vinicius Ramos ◽  
Fabián Riquelme ◽  
Rene Noel ◽  
...  

Communicating in social and public environments are considered professional skills that can strongly influence career development. Therefore, it is important to proper train and evaluate students in this kind of abilities so that they can better interact in their professional relationships, during the resolution of problems, negotiations and conflict management. This is a complex problem as it involves corporal analysis and the assessment of aspects that until recently were almost impossible to quantitatively measure. Nowadays, a number of new technologies and sensors have being developed for the capture of different kinds of contextual and personal information, but these technologies were not yet fully integrated inside learning settings. In this context, this paper presents a framework to facilitate the analysis and detection of patterns of students in oral presentations. Four steps are proposed for the given framework: Data collection, Statistical Analysis, Clustering, and Sequential Pattern Mining. Data Collection step is responsible for the collection of students interactions during presentations and the arrangement of data for further analysis. Statistical Analysis provides a general understanding of the data collected by showing the differences and similarities of the presentations along the semester. The Clustering stage segments students into groups according to well-defined attributes helping to observe different corporal patterns of the students. Finally, Sequential Pattern Mining step complements the previous stages allowing the identification of sequential patterns of postures in the different groups. The framework was tested in a case study with data collected from 222 freshman students of Computer Engineering (CE) course at three different times during two different years. The analysis made it possible to segment the presenters into three distinct groups according to their corporal postures. The statistical analysis helped to assess how the postures of the students evolved throughout each year. The sequential pattern mining provided a complementary perspective for data evaluation and helped to observe the most frequent postural sequences of the students. Results show the framework could be used as a guidance to provide students automated feedback throughout their presentations and can serve as background information for future comparisons of students presentations from different undergraduate courses.


2021 ◽  
pp. 1-15
Author(s):  
Youxi Wu ◽  
Yuehua Wang ◽  
Yan Li ◽  
Xingquan Zhu ◽  
Xindong Wu

Author(s):  
Tao Li ◽  
Shuaichi Zhang ◽  
Hui Chen ◽  
Yongjun Ren ◽  
Xiang Li ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document