A fast and parallel algorithm for frequent pattern mining from big data in many-task environments

Wei Tee Lin; Chih Ping Chu

doi:10.1504/ijhpcn.2017.084244

A fast and parallel algorithm for frequent pattern mining from big data in many-task environments

International Journal of High Performance Computing and Networking ◽

10.1504/ijhpcn.2017.084244 ◽

2017 ◽

Vol 10 (3) ◽

pp. 157 ◽

Cited By ~ 1

Author(s):

Wei Tee Lin ◽

Chih Ping Chu

Keyword(s):

Big Data ◽

Parallel Algorithm ◽

Pattern Mining ◽

Frequent Pattern Mining ◽

Frequent Pattern ◽

Task Environments

Download Full-text

A fast and parallel algorithm for frequent pattern mining from big data in many-task environments

International Journal of High Performance Computing and Networking ◽

10.1504/ijhpcn.2017.10005138 ◽

2017 ◽

Vol 10 (3) ◽

pp. 157

Author(s):

Wei Tee Lin ◽

Chih Ping Chu

Keyword(s):

Big Data ◽

Parallel Algorithm ◽

Pattern Mining ◽

Frequent Pattern Mining ◽

Frequent Pattern ◽

Task Environments

Download Full-text

A parallel approach for high utility-based frequent pattern mining in a big data environment

Iran Journal of Computer Science ◽

10.1007/s42044-021-00083-5 ◽

2021 ◽

Author(s):

Krishna Kumar Mohbey ◽

Sunil Kumar

Keyword(s):

Big Data ◽

Pattern Mining ◽

Frequent Pattern Mining ◽

Frequent Pattern ◽

Data Environment ◽

High Utility

Download Full-text

Swarm intelligent based online feature selection (OFS) and weighted entropy frequent pattern mining (WEFPM) algorithm for big data analysis

Cluster Computing ◽

10.1007/s10586-017-1489-9 ◽

2017 ◽

Vol 22 (S5) ◽

pp. 11791-11803

Author(s):

S. Gayathri Devi ◽

M. Sabrigiriraj

Keyword(s):

Feature Selection ◽

Big Data ◽

Data Analysis ◽

Pattern Mining ◽

Frequent Pattern Mining ◽

Big Data Analysis ◽

Frequent Pattern ◽

Online Feature Selection ◽

Weighted Entropy

Download Full-text

A New Parallel Algorithm for Frequent Pattern Mining

Journal of Computational Intelligence and Electronic Systems ◽

10.1166/jcies.2013.1048 ◽

2013 ◽

Vol 2 (1) ◽

pp. 55-59 ◽

Cited By ~ 1

Author(s):

Saeid Masoumi ◽

Raziyeh Tabatabaei ◽

Mohammad-Reza Feizi-Derakhshi ◽

Khatereh Tabatabaei

Keyword(s):

Parallel Algorithm ◽

Pattern Mining ◽

Frequent Pattern Mining ◽

Frequent Pattern

Download Full-text

Big Data Frequent Pattern Mining

Frequent Pattern Mining ◽

10.1007/978-3-319-07821-2_10 ◽

2014 ◽

pp. 225-259 ◽

Cited By ~ 8

Author(s):

David C. Anastasiu ◽

Jeremy Iverson ◽

Shaden Smith ◽

George Karypis

Keyword(s):

Big Data ◽

Pattern Mining ◽

Frequent Pattern Mining ◽

Frequent Pattern

Download Full-text

An Innovative Framework for Supporting Cognitive-Based Big Data Analytics for Frequent Pattern Mining

2018 IEEE International Conference on Cognitive Computing (ICCC) ◽

10.1109/iccc.2018.00014 ◽

2018 ◽

Cited By ~ 3

Author(s):

Deyu Deng ◽

Carson K. Leung ◽

Bryan H. Wodi ◽

Jialiang Yu ◽

Hao Zhang ◽

...

Keyword(s):

Big Data ◽

Data Analytics ◽

Pattern Mining ◽

Big Data Analytics ◽

Frequent Pattern Mining ◽

Frequent Pattern

Download Full-text

Constrained Frequent Pattern Mining from Big Data Via Crowdsourcing

Big Data Applications and Services 2017 - Advances in Intelligent Systems and Computing ◽

10.1007/978-981-13-0695-2_9 ◽

2018 ◽

pp. 69-79 ◽

Cited By ~ 2

Author(s):

Calvin S. H. Hoi ◽

Daniyal Khowaja ◽

Carson K. Leung

Keyword(s):

Big Data ◽

Pattern Mining ◽

Frequent Pattern Mining ◽

Frequent Pattern

Download Full-text

Frequent Pattern Mining over Unstructured Data using Semi-Structured Doc-Model and Pattern Ranking

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit206216 ◽

2020 ◽

pp. 36-42

Author(s):

Sudhir Tirumalasetty ◽

A. Divya ◽

D. Rahitya Lakshmi ◽

Ch. Durga Bhavani ◽

D. Anusha

Keyword(s):

Data Mining ◽

Big Data ◽

Pattern Mining ◽

Frequent Pattern Mining ◽

Unstructured Data ◽

Frequent Pattern ◽

Frequent Patterns ◽

Innovative Methods ◽

Mining Algorithms ◽

Doc Model

Frequent pattern mining is an essential data-mining task, with a goal of discovering knowledge in the form of repeated patterns. Many efficient pattern-mining algorithms have been discovered in the last two decades, yet most do not scale to the type of data we are presented with today, the so-called “Big Data”. Scalable parallel algorithms hold the key to solving the problem in this context. This paper reviews recent advances in parallel frequent pattern mining, analysing them through the Big Data lens. Load balancing and work partitioning are the major challenges to be conquered. These challenges always invoke innovative methods to do, as Big Data evolves with no limits. The biggest challenge than before is conquering unstructured data for finding frequent patterns. To accomplish this Semi Structured Doc-Model and ranking of patterns are used.

Download Full-text

Load Balancing Approach Parallel Algorithm for Frequent Pattern Mining

Lecture Notes in Computer Science - Parallel Computing Technologies ◽

10.1007/978-3-540-73940-1_63 ◽

2007 ◽

pp. 623-631 ◽

Cited By ~ 8

Author(s):

Kun-Ming Yu ◽

Jiayi Zhou ◽

Wei Chen Hsiao

Keyword(s):

Load Balancing ◽

Parallel Algorithm ◽

Pattern Mining ◽

Frequent Pattern Mining ◽

Frequent Pattern

Download Full-text

A MapReduce-Based Parallel Frequent Pattern Growth Algorithm for Spatiotemporal Association Analysis of Mobile Trajectory Big Data

Complexity ◽

10.1155/2018/2818251 ◽

2018 ◽

Vol 2018 ◽

pp. 1-16 ◽

Cited By ~ 7

Author(s):

Dawen Xia ◽

Xiaonan Lu ◽

Huaqing Li ◽

Wendong Wang ◽

Yantao Li ◽

...

Keyword(s):

Big Data ◽

Association Analysis ◽

Intelligent Transportation Systems ◽

Large Scale ◽

Pattern Mining ◽

Frequent Pattern Mining ◽

Transportation Systems ◽

Frequent Pattern ◽

Trajectory Data ◽

Pattern Growth

Frequent pattern mining is an effective approach for spatiotemporal association analysis of mobile trajectory big data in data-driven intelligent transportation systems. While existing parallel algorithms have been successfully applied to frequent pattern mining of large-scale trajectory data, two major challenges are how to overcome the inherent defects of Hadoop to cope with taxi trajectory big data including massive small files and how to discover the implicitly spatiotemporal frequent patterns with MapReduce. To conquer these challenges, this paper presents a MapReduce-based Parallel Frequent Pattern growth (MR-PFP) algorithm to analyze the spatiotemporal characteristics of taxi operating using large-scale taxi trajectories with massive small file processing strategies on a Hadoop platform. More specifically, we first implement three methods, that is, Hadoop Archives (HAR), CombineFileInputFormat (CFIF), and Sequence Files (SF), to overcome the existing defects of Hadoop and then propose two strategies based on their performance evaluations. Next, we incorporate SF into Frequent Pattern growth (FP-growth) algorithm and then implement the optimized FP-growth algorithm on a MapReduce framework. Finally, we analyze the characteristics of taxi operating in both spatial and temporal dimensions by MR-PFP in parallel. The results demonstrate that MR-PFP is superior to existing Parallel FP-growth (PFP) algorithm in efficiency and scalability.

Download Full-text