Load Balancing Approach Parallel Algorithm for Frequent Pattern Mining

Author(s):  
Kun-Ming Yu ◽  
Jiayi Zhou ◽  
Wei Chen Hsiao
2013 ◽  
Vol 2 (1) ◽  
pp. 55-59 ◽  
Author(s):  
Saeid Masoumi ◽  
Raziyeh Tabatabaei ◽  
Mohammad-Reza Feizi-Derakhshi ◽  
Khatereh Tabatabaei

Author(s):  
Kun-Ming Yu ◽  
Sheng-Hui Liu ◽  
Li-Wei Zhou ◽  
Shu-Hao Wu

Frequent pattern mining has been playing an essential role in knowledge discovery and data mining tasks that try to find usable patterns from databases. Efficiency is especially crucial for an algorithm in order to find frequent itemsets from a large database. Numerous methods have been proposed to solve this problem, such as Apriori and FP-growth. These are regarded as fundamental frequent pattern mining methods. In addition, parallel computing architectures, such as an on-cloud platform, a grid system, multi-core and GPU platform, have been popular in data mining. However, most of the algorithms have been proposed without considering the prevalent multi-core architectures. In this study, multi-core architectures were used as well as two high efficiency load balancing parallel data mining methods based on the Apriori algorithm. The main goal of the proposed algorithms was to reduce the massive number of duplicate candidates generated using previous methods. This goal was achieved for, in this detailed experimental study the algorithms performed better than the previous methods. The experimental results demonstrated that the proposed algorithms had dramatically reduced computation time when using more threads. Moreover, the observations showed that the workload was equally balanced among the computing units.


2019 ◽  
Vol 9 (9) ◽  
pp. 1859 ◽  
Author(s):  
Chun-Cheng Lin ◽  
Wei-Ching Li ◽  
Ju-Chin Chen ◽  
Wen-Yu Chung ◽  
Sheng-Hao Chung ◽  
...  

Data mining is a set of methods used to mine hidden information from data. It mainly includes frequent pattern mining, sequential pattern mining, classification, and clustering. Frequent pattern mining is used to discover the correlation among various sets of items within large databases. The rapid upward trend in data size slows the mining of frequent patterns. Numerous studies have attempted to develop algorithms that operate in distributed computing environments to accelerate the mining process. FLR-mining (Fast, Load balancing and Resource efficient mining algorithm) is one of the fastest methods of mining with efficient consideration of load balancing and resources. FLR-mining can automatically determine the appropriate number of computing nodes. However, FLR-mining and existing methods assume that the network bandwidth is constant. In practical distributed and many-task computing systems, this assumption fails because there are packet collisions caused by many mining tasks that run in a simultaneous manner. Therefore, a method that can consider the varying network bandwidth is necessary. In this study, we propose a method that can rapidly mine frequent patterns under the varying network bandwidth. The proposed method can also determine the appropriate number of computing nodes to efficiently utilize computing resources and achieve load balancing. Through empirical evaluation, the proposed method is shown to deliver excellent performance in terms of execution efficiency and load balancing.


Information sharing among the associations is a general development in a couple of zones like business headway and exhibiting. As bit of the touchy principles that ought to be kept private may be uncovered and such disclosure of delicate examples may impacts the advantages of the association that have the data. Subsequently the standards which are delicate must be secured before sharing the data. In this paper to give secure information sharing delicate guidelines are bothered first which was found by incessant example tree. Here touchy arrangement of principles are bothered by substitution. This kind of substitution diminishes the hazard and increment the utility of the dataset when contrasted with different techniques. Examination is done on certifiable dataset. Results shows that proposed work is better as appear differently in relation to various past strategies on the introduce of evaluation parameters.


2011 ◽  
Vol 22 (8) ◽  
pp. 1749-1760
Author(s):  
Yu-Hong GUO ◽  
Yun-Hai TONG ◽  
Shi-Wei TANG ◽  
Leng-Dong WU

Sign in / Sign up

Export Citation Format

Share Document