Equivalence Class Based Parallel Algorithm for Mining MFI

2015 ◽  
Vol 713-715 ◽  
pp. 1712-1715
Author(s):  
Hui Wang

We present a novel and powerful parallel algorithm, PMFI, for mining all the maximal frequent itemsets from a big database. PMFI utilizes novel technologies to make the I/O overhead down drastically. The key principle is to utilize prefix-based equivalence classes to decompose the search space. It distributes the work among the processors by equivalence class weights. It re-represents the database with vertical format, so the frequency counting can be done by simple tid-list intersection operations. It bases a novel serial algorithm MaxMining which utilizes multiple-level backtrack pruning strategy, so that each processor can count the maximal frequent itemsets independently by selectively duplicating the pieces of database. These techniques eliminate the need for synchronization. The dynamic load balance schema is applied in PMFI, it would be hopeful to achieve better performance.

2010 ◽  
Vol 26-28 ◽  
pp. 118-122
Author(s):  
Chong Huan Xu ◽  
Chun Hua Ju

According to the features of data streams and combined sliding window, a new algorithm A-MFI which is based on self-adjusting and orderly-compound policy for mining maximal frequent itemsets in data stream is proposed. This algorithm which is based on basic window updates information from data stream flow fragments and scans the stream only once to gain and store it in frequent itemsets list when the data stream flows. The core idea of this algorithm: construct self-adjusting and orderly-compound FP-tree, use mixed subset pruning techniques to reduce the search space, merge nodes which has equal minsup in the same branch and compress to generate the orderly-compound FP-tree to avoid superset checking when mining maximal frequent itemsets. The experimental results show that the algorithm has higher efficiency in time and space, and also has good scalability.


2018 ◽  
Vol 7 (3.12) ◽  
pp. 157
Author(s):  
D Srinivasa Rao ◽  
V Sucharitha ◽  
K V.V Satyanarayana

Mining frequent patterns are most widely used in many applications such as supermarkets, diagnostics, and other real-time applications. Performance of the algorithm is calculated based on the computation of the algorithm. It is very tedious to compute the frequent patterns in mining. Many algorithms and techniques are implemented and studied to generate the high-performance algorithms such as Prepost+ which employees the N-list to represent itemsets and directly discovers frequent itemsets using a set-enumeration search tree. But due to its pruning strategy, it is known that the computation time is more for processing the search space. It enumerates all item sets from datasets by the principle of exhaustion and they don’t sort them based on utility, but only a statistical proof of most recurring itemset. In this paper, the proposed Enhanced Ontologies based Alignment Algorithm (EOBAA) to identify, extract, sort out the HUI's from FI's. To improve the similarity measure the proposed system adopted Cosine similarity. The experiments conducted on 1 real datasets and show the performance of the EOBAA based on the computation time and accuracy of the proposed EOBAA.  


2015 ◽  
Vol 713-715 ◽  
pp. 1765-1768
Author(s):  
Hui Wang

We present a new algorithm for mining maximal frequent itemsets, MaxMining, from big transaction databases. MaxMining employs the depth-first traversal and iterative method. It re-represents the transaction database by vertical tidset format, travels the search space with effective pruning strategies which reduces the search space dramatically. MaxMining removes all the non-maximal frequent itemsets to get the exact set of maximal frequent itemsets directly, no need to enumerate all the frequent itemsets from smaller ones step by step. It backtracks to the proper ancestor directly, needless level by level, ignoring those redundant frequent itemsets. We found that MaxMining can be more effective to find all the maximal frequent itemsets from big databases than many of proposed algorithms with ordinary pruning strategies.


2020 ◽  
pp. 1-12
Author(s):  
Wang Li

The teaching of linguistics is limited by the influence of various factors, which leads to poor teaching effect, and the teaching process is difficult to evaluate. In order to improve the efficiency of linguistics teaching, this paper uses improved machine learning algorithms to construct a linguistics artificial intelligence teaching model. According to the teaching needs of linguistics, the efficiency of the teaching process is improved, and the teaching evaluation is performed, and the root cause analysis algorithm based on MCTS is optimized. Moreover, according to the frequent item set algorithm in data mining, a layered pruning strategy is proposed to further reduce the search space and improve the efficiency of the model. In addition, this study combines with the comparative teaching experiment to study the efficiency of artificial intelligence models in linguistics teaching. The statistical results show that the model proposed in this paper has a certain effect.


2014 ◽  
Vol 610 ◽  
pp. 291-295
Author(s):  
Qiang Wu ◽  
Ding We Wu ◽  
Qin Wang ◽  
Shao Min Wen ◽  
Rong Tu

In this paper, a novel algorithm for mining maximal frequent itemsets is presented, which has a pre-processing phase where a digraph is constructed. The digraph represents the frequent 2-itemsets which play an important role on the performance of data mining. Then the search for maximal frequent itemsets is done in the digraph. Experiments show that the proposed algorithm is efficient for all types of data.


Sign in / Sign up

Export Citation Format

Share Document