Research on mining global maximal frequent itemsets for health big data

When it comes to association rule mining, all frequent itemsets are first found, and then the confidence level of association rules is calculated through the support degree of frequent itemsets. As all non-empty subsets in frequent itemsets are still frequent itemsets, all frequent itemsets can be acquired only by finding all maximal frequent itemsets (MFIs), whose supersets are not frequent itemsets. In this study, an algorithm, named right-hand side expanding (RHSE), which can accurately find all MFIs, was proposed. First, an Expanding Operation was designed, which, starting from any given frequent itemset, could add items using certain rules and form some supersets of given frequent itemsets. In addition, these supersets were all MFIs. Next, this operator was used to add items by taking all frequent 1-itemsets as the starting point alternately, and all MFIs were found in the end. Due to the special design of the Expanding Operation, each MFI could be found. Moreover, the path found was unique, which avoided the algorithm redundancy in temporal and spatial complexity. This algorithm, which has a high operating rate, is applicable to the big data of high-dimensional mass transactions as it is capable of avoiding the computing redundancy and finding all MFIs. In the end, a detailed experimental report on 10 open standard transaction sets was given in this study, including the big data calculation results of million-class transactions.

Download Full-text

A novel process-based association rule approach through maximal frequent itemsets for big data processing

Future Generation Computer Systems ◽

10.1016/j.future.2017.08.017 ◽

2018 ◽

Vol 81 ◽

pp. 414-424 ◽

Cited By ~ 3

Author(s):

Zelei Liu ◽

Liang Hu ◽

Chunyi Wu ◽

Yan Ding ◽

Quangang Wen ◽

...

Keyword(s):

Big Data ◽

Data Processing ◽

Association Rule ◽

Frequent Itemsets ◽

Big Data Processing ◽

Rule Approach ◽

Maximal Frequent Itemsets

Download Full-text

Explore maximal frequent itemsets for big data pre-processing based on small sample in cloud computing

2016 8th International Congress on Ultra Modern Telecommunications and Control Systems and Workshops (ICUMT) ◽

10.1109/icumt.2016.7765363 ◽

2016 ◽

Author(s):

Gaochao Xu ◽

Yan Ding ◽

Chunyi Wu ◽

Yunan Zhai ◽

Jia Zhao

Keyword(s):

Cloud Computing ◽

Big Data ◽

Frequent Itemsets ◽

Small Sample ◽

Maximal Frequent Itemsets

Download Full-text

A mining algorithm for distributed global maximal frequent itemsets based on Sorted SCan-Tree

2016 IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC) ◽

10.1109/imcec.2016.7867533 ◽

2016 ◽

Author(s):

Yulei Huang ◽

Jinhuan Wang ◽

Yan Li ◽

Qing Lin

Keyword(s):

Frequent Itemsets ◽

Mining Algorithm ◽

Maximal Frequent Itemsets

Download Full-text

New Policy of Maximal Frequent Itemsets in Data Stream Mining

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.26-28.118 ◽

2010 ◽

Vol 26-28 ◽

pp. 118-122

Author(s):

Chong Huan Xu ◽

Chun Hua Ju

Keyword(s):

Stream Flow ◽

Data Stream ◽

Search Space ◽

Frequent Itemsets ◽

Stream Flows ◽

Algorithm Construct ◽

Maximal Frequent Itemsets ◽

Core Idea ◽

Basic Window ◽

Pruning Techniques

According to the features of data streams and combined sliding window, a new algorithm A-MFI which is based on self-adjusting and orderly-compound policy for mining maximal frequent itemsets in data stream is proposed. This algorithm which is based on basic window updates information from data stream flow fragments and scans the stream only once to gain and store it in frequent itemsets list when the data stream flows. The core idea of this algorithm: construct self-adjusting and orderly-compound FP-tree, use mixed subset pruning techniques to reduce the search space, merge nodes which has equal minsup in the same branch and compress to generate the orderly-compound FP-tree to avoid superset checking when mining maximal frequent itemsets. The experimental results show that the algorithm has higher efficiency in time and space, and also has good scalability.

Download Full-text

Mining Maximal Frequent Itemsets for Intrusion Detection

Grid and Cooperative Computing - GCC 2004 Workshops - Lecture Notes in Computer Science ◽

10.1007/978-3-540-30207-0_53 ◽

2004 ◽

pp. 422-429 ◽

Cited By ~ 3

Author(s):

Hui Wang ◽

Qing-Hua Li ◽

Huanyu Xiong ◽

Sheng-Yi Jiang

Keyword(s):

Intrusion Detection ◽

Frequent Itemsets ◽

Maximal Frequent Itemsets

Download Full-text

A Hybrid Method for Discovering Maximal Frequent Itemsets

2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery ◽

10.1109/fskd.2008.347 ◽

2008 ◽

Cited By ~ 1

Author(s):

Fu-zan Chen ◽

Min-qiang Li

Keyword(s):

Hybrid Method ◽

Frequent Itemsets ◽

Maximal Frequent Itemsets

Download Full-text

Data Mining Itemset of Big Data Using Pre-Processing Based on Mapreduce FrameWork with ETL Tools

APTIKOM Journal on Computer Science and Information Technologies ◽

10.11591/aptikom.j.csit.103 ◽

2017 ◽

Vol 2 (2) ◽

pp. 57-62

Author(s):

Padmanathan Anantharaman ◽

H.V. Ramakrishan

Keyword(s):

Big Data ◽

Clustering Algorithm ◽

Programming Model ◽

Hybrid Approach ◽

Processing Technique ◽

Frequent Itemsets ◽

Frequent Itemset ◽

Frequent Itemset Mining ◽

Itemset Mining ◽

Dataset Size

As data volumes continue to grow, they quickly consume the capacity of data warehouses and application databases. Is your IT organization forced into costly upgrades to expensive databases and data warehouse hardware appliances and enormous amount of data is getting explored through Internet of Things (IoT) as technologies are advancing and people uses these technologies in day to day activities, this data is termed as Big Data having its characteristics and challenges. Frequent Itemset Mining algorithms are aimed to disclose frequent itemsets from transactional database but as the dataset size increases, it cannot be handled by traditional frequent itemset mining. MapReduce programming model solves the problem of large datasets but it has large communication cost which reduces execution efficiency. This proposed new pre-processed k-means technique applied on BigFIM algorithm. ClustBigFIM uses hybrid approach, clustering using k-means algorithm to generate Clusters from huge datasets and Apriori and Eclat to mine frequent itemsets from generated clusters using MapReduce programming model. Results shown that execution efficiency of ClustBigFIM algorithm is increased by applying k-means clustering algorithm before BigFIM algorithm as one of the pre-processing technique.

Download Full-text

Efficiently Mining Maximal Frequent Itemsets Based on Digraph

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.610.291 ◽

2014 ◽

Vol 610 ◽

pp. 291-295

Author(s):

Qiang Wu ◽

Ding We Wu ◽

Qin Wang ◽

Shao Min Wen ◽

Rong Tu

Keyword(s):

Data Mining ◽

Frequent Itemsets ◽

Maximal Frequent Itemsets ◽

Novel Algorithm

In this paper, a novel algorithm for mining maximal frequent itemsets is presented, which has a pre-processing phase where a digraph is constructed. The digraph represents the frequent 2-itemsets which play an important role on the performance of data mining. Then the search for maximal frequent itemsets is done in the digraph. Experiments show that the proposed algorithm is efficient for all types of data.

Download Full-text

An Efficient Algorithm for Privacy Preserving Maximal Frequent Itemsets Mining

2011 Fourth International Symposium on Parallel Architectures, Algorithms and Programming ◽

10.1109/paap.2011.62 ◽

2011 ◽

Author(s):

Yuqing Miao ◽

Xiaohua Zhang ◽

Kongling Wu ◽

Jie Su

Keyword(s):

Efficient Algorithm ◽

Privacy Preserving ◽

Frequent Itemsets ◽

Frequent Itemsets Mining ◽

Maximal Frequent Itemsets

Download Full-text