Large-Scale Loop Detector Troubleshooting Using Clustering and Association Rule Mining

Data mining is a broad term used to describe various methods for discovering patterns in data. A kind of pattern often considered is association rules, probabilistic rules stating that objects satisfying description A also satisfy description B with certain support and confidence. In this study, we first make use of the first-order predicate logic to represent knowledge derived from celestial spectra data. Next, we propose a concept of constrained frequent pattern trees (CFP) along with an algorithm used to construct CFPs, aiming to improve the efficiency and pertinence of association rule mining. The running results show that it is feasible and valuable to apply this method to mining the association rule and the improved algorithm can decrease related computation quantity in large scale and improve the efficiency of the algorithm. Finally, the simulation results of knowledge acquisition for fault diagnosis also show the validity of CFP algorithm.

Download Full-text

Research on Association Rule Mining Algorithm Based on Distributed Data

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.998-999.899 ◽

2014 ◽

Vol 998-999 ◽

pp. 899-902 ◽

Cited By ~ 1

Author(s):

Cheng Luo ◽

Ying Chen

Keyword(s):

Data Mining ◽

Association Rule ◽

Association Rule Mining ◽

Large Scale ◽

Frequent Itemsets ◽

Network Communication ◽

Data Mining Algorithm ◽

Distributed Data ◽

Rule Mining ◽

Mining Algorithm

Existing data miming algorithms have mostly implemented data mining under centralized environment, but the large-scale database exists in the distributed form. According to the existing problem of the distributed data mining algorithm FDM and its improved algorithms, which exist the problem that the frequent itemsets are lost and network communication cost too much. This paper proposes a association rule mining algorithm based on distributed data (ARADD). The mapping marks the array mechanism is included in the ARADD algorithm, which can not only keep the integrity of the frequent itemsets, but also reduces the cost of network communication. The efficiency of algorithm is proved in the experiment.

Download Full-text

Dynamic load balancing of large-scale distributed association rule mining

2011 IEEE International Conference on Computer Applications and Industrial Electronics (ICCAIE) ◽

10.1109/iccaie.2011.6162196 ◽

2011 ◽

Author(s):

Raja Tlili ◽

Yahya Slimani

Keyword(s):

Load Balancing ◽

Dynamic Load ◽

Association Rule ◽

Association Rule Mining ◽

Large Scale ◽

Dynamic Load Balancing ◽

Rule Mining ◽

Distributed Association

Download Full-text

Human Resource Allocation Based on Fuzzy Data Mining Algorithm

Complexity ◽

10.1155/2021/9489114 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

You Wu ◽

Zheng Wang ◽

Shengqi Wang

Keyword(s):

Data Mining ◽

Human Resource ◽

Business Process ◽

Association Rule ◽

Association Rule Mining ◽

Large Scale ◽

Data Mining Algorithm ◽

Rule Mining ◽

Mining Algorithm ◽

Database Technology

Data mining is currently a frontier research topic in the field of information and database technology. It is recognized as one of the most promising key technologies. Data mining involves multiple technologies, such as mathematical statistics, fuzzy theory, neural networks, and artificial intelligence, with relatively high technical content. The realization is also difficult. In this article, we have studied the basic concepts, processes, and algorithms of association rule mining technology. Aiming at large-scale database applications, in order to improve the efficiency of data mining, we proposed an incremental association rule mining algorithm based on clustering, that is, using fast clustering. First, the feasibility of realizing performance appraisal data mining is studied; then, the business process needed to realize the information system is analyzed, the business process-related links and the corresponding data input interface are designed, and then the data process to realize the data processing is designed, including data foundation and database model. Aiming at the high efficiency of large-scale database mining, database development tools are used to implement the specific system settings and program design of this algorithm. Incorporated into the human resource management system of colleges and universities, they carried out successful association broadcasting, realized visualization, and finally discovered valuable information.

Download Full-text

Privacy-preserving association rule mining in large-scale distributed system

IEEE International Symposium on Cluster Computing and the Grid, 2004. CCGrid 2004. ◽

10.1109/ccgrid.2004.1336595 ◽

2004 ◽

Cited By ~ 9

Author(s):

A. Schuster ◽

R. Wolff ◽

B. Gilburd

Keyword(s):

Distributed System ◽

Association Rule ◽

Association Rule Mining ◽

Large Scale ◽

Privacy Preserving ◽

Rule Mining

Download Full-text

MR-ARM: A MAP-REDUCE ASSOCIATION RULE MINING FRAMEWORK

Parallel Processing Letters ◽

10.1142/s0129626413500126 ◽

2013 ◽

Vol 23 (03) ◽

pp. 1350012 ◽

Cited By ~ 8

Author(s):

FADI THABTAH ◽

SUHEL HAMMOUD

Keyword(s):

Association Rule ◽

Association Rule Mining ◽

Large Scale ◽

Large Data ◽

Distributed Applications ◽

Ease Of Use ◽

Map Reduce ◽

Rule Mining ◽

Training Time ◽

Frequent Items

Association rule is one of the primary tasks in data mining that discovers correlations among items in a transactional database. The majority of vertical and horizontal association rule mining algorithms have been developed to improve the frequent items discovery step which necessitates high demands on training time and memory usage particularly when the input database is very large. In this paper, we overcome the problem of mining very large data by proposing a new parallel Map-Reduce (MR) association rule mining technique called MR-ARM that uses a hybrid data transformation format to quickly finding frequent items and generating rules. The MR programming paradigm is becoming popular for large scale data intensive distributed applications due to its efficiency, simplicity and ease of use, and therefore the proposed algorithm develops a fast parallel distributed batch set intersection method for finding frequent items. Two implementations (Weka, Hadoop) of the proposed MR association rule algorithm have been developed and a number of experiments against small, medium and large data collections have been conducted. The ground bases of the comparisons are time required by the algorithm for: data initialisation, frequent items discovery, rule generation, etc. The results show that MR-ARM is very useful tool for mining association rules from large datasets in a distributed environment.

Download Full-text

Rule-Based Semantic Concept Classification from Large-Scale Video Collections

International Journal of Multimedia Data Engineering and Management ◽

10.4018/jmdem.2013010103 ◽

2013 ◽

Vol 4 (1) ◽

pp. 46-67 ◽

Cited By ~ 3

Author(s):

Lin Lin ◽

Mei-Ling Shyu ◽

Shu-Ching Chen

Keyword(s):

Data Mining ◽

Information Retrieval ◽

Association Rule ◽

Association Rule Mining ◽

Large Scale ◽

Multimedia Data ◽

Semantic Concept ◽

Rule Mining ◽

Rule Based ◽

Concept Classification

The explosive growth and increasing complexity of the multimedia data have created a high demand of multimedia services and applications in various areas so that people can access and distribute the data easily. Unfortunately, traditional keyword-based information retrieval is no longer suitable. Instead, multimedia data mining and content-based multimedia information retrieval have become the key technologies in modern societies. Among many data mining techniques, association rule mining (ARM) is considered one of the most popular approaches to extract useful information from multimedia data in terms of relationships between variables. In this paper, a novel rule-based semantic concept classification framework using weighted association rule mining (WARM), capturing the significance degrees of the feature-value pairs to improve the applicability of ARM, is proposed to deal with major issues and challenges in large-scale video semantic concept classification. Unlike traditional ARM that the rules are generated by frequency count and the items existing in one rule are equally important, our proposed WARM algorithm utilizes multiple correspondence analysis (MCA) to explore the relationships among features and concepts and to signify different contributions of the features in rule generation. To the authors best knowledge, this is one of the first WARM-based classifiers in the field of multimedia concept retrieval. The experimental results on the benchmark TRECVID data demonstrate that the proposed framework is able to handle large-scale and imbalanced video data with promising classification and retrieval performance.

Download Full-text