Large-Scale Loop Detector Troubleshooting Using Clustering and Association Rule Mining

2020 ◽  
Vol 146 (7) ◽  
pp. 04020064 ◽  
Author(s):  
Amin Ariannezhad ◽  
Yao-Jan Wu
2010 ◽  
Vol 39 ◽  
pp. 449-454
Author(s):  
Jiang Hui Cai ◽  
Wen Jun Meng ◽  
Zhi Mei Chen

Data mining is a broad term used to describe various methods for discovering patterns in data. A kind of pattern often considered is association rules, probabilistic rules stating that objects satisfying description A also satisfy description B with certain support and confidence. In this study, we first make use of the first-order predicate logic to represent knowledge derived from celestial spectra data. Next, we propose a concept of constrained frequent pattern trees (CFP) along with an algorithm used to construct CFPs, aiming to improve the efficiency and pertinence of association rule mining. The running results show that it is feasible and valuable to apply this method to mining the association rule and the improved algorithm can decrease related computation quantity in large scale and improve the efficiency of the algorithm. Finally, the simulation results of knowledge acquisition for fault diagnosis also show the validity of CFP algorithm.


2014 ◽  
Vol 998-999 ◽  
pp. 899-902 ◽  
Author(s):  
Cheng Luo ◽  
Ying Chen

Existing data miming algorithms have mostly implemented data mining under centralized environment, but the large-scale database exists in the distributed form. According to the existing problem of the distributed data mining algorithm FDM and its improved algorithms, which exist the problem that the frequent itemsets are lost and network communication cost too much. This paper proposes a association rule mining algorithm based on distributed data (ARADD). The mapping marks the array mechanism is included in the ARADD algorithm, which can not only keep the integrity of the frequent itemsets, but also reduces the cost of network communication. The efficiency of algorithm is proved in the experiment.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
You Wu ◽  
Zheng Wang ◽  
Shengqi Wang

Data mining is currently a frontier research topic in the field of information and database technology. It is recognized as one of the most promising key technologies. Data mining involves multiple technologies, such as mathematical statistics, fuzzy theory, neural networks, and artificial intelligence, with relatively high technical content. The realization is also difficult. In this article, we have studied the basic concepts, processes, and algorithms of association rule mining technology. Aiming at large-scale database applications, in order to improve the efficiency of data mining, we proposed an incremental association rule mining algorithm based on clustering, that is, using fast clustering. First, the feasibility of realizing performance appraisal data mining is studied; then, the business process needed to realize the information system is analyzed, the business process-related links and the corresponding data input interface are designed, and then the data process to realize the data processing is designed, including data foundation and database model. Aiming at the high efficiency of large-scale database mining, database development tools are used to implement the specific system settings and program design of this algorithm. Incorporated into the human resource management system of colleges and universities, they carried out successful association broadcasting, realized visualization, and finally discovered valuable information.


2013 ◽  
Vol 23 (03) ◽  
pp. 1350012 ◽  
Author(s):  
FADI THABTAH ◽  
SUHEL HAMMOUD

Association rule is one of the primary tasks in data mining that discovers correlations among items in a transactional database. The majority of vertical and horizontal association rule mining algorithms have been developed to improve the frequent items discovery step which necessitates high demands on training time and memory usage particularly when the input database is very large. In this paper, we overcome the problem of mining very large data by proposing a new parallel Map-Reduce (MR) association rule mining technique called MR-ARM that uses a hybrid data transformation format to quickly finding frequent items and generating rules. The MR programming paradigm is becoming popular for large scale data intensive distributed applications due to its efficiency, simplicity and ease of use, and therefore the proposed algorithm develops a fast parallel distributed batch set intersection method for finding frequent items. Two implementations (Weka, Hadoop) of the proposed MR association rule algorithm have been developed and a number of experiments against small, medium and large data collections have been conducted. The ground bases of the comparisons are time required by the algorithm for: data initialisation, frequent items discovery, rule generation, etc. The results show that MR-ARM is very useful tool for mining association rules from large datasets in a distributed environment.


Author(s):  
Lin Lin ◽  
Mei-Ling Shyu ◽  
Shu-Ching Chen

The explosive growth and increasing complexity of the multimedia data have created a high demand of multimedia services and applications in various areas so that people can access and distribute the data easily. Unfortunately, traditional keyword-based information retrieval is no longer suitable. Instead, multimedia data mining and content-based multimedia information retrieval have become the key technologies in modern societies. Among many data mining techniques, association rule mining (ARM) is considered one of the most popular approaches to extract useful information from multimedia data in terms of relationships between variables. In this paper, a novel rule-based semantic concept classification framework using weighted association rule mining (WARM), capturing the significance degrees of the feature-value pairs to improve the applicability of ARM, is proposed to deal with major issues and challenges in large-scale video semantic concept classification. Unlike traditional ARM that the rules are generated by frequency count and the items existing in one rule are equally important, our proposed WARM algorithm utilizes multiple correspondence analysis (MCA) to explore the relationships among features and concepts and to signify different contributions of the features in rule generation. To the authors best knowledge, this is one of the first WARM-based classifiers in the field of multimedia concept retrieval. The experimental results on the benchmark TRECVID data demonstrate that the proposed framework is able to handle large-scale and imbalanced video data with promising classification and retrieval performance.


Sign in / Sign up

Export Citation Format

Share Document