An Enhanced Apriori with Interestingness of Patterns using cSupport and rSupport﻿

Abstract The requirement of highest information sometimes is not balance with the provision of adequate information, so that the information must be re-excavated in large data. By using the technique of association rule we can obtain information from large data such as the college data. The purposes of this research is to determine the patterns of study from student in F-MIPA UNSRAT by using association rule method of data mining algorithms and to compare in the apriori method and a hash-based algorithms. The major’s student data of F-MIPA UNSRAT as a data were processed by association rule method of data mining with the apriori algorithm and a hash-based algorithm by using support and confidance at least 1 %. The results of processing data with apriori algorithms was same with the processing results of hash-based algorithms is as much as 49 combinations of 2-itemset. The pattern that formed between 7,5% of graduates from mathematics major that studied for more 5 years with confidence value is 38,5%. Keywords: Apriori algorithm, hash-based algorithm, association rule, data mining. Abstrak Kebutuhan informasi yang sangat tinggi terkadang tidak diimbangi dengan pemberian informasi yang memadai, sehingga informasi tersebut harus kembali digali dalam data yang besar. Dengan menggunakan teknik association rule kita dapat memperoleh informasi dari data yang besar seperti data yang ada di perguruan tinggi. Tujuan penelitian ini adalah menentukan pola lama studi mahasiswa F-MIPA UNSRAT dengan menggunakan metode association rule data mining serta membandingkan algoritma apriori dan algoritma hash-based. Data yang digunakan adalah data induk mahasiswa F-MIPA UNSRAT yang diolah menggunakan teknik association rule data mining dengan algoritma apriori dan algoritma hash-based dengan minimum support 1% dan minimum confidance 1%. Hasil pengolahan data dengan algoritma apriori sama dengan hasil pengolahan data dengan algoritma hash-based yaitu sebanyak 49 kombinasi 2-itemset. Pola yang terbentuk antara lain 7,5% lulusan yang berasal dari jurusan matematika menempuh studi selama lebih dari 5 tahun dengan nilai confidence 38,5%. Kata kunci : Association rule data mining, algoritma apriori, algoritma hash-based

Download Full-text

Environmental Examination of Data Mining Algorithms Based on Cloud-Computing

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.380-384.2911 ◽

2013 ◽

Vol 380-384 ◽

pp. 2911-2914

Author(s):

Yi Zhuo Guo ◽

Tao Dai

Keyword(s):

Data Mining ◽

Cloud Computing ◽

Cluster Computing ◽

Programming Model ◽

Test System ◽

Parallel Execution ◽

Apriori Algorithm ◽

Data Mining Algorithms ◽

Parallel Programming Model ◽

Mining Algorithms

This article on cloud computing and data mining to a more comprehensive study to introduce the concept of cloud computing and data mining, pointed out that the traditional data mining techniques in the case of network test system of massive data mining, processing speed is slow, the load is not balancing and node efficiency is not high enough, Apriori algorithm based on the Map/Reduce parallel programming model, the distributed nature of cloud computing environments, make full use of cluster computing resources to support the parallel execution of algorithms by examples of cloud computing after Apriori algorithm in cloud computing environment to get higher efficiency of frequent itemsets mining algorithm performance than traditional data mining.

Download Full-text

Preference-Based Frequent Pattern Mining

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch073 ◽

2008 ◽

pp. 1280-1299

Author(s):

Moonjung Cho ◽

Jian Pei ◽

Haixun Wang ◽

Wei Wang

Keyword(s):

Data Mining ◽

General Framework ◽

Pattern Mining ◽

Frequent Pattern Mining ◽

Frequent Pattern ◽

Frequent Patterns ◽

Performance Study ◽

Important Data ◽

Mining Algorithms ◽

Extensive Performance

Frequent pattern mining is an important data-mining problem with broad applications. Although there are many in-depth studies on efficient frequent pattern mining algorithms and constraint pushing techniques, the effectiveness of frequent pattern mining remains a serious concern: It is non-trivial and often tricky to specify appropriate support thresholds and proper constraints. In this paper, we propose a novel theme of preference-based frequent pattern mining. A user simply can specify a preference instead of setting detailed parameters in constraints. We identify the problem of preference-based frequent pattern mining and formulate the preferences for mining. We develop an efficient framework to mine frequent patterns with preferences. Interestingly, many preferences can be pushed deep into the mining by properly employing the existing efficient frequent pattern mining techniques. We conduct an extensive performance study to examine our method. The results indicate that preference-based frequent pattern mining is effective and efficient. Furthermore, we extend our discussion from pattern-based frequent pattern mining to preference-based data mining in principle and draw a general framework.

Download Full-text

Data Warehousing for Association Mining

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch093 ◽

2010 ◽

pp. 592-597

Author(s):

Yuefeng Li

Keyword(s):

Data Mining ◽

Association Rules ◽

Data Warehousing ◽

Association Mining ◽

Second Phase ◽

Frequent Patterns ◽

Search Spaces ◽

Decision Attributes ◽

Long Time ◽

Two Phases

With the phenomenal growth of electronic data and information, there are many demands for developments of efficient and effective systems (tools) to address the issue of performing data mining tasks on data warehouses or multidimensional databases. Association rules describe associations between itemsets (i.e., sets of data items) (or granules). Association mining (or called association rule mining) finds interesting or useful association rules in databases, which is the crucial technique for the development of data mining. Association mining can be used in many application areas, for example, the discovery of associations between customers’ locations and shopping behaviours in market basket analysis. Association mining includes two phases. The first phase is called pattern mining that is the discovery of frequent patterns. The second phase is called rule generation that is the discovery of the interesting and useful association rules in the discovered patterns. The first phase, however, often takes a long time to find all frequent patterns that also include much noise as well (Pei and Han, 2002). The second phase is also a time consuming activity (Han and Kamber, 2000) and can generate many redundant rules (Zaki, 2004) (Xu and Li, 2007). To reduce search spaces, user constraintbased techniques attempt to find knowledge that meet some sorts of constraints. There are two interesting concepts that have been used in user constraint-based techniques: meta-rules (Han and Kamber, 2000) and granule mining (Li et al., 2006). The aim of this chapter is to present the latest research results about data warehousing techniques that can be used for improving the performance of association mining. The chapter will introduce two important approaches based on user constraint-based techniques. The first approach requests users to inputs their meta-rules that describe their desires for certain data dimensions. It then creates data cubes based these meta-rules and then provides interesting association rules. The second approach firstly requests users to provide condition and decision attributes that used to describe the antecedent and consequence of rules, respectively. It then finds all possible data granules based condition attributes and decision attributes. It also creates a multi-tier structure to store the associations between granules, and association mappings to provide interesting rules.

Download Full-text

IMPROVED DATA MINING ALGORITHMS FOR FREQUENT PATTERNS WITH COMPOSITE ITEMS

Challenges in Information Technology Management ◽

10.1142/9789812819079_0002 ◽

2008 ◽

Author(s):

KE WANG ◽

JAMES N. K. LIU ◽

WEI-MIN MA

Keyword(s):

Data Mining ◽

Frequent Patterns ◽

Data Mining Algorithms ◽

Mining Algorithms

Download Full-text

Frequent Pattern Mining over Unstructured Data using Semi-Structured Doc-Model and Pattern Ranking

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit206216 ◽

2020 ◽

pp. 36-42

Author(s):

Sudhir Tirumalasetty ◽

A. Divya ◽

D. Rahitya Lakshmi ◽

Ch. Durga Bhavani ◽

D. Anusha

Keyword(s):

Data Mining ◽

Big Data ◽

Pattern Mining ◽

Frequent Pattern Mining ◽

Unstructured Data ◽

Frequent Pattern ◽

Frequent Patterns ◽

Innovative Methods ◽

Mining Algorithms ◽

Doc Model

Frequent pattern mining is an essential data-mining task, with a goal of discovering knowledge in the form of repeated patterns. Many efficient pattern-mining algorithms have been discovered in the last two decades, yet most do not scale to the type of data we are presented with today, the so-called “Big Data”. Scalable parallel algorithms hold the key to solving the problem in this context. This paper reviews recent advances in parallel frequent pattern mining, analysing them through the Big Data lens. Load balancing and work partitioning are the major challenges to be conquered. These challenges always invoke innovative methods to do, as Big Data evolves with no limits. The biggest challenge than before is conquering unstructured data for finding frequent patterns. To accomplish this Semi Structured Doc-Model and ranking of patterns are used.

Download Full-text

Algoritma Apriori Untuk Penentuan Assosiasi Penjualan Barang

Jurnal Teknologi Informasi dan Komunikasi (TIKomSiN) ◽

10.30646/tikomsin.v9i1.538 ◽

2021 ◽

Vol 9 (1) ◽

pp. 7

Author(s):

Calvin Ivan Wiryawan ◽

Yustina Retno Wahyu Utami ◽

Didik Nugroho

Keyword(s):

Data Mining ◽

Association Rules ◽

A Priori ◽

Basic Needs ◽

Marketing Strategies ◽

Apriori Algorithm ◽

Growing Up ◽

Minimum Support ◽

Data Mining Algorithms ◽

Mining Algorithms

The increasing of selling basic needs make the company has to provide a lot of goods. The data will be growing up with increasing the transaction at Sari Bumi store. All this time, the selling basic needs at Sari Bumi Store unstructured well so that needed an application with produce important information that can decide marketing strategies. In this research, Apriori algorithm is used to determine association rules. This method was chosen because it is one of the classic data mining algorithms to look for patterns of relationships between one or more items in one dataset. A priori algorithms can help companies in developing marketing strategies. The result of this research is combination between 4 item set with a minimum support of 30% and minimum confidence of 60%.Keywords: sale, staple, apriori algorithm

Download Full-text

Data Warehousing for Association Mining

Business Information Systems ◽

10.4018/978-1-61520-969-9.ch054 ◽

2010 ◽

pp. 887-893

Author(s):

Yuefeng Li

Keyword(s):

Data Mining ◽

Association Rules ◽

Data Warehousing ◽

Association Mining ◽

Second Phase ◽

Frequent Patterns ◽

Search Spaces ◽

Decision Attributes ◽

Long Time ◽

Two Phases

With the phenomenal growth of electronic data and information, there are many demands for developments of efficient and effective systems (tools) to address the issue of performing data mining tasks on data warehouses or multidimensional databases. Association rules describe associations between itemsets (i.e., sets of data items) (or granules). Association mining (or called association rule mining) finds interesting or useful association rules in databases, which is the crucial technique for the development of data mining. Association mining can be used in many application areas, for example, the discovery of associations between customers’ locations and shopping behaviours in market basket analysis. Association mining includes two phases. The first phase is called pattern mining that is the discovery of frequent patterns. The second phase is called rule generation that is the discovery of the interesting and useful association rules in the discovered patterns. The first phase, however, often takes a long time to find all frequent patterns that also include much noise as well (Pei and Han, 2002). The second phase is also a time consuming activity (Han and Kamber, 2000) and can generate many redundant rules (Zaki, 2004) (Xu and Li, 2007). To reduce search spaces, user constraintbased techniques attempt to find knowledge that meet some sorts of constraints. There are two interesting concepts that have been used in user constraint-based techniques: meta-rules (Han and Kamber, 2000) and granule mining (Li et al., 2006). The aim of this chapter is to present the latest research results about data warehousing techniques that can be used for improving the performance of association mining. The chapter will introduce two important approaches based on user constraint-based techniques. The first approach requests users to inputs their meta-rules that describe their desires for certain data dimensions. It then creates data cubes based these meta-rules and then provides interesting association rules. The second approach firstly requests users to provide condition and decision attributes that used to describe the antecedent and consequence of rules, respectively. It then finds all possible data granules based condition attributes and decision attributes. It also creates a multi-tier structure to store the associations between granules, and association mappings to provide interesting rules.

Download Full-text

Evaluation of Frequent Itemset Mining Algorithms-Apriori and FP Growth

International Journal of Engineering Technology and Management Sciences ◽

10.46647/ijetms.2020.v04i06.001 ◽

2020 ◽

Vol 4 (6) ◽

pp. 1-4

Author(s):

Jismy Joseph ◽

Kesavaraj G

Keyword(s):

Pattern Mining ◽

Frequent Pattern Mining ◽

Frequent Itemset ◽

Frequent Pattern ◽

Frequent Itemset Mining ◽

Frequent Patterns ◽

Apriori Algorithm ◽

Itemset Mining ◽

Mining Algorithms ◽

Time And Space Complexity

Nowadays the Frequentitemset mining (FIM) is an essential task for retrieving frequently occurring patterns, correlation, events or association in a transactional database. Understanding of such frequent patterns helps to take substantial decisions in decisive situations. Multiple algorithms are proposed for finding such patterns, however the time and space complexity of these algorithms rapidly increases with number of items in a dataset. So it is necessary to analyze the efficiency of these algorithms by using different datasets. The aim of this paper is to evaluate theperformance of frequent itemset mining algorithms, Apriori and Frequent Pattern (FP) growth by comparing their features. This study shows that the FP-growth algorithm is more efficient than the Apriori algorithm for generating rules and frequent pattern mining.

Download Full-text

An Enhanced Apriori with Interestingness of Patterns using cSupport and rSupport﻿

Efficient Tree Based Distributed Data Mining Algorithms for mining Frequent Patterns

Penggunaan Association Rule Data Mining Untuk Menentukan Pola Lama Studi Mahasiswa F-MIPA UNSRAT

Environmental Examination of Data Mining Algorithms Based on Cloud-Computing

Preference-Based Frequent Pattern Mining

Data Warehousing for Association Mining

IMPROVED DATA MINING ALGORITHMS FOR FREQUENT PATTERNS WITH COMPOSITE ITEMS

Frequent Pattern Mining over Unstructured Data using Semi-Structured Doc-Model and Pattern Ranking

Algoritma Apriori Untuk Penentuan Assosiasi Penjualan Barang

Data Warehousing for Association Mining

Evaluation of Frequent Itemset Mining Algorithms-Apriori and FP Growth

An Enhanced Apriori with Interestingness of Patterns using cSupport and rSupport