Algoritma Apriori untuk Pencarian Frequent itemset dalam Association Rule Mining

Abstract Over decades, retail chains and department stores have been selling their products without using the transactional data generated by their sales as a source of knowledge. Abundant data availability, the need for information (or knowledge) as a support for decision making to create business solutions, and infrastructure support in the field of information technology are the embryos of the birth of data mining technology. Association rule mining is a data mining method used to extract useful patterns between data items. In this research, the Apriori algorithm was applied to find frequent itemset in association rule mining. Data processing using Tanagra tools. The dataset used was the Supermarket dataset consisting of 12 attributes and 108.131 transaction. The experimental results obtained by association rules or rules from the combination of item-sets beer wine spirit-frozen foods and snack foods as a Frequent itemset with a support value of 15.489% and a confidence value of 83.719%. Lift ratio value obtained was 2.47766 which means that there were some benefits from the association rule or rules. Keywords: Apriori, Association Rule Mining. Abstrak Selama beberapa dekade rantai ritel dan department store telah menjual produk mereka tanpa menggunakan data transaksional yang dihasilkan oleh penjualan mereka sebagai sumber pengetahuan. Ketersediaan data yang melimpah, kebutuhan akan informasi (atau pengetahuan) sebagai pendukung pengambilan keputusan untuk membuat solusi bisnis, dan dukungan infrastruktur di bidang teknologi informasi merupakan cikal-bakal dari lahirnya teknologi data mining. Data mining menemukan pola yang menarik dari database seperti association rule, correlations, sequences, classifier dan masih banyak lagi yang mana association rule adalah salah satu masalah yang paling popular. Association rule mining merupakan metode data mining yang digunakan untuk mengekstrasi pola yang bermanfaat di antara data barang. Pada penelitian ini diterapkan algoritma Apriori untuk pencarian frequent itemset dalam association rule mining. Pengolahan data menggunakan tools Tanagra. Dataset yang digunakan adalah dataset Supermarket yang terdiri dari 12 atribut dan 108.131 transaksi. Hasil eksperimen diperoleh aturan asosiasi atau rules dari kombinasi itemsets beer wine spirit-frozen foods dan snack foods sebagai Frequent itemset dengan nilai support sebesar 15,489% dan nilai confidence sebesar 83,719%. Nilai Lift ratio yang diperoleh sebesar 2,47766 yang artinya terdapat manfaat dari aturan asosiasi atau rules tersebut. Kata kunci: Apriori, Association rule mining

Download Full-text

Comparative Analysis of Association Rule Mining Algorithms in Market Basket Analysis Using Transactional Data

Journal of Computer Science and Its Application ◽

10.4314/jcsia.v27i1.8 ◽

2020 ◽

Vol 27 (1) ◽

Author(s):

AA Izang ◽

SO Kuyoro ◽

OD Alao ◽

RU Okoro ◽

OA Adesegun

Keyword(s):

Data Mining ◽

Association Rule ◽

Association Rule Mining ◽

Market Basket Analysis ◽

Rule Mining ◽

Business Decisions ◽

Market Basket ◽

Minimum Support ◽

Support Threshold ◽

Transactional Data

Association rule mining (ARM) is an aspect of data mining that has revolutionized the area of predictive modelling paving way for data mining technique to become the recommended method for business owners to evaluate organizational performance. Market basket analysis (MBA), a useful modeling technique in data mining, is often used to analyze customer buying pattern. Choosing the right ARM algorithm to use in MBA is somewhat difficult, as most algorithms performance is determined by characteristics such as amount of data used, application domain, time variation, and customer’s preferences. Hence this study examines four ARM algorithm used in MBA systems for improved business Decisions. One million, one hundered and twele thousand (1,112,000) transactional data were extracted from Babcock University Superstore. The dataset was induced with Frequent Pattern Growth, Apiori, Association Outliers and Supervised Association Rule ARM algorithms. The outputs were compared using minimum support threshold, confidence level and execution time as metrics. The result showed that The FP Growth has minimum support threshold of 0.011 and confidence level of 0.013, Apriori 0.019 and 0.022, Association outliers 0.026 and 0.294 while Supervised Association Rule has 0.032 and 0.212 respectively. The FP Growth and Apirori ARM algorithms performed better than Association Outliers and Supervised Association Rule when the minimum support and confidence threshold were both set to 0.1. The study concluded by recommending a hybrid ARM algorithm to be used for building MBA Applications. The outcome of this study when adopted by business ventures will lead to improved business decisions thereby helping to achieve customer retention. Keywords: Association rule mining, Business ventures, Data mining, Market basket analysis, Transactional data.

Download Full-text

An Enhanced Approach to Mine Maximal Frequent Itemset using Maximal Frequent Itemset Prima Algorithm (MFIPA)

Asian Journal of Computer Science and Technology ◽

10.51983/ajcst-2019.8.s2.2035 ◽

2019 ◽

Vol 8 (S2) ◽

pp. 9-12

Author(s):

R. Smeeta Mary ◽

K. Perumal

Keyword(s):

Data Mining ◽

Association Rule ◽

Association Rule Mining ◽

Frequent Itemsets ◽

Frequent Itemset ◽

Decision Makers ◽

New Method ◽

Rule Mining

In data mining finding out the frequent itemsets is one of the very essential topics. Data mining helps in identifying the best knowledge for different decision makers. Frequent itemset generation is the precondition and most time-consuming method for association rule mining. In this paper we suggest a new algorithm for frequent itemset detection that works with datasets in distributed manner. The proposed algorithm brings in a new method to find frequent itemset not including the necessitate to create candidate itemsets. The proposed approach could be implemented using horizontal representation for transaction datasets and allocating prime value. It explores all the frequent itemset that is present in the input and according to the support the maximum frequent itemset is identified. It was applied on different transactions database and compared with well-known algorithms: FP-Growth and Parallel Apriori with different support levels. The try out showed that the proposed algorithm attain major time improvement over both algorithms.

Download Full-text

Penerapan Association Rule Mining Berbasis Algoritma Frequent Pattern Growth untuk Rekomendasi Penjualan

JATISI (Jurnal Teknik Informatika dan Sistem Informasi) ◽

10.35957/jatisi.v7i2.339 ◽

2020 ◽

Vol 7 (2) ◽

pp. 135-148

Author(s):

Didi Supriyadi

Keyword(s):

Data Mining ◽

Association Rule ◽

Association Rule Mining ◽

Frequent Itemset ◽

Frequent Pattern ◽

Rule Mining ◽

Minimum Support ◽

Pattern Growth ◽

Mining Association Rule

Tingkat persaingan dan kompleksitas permasalahan penjualan pada perusahaan retail, menuntut setiap perusahaan retail untuk mampu berkompetisi dengan perusahaan lain. Salah satu yang dapat dilakukan adalah melalui pengambilan keputusan terkait penjualan yang lebih tepat dan efektif. Besarnya data transaksinonal penjualan perusahaan retail dapat dilakukan ekstraksi informasi yang bermanfaat. Metode yang dapat digunakan untuk menggali informasi adalah melalui penerapan association rule mining. Association Rule Mining merupakan suatu metode data mining yang berfokus pada pola transaksi dengan cara mengekstraksi asosiasi atau hubungan suatu kejadian. Keranjang belanja yang terdapat pada perusahaan retail yang terkomputerisasi merupakan cara terbaik untuk memberikan dukungan rekomendasi keputusan secara ilmiah dengan cara menentukan hubungan antara barang yang dibeli secara bersamaan dalam setiap transaksi. Algoritma FP-growth digunakan untuk menentukan himpunan dataset yang paling sering muncul (frequent itemset) pada sekeompok data. Penelitian ini menghasilkan nilai minimum support 0,1% dan nilai minimum confidence 60% jumlah rule yang dihasilkan berjumlah 116457, nilai minimum confidence 70% jumlah rule yang dihasilkan berjumlah 84086, dan nilai minimum confidence 80% jumlah rule yang dihasilkan berjumlah 48623 dari data yang diolah sebanyak 22191. Hasil rule ini dapat digunakan untuk strategi pemasaran produk. Nilai minimum support 0,1% dimana semakin besar nilai minimum confidence maka menghasilkan rule yang semakin sedikit.

Download Full-text

Present State-of-The-Art of Association Rule Mining Algorithms

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.a2202.109119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 6398-6405

Keyword(s):

Data Mining ◽

Association Rule ◽

Association Rule Mining ◽

State Of The Art ◽

Synthetic Data ◽

Data Sets ◽

Evolutionary Analysis ◽

Rule Mining ◽

Transaction Database ◽

Mining Algorithms

A Data mining is the method of extracting useful information from various repositories such as Relational Database, Transaction database, spatial database, Temporal and Time-series database, Data Warehouses, World Wide Web. Various functionalities of Data mining include Characterization and Discrimination, Classification and prediction, Association Rule Mining, Cluster analysis, Evolutionary analysis. Association Rule mining is one of the most important techniques of Data Mining, that aims at extracting interesting relationships within the data. In this paper we study various Association Rule mining algorithms, also compare them by using synthetic data sets, and we provide the results obtained from the experimental analysis

Download Full-text

ASSOCIATION RULE MINING UNTUK MENINGKATKAN PROMOSI PRODUK ( STUDI KASUS PADA PD. XYZ )

JURNAL FASILKOM ◽

10.37859/jf.v7i2.789 ◽

2018 ◽

Vol 7 (2) ◽

pp. 284-288

Author(s):

Doni Winarso ◽

Anwar Karnaidi

Keyword(s):

Data Mining ◽

Association Rule ◽

Association Rule Mining ◽

Rule Mining

Analisis association rule adalah teknik data mining yang digunakan untuk menemukan aturan asosiatif antara suatu kombinasi item. penelitian ini menggunakan algoritma apriori. Dengan algoritma tersebut dilakukan pencarian frekuensi dan item barang yang paling sering muncul. hasil dari penelitian in menunjukkan bahwa algoritma apriori dapat digunakan untuk menganalisis data transaksi sehingga diketahui mana produk yang harus dipromosikan. Perhitungan metode apriori menghasilkan suatu pola pembelian yang terjadi di PD. XYZ. dengan menganalisis pola tersebut dihasilakn kesimpulan bahwa produk yang akan dipromosikan yaitu cat tembok ekonomis dan peralatan cat berupa kuas tangan dengan nilai support 11% dan confidence 75% .

Download Full-text

Artificial Bee Colony-Based Associative Classifier for Healthcare Data Diagnosis

Handbook of Research on Disease Prediction Through Data Analytics and Machine Learning - Advances in Medical Diagnosis, Treatment, and Care ◽

10.4018/978-1-7998-2742-9.ch012 ◽

2021 ◽

pp. 237-253

Author(s):

M. Nandhini ◽

S. N. Sivanandam ◽

S. Renugadevi

Keyword(s):

Data Mining ◽

Association Rule ◽

Association Rule Mining ◽

Artificial Bee Colony ◽

Optimization Technique ◽

Rule Mining ◽

Healthcare Data ◽

Bee Colony ◽

Small Set ◽

Significant Class

Data mining is likely to explore hidden patterns from the huge quantity of data and provides a way of analyzing and categorizing the data. Associative classification (AC) is an integration of two data mining tasks, association rule mining, and classification which is used to classify the unknown data. Though association rule mining techniques are successfully utilized to construct classifiers, it lacks in generating a small set of significant class association rules (CARs) to build an accurate associative classifier. In this work, an attempt is made to generate significant CARs using Artificial Bee Colony (ABC) algorithm, an optimization technique to construct an efficient associative classifier. Associative classifier, thus built using ABC discovered CARs achieve high prognostic accurateness and interestingness value. Promising results were provided by the ABC based AC when experiments were conducted using health care datasets from the UCI machine learning repository.

Download Full-text

Constraint-Based Association Rule Mining

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch049 ◽

2011 ◽

pp. 307-312 ◽

Cited By ~ 10

Author(s):

Carson Kai-Sang Leung

Keyword(s):

Data Mining ◽

Association Rules ◽

Association Rule ◽

Association Rule Mining ◽

Computational Cost ◽

Knowledge Discovery In Databases ◽

Rule Mining ◽

The Subject ◽

User Focus ◽

High Computational Cost

The problem of association rule mining was introduced in 1993 (Agrawal et al., 1993). Since then, it has been the subject of numerous studies. Most of these studies focused on either performance issues or functionality issues. The former considered how to compute association rules efficiently, whereas the latter considered what kinds of rules to compute. Examples of the former include the Apriori-based mining framework (Agrawal & Srikant, 1994), its performance enhancements (Park et al., 1997; Leung et al., 2002), and the tree-based mining framework (Han et al., 2000); examples of the latter include extensions of the initial notion of association rules to other rules such as dependence rules (Silverstein et al., 1998) and ratio rules (Korn et al., 1998). In general, most of these studies basically considered the data mining exercise in isolation. They did not explore how data mining can interact with the human user, which is a key component in the broader picture of knowledge discovery in databases. Hence, they provided little or no support for user focus. Consequently, the user usually needs to wait for a long period of time to get numerous association rules, out of which only a small fraction may be interesting to the user. In other words, the user often incurs a high computational cost that is disproportionate to what he wants to get. This calls for constraint-based association rule mining.

Download Full-text

Association Rule Mining of Relational Data

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch015 ◽

2011 ◽

pp. 87-93

Author(s):

Anne Denton

Keyword(s):

Data Mining ◽

Data Structures ◽

Association Rule ◽

Association Rule Mining ◽

Relational Data ◽

Rule Mining ◽

Data Mining Algorithms ◽

Mining Algorithms ◽

Relational Database Management ◽

Relational Database Management Systems

Most data of practical relevance are structured in more complex ways than is assumed in traditional data mining algorithms, which are based on a single table. The concept of relations allows for discussing many data structures such as trees and graphs. Relational data have much generality and are of significant importance, as demonstrated by the ubiquity of relational database management systems. It is, therefore, not surprising that popular data mining techniques, such as association rule mining, have been generalized to relational data. An important aspect of the generalization process is the identification of challenges that are new to the generalized setting.

Download Full-text

On Association Rule Mining for the QSAR Problem

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch014 ◽

2011 ◽

pp. 83-86

Author(s):

Luminita Dumitriu

Keyword(s):

Data Mining ◽

Association Rule ◽

Association Rule Mining ◽

Predictive Ability ◽

Quantitative Structure Activity Relationship ◽

Rule Mining ◽

Data Mining Techniques ◽

Neuro Fuzzy ◽

The 1960S ◽

New Compounds

The concept of Quantitative Structure-Activity Relationship (QSAR), introduced by Hansch and co-workers in the 1960s, attempts to discover the relationship between the structure and the activity of chemical compounds (SAR), in order to allow the prediction of the activity of new compounds based on knowledge of their chemical structure alone. These predictions can be achieved by quantifying the SAR. Initially, statistical methods have been applied to solve the QSAR problem. For example, pattern recognition techniques facilitate data dimension reduction and transformation techniques from multiple experiments to the underlying patterns of information. Partial least squares (PLS) is used for performing the same operations on the target properties. The predictive ability of this method can be tested using cross-validation on the test set of compounds. Later, data mining techniques have been considered for this prediction problem. Among data mining techniques, the most popular ones are based on neural networks (Wang, Durst, Eberhart, Boyd, & Ben-Miled, 2004) or on neuro-fuzzy approaches (Neagu, Benfenati, Gini, Mazzatorta, & Roncaglioni, 2002) or on genetic programming (Langdon, &Barrett, 2004). All these approaches predict the activity of a chemical compound, without being able to explain the predicted value. In order to increase the understanding on the prediction process, descriptive data mining techniques have started to be used related to the QSAR problem. These techniques are based on association rule mining. In this chapter, we describe the use of association rule-based approaches related to the QSAR problem.

Download Full-text

Association Rule and Quantitative Association Rule Mining among Infrequent Items

Rare Association Rule Mining and Knowledge Discovery ◽

10.4018/978-1-60566-754-6.ch002 ◽

2010 ◽

pp. 15-32 ◽

Cited By ~ 1

Author(s):

Ling Zhou ◽

Stephen Yau

Keyword(s):

Data Mining ◽

Association Rules ◽

Association Rule ◽

Association Rule Mining ◽

Rule Mining ◽

Transactional Databases ◽

Frequent Items ◽

Increasing Demand ◽

Quantitative Association Rule

Association rule mining among frequent items has been extensively studied in data mining research. However, in recent years, there is an increasing demand for mining infrequent items (such as rare but expensive items). Since exploring interesting relationships among infrequent items has not been discussed much in the literature, in this chapter, the authors propose two simple, practical and effective schemes to mine association rules among rare items. Their algorithms can also be applied to frequent items with bounded length. Experiments are performed on the well-known IBM synthetic database. The authors’ schemes compare favorably to Apriori and FP-growth under the situation being evaluated. In addition, they explore quantitative association rule mining in transactional databases among infrequent items by associating quantities of items: some interesting examples are drawn to illustrate the significance of such mining.

Download Full-text