scholarly journals Query Expansion in Information Retrieval using Frequent Pattern (FP) Growth Algorithm for Frequent Itemset Search and Association Rules Mining

Author(s):  
Lasmedi Afuan ◽  
Ahmad Ashari ◽  
Yohanes Suyanto
2014 ◽  
Vol 602-605 ◽  
pp. 3536-3539
Author(s):  
Yu Fu ◽  
Jun Rui Yang

Frequent pattern mining has been an important research direction in association rules. This paper use a methodology by preprocessing the original dataset using fuzzy clustering which can mapped quantitative datasets into linguistic datasets. Then we propose a algorithm based on fuzzy frequent pattern tree for extracting fuzzy frequent itemset from mapped linguistic datasets. Experimental results show that our algorithm is shorter than the F-Apriori on computing time to huge database. For large database, the algorithm presented in this paper is proved to have a good prospect.


2017 ◽  
Author(s):  
Andysah Putera Utama Siahaan ◽  
Mesran Mesran ◽  
Andre Hasudungan Lubis ◽  
Ali Ikhwan ◽  
Supiyandi

Sales transaction data on a company will continue to increase day by day. Large amounts of data can be problematic for a company if it is not managed properly. Data mining is a field of science that unifies techniques from machine learning, pattern processing, statistics, databases, and visualization to handle the problem of retrieving information from large databases. The relationship sought in data mining can be a relationship between two or more in one dimension. The algorithm included in association rules in data mining is the Frequent Pattern Growth (FP-Growth) algorithm is one of the alternatives that can be used to determine the most frequent itemset in a data set.


2012 ◽  
Vol 263-266 ◽  
pp. 3060-3063 ◽  
Author(s):  
Yi Tao Zhang ◽  
Wen Liang Tang ◽  
Cheng Wang Xie ◽  
Ji Qiang Xiong

A VPA algorithm is proposed to mining the association rules in the privacy preserving data mining, where data is vertically partitioned. The VSS protocol was used to encrypt the vertically data, which was owned by different parties. And the private comparing protocol was adopted to generate the frequent itemset. In VPA the ID numbers of the recordings were employed to keep the consistency of the data among different parties, which were saved in ID index array. The VPA algorithm can generate association rules without violating the privacy. The performance of the scheme is validated against representative real and synthetic datasets. The results reveal that the VPA algorithm can do the same in finding frequent itemset and generating the consistent rules, as it did in Apriori algorithm, in which the data were vertically partitioned and totally encrypted.


Author(s):  
Ahmed Abbache ◽  
Farid Meziane ◽  
Ghalem Belalem ◽  
Fatma Zohra Belkredim

Query expansion is the process of adding additional relevant terms to the original queries to improve the performance of information retrieval systems. However, previous studies showed that automatic query expansion using WordNet do not lead to an improvement in the performance. One of the main challenges of query expansion is the selection of appropriate terms. In this paper, the authors review this problem using Arabic WordNet and Association Rules within the context of Arabic Language. The results obtained confirmed that with an appropriate selection method, the authors are able to exploit Arabic WordNet to improve the retrieval performance. Their empirical results on a sub-corpus from the Xinhua collection showed that their automatic selection method has achieved a significant performance improvement in terms of MAP and recall and a better precision with the first top retrieved documents.


2014 ◽  
Vol 543-547 ◽  
pp. 3625-3631
Author(s):  
Shao Rong Feng ◽  
Lin Bao Ye ◽  
Zi Yu Lin

The purpose of association rules mining is to find rules which can meet the minimum support and minimum confidence from a large quantity of data. To find the valid association rules efficiently, we had a comprehensive analysis on some well-know parallel association rules mining algorithms and proposes a new parallel association rules mining algorithm (Array Based on Hadoop, short for ABH) based on the cloud computing platform. The ABH scans the database only once, uses the 0/1 array to represent one of the transactions and to record the frequency of the same transaction. Moreover, by utilizing the random access characteristics of the array and the special nature of the frequent itemset, the ABH can reduce the quantity of frequent candidate itemset effectively and find the frequent itemset quickly. We have compared the ABH with two classical algorithms CD and DD through experiment; we can find that ABH outperforms CD and DD.


2015 ◽  
Vol 2015 ◽  
pp. 1-6 ◽  
Author(s):  
Yi Zeng ◽  
Shiqun Yin ◽  
Jiangyue Liu ◽  
Miao Zhang

Association rules mining is an important technology in data mining. FP-Growth (frequent-pattern growth) algorithm is a classical algorithm in association rules mining. But the FP-Growth algorithm in mining needs two times to scan database, which reduces the efficiency of algorithm. Through the study of association rules mining and FP-Growth algorithm, we worked out improved algorithms of FP-Growth algorithm—Painting-Growth algorithm and N (not) Painting-Growth algorithm (removes the painting steps, and uses another way to achieve). We compared two kinds of improved algorithms with FP-Growth algorithm. Experimental results show that Painting-Growth algorithm is more than 1050 and N Painting-Growth algorithm is less than 10000 in data volume; the performance of the two kinds of improved algorithms is better than that of FP-Growth algorithm.


2021 ◽  
Vol 9 (1) ◽  
pp. 93
Author(s):  
Lasmedi Afuan ◽  
Ahmad Ashari ◽  
Yohanes Suyanto

This research develops a new approach to query expansion by integrating Association Rules (AR) and Ontology. In the proposed approach, there are several steps to expand the query, namely (1) the document retrieval step; (2) the step of query expansion using AR; (3) the step of query expansion using Ontology. In the initial step, the system retrieved the top documents via the user's initial query. Next is the initial processing step (stopword removal, POS Tagging, TF-IDF). Then do a Frequent Itemset (FI) search from the list of terms generated from the previous step using FP-Growth. The association rules search by using the results of FI. The output from the AR step expanded using Ontology. The results of the expansion with Ontology use as new queries. The dataset used is a collection of learning documents. Ten queries used for the testing, the test results are measured by three measuring devices, namely recall, precision, and f-measure. Based on testing and analysis results,  integrating AR and Ontology can increase the relevance of documents with the value of recall, precision, and f-measure by 87.28, 79.07, and 82.85.


Sign in / Sign up

Export Citation Format

Share Document