Query Expansion in Information Retrieval using Frequent Pattern (FP) Growth Algorithm for Frequent Itemset Search and Association Rules Mining

Frequent pattern mining has been an important research direction in association rules. This paper use a methodology by preprocessing the original dataset using fuzzy clustering which can mapped quantitative datasets into linguistic datasets. Then we propose a algorithm based on fuzzy frequent pattern tree for extracting fuzzy frequent itemset from mapped linguistic datasets. Experimental results show that our algorithm is shorter than the F-Apriori on computing time to huge database. For large database, the algorithm presented in this paper is proved to have a good prospect.

Download Full-text

Association Rules Analysis on FP-Growth Method in Predicting Sales

10.31227/osf.io/8m57c ◽

2017 ◽

Author(s):

Andysah Putera Utama Siahaan ◽

Mesran Mesran ◽

Andre Hasudungan Lubis ◽

Ali Ikhwan ◽

Supiyandi

Keyword(s):

Data Mining ◽

Association Rules ◽

Frequent Itemset ◽

Frequent Pattern ◽

Data Set ◽

Pattern Processing ◽

Large Databases ◽

Growth Method ◽

Association Rules Analysis ◽

A Company

Sales transaction data on a company will continue to increase day by day. Large amounts of data can be problematic for a company if it is not managed properly. Data mining is a field of science that unifies techniques from machine learning, pattern processing, statistics, databases, and visualization to handle the problem of retrieving information from large databases. The relationship sought in data mining can be a relationship between two or more in one dimension. The algorithm included in association rules in data mining is the Frequent Pattern Growth (FP-Growth) algorithm is one of the alternatives that can be used to determine the most frequent itemset in a data set.

Download Full-text

Privacy Preserving in Association Rules Mining with VPA Algorithm

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.263-266.3060 ◽

2012 ◽

Vol 263-266 ◽

pp. 3060-3063 ◽

Cited By ~ 1

Author(s):

Yi Tao Zhang ◽

Wen Liang Tang ◽

Cheng Wang Xie ◽

Ji Qiang Xiong

Keyword(s):

Data Mining ◽

Association Rules ◽

Privacy Preserving ◽

Frequent Itemset ◽

Apriori Algorithm ◽

Privacy Preserving Data Mining ◽

Association Rules Mining ◽

Synthetic Datasets

A VPA algorithm is proposed to mining the association rules in the privacy preserving data mining, where data is vertically partitioned. The VSS protocol was used to encrypt the vertically data, which was owned by different parties. And the private comparing protocol was adopted to generate the frequent itemset. In VPA the ID numbers of the recordings were employed to keep the consistency of the data among different parties, which were saved in ID index array. The VPA algorithm can generate association rules without violating the privacy. The performance of the scheme is validated against representative real and synthetic datasets. The results reveal that the VPA algorithm can do the same in finding frequent itemset and generating the consistent rules, as it did in Apriori algorithm, in which the data were vertically partitioned and totally encrypted.

Download Full-text

Semantic Query Expansion Combining Association Rules with Ontologies and Information Retrieval Techniques

Data Warehousing and Knowledge Discovery - Lecture Notes in Computer Science ◽

10.1007/11546849_32 ◽

2005 ◽

pp. 326-335 ◽

Cited By ~ 3

Author(s):

Min Song ◽

Il-Yeol Song ◽

Xiaohua Hu ◽

Robert Allen

Keyword(s):

Information Retrieval ◽

Association Rules ◽

Query Expansion ◽

Semantic Query

Download Full-text

Query Expansion of Pseudo Relevance Feedback Based on Matrix-Weighted Association Rules Mining

Journal of Software ◽

10.3724/sp.j.1001.2009.03368 ◽

2010 ◽

Vol 20 (7) ◽

pp. 1854-1865 ◽

Cited By ~ 5

Author(s):

Ming-Xuan HUANG ◽

Xiao-Wei YAN ◽

Shi-Chao ZHANG

Keyword(s):

Association Rules ◽

Relevance Feedback ◽

Query Expansion ◽

Association Rules Mining ◽

Weighted Association Rules ◽

Pseudo Relevance Feedback

Download Full-text

Arabic Query Expansion Using WordNet and Association Rules

Information Retrieval and Management ◽

10.4018/978-1-5225-5191-1.ch054 ◽

2018 ◽

pp. 1239-1254 ◽

Cited By ~ 1

Author(s):

Ahmed Abbache ◽

Farid Meziane ◽

Ghalem Belalem ◽

Fatma Zohra Belkredim

Keyword(s):

Information Retrieval ◽

Association Rules ◽

Query Expansion ◽

Arabic Language ◽

Selection Method ◽

Empirical Results ◽

Retrieval Systems ◽

Information Retrieval Systems ◽

Significant Performance ◽

Selection Of

Query expansion is the process of adding additional relevant terms to the original queries to improve the performance of information retrieval systems. However, previous studies showed that automatic query expansion using WordNet do not lead to an improvement in the performance. One of the main challenges of query expansion is the selection of appropriate terms. In this paper, the authors review this problem using Arabic WordNet and Association Rules within the context of Arabic Language. The results obtained confirmed that with an appropriate selection method, the authors are able to exploit Arabic WordNet to improve the retrieval performance. Their empirical results on a sub-corpus from the Xinhua collection showed that their automatic selection method has achieved a significant performance improvement in terms of MAP and recall and a better precision with the first top retrieved documents.

Download Full-text

Research on Parallel Association Rules Mining Algorithm Based on Hadoop

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.543-547.3625 ◽

2014 ◽

Vol 543-547 ◽

pp. 3625-3631

Author(s):

Shao Rong Feng ◽

Lin Bao Ye ◽

Zi Yu Lin

Keyword(s):

Cloud Computing ◽

Association Rules ◽

Random Access ◽

Comprehensive Analysis ◽

Frequent Itemset ◽

Association Rules Mining ◽

Computing Platform ◽

Mining Algorithm ◽

Cloud Computing Platform ◽

Mining Algorithms

The purpose of association rules mining is to find rules which can meet the minimum support and minimum confidence from a large quantity of data. To find the valid association rules efficiently, we had a comprehensive analysis on some well-know parallel association rules mining algorithms and proposes a new parallel association rules mining algorithm (Array Based on Hadoop, short for ABH) based on the cloud computing platform. The ABH scans the database only once, uses the 0/1 array to represent one of the transactions and to record the frequency of the same transaction. Moreover, by utilizing the random access characteristics of the array and the special nature of the frequent itemset, the ABH can reduce the quantity of frequent candidate itemset effectively and find the frequent itemset quickly. We have compared the ABH with two classical algorithms CD and DD through experiment; we can find that ABH outperforms CD and DD.

Download Full-text

Research of Improved FP-Growth Algorithm in Association Rules Mining

Scientific Programming ◽

10.1155/2015/910281 ◽

2015 ◽

Vol 2015 ◽

pp. 1-6 ◽

Cited By ~ 10

Author(s):

Yi Zeng ◽

Shiqun Yin ◽

Jiangyue Liu ◽

Miao Zhang

Keyword(s):

Data Mining ◽

Association Rules ◽

Experimental Results ◽

Frequent Pattern ◽

Association Rules Mining ◽

Classical Algorithm ◽

Pattern Growth ◽

Data Volume ◽

Better Than

Association rules mining is an important technology in data mining. FP-Growth (frequent-pattern growth) algorithm is a classical algorithm in association rules mining. But the FP-Growth algorithm in mining needs two times to scan database, which reduces the efficiency of algorithm. Through the study of association rules mining and FP-Growth algorithm, we worked out improved algorithms of FP-Growth algorithm—Painting-Growth algorithm and N (not) Painting-Growth algorithm (removes the painting steps, and uses another way to achieve). We compared two kinds of improved algorithms with FP-Growth algorithm. Experimental results show that Painting-Growth algorithm is more than 1050 and N Painting-Growth algorithm is less than 10000 in data volume; the performance of the two kinds of improved algorithms is better than that of FP-Growth algorithm.

Download Full-text

A New Approach in Query Expansion Methods for Improving Information Retrieval

JUITA Jurnal Informatika ◽

10.30595/juita.v9i1.9657 ◽

2021 ◽

Vol 9 (1) ◽

pp. 93

Author(s):

Lasmedi Afuan ◽

Ahmad Ashari ◽

Yohanes Suyanto

Keyword(s):

Association Rules ◽

Query Expansion ◽

Initial Step ◽

Document Retrieval ◽

Frequent Itemset ◽

New Approach ◽

Pos Tagging ◽

Measuring Devices ◽

Initial Processing ◽

F Measure

This research develops a new approach to query expansion by integrating Association Rules (AR) and Ontology. In the proposed approach, there are several steps to expand the query, namely (1) the document retrieval step; (2) the step of query expansion using AR; (3) the step of query expansion using Ontology. In the initial step, the system retrieved the top documents via the user's initial query. Next is the initial processing step (stopword removal, POS Tagging, TF-IDF). Then do a Frequent Itemset (FI) search from the list of terms generated from the previous step using FP-Growth. The association rules search by using the results of FI. The output from the AR step expanded using Ontology. The results of the expansion with Ontology use as new queries. The dataset used is a collection of learning documents. Ten queries used for the testing, the test results are measured by three measuring devices, namely recall, precision, and f-measure. Based on testing and analysis results, integrating AR and Ontology can increase the relevance of documents with the value of recall, precision, and f-measure by 87.28, 79.07, and 82.85.

Download Full-text