A New Approach in Query Expansion Methods for Improving Information Retrieval

This research develops a new approach to query expansion by integrating Association Rules (AR) and Ontology. In the proposed approach, there are several steps to expand the query, namely (1) the document retrieval step; (2) the step of query expansion using AR; (3) the step of query expansion using Ontology. In the initial step, the system retrieved the top documents via the user's initial query. Next is the initial processing step (stopword removal, POS Tagging, TF-IDF). Then do a Frequent Itemset (FI) search from the list of terms generated from the previous step using FP-Growth. The association rules search by using the results of FI. The output from the AR step expanded using Ontology. The results of the expansion with Ontology use as new queries. The dataset used is a collection of learning documents. Ten queries used for the testing, the test results are measured by three measuring devices, namely recall, precision, and f-measure. Based on testing and analysis results, integrating AR and Ontology can increase the relevance of documents with the value of recall, precision, and f-measure by 87.28, 79.07, and 82.85.

Download Full-text

Optimasi Pembobotan pada Query Expansion dengan Term Relatedness to Query-Entropy based (TRQE)

Jurnal Buana Informatika ◽

10.24002/jbi.v6i3.433 ◽

2015 ◽

Vol 6 (3) ◽

Author(s):

Resti Ludviani ◽

Khadijah F. Hayati ◽

Agus Zainal Arifin ◽

Diana Purwitasari

Keyword(s):

Query Expansion ◽

Retrieval System ◽

Document Retrieval ◽

Retrieval Performance ◽

Term Weighting ◽

New Approach ◽

Term Selection ◽

Relevance Evaluation ◽

Feedback Module ◽

Pseudo Feedback

Abstract. An appropriate selection term for expanding a query is very important in query expansion. Therefore, term selection optimization is added to improve query expansion performance on document retrieval system. This study proposes a new approach named Term Relatedness to Query-Entropy based (TRQE) to optimize weight in query expansion by considering semantic and statistic aspects from relevance evaluation of pseudo feedback to improve document retrieval performance. The proposed method has 3 main modules, they are relevace feedback, pseudo feedback, and document retrieval. TRQE is implemented in pseudo feedback module to optimize weighting term in query expansion. The evaluation result shows that TRQE can retrieve document with the highest result at precission of 100% and recall of 22,22%. TRQE for weighting optimization of query expansion is proven to improve retrieval document.Â Â Â Â Keywords: TRQE, query expansion, term weighting, term relatedness to query, relevance feedbackÂ Abstrak..Pemilihan term yang tepat untuk memperluas queri merupakan hal yang penting pada query expansion. Oleh karena itu, perlu dilakukan optimasi penentuan term yang sesuai sehingga mampu meningkatkan performa query expansion pada system temu kembali dokumen. Penelitian ini mengajukan metode Term Relatedness to Query-Entropy based (TRQE), sebuah metode untuk mengoptimasi pembobotan pada query expansion dengan memperhatikan aspek semantic dan statistic dari penilaian relevansi suatu pseudo feedback sehingga mampu meningkatkan performa temukembali dokumen. Metode yang diusulkan memiliki 3 modul utama yaitu relevan feedback, pseudo feedback, dan document retrieval. TRQE diimplementasikan pada modul pseudo feedback untuk optimasi pembobotan term pada ekspansi query. Evaluasi hasil uji coba menunjukkan bahwa metode TRQE dapat melakukan temukembali dokumen dengan hasil terbaik pada precisionÂ 100% dan recall sebesar 22,22%.Metode TRQE untuk optimasi pembobotan pada query expansion terbukti memberikan pengaruh untuk meningkatkan relevansi pencarian dokumen.Kata Kunci: TRQE, ekspansi query, pembobotan term, term relatedness to query, relevance feedback

Download Full-text

Query Expansion in Information Retrieval using Frequent Pattern (FP) Growth Algorithm for Frequent Itemset Search and Association Rules Mining

International Journal of Advanced Computer Science and Applications ◽

10.14569/ijacsa.2019.0100235 ◽

2019 ◽

Vol 10 (2) ◽

Author(s):

Lasmedi Afuan ◽

Ahmad Ashari ◽

Yohanes Suyanto

Keyword(s):

Information Retrieval ◽

Association Rules ◽

Query Expansion ◽

Frequent Itemset ◽

Frequent Pattern ◽

Association Rules Mining

Download Full-text

A personalized query expansion approach for engineering document retrieval

Advanced Engineering Informatics ◽

10.1016/j.aei.2014.04.002 ◽

2014 ◽

Vol 28 (4) ◽

pp. 344-359 ◽

Cited By ~ 20

Author(s):

Gyeong June Hahm ◽

Mun Yong Yi ◽

Jae Hyun Lee ◽

Hyo Won Suh

Keyword(s):

Query Expansion ◽

Document Retrieval

Download Full-text

Open-vocabulary spoken-document retrieval based on query expansion using related web documents

10.21437/interspeech.2008-568 ◽

2008 ◽

Author(s):

Makoto Terao ◽

Takafumi Koshinaka ◽

Shinichi Ando ◽

Ryosuke Isotani ◽

Akitoshi Okumura

Keyword(s):

Query Expansion ◽

Document Retrieval ◽

Spoken Document Retrieval ◽

Web Documents

Download Full-text

Efficient Association Rules Selecting for Automatic Query Expansion

Computational Linguistics and Intelligent Text Processing - Lecture Notes in Computer Science ◽

10.1007/978-3-319-77116-8_42 ◽

2018 ◽

pp. 563-574

Author(s):

Ahlem Bouziri ◽

Chiraz Latiri ◽

Eric Gaussier

Keyword(s):

Association Rules ◽

Query Expansion

Download Full-text

Proposal of a composite indicator for measuring social media presence in the wine market

CARMA 2020 - 3rd International Conference on Advanced Research Methods and Analytics ◽

10.4995/carma2020.2020.11647 ◽

2020 ◽

Author(s):

Andrea Conchado Peiró ◽

José Miguel Carot Sierra ◽

Elena Vázquez Barrachina ◽

Enrique Orduña Malea

Keyword(s):

Social Media ◽

Market Research ◽

Initial Step ◽

Composite Indicators ◽

New Approach ◽

Misleading Information ◽

Social Media Networks ◽

Wine Market ◽

Definition Of ◽

Research Analysis

Cybermetrics field is attracting considerable interest due to its utility as a data-oriented technique for research, though it may provide misleading information when used in complex systems. This paper outlines a new approach to market research analysis through the definition of composite indicators for cybermetrics, applied to the Spanish wine market. Our findings show that the majority of cellars were present in only one or two social media networks: Facebook, Twitter or both. Besides, the presence on the Web can be summarized into three principal components: website quality, presence on Facebook, and presence on Twitter. Three groups of cellars were identified according to their position in these components: cellars with a high number of errors in their website with complete absence of information in social media, cellars with strong presence in social media, and cellars in an intermediate position. Our results constitute an excellent initial step towards the definition of a methodology for building composite indicators in cybermetrics. From a practical approach, these indicators may encourage cellar managers to make better decisions towards their transition to the digital market.

Download Full-text

Efficient Implementations for UWEP Incremental Frequent Itemset Mining Algorithm

International Journal of Applied Logistics ◽

10.4018/ijal.2021010102 ◽

2021 ◽

Vol 11 (1) ◽

pp. 18-37

Author(s):

Mehmet Bicer ◽

Daniel Indictor ◽

Ryan Yang ◽

Xiaowen Zhang

Keyword(s):

Association Rules ◽

Association Rule ◽

Search Space ◽

Frequent Itemset ◽

Incremental Algorithm ◽

Frequent Patterns ◽

Lazy Evaluation ◽

Rule Mining ◽

Implementation Techniques ◽

Common Technique

Association rule mining is a common technique used in discovering interesting frequent patterns in data acquired in various application domains. The search space combinatorically explodes as the size of the data increases. Furthermore, the introduction of new data can invalidate old frequent patterns and introduce new ones. Hence, while finding the association rules efficiently is an important problem, maintaining and updating them is also crucial. Several algorithms have been introduced to find the association rules efficiently. One of them is Apriori. There are also algorithms written to update or maintain the existing association rules. Update with early pruning (UWEP) is one such algorithm. In this paper, the authors propose that in certain conditions it is preferable to use an incremental algorithm as opposed to the classic Apriori algorithm. They also propose new implementation techniques and improvements to the original UWEP paper in an algorithm we call UWEP2. These include the use of memorization and lazy evaluation to reduce scans of the dataset.

Download Full-text

ATURAN REKOMENDASI BARANG MENGGUNAKAN MULTI LEVEL ASSOCIATION RULES MINING (ML-ARM)

Transmisi ◽

10.14710/transmisi.20.2.49-56 ◽

2018 ◽

Vol 20 (2) ◽

pp. 49

Author(s):

Zahra Arwananing Tyas

Keyword(s):

Association Rules ◽

Association Rules Mining ◽

Multi Level ◽

F Measure

Sistem rekomendasi dapat menghasilkan rekomendasi dengan berbagai cara dan menggunakan berbagai macam metode, salah satunya adalah memanfaatkan tumpukan kasus lama atau tumpukan data transaksi lama yang dapat menghasilkan informasi atau aturan dengan metode Association Rules Mining(ARM). Aturan terbentuk dengan metode multi level ARM dan menghasilkan 5 aturan yang akan dicocokkan dengan masukan pengguna. Saat aturan ditemukan cocok maka consequent dari aturan tersebut akan dijadikan hasil rekomendasi. Hasil pengujian dari aturan yang terbentuk memiliki nilai akurasi 94,12% dan nilai precision, recall dan F-measure untuk sistem rekomendasi ini pada proses rekomendasi dengan aturan yaitu berturut 0,475; 0,513 dan 0,25.

Download Full-text

A hybrid evolutionary algorithm based automatic query expansion for enhancing document retrieval system

Journal of Ambient Intelligence and Humanized Computing ◽

10.1007/s12652-019-01247-9 ◽

2019 ◽

Cited By ~ 7

Author(s):

Dilip Kumar Sharma ◽

Rajendra Pamula ◽

D. S. Chauhan

Keyword(s):

Evolutionary Algorithm ◽

Query Expansion ◽

Retrieval System ◽

Document Retrieval ◽

Hybrid Evolutionary Algorithm

Download Full-text

Closed-Itemset Incremental-Mining Problem

Encyclopedia of Data Warehousing and Mining ◽

10.4018/978-1-59140-557-3.ch029 ◽

2011 ◽

pp. 150-153

Author(s):

Luminita Dumitriu

Keyword(s):

Association Rules ◽

Frequent Itemsets ◽

Frequent Itemset ◽

Incremental Mining ◽

Minimum Support ◽

Confidence Threshold ◽

Support Threshold ◽

Mining Association Rules

Association rules, introduced by Agrawal, Imielinski and Swami (1993), provide useful means to discover associations in data. The problem of mining association rules in a database is defined as finding all the association rules that hold with more than a user-given minimum support threshold and a user-given minimum confidence threshold. According to Agrawal, Imielinski and Swami, this problem is solved in two steps: 1. Find all frequent itemsets in the database. 2. For each frequent itemset I, generate all the association rules I’ÞI\I’, where I’ÌI.

Download Full-text