scholarly journals A New Approach in Query Expansion Methods for Improving Information Retrieval

2021 ◽  
Vol 9 (1) ◽  
pp. 93
Author(s):  
Lasmedi Afuan ◽  
Ahmad Ashari ◽  
Yohanes Suyanto

This research develops a new approach to query expansion by integrating Association Rules (AR) and Ontology. In the proposed approach, there are several steps to expand the query, namely (1) the document retrieval step; (2) the step of query expansion using AR; (3) the step of query expansion using Ontology. In the initial step, the system retrieved the top documents via the user's initial query. Next is the initial processing step (stopword removal, POS Tagging, TF-IDF). Then do a Frequent Itemset (FI) search from the list of terms generated from the previous step using FP-Growth. The association rules search by using the results of FI. The output from the AR step expanded using Ontology. The results of the expansion with Ontology use as new queries. The dataset used is a collection of learning documents. Ten queries used for the testing, the test results are measured by three measuring devices, namely recall, precision, and f-measure. Based on testing and analysis results,  integrating AR and Ontology can increase the relevance of documents with the value of recall, precision, and f-measure by 87.28, 79.07, and 82.85.

2015 ◽  
Vol 6 (3) ◽  
Author(s):  
Resti Ludviani ◽  
Khadijah F. Hayati ◽  
Agus Zainal Arifin ◽  
Diana Purwitasari

Abstract. An appropriate selection term for expanding a query is very important in query expansion. Therefore, term selection optimization is added to improve query expansion performance on document retrieval system. This study proposes a new approach named Term Relatedness to Query-Entropy based (TRQE) to optimize weight in query expansion by considering semantic and statistic aspects from relevance evaluation of pseudo feedback to improve document retrieval performance. The proposed method has 3 main modules, they are relevace feedback, pseudo feedback, and document retrieval. TRQE is implemented in pseudo feedback module to optimize weighting term in query expansion. The evaluation result shows that TRQE can retrieve document with the highest result at precission of 100% and recall of 22,22%. TRQE for weighting optimization of query expansion is proven to improve retrieval document.     Keywords: TRQE, query expansion, term weighting, term relatedness to query, relevance feedback Abstrak..Pemilihan term yang tepat untuk memperluas queri merupakan hal yang penting pada query expansion. Oleh karena itu, perlu dilakukan optimasi penentuan term yang sesuai sehingga mampu meningkatkan performa query expansion pada system temu kembali dokumen. Penelitian ini mengajukan metode Term Relatedness to Query-Entropy based (TRQE), sebuah metode untuk mengoptimasi pembobotan pada query expansion dengan memperhatikan aspek semantic dan statistic dari penilaian relevansi suatu pseudo feedback sehingga mampu meningkatkan performa temukembali dokumen. Metode yang diusulkan memiliki 3 modul utama yaitu relevan feedback, pseudo feedback, dan document retrieval. TRQE diimplementasikan pada modul pseudo feedback untuk optimasi pembobotan term pada ekspansi query. Evaluasi hasil uji coba menunjukkan bahwa metode TRQE dapat melakukan temukembali dokumen dengan hasil terbaik pada precision  100% dan recall sebesar 22,22%.Metode TRQE untuk optimasi pembobotan pada query expansion terbukti memberikan pengaruh untuk meningkatkan relevansi pencarian dokumen.Kata Kunci: TRQE, ekspansi query, pembobotan term, term relatedness to query, relevance feedback


2014 ◽  
Vol 28 (4) ◽  
pp. 344-359 ◽  
Author(s):  
Gyeong June Hahm ◽  
Mun Yong Yi ◽  
Jae Hyun Lee ◽  
Hyo Won Suh

2008 ◽  
Author(s):  
Makoto Terao ◽  
Takafumi Koshinaka ◽  
Shinichi Ando ◽  
Ryosuke Isotani ◽  
Akitoshi Okumura

Author(s):  
Andrea Conchado Peiró ◽  
José Miguel Carot Sierra ◽  
Elena Vázquez Barrachina ◽  
Enrique Orduña Malea

Cybermetrics field is attracting considerable interest due to its utility as a data-oriented technique for research, though it may provide misleading information when used in complex systems. This paper outlines a new approach to market research analysis through the definition of composite indicators for cybermetrics, applied to the Spanish wine market. Our findings show that the majority of cellars were present in only one or two social media networks: Facebook, Twitter or both. Besides, the presence on the Web can be summarized into three principal components: website quality, presence on Facebook, and presence on Twitter. Three groups of cellars were identified according to their position in these components: cellars with a high number of errors in their website with complete absence of information in social media, cellars with strong presence in social media, and cellars in an intermediate position. Our results constitute an excellent initial step towards the definition of a methodology for building composite indicators in cybermetrics. From a practical approach, these indicators may encourage cellar managers to make better decisions towards their transition to the digital market.


2021 ◽  
Vol 11 (1) ◽  
pp. 18-37
Author(s):  
Mehmet Bicer ◽  
Daniel Indictor ◽  
Ryan Yang ◽  
Xiaowen Zhang

Association rule mining is a common technique used in discovering interesting frequent patterns in data acquired in various application domains. The search space combinatorically explodes as the size of the data increases. Furthermore, the introduction of new data can invalidate old frequent patterns and introduce new ones. Hence, while finding the association rules efficiently is an important problem, maintaining and updating them is also crucial. Several algorithms have been introduced to find the association rules efficiently. One of them is Apriori. There are also algorithms written to update or maintain the existing association rules. Update with early pruning (UWEP) is one such algorithm. In this paper, the authors propose that in certain conditions it is preferable to use an incremental algorithm as opposed to the classic Apriori algorithm. They also propose new implementation techniques and improvements to the original UWEP paper in an algorithm we call UWEP2. These include the use of memorization and lazy evaluation to reduce scans of the dataset.


Transmisi ◽  
2018 ◽  
Vol 20 (2) ◽  
pp. 49
Author(s):  
Zahra Arwananing Tyas

Sistem rekomendasi dapat menghasilkan rekomendasi dengan berbagai cara dan menggunakan berbagai macam metode, salah satunya adalah memanfaatkan tumpukan kasus lama atau tumpukan data transaksi lama yang dapat menghasilkan informasi atau aturan dengan metode Association Rules Mining(ARM). Aturan terbentuk dengan metode multi level ARM dan menghasilkan 5 aturan yang akan dicocokkan dengan masukan pengguna. Saat aturan ditemukan cocok maka consequent dari aturan tersebut akan dijadikan hasil rekomendasi.  Hasil pengujian dari aturan yang terbentuk memiliki nilai akurasi 94,12% dan nilai precision, recall dan F-measure untuk sistem rekomendasi ini pada proses rekomendasi dengan aturan yaitu berturut 0,475; 0,513 dan 0,25.


Author(s):  
Luminita Dumitriu

Association rules, introduced by Agrawal, Imielinski and Swami (1993), provide useful means to discover associations in data. The problem of mining association rules in a database is defined as finding all the association rules that hold with more than a user-given minimum support threshold and a user-given minimum confidence threshold. According to Agrawal, Imielinski and Swami, this problem is solved in two steps: 1. Find all frequent itemsets in the database. 2. For each frequent itemset I, generate all the association rules I’ÞI\I’, where I’ÌI.


Sign in / Sign up

Export Citation Format

Share Document