scholarly journals Arabic Query Expansion Using WordNet and Association Rules

Author(s):  
Ahmed Abbache ◽  
Farid Meziane ◽  
Ghalem Belalem ◽  
Fatma Zohra Belkredim

Query expansion is the process of adding additional relevant terms to the original queries to improve the performance of information retrieval systems. However, previous studies showed that automatic query expansion using WordNet do not lead to an improvement in the performance. One of the main challenges of query expansion is the selection of appropriate terms. In this paper, the authors review this problem using Arabic WordNet and Association Rules within the context of Arabic Language. The results obtained confirmed that with an appropriate selection method, the authors are able to exploit Arabic WordNet to improve the retrieval performance. Their empirical results on a sub-corpus from the Xinhua collection showed that their automatic selection method has achieved a significant performance improvement in terms of MAP and recall and a better precision with the first top retrieved documents.

Author(s):  
Ahmed Abbache ◽  
Farid Meziane ◽  
Ghalem Belalem ◽  
Fatma Zohra Belkredim

Query expansion is the process of adding additional relevant terms to the original queries to improve the performance of information retrieval systems. However, previous studies showed that automatic query expansion using WordNet do not lead to an improvement in the performance. One of the main challenges of query expansion is the selection of appropriate terms. In this paper, the authors review this problem using Arabic WordNet and Association Rules within the context of Arabic Language. The results obtained confirmed that with an appropriate selection method, the authors are able to exploit Arabic WordNet to improve the retrieval performance. Their empirical results on a sub-corpus from the Xinhua collection showed that their automatic selection method has achieved a significant performance improvement in terms of MAP and recall and a better precision with the first top retrieved documents.


Author(s):  
Siham Jabri ◽  
Azzeddine Dahbi ◽  
Taoufiq Gadi

Pseudo-relevance feedback is a query expansion approach whose terms are selected from a set of top ranked retrieved documents in response to the original query.  However, the selected terms will not be related to the query if the top retrieved documents are irrelevant. As a result, retrieval performance for the expanded query is not improved, compared to the original one. This paper suggests the use of documents selected using Pseudo Relevance Feedback for generating association rules. Thus, an algorithm based on dominance relations is applied. Then the strong correlations between query and other terms are detected, and an oriented and weighted graph called Pseudo-Graph Feedback is constructed. This graph serves for expanding original queries by terms related semantically and selected by the user. The results of the experiments on Text Retrieval Conference (TREC) collection are very significant, and best results are achieved by the proposed approach compared to both the baseline system and an existing technique.


Author(s):  
Dr. V. Suma

The recent technology development fascinates the people towards information and its services. Managing the personal and pubic data is a perennial research topic among researchers. In particular retrieval of information gains more attention as it is important similar to data storing. Clustering based, similarity based, graph based information retrieval systems are evolved to reduce the issues in conventional information retrieval systems. Learning based information retrieval is the present trend and in particular deep neural network is widely adopted due to its retrieval performance. However, the similarity between the information has uncertainties due to its measuring procedures. Considering these issues also to improve the retrieval performance, a hybrid deep fuzzy hashing algorithm is introduced in this research work. Hashing efficiently retrieves the information based on mapping the similar information as correlated binary codes and this underlying information is trained using deep neural network and fuzzy logic to retrieve the necessary information from distributed cloud. Experimental results prove that the proposed model attains better retrieval accuracy and accuracy compared to conventional models such as support vector machine and deep neural network.


2018 ◽  
Vol 2 (4) ◽  
pp. 140 ◽  
Author(s):  
Ramadhana Rosyadi ◽  
Said Al-Faraby ◽  
Adiwijaya Adiwijaya

Islam has 25 prophets as guidelines for human life, documents containing information about the stories of the lives of the prophets during their lifetime. This study aims to build a more specific question and answer system by generating relevant answers not in the form of documents. Question Answering System is able to overcome problems in the Question and answer system, information retrieval systems where the answers issued are correct with responses to requests submitted, not in the form of documents that may contain answers. This study uses the Pattern Based method as extracting sentence pieces which are the answers to find answers that match the patterns that have been made. The selection of datasets causes a number of questions that can be submitted to be limited to information stored in the data itself. Besides that, questions are also limited in the form of Question words that are Factoid, namely Who, when, where, what and how. Accuracy results obtained using the Pattern Based method on Question Answering System are 39.36%.


Author(s):  
Fabrizio Sebastiani

The categorization of documents into subject-specific categories is a useful enhancement for large document collections addressed by information retrieval systems, as a user can first browse a category tree in search of the category that best matches her interests and then issue a query for more specific documents “from within the category.” This approach combines two modalities in information seeking that are most popular in Web-based search engines, i.e., category-based site browsing (as exemplified by, e.g., Yahoo™) and keyword-based document querying (as exemplified by, e.g., AltaVista™). Appropriate query expansion tools need to be provided, though, in order to allow the user to incrementally refine her query through further retrieval passes, thus allowing the system to produce a series of subsequent document rankings that hopefully converge to the user’s expected ranking. In this work we propose that automatically generated, category-specific “associative” thesauri be used for such purpose. We discuss a method for their generation and discuss how the thesaurus specific to a given category may usefully be endowed with “gateways” to the thesauri specific to its parent and children categories.


2014 ◽  
Vol 2014 ◽  
pp. 1-10 ◽  
Author(s):  
A. R. Rivas ◽  
E. L. Iglesias ◽  
L. Borrajo

Information Retrieval focuses on finding documents whose content matches with a user query from a large document collection. As formulating well-designed queries is difficult for most users, it is necessary to use query expansion to retrieve relevant information. Query expansion techniques are widely applied for improving the efficiency of the textual information retrieval systems. These techniques help to overcome vocabulary mismatch issues by expanding the original query with additional relevant terms and reweighting the terms in the expanded query. In this paper, different text preprocessing and query expansion approaches are combined to improve the documents initially retrieved by a query in a scientific documental database. A corpus belonging to MEDLINE, called Cystic Fibrosis, is used as a knowledge source. Experimental results show that the proposed combinations of techniques greatly enhance the efficiency obtained by traditional queries.


Author(s):  
Omar El Midaoui ◽  
Btihal El Ghali ◽  
Abderrahim El Qadi

Geographical queries need a special process of reformulation by information retrieval systems (IRS) due to their specificities and hierarchical structure. This fact is ignored by most of web search engines. In this paper, we propose an automatic approach for building a spatial taxonomy, that models’ the notion of adjacency that will be used in the reformulation of the spatial part of a geographical query. This approach exploits the documents that are in top of the retrieved list when submitting a spatial entity, which is composed of a spatial relation and a noun of a city. Then, a transactional database is constructed, considering each document extracted as a transaction that contains the nouns of the cities sharing the country of the submitted query’s city. The algorithm frequent pattern growth (FP-growth) is applied to this database in his parallel version (parallel FP-growth: PFP) in order to generate association rules, that will form the country’s taxonomy in a Big Data context. Experiments has been conducted on Spark and their results show that query reformulation using the taxonomy constructed based on our proposed approach improves the precision and the effectiveness of the IRS.


2021 ◽  
Vol 3 (2) ◽  
Author(s):  
Falah Al-akashi ◽  
Diana Inkpen

What is a real time agent, how does it remedy ongoing daily frustrations for users, and how does it improve the retrieval performance in World Wide Web? These are the main question we focus on this manuscript. In many distributed information retrieval systems, information in agents should be ranked based on a combination of multiple criteria. Linear combination of ranks has been the dominant approach due to its simplicity and effectiveness. Such a combination scheme in distributed infrastructure requires that the ranks in resources or agents are comparable to each other before combined. The main challenge is transforming the raw rank values of different criteria appropriately to make them comparable before any combination. Different ways for ranking agents make this strategy difficult. In this research, we will demonstrate how to rank Web documents based on resource-provided information how to combine several resources raking schemas in one time. The proposed system was implemented specifically in data provided by agents to create a comparable combination for different attributes. The proposed approach was tested on the queries provided by Text Retrieval Conference (TREC). Experimental results showed that our approach is effective and robust compared with offline search platforms.


Sign in / Sign up

Export Citation Format

Share Document