The Use of Arabic WordNet in Arabic Information Retrieval

2014 ◽  
Vol 4 (3) ◽  
pp. 54-65 ◽  
Author(s):  
Ahmed Abbache ◽  
Fatiha Barigou ◽  
Fatma Zohra Belkredim ◽  
Ghalem Belalem

Research and experimentation using Arabic WordNet in the field of information retrieval are relatively new. It is limited compared to the research that has been done using Princeton WordNet. This work attempts to study the impact of Arabic WordNet on the performance of Arabic information retrieval. We extend Lucene with Arabic WordNet to expand user's queries. The major contribution of this study is to propose an interactive query expansion (IQE) methodology using the word's part-of-speech, according to the part it plays in a query. First, the user selects the appropriate part of speech for each term in the original query, and then he reselects the appropriate synonyms. Experimental results show that our IQE strategy produces a good Mean Average Precision (MAP), it is able to improve MAP by 12.6%, but no variant of automatic query expansion (AQE) strategies did. Nevertheless, the experiments allow us to conclude that with an appropriate use of Arabic WordNet as a source of linguistic information for AQE can improve effectiveness for Arabic information retrieval.

2016 ◽  
pp. 773-783 ◽  
Author(s):  
Ahmed Abbache ◽  
Fatiha Barigou ◽  
Fatma Zohra Belkredim ◽  
Ghalem Belalem

Research and experimentation using Arabic WordNet in the field of information retrieval are relatively new. It is limited compared to the research that has been done using Princeton WordNet. This work attempts to study the impact of Arabic WordNet on the performance of Arabic information retrieval. We extend Lucene with Arabic WordNet to expand user's queries. The major contribution of this study is to propose an interactive query expansion (IQE) methodology using the word's part-of-speech, according to the part it plays in a query. First, the user selects the appropriate part of speech for each term in the original query, and then he reselects the appropriate synonyms. Experimental results show that our IQE strategy produces a good Mean Average Precision (MAP), it is able to improve MAP by 12.6%, but no variant of automatic query expansion (AQE) strategies did. Nevertheless, the experiments allow us to conclude that with an appropriate use of Arabic WordNet as a source of linguistic information for AQE can improve effectiveness for Arabic information retrieval.


2018 ◽  
Vol 45 (4) ◽  
pp. 429-442 ◽  
Author(s):  
Abdelkader El Mahdaouy ◽  
Saïd Ouatik El Alaoui ◽  
Eric Gaussier

Pseudo-relevance feedback (PRF) is a very effective query expansion approach, which reformulates queries by selecting expansion terms from top k pseudo-relevant documents. Although standard PRF models have been proven effective to deal with vocabulary mismatch between users’ queries and relevant documents, expansion terms are selected without considering their similarity to the original query terms. In this article, we propose a method to incorporate word embedding (WE) similarity into PRF models for Arabic information retrieval (IR). The main idea is to select expansion terms using their distribution in the set of top pseudo-relevant documents along with their similarity to the original query terms. Experiments are conducted on the standard Arabic TREC 2001/2002 collection using three neural WE models. The obtained results show that our PRF extensions significantly outperform their baseline PRF models. Moreover, they enhanced the baseline IR model by 22% and 68% for the mean average precision (MAP) and the robustness index (RI), respectively.


Author(s):  
Daniel Crabtree

Web search engines help users find relevant web pages by returning a result set containing the pages that best match the user’s query. When the identified pages have low relevance, the query must be refined to capture the search goal more effectively. However, finding appropriate refinement terms is difficult and time consuming for users, so researchers developed query expansion approaches to identify refinement terms automatically. There are two broad approaches to query expansion, automatic query expansion (AQE) and interactive query expansion (IQE) (Ruthven et al., 2003). AQE has no user involvement, which is simpler for the user, but limits its performance. IQE has user involvement, which is more complex for the user, but means it can tackle more problems such as ambiguous queries. Searches fail by finding too many irrelevant pages (low precision) or by finding too few relevant pages (low recall). AQE has a long history in the field of information retrieval, where the focus has been on improving recall (Velez et al., 1997). Unfortunately, AQE often decreased precision as the terms used to expand a query often changed the query’s meaning (Croft and Harper (1979) identified this effect and named it query drift). The problem is that users typically consider just the first few results (Jansen et al., 2005), which makes precision vital to web search performance. In contrast, IQE has historically balanced precision and recall, leading to an earlier uptake within web search. However, like AQE, the precision of IQE approaches needs improvement. Most recently, approaches have started to improve precision by incorporating semantic knowledge.


Music is the combination of melody, linguistic information and singer’s mental realm. As popularity of music increases, the choice of songs also varies according to their mental conditions. The mental conditions reach the supreme bliss to melancholy strain based on the musical notes. Majority mostly prefer songs, which satisfy their current state of mind. Pragmatic analysis in music by computer is a difficult task, as emotion is very complex and it camouflages the real situation. Hence, In this paper , trying to classify the songs based on the features of music which helps to classify the emotion more easily. Music feature extraction is done using Music Information Retrieval (MIR) toolbox. The dataset consists of 100 of Hindi songs of 30 seconds clip and later classify the emotion based on Naïve Bayes classification method using Weka API.


Author(s):  
Fabrizio Sebastiani

The categorization of documents into subject-specific categories is a useful enhancement for large document collections addressed by information retrieval systems, as a user can first browse a category tree in search of the category that best matches her interests and then issue a query for more specific documents “from within the category.” This approach combines two modalities in information seeking that are most popular in Web-based search engines, i.e., category-based site browsing (as exemplified by, e.g., Yahoo™) and keyword-based document querying (as exemplified by, e.g., AltaVista™). Appropriate query expansion tools need to be provided, though, in order to allow the user to incrementally refine her query through further retrieval passes, thus allowing the system to produce a series of subsequent document rankings that hopefully converge to the user’s expected ranking. In this work we propose that automatically generated, category-specific “associative” thesauri be used for such purpose. We discuss a method for their generation and discuss how the thesaurus specific to a given category may usefully be endowed with “gateways” to the thesauri specific to its parent and children categories.


2022 ◽  
Vol 12 (1) ◽  
pp. 0-0

In this paper, the authors propose and readapt a new concept-based approach of query expansion in the context of Arabic information retrieval. The purpose is to represent the query by a set of weighted concepts in order to identify better the user's information need. Firstly, concepts are extracted from the initially retrieved documents by the Pseudo-Relevance Feedback method, and then they are integrated into a semantic weighted tree in order to detect more information contained in the related concepts connected by semantic relations to the primary concepts. The authors use the “Arabic WordNet” as a resource to extract, disambiguate concepts and build the semantic tree. Experimental results demonstrate that measure of MAP (Mean Average Precision) is about 10% of improvement using the open source Lucene as IR System on a collection formed from the Arabic BBC news.


2014 ◽  
Author(s):  
Ashraf Mahgoub ◽  
Mohsen Rashwan ◽  
Hazem Raafat ◽  
Mohamed Zahran ◽  
Magda Fayek

Sign in / Sign up

Export Citation Format

Share Document