The Use of Arabic WordNet in Arabic Information Retrieval

Research and experimentation using Arabic WordNet in the field of information retrieval are relatively new. It is limited compared to the research that has been done using Princeton WordNet. This work attempts to study the impact of Arabic WordNet on the performance of Arabic information retrieval. We extend Lucene with Arabic WordNet to expand user's queries. The major contribution of this study is to propose an interactive query expansion (IQE) methodology using the word's part-of-speech, according to the part it plays in a query. First, the user selects the appropriate part of speech for each term in the original query, and then he reselects the appropriate synonyms. Experimental results show that our IQE strategy produces a good Mean Average Precision (MAP), it is able to improve MAP by 12.6%, but no variant of automatic query expansion (AQE) strategies did. Nevertheless, the experiments allow us to conclude that with an appropriate use of Arabic WordNet as a source of linguistic information for AQE can improve effectiveness for Arabic information retrieval.

Download Full-text

Word-embedding-based pseudo-relevance feedback for Arabic information retrieval

Journal of Information Science ◽

10.1177/0165551518792210 ◽

2018 ◽

Vol 45 (4) ◽

pp. 429-442 ◽

Cited By ~ 5

Author(s):

Abdelkader El Mahdaouy ◽

Saïd Ouatik El Alaoui ◽

Eric Gaussier

Keyword(s):

Information Retrieval ◽

Relevance Feedback ◽

Query Expansion ◽

Main Idea ◽

Word Embedding ◽

Average Precision ◽

Standard Arabic ◽

Arabic Information Retrieval ◽

The Mean ◽

Pseudo Relevance Feedback

Pseudo-relevance feedback (PRF) is a very effective query expansion approach, which reformulates queries by selecting expansion terms from top k pseudo-relevant documents. Although standard PRF models have been proven effective to deal with vocabulary mismatch between users’ queries and relevant documents, expansion terms are selected without considering their similarity to the original query terms. In this article, we propose a method to incorporate word embedding (WE) similarity into PRF models for Arabic information retrieval (IR). The main idea is to select expansion terms using their distribution in the set of top pseudo-relevant documents along with their similarity to the original query terms. Experiments are conducted on the standard Arabic TREC 2001/2002 collection using three neural WE models. The obtained results show that our PRF extensions significantly outperform their baseline PRF models. Moreover, they enhanced the baseline IR model by 22% and 68% for the mean average precision (MAP) and the robustness index (RI), respectively.

Download Full-text

A hybrid semantic query expansion approach for Arabic information retrieval

Journal Of Big Data ◽

10.1186/s40537-020-00310-z ◽

2020 ◽

Vol 7 (1) ◽

Author(s):

Hiba ALMarwi ◽

Mossa Ghurab ◽

Ibrahim Al-Baltah

Keyword(s):

Information Retrieval ◽

Query Expansion ◽

Semantic Query ◽

Arabic Information Retrieval

Download Full-text

Enhancing Web Search through Query Expansion

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch116 ◽

2011 ◽

pp. 752-757 ◽

Cited By ~ 2

Author(s):

Daniel Crabtree

Keyword(s):

Information Retrieval ◽

Search Engines ◽

Query Expansion ◽

Web Search ◽

User Involvement ◽

Semantic Knowledge ◽

Web Pages ◽

Search Performance ◽

Interactive Query ◽

Web Search Engines

Web search engines help users find relevant web pages by returning a result set containing the pages that best match the user’s query. When the identified pages have low relevance, the query must be refined to capture the search goal more effectively. However, finding appropriate refinement terms is difficult and time consuming for users, so researchers developed query expansion approaches to identify refinement terms automatically. There are two broad approaches to query expansion, automatic query expansion (AQE) and interactive query expansion (IQE) (Ruthven et al., 2003). AQE has no user involvement, which is simpler for the user, but limits its performance. IQE has user involvement, which is more complex for the user, but means it can tackle more problems such as ambiguous queries. Searches fail by finding too many irrelevant pages (low precision) or by finding too few relevant pages (low recall). AQE has a long history in the field of information retrieval, where the focus has been on improving recall (Velez et al., 1997). Unfortunately, AQE often decreased precision as the terms used to expand a query often changed the query’s meaning (Croft and Harper (1979) identified this effect and named it query drift). The problem is that users typically consider just the first few results (Jansen et al., 2005), which makes precision vital to web search performance. In contrast, IQE has historically balanced precision and recall, leading to an earlier uptake within web search. However, like AQE, the precision of IQE approaches needs improvement. Most recently, approaches have started to improve precision by incorporating semantic knowledge.

Download Full-text

The Impact of Online Indexing in Improving Arabic Information Retrieval Systems

Informatica ◽

10.31449/inf.v42i4.2297 ◽

2018 ◽

Vol 42 (4) ◽

Author(s):

Tahar Dilekh ◽

Saber Benharzallah ◽

Ali Behloul

Keyword(s):

Information Retrieval ◽

Retrieval Systems ◽

Arabic Information Retrieval ◽

Information Retrieval Systems ◽

The Impact

Download Full-text

An Empirical Prediction Methodology for the Emotional Behaviors with the Impact of Musical Features

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.f3336.049620 ◽

2020 ◽

Vol 9 (6) ◽

pp. 646-649

Keyword(s):

Feature Extraction ◽

Information Retrieval ◽

Real Situation ◽

Linguistic Information ◽

Empirical Prediction ◽

Current State ◽

Pragmatic Analysis ◽

State Of Mind ◽

Music Information ◽

The Impact

Music is the combination of melody, linguistic information and singer’s mental realm. As popularity of music increases, the choice of songs also varies according to their mental conditions. The mental conditions reach the supreme bliss to melancholy strain based on the musical notes. Majority mostly prefer songs, which satisfy their current state of mind. Pragmatic analysis in music by computer is a difficult task, as emotion is very complex and it camouflages the real situation. Hence, In this paper , trying to classify the songs based on the features of music which helps to classify the emotion more easily. Music feature extraction is done using Music Information Retrieval (MIR) toolbox. The dataset consists of 100 of Hindi songs of 30 seconds clip and later classify the emotion based on Naïve Bayes classification method using Weka API.

Download Full-text

Interactive Query Expansion with Automatically Generated Category-Specific Thesauri

Text Databases and Document Management ◽

10.4018/978-1-878289-93-3.ch005 ◽

2011 ◽

pp. 103-117

Author(s):

Fabrizio Sebastiani

Keyword(s):

Information Retrieval ◽

Search Engines ◽

Information Seeking ◽

Query Expansion ◽

Document Collections ◽

Web Based ◽

Interactive Query ◽

Retrieval Systems ◽

Subject Specific ◽

Information Retrieval Systems

The categorization of documents into subject-specific categories is a useful enhancement for large document collections addressed by information retrieval systems, as a user can first browse a category tree in search of the category that best matches her interests and then issue a query for more specific documents “from within the category.” This approach combines two modalities in information seeking that are most popular in Web-based search engines, i.e., category-based site browsing (as exemplified by, e.g., Yahoo™) and keyword-based document querying (as exemplified by, e.g., AltaVista™). Appropriate query expansion tools need to be provided, though, in order to allow the user to incrementally refine her query through further retrieval passes, thus allowing the system to produce a series of subsequent document rankings that hopefully converge to the user’s expected ranking. In this work we propose that automatically generated, category-specific “associative” thesauri be used for such purpose. We discuss a method for their generation and discuss how the thesaurus specific to a given category may usefully be endowed with “gateways” to the thesauri specific to its parent and children categories.

Download Full-text

Hybrid Query Expansion Model Based on Pseudo Relevance Feedback and Semantic Tree for Arabic IR

International Journal of Information Retrieval Research ◽

10.4018/ijirr.289949 ◽

2022 ◽

Vol 12 (1) ◽

pp. 0-0

Keyword(s):

Relevance Feedback ◽

Query Expansion ◽

Semantic Relations ◽

Information Need ◽

Weighted Tree ◽

Average Precision ◽

Arabic Information Retrieval ◽

Semantic Tree ◽

Expansion Model ◽

Pseudo Relevance Feedback

In this paper, the authors propose and readapt a new concept-based approach of query expansion in the context of Arabic information retrieval. The purpose is to represent the query by a set of weighted concepts in order to identify better the user's information need. Firstly, concepts are extracted from the initially retrieved documents by the Pseudo-Relevance Feedback method, and then they are integrated into a semantic weighted tree in order to detect more information contained in the related concepts connected by semantic relations to the primary concepts. The authors use the “Arabic WordNet” as a resource to extract, disambiguate concepts and build the semantic tree. Experimental results demonstrate that measure of MAP (Mean Average Precision) is about 10% of improvement using the open source Lucene as IR System on a collection formed from the Arabic BBC news.

Download Full-text