scholarly journals RETRIEVAL PERFORMANCE OF ARABIC LIGHT STEMMERS

2019 ◽  
Vol 2 (10) ◽  
pp. 81-90
Author(s):  
Ouahiba Saoudi ◽  
Roslina Othman

Despite the fact that stemming greatly improves Arabic information retrieval performance, yet no standard stemmer emerges in the field of Arabic IR due to some limitations and shortcomings. Among the recurring problems is that the stemmer can reduce unrelated words to the same stem as well as fall short to reduce related words to a common stem. Many studies have suggested Arabic algorithms to address the problem associated with stemming. This paper aims to review the state of the retrieval performance of Arabic Light stemmers based on the main objectives achieved, causes for retrieval success and failure, retrieval measure, the affixes, and methodologies. The results showed that light 10 has better retrieval performance compared to other reviewed Arabic light stemmers.

2019 ◽  
Vol 13 (6) ◽  
pp. 24
Author(s):  
Mustafa Abdel-Kareem Ababneh ◽  
Ghassan Kanaan ◽  
Ayat Amin Al-Jarrah

Slang language has become the most used language in the most countries. It has almost become the first language in the social media, websites and daily conversations. Moreover, it has become used in many conferences to clarify information and to deliver the required purpose of them. Therefore, this great spread of slang language over the world. In Jordan indicates that it is important to know meanings of Jordanian slang vocabularies. Mainly, In research system, we created a system framework allows users to restore Arabic information depending on queries that are written in slang language and this framework was made basically by context-free grammar to convert from slang to classical and vice versa. In addition, to conclude with, we will apply it on the colloquial slang in North of Jordan specifically; Irbid, Ajloun, Jerash, Mafraq and AlRamtha city. As well as, we will make a special file for Non_Arabic words and the stop words too. After we made an evaluation for the system relying on the results of recall, precision and F-measure where the results of precision about 0.63 for both researches slang and classical query, and this indicates that the system supports searching in Jordanian slang language. The purpose of this research is to enhance Arabic information retrieval, and it will be a significant resource for researchers who are interested in slang languages. As well as, it helps tie communities together.


Patents are critical intellectual assets for any competitive business. With ever increasing patent filings, effective patent prior art search has become an inevitably important task in patent retrieval which is a subfield of information retrieval (IR). The goal of the prior art search is to find and rank documents related to a query patent. Query formulation is a key step in prior art search in which patent structure is exploited to generate queries using various fields available in patent text. As patent encodes multiple technical domains, this work argues that technical domains and patent structure have their combined effect on the effectiveness of patent retrieval. The study uses international patent classification codes (IPC) to categorize query patents in eight technical domains and also explores eighteen different combination of patent fields to generate search queries. A total of 144 extensive retrieval experiments have been carried out using BM25 ranking algorithm. Retrieval performance is evaluated in terms of recall score of top 1000 records. Empirical results support our assumption. A two-way analysis of variance is also conducted to validate the hypotheses. The findings of this work may be helpful for patent information retrieval professionals to develop domain specific patent retrieval systems exploiting the patent structure.


1988 ◽  
Vol 32 (5) ◽  
pp. 301-305
Author(s):  
Robert D. Peters ◽  
Gloria T. Yastrop ◽  
Deborah A. Boehm-Davis

This research examined the effects two different cognitive individual differences (perceptual speed and spatial scanning) on information retrieval performance under two matched and two mismatched database format/query conditions. A graphic and a tabular form of an airline database were constructed, along with questions that required users to search through the database to determine the correct response. Two types of questions were designed - graphic and tabular. The data indicate that users are faster when the format of the information in the database matches the type of information needed to answer the question and that cognitive individual differences are differentially predictive of performance in the matched and mismatched conditions. Recommendations for database design are presented.


Sign in / Sign up

Export Citation Format

Share Document