A DISCOURSE-BASED INFORMATION RETRIEVAL FOR TAMIL LITERARY TEXTS

Anita Ramalingam; Subalalitha Chinnaudayar Navaneethakrish

doi:10.32890/jict2021.20.3.4

A DISCOURSE-BASED INFORMATION RETRIEVAL FOR TAMIL LITERARY TEXTS

Journal of Information and Communication Technology ◽

10.32890/jict2021.20.3.4 ◽

2021 ◽

Vol 20 (Number 3) ◽

pp. 353-389

Author(s):

Anita Ramalingam ◽

Subalalitha Chinnaudayar Navaneethakrish

Keyword(s):

Information Retrieval ◽

Retrieval System ◽

Semantic Analysis ◽

Literary Text ◽

Information Retrieval System ◽

Discourse Processing ◽

Search Performance ◽

Literary Works ◽

Happy Life ◽

Processing Techniques

Tamil literature has many valuable thoughts that can help the human community to lead a successful and a happy life. Tamil literary works are abundantly available and searched on the World Wide Web (WWW), but the existing search systems follow a keyword-based match strategy which fails to satisfy the user needs. This necessitates the demand for a focused Information Retrieval System that semantically analyses the Tamil literary text which will eventually improve the search system performance. This paper proposes a novel Information Retrieval framework that uses discourse processing techniques which aids in semantic analysis and representation of the Tamil Literary text. The proposed framework has been tested using two ancient literary works, the Thirukkural and Naladiyar, which were written during 300 BCE. The Thirukkural comprises 1330 couplets, each 7 words long, while the Naladiyar consists of 400 quatrains, each 15 words long. The proposed system, tested with all the 1330 Thirukkural couplets and 400 Naladiyar quatrains, achieved a mean average precision (MAP) score of 89%. The performance of the proposed framework has been compared with Google Tamil search and a keyword-based search which is a substandard version of the proposed framework. Google Tamil search achieved a MAP score of 56% and keyword-based method achieved a MAP score of 62% which shows that the discourse processing techniques improves the search performance of an Information Retrieval system.

Download Full-text

Semantic Search on Unstructured Data

Semantic-Enabled Advancements on the Web ◽

10.4018/978-1-4666-0185-7.ch009 ◽

2012 ◽

pp. 194-213

Author(s):

Alex Kohn ◽

François Bry ◽

Alexander Manta

Keyword(s):

Information Retrieval ◽

Retrieval System ◽

Pharmaceutical Research ◽

Information Retrieval System ◽

Semantic Search ◽

Unstructured Data ◽

Search Performance ◽

Retrieval Performance ◽

Enterprise Search ◽

Existing Data

Studies agree that searchers are often not satisfied with the performance of current enterprise search engines. As a consequence, more scientists worldwide are actively investigating new avenues for searching to improve retrieval performance. This paper contributes to YASA (Your Adaptive Search Agent), a fully implemented and thoroughly evaluated ontology-based information retrieval system for the enterprise. A salient particularity of YASA is that large parts of the ontology are automatically filled with facts by recycling and transforming existing data. YASA offers context-based personalization, faceted navigation, as well as semantic search capabilities. YASA has been deployed and evaluated in the pharmaceutical research department of Roche, Penzberg, and results show that already semantically simple ontologies suffice to considerably improve search performance.

Download Full-text

Method of Lexical Enrichment in Information Retrieval System in Arabic

International Journal of Information Retrieval Research ◽

10.4018/ijirr.2013100103 ◽

2013 ◽

Vol 3 (4) ◽

pp. 35-51 ◽

Cited By ~ 3

Author(s):

Souheyl Mallat ◽

Anis Zouaghi ◽

Emna Hkiri ◽

Mounir Zrigui

Keyword(s):

Information Retrieval ◽

Retrieval System ◽

Semantic Analysis ◽

Contextual Information ◽

Information Retrieval System ◽

Enrichment Method ◽

Weighting Functions ◽

Retrieval Systems ◽

Significant Term ◽

Information Retrieval Systems

In this paper, the authors propose a method for lexical enrichment of Arabic queries in order to improve the performance of the information retrieval systems SRI. This method has two types of enrichment: linguistic and contextual. The first one is based on the linguistic analysis (lemmatization, morphological, syntactic and semantic analysis), whose goal is to generate a descriptive list (list-desc). This list contains a set of linguistic lexicon assigned to each significant term in the query. The second enrichment consists in integrating contextual information derived from the corpus documents. It is based on statistical analysis using Salton weighting functions: TF-IDF and TF-IEF. The TF-IDF function is applied on the list-desc and documents in the corpus in order to identify relevant documents. TF-IEF function is made between the list-desc and sentences belonging to the relevant documents to identify relevant sentences. Then, terms in these sentences are weighted, and those with highest weights are considered rich in terms of informative and contextual importance are added to the original query. The authors' lexical enrichment method was evaluated on a corpus of documents belonging to a specialized domain and results show its interest in terms of precision and recall.

Download Full-text

CONCEPT OF INFORMATION-SEARCH SYSTEM (ON THE BASIS OF ONTOLOGIES) FOR THE DOMAIN OF MEDICAL LAW

Computer Systems and Information Technologies ◽

10.31891/csit-2020-2-1 ◽

2020 ◽

Vol 2 (2) ◽

pp. 6-9

Author(s):

T. HOVORUSHCHENKO ◽

◽

Y. HNATCHUK ◽

O. SAVCHUK ◽

◽

...

Keyword(s):

Information Retrieval ◽

Information Search ◽

Retrieval System ◽

Semantic Analysis ◽

Information Retrieval System ◽

Information Resources ◽

Semantic Search ◽

Medical Law ◽

Retrieval Systems ◽

Information Retrieval Systems

The search for information is one of the main components of human activity. The ideal information retrieval system should issue only documents that are relevant to the request. Today, real information retrieval systems provide a completeness factor of 70%, and a search accuracy factor – at a level sometimes even 10%. Thus, the well-known information retrieval systems are currently unable to meet the modern needs of users. The global trend in the processing of large arrays of information, which allows you to solve new classes of problems based on available information resources, is the intellectualization of information and data processing. As a standard of knowledge engineering in the development of information retrieval systems, it is worthwhile to use ontologies that are widely used in the work of search engines and information retrieval systems, as ontologies are an effective tool for organizing a semantic search. The use of ontologies as part of information retrieval systems helps to solve a number of methodological and technological problems that arise during the development of such systems. An important and actual task now is to develop an effective information retrieval system for the field of medical law. The purpose of this study is to develop the concept of an effective information retrieval system (based on ontologies) for the field of medical law. The paper proposes the concept of an information retrieval system (based on ontologies) for the field of medical law, which consists of: an internal ontology of semantic search, which will contain knowledge about the basic elements of the search process; taxonomies of information objects, information about which the user is looking for (this taxonomy will integrate existing ontologies of multimedia information resources, Web-services, and organizational structures); ontologies of the subject area, which will be used for the accumulation of knowledge, as well as for the construction of thesauri, dictionaries, taxonomies; linguistic ontologies designed for semantic analysis of natural information resources.

Download Full-text