Semantic Search on Unstructured Data

Author(s):  
Alex Kohn ◽  
François Bry ◽  
Alexander Manta

Studies agree that searchers are often not satisfied with the performance of current enterprise search engines. As a consequence, more scientists worldwide are actively investigating new avenues for searching to improve retrieval performance. This paper contributes to YASA (Your Adaptive Search Agent), a fully implemented and thoroughly evaluated ontology-based information retrieval system for the enterprise. A salient particularity of YASA is that large parts of the ontology are automatically filled with facts by recycling and transforming existing data. YASA offers context-based personalization, faceted navigation, as well as semantic search capabilities. YASA has been deployed and evaluated in the pharmaceutical research department of Roche, Penzberg, and results show that already semantically simple ontologies suffice to considerably improve search performance.

Author(s):  
Alex Kohn ◽  
François Bry ◽  
Alexander Manta

Studies agree that searchers are often not satisfied with the performance of current enterprise search engines. As a consequence, more scientists worldwide are actively investigating new avenues for searching to improve retrieval performance. This paper contributes to YASA (Your Adaptive Search Agent), a fully implemented and thoroughly evaluated ontology-based information retrieval system for the enterprise. A salient particularity of YASA is that large parts of the ontology are automatically filled with facts by recycling and transforming existing data. YASA offers context-based personalization, faceted navigation, as well as semantic search capabilities. YASA has been deployed and evaluated in the pharmaceutical research department of Roche, Penzberg, and results show that already semantically simple ontologies suffice to considerably improve search performance.


2017 ◽  
Vol 7 (3) ◽  
pp. 38-61 ◽  
Author(s):  
Ameni Yengui ◽  
Mahmoud Neji

In this article, the authors introduce their OSSVIRI information retrieval system which composed of three modules. In the analysis module, they have proposed a statistical technique exploiting the word frequency in order to extract the simple, compound and specific terms from the documents. In the indexing module, the authors used the ontology to associate the terms with their concepts, retrieve the relations between them and disambiguate the concepts to improve the sematic content of the documents. The concepts and relations are represented as a conceptual graph. In the research module, the authors have proposed a technique of users' query reformulation based on external resources and users' profiles and a technique of pairing based on the combined expansion of the requests and the documents guided by the context of the requirement in information and the documentary contents. This system is validated using the metrics from the research information and comparisons with existing statistical approach. The authors show that their approach achieves good results.


2013 ◽  
Vol 278-280 ◽  
pp. 2069-2072
Author(s):  
Jin Xing Shen

In order to achieve semantic retrieval for scientific research information in WWW, this paper applies an ontology-based framework to information retrieval system for management information system. After analyze the limitations of traditional method, bring a semantic search forward, and mainly introduce the thought of the semantic retrieval as well as the way to constitute ontology entity and the language that describes it. Moreover, semantic retrieval system based on ontology is also given. The application to retrieve project information shows that the framework can overcome the localization of other ontology’s models, and this research facilitates the semantic retrieval of management information through semantic retrieval concepts on the Semantic Web.


2021 ◽  
Vol 20 (Number 3) ◽  
pp. 353-389
Author(s):  
Anita Ramalingam ◽  
Subalalitha Chinnaudayar Navaneethakrish

Tamil literature has many valuable thoughts that can help the human community to lead a successful and a happy life. Tamil literary works are abundantly available and searched on the World Wide Web (WWW), but the existing search systems follow a keyword-based match strategy which fails to satisfy the user needs. This necessitates the demand for a focused Information Retrieval System that semantically analyses the Tamil literary text which will eventually improve the search system performance. This paper proposes a novel Information Retrieval framework that uses discourse processing techniques which aids in semantic analysis and representation of the Tamil Literary text. The proposed framework has been tested using two ancient literary works, the Thirukkural and Naladiyar, which were written during 300 BCE. The Thirukkural comprises 1330 couplets, each 7 words long, while the Naladiyar consists of 400 quatrains, each 15 words long. The proposed system, tested with all the 1330 Thirukkural couplets and 400 Naladiyar quatrains, achieved a mean average precision (MAP) score of 89%. The performance of the proposed framework has been compared with Google Tamil search and a keyword-based search which is a substandard version of the proposed framework. Google Tamil search achieved a MAP score of 56% and keyword-based method achieved a MAP score of 62% which shows that the discourse processing techniques improves the search performance of an Information Retrieval system.


2016 ◽  
Author(s):  
Jan Werrmann

Obtaining the right information at the right time is one of the main challenges for modern societies. This holds especially for companies that must handle complex business processes that require case dependent information. Unfortunately, case dependent and relevant information is often widespread over different document systems. Users must interact with various applications and search for semantically related (and helpful) documents without any, or with only little, support by the disparate retrieval systems. In this work, a system called Advanced ontologybased Information Retrieval System (AIRS) is introduced that includes methods of state-of-the-art enterprise search technology and combines them with an ontology called AIRS Knowledge Base (AIRSKB). AIRS is deeply integrated with advanced information retrieval technologies to make search processes in large heterogeneous document landscapes more effective and increase the quality of search results. ...


2020 ◽  
Vol 2 (2) ◽  
pp. 6-9
Author(s):  
T. HOVORUSHCHENKO ◽  
◽  
Y. HNATCHUK ◽  
O. SAVCHUK ◽  
◽  
...  

The search for information is one of the main components of human activity. The ideal information retrieval system should issue only documents that are relevant to the request. Today, real information retrieval systems provide a completeness factor of 70%, and a search accuracy factor – at a level sometimes even 10%. Thus, the well-known information retrieval systems are currently unable to meet the modern needs of users. The global trend in the processing of large arrays of information, which allows you to solve new classes of problems based on available information resources, is the intellectualization of information and data processing. As a standard of knowledge engineering in the development of information retrieval systems, it is worthwhile to use ontologies that are widely used in the work of search engines and information retrieval systems, as ontologies are an effective tool for organizing a semantic search. The use of ontologies as part of information retrieval systems helps to solve a number of methodological and technological problems that arise during the development of such systems. An important and actual task now is to develop an effective information retrieval system for the field of medical law. The purpose of this study is to develop the concept of an effective information retrieval system (based on ontologies) for the field of medical law. The paper proposes the concept of an information retrieval system (based on ontologies) for the field of medical law, which consists of: an internal ontology of semantic search, which will contain knowledge about the basic elements of the search process; taxonomies of information objects, information about which the user is looking for (this taxonomy will integrate existing ontologies of multimedia information resources, Web-services, and organizational structures); ontologies of the subject area, which will be used for the accumulation of knowledge, as well as for the construction of thesauri, dictionaries, taxonomies; linguistic ontologies designed for semantic analysis of natural information resources.


Sign in / Sign up

Export Citation Format

Share Document