Implementation of Weighted Tree Similarity and Cosine Sorensen-Dice Algorithms for Semantic Search in Document Repository Information System

As more and more documents we manage, the more difficult it is in the search process, and the need to use information retrieval becomes important. With the information retrieval system, it can help in searching for documents that match the similarity of keywords. Usually document searches usually only see the name of the document (file) being searched for by the user without paying attention to the content or metadata of the document, so that it cannot meet their information needs. Document search has several approaches, including full-text search, plain metadata search and semantic search. This study uses the Weighted Tree Similarity algorithm with the Cosine Sorensen Dice algorithm to calculate the semantic search similarity. In this study, document metadata is represented in the form of a tree that has labeled nodes, labeled branches and weighted branches. The similarity calculation on the subtree edge label uses Cosine Sorensen Dice, while the total similarity of a document uses the weighted tree similarity. The metadata structure of the document uses the taxonomy owner, description, title, disposition content and type. The result of this research is a document search application with taxonomic weight on file storage.

Download Full-text

Literature Review in Computational Linguistics Issues in the Developing Field of Consumer Informatics

Health Information Systems ◽

10.4018/978-1-60566-988-5.ch016 ◽

2011 ◽

pp. 226-232

Author(s):

Ki Jung Lee

Keyword(s):

Information Retrieval ◽

Computational Linguistics ◽

Information Structure ◽

Information Needs ◽

Retrieval System ◽

Medical Information ◽

Information Retrieval System ◽

Search Results ◽

Medical Information Retrieval ◽

On Line

With the increased use of Internet, a large number of consumers first consult on line resources for their healthcare decisions. The problem of the existing information structure primarily lies in the fact that the vocabulary used in consumer queries is intrinsically different from the vocabulary represented in medical literature. Consequently, the medical information retrieval often provides poor search results. Since consumers make medical decisions based on the search results, building an effective information retrieval system becomes an essential issue. By reviewing the foundational concepts and application components of medical information retrieval, this paper will contribute to a body of research that seeks appropriate answers to a question like “How can we design a medical information retrieval system that can satisfy consumer’s information needs?”

Download Full-text

A Weighted-Tree Similarity Algorithm for Multi-Agent Systems in E-Business Environments

Computational Intelligence ◽

10.1111/j.0824-7935.2004.00255.x ◽

2004 ◽

Vol 20 (4) ◽

pp. 584-602 ◽

Cited By ~ 37

Author(s):

Virendrakumar C. Bhavsar ◽

Harold Boley ◽

Lu Yang

Keyword(s):

Multi Agent Systems ◽

Weighted Tree ◽

Agent Systems ◽

Business Environments ◽

Similarity Algorithm ◽

Multi Agent ◽

Tree Similarity

Download Full-text

PERSONALIZING THE SOURCE SELECTION AND THE RESULT MERGING PROCESS

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213009000159 ◽

2009 ◽

Vol 18 (02) ◽

pp. 331-354 ◽

Cited By ~ 7

Author(s):

SAMIR KECHID ◽

HABIBA DRIAS

Keyword(s):

Information Retrieval ◽

Information Needs ◽

Retrieval System ◽

Information Source ◽

Information Access ◽

Information Retrieval System ◽

Specific Information ◽

Distributed Environment ◽

Source Selection ◽

User Query

The World Wide Web knows an incessant and very fast development. Currently, finding useful information on the Web is a time consuming process. In this paper, we present PIRS a personalized Information Retrieval System in a distributed environment. Most prior research in distributed information access focused on selecting and merging information that has the most relevant content according to the query but ignored the user's specific needs. The underlying idea is that different users have different backgrounds, goals and interests when seeking information and thus, the same query may cover different specific information needs according to who emitted it. However, with the ever expanding Web, users are faced with a huge number of information resources. Consequently, such query-based information access strategies lead to inaccurate query results. PIRS extends the state of the art in a Web-based information retrieval system in distributed environment. First, it develops models for representing both user and information source using feature based profiles. Second, PIRS expands a user query according to his profile. Third, it develops algorithms for source selection and results merging that personalize the computation of the relevance score of a document in response to the user's query. PIRS has been experimented with several known information source. The experimental results obtained show the effectiveness of our approach.

Download Full-text

Literature Review in Computational Linguistics Issues in the Developing Field of Consumer Informatics

Handbook of Research on Text and Web Mining Technologies ◽

10.4018/978-1-59904-990-8.ch043 ◽

2010 ◽

pp. 758-765 ◽

Cited By ~ 1

Author(s):

Ki Jung Lee

Keyword(s):

Information Retrieval ◽

Computational Linguistics ◽

Information Structure ◽

Information Needs ◽

Retrieval System ◽

Medical Information ◽

Information Retrieval System ◽

Search Results ◽

Medical Information Retrieval ◽

On Line

Download Full-text

Semantic Search on Unstructured Data

Semantic-Enabled Advancements on the Web ◽

10.4018/978-1-4666-0185-7.ch009 ◽

2012 ◽

pp. 194-213

Author(s):

Alex Kohn ◽

François Bry ◽

Alexander Manta

Keyword(s):

Information Retrieval ◽

Retrieval System ◽

Pharmaceutical Research ◽

Information Retrieval System ◽

Semantic Search ◽

Unstructured Data ◽

Search Performance ◽

Retrieval Performance ◽

Enterprise Search ◽

Existing Data

Studies agree that searchers are often not satisfied with the performance of current enterprise search engines. As a consequence, more scientists worldwide are actively investigating new avenues for searching to improve retrieval performance. This paper contributes to YASA (Your Adaptive Search Agent), a fully implemented and thoroughly evaluated ontology-based information retrieval system for the enterprise. A salient particularity of YASA is that large parts of the ontology are automatically filled with facts by recycling and transforming existing data. YASA offers context-based personalization, faceted navigation, as well as semantic search capabilities. YASA has been deployed and evaluated in the pharmaceutical research department of Roche, Penzberg, and results show that already semantically simple ontologies suffice to considerably improve search performance.

Download Full-text

UcEF for Semantic IR

Advanced Concepts, Methods, and Applications in Semantic Computing - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-7998-6697-8.ch010 ◽

2021 ◽

pp. 190-217

Author(s):

Bernard Ijesunor Akhigbe

Keyword(s):

Systematic Review ◽

Information Retrieval ◽

Information Needs ◽

Semantic Search ◽

Related Data ◽

Multiple Contexts ◽

Descriptive Approach ◽

The Impact ◽

Cognitive User ◽

Data Analytic

At present, keyword-based techniques allow information retrieval (IR) but are unable to capture the conceptualizations in users' information needs and contents. The response to this has been semantic search computing with commendable success. Surprisingly, it is still difficult to evaluate Semantic IR (SIR) and understand the user contexts. The absence of a standardized cognitive user-centred evaluative paradigm (CUcEP) further exacerbates these challenges. This chapter provides the state-of-the-art on IR and SIR evaluation and a systematic review of contexts. Appropriate user-centred theories and the proposed evaluative framework with its integrated-context, web analytic conception, and related data analytic technique are presented. A descriptive approach is adopted, with the conclusion that multiple contexts are essential in SIR evaluation since “searching by meaning” is a multi-dimensional cognitive conception, hence the need to consider the impact of context dynamicity. Finally, the foregrounded semantic items will be applied to standardize the CUcEP in future.

Download Full-text

Document Search in Information Retrieval System Using Vector Space Model

10.1109/iceeie52663.2021.9616735 ◽

2021 ◽

Author(s):

Yusrandi ◽

Muladi ◽

Harits Ar Rosyid ◽

Abd Kadir Mahamad

Keyword(s):

Information Retrieval ◽

Vector Space ◽

Retrieval System ◽

Vector Space Model ◽

Information Retrieval System ◽

Space Model ◽

Document Search

Download Full-text

Towards a Statistical Approach to the Analysis, the Indexing, and the Semantic Search of Medical Videoconferences

International Journal of Information Retrieval Research ◽

10.4018/ijirr.2017070103 ◽

2017 ◽

Vol 7 (3) ◽

pp. 38-61 ◽

Cited By ~ 1

Author(s):

Ameni Yengui ◽

Mahmoud Neji

Keyword(s):

Information Retrieval ◽

Statistical Approach ◽

Retrieval System ◽

Statistical Technique ◽

Information Retrieval System ◽

Semantic Search ◽

Query Reformulation ◽

Research Information ◽

External Resources ◽

Conceptual Graph

In this article, the authors introduce their OSSVIRI information retrieval system which composed of three modules. In the analysis module, they have proposed a statistical technique exploiting the word frequency in order to extract the simple, compound and specific terms from the documents. In the indexing module, the authors used the ontology to associate the terms with their concepts, retrieve the relations between them and disambiguate the concepts to improve the sematic content of the documents. The concepts and relations are represented as a conceptual graph. In the research module, the authors have proposed a technique of users' query reformulation based on external resources and users' profiles and a technique of pairing based on the combined expansion of the requests and the documents guided by the context of the requirement in information and the documentary contents. This system is validated using the metrics from the research information and comparisons with existing statistical approach. The authors show that their approach achieves good results.

Download Full-text

The Application of Extended Weighted Tree Similarity Algorithm for Similarity Searching

2019 International Conference on Information and Communications Technology (ICOIACT) ◽

10.1109/icoiact46704.2019.8938468 ◽

2019 ◽

Author(s):

Akrilvalerat Deainert Wierfi ◽

Ema Utami ◽

Andi Sunyoto

Keyword(s):

Similarity Searching ◽

Weighted Tree ◽

Similarity Algorithm ◽

Tree Similarity

Download Full-text

Ontology-Based Semantic Retrieval for Management Information System

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.278-280.2069 ◽

2013 ◽

Vol 278-280 ◽

pp. 2069-2072

Author(s):

Jin Xing Shen

Keyword(s):

Information System ◽

Information Retrieval ◽

Management Information System ◽

Traditional Method ◽

Retrieval System ◽

Information Retrieval System ◽

Semantic Search ◽

Semantic Retrieval ◽

Management Information ◽

Research Information

In order to achieve semantic retrieval for scientific research information in WWW, this paper applies an ontology-based framework to information retrieval system for management information system. After analyze the limitations of traditional method, bring a semantic search forward, and mainly introduce the thought of the semantic retrieval as well as the way to constitute ontology entity and the language that describes it. Moreover, semantic retrieval system based on ontology is also given. The application to retrieve project information shows that the framework can overcome the localization of other ontology’s models, and this research facilitates the semantic retrieval of management information through semantic retrieval concepts on the Semantic Web.

Download Full-text