scholarly journals Smart combination of web measures for solving semantic similarity problems

2017 ◽  
Author(s):  
Jorge Martinez‐Gil ◽  
José F. Aldana‐Montes

Semantic similarity measures are very important in many computer‐related fields. Previous works on applications such as data integration, query expansion, tag refactoring or text clustering have used some semantic similarity measures in the past. Despite the usefulness of semantic similarity measures in these applications, the problem of measuring the similarity between two text expressions remains a key challenge. This paper aims to address this issue.

Author(s):  
David Sánchez ◽  
Montserrat Batet

The Information Content (IC) of a concept quantifies the amount of information it provides when appearing in a context. In the past, IC used to be computed as a function of concept appearance probabilities in corpora, but corpora-dependency and data sparseness hampered results. Recently, some other authors tried to overcome previous approaches, estimating IC from the knowledge modeled in an ontology. In this paper, the authors develop this idea, by proposing a new model to compute the IC of a concept exploiting the taxonomic knowledge modeled in an ontology. In comparison with related works, their proposal aims to better capture semantic evidences found in the ontology. To test the authors’ approach, they have applied it to well-known semantic similarity measures, which were evaluated using standard benchmarks. Results show that the use of the authors’ model produces, in most cases, more accurate similarity estimations than related works.


2021 ◽  
Vol 177 ◽  
pp. 114922
Author(s):  
Mehdi Jabalameli ◽  
Mohammadali Nematbakhsh ◽  
Reza Ramezani

2020 ◽  
Vol 7 (1) ◽  
Author(s):  
Ali A. Amer ◽  
Hassan I. Abdalla

Abstract Similarity measures have long been utilized in information retrieval and machine learning domains for multi-purposes including text retrieval, text clustering, text summarization, plagiarism detection, and several other text-processing applications. However, the problem with these measures is that, until recently, there has never been one single measure recorded to be highly effective and efficient at the same time. Thus, the quest for an efficient and effective similarity measure is still an open-ended challenge. This study, in consequence, introduces a new highly-effective and time-efficient similarity measure for text clustering and classification. Furthermore, the study aims to provide a comprehensive scrutinization for seven of the most widely used similarity measures, mainly concerning their effectiveness and efficiency. Using the K-nearest neighbor algorithm (KNN) for classification, the K-means algorithm for clustering, and the bag of word (BoW) model for feature selection, all similarity measures are carefully examined in detail. The experimental evaluation has been made on two of the most popular datasets, namely, Reuters-21 and Web-KB. The obtained results confirm that the proposed set theory-based similarity measure (STB-SM), as a pre-eminent measure, outweighs all state-of-art measures significantly with regards to both effectiveness and efficiency.


2021 ◽  
Vol 54 (2) ◽  
pp. 1-37
Author(s):  
Dhivya Chandrasekaran ◽  
Vijay Mago

Estimating the semantic similarity between text data is one of the challenging and open research problems in the field of Natural Language Processing (NLP). The versatility of natural language makes it difficult to define rule-based methods for determining semantic similarity measures. To address this issue, various semantic similarity methods have been proposed over the years. This survey article traces the evolution of such methods beginning from traditional NLP techniques such as kernel-based methods to the most recent research work on transformer-based models, categorizing them based on their underlying principles as knowledge-based, corpus-based, deep neural network–based methods, and hybrid methods. Discussing the strengths and weaknesses of each method, this survey provides a comprehensive view of existing systems in place for new researchers to experiment and develop innovative ideas to address the issue of semantic similarity.


2018 ◽  
Vol 14 (2) ◽  
pp. 16-36 ◽  
Author(s):  
Carlos Ramón Rangel ◽  
Junior Altamiranda ◽  
Mariela Cerrada ◽  
Jose Aguilar

The merging procedures of two ontologies are mostly related to the enrichment of one of the input ontologies, i.e. the knowledge of the aligned concepts from one ontology are copied into the other ontology. As a consequence, the resulting new ontology extends the original knowledge of the base ontology, but the unaligned concepts of the other ontology are not considered in the new extended ontology. On the other hand, there are experts-aided semi-automatic approaches to accomplish the task of including the knowledge that is left out from the resulting merged ontology and debugging the possible concept redundancy. With the aim of facing the posed necessity of including all the knowledge of the ontologies to be merged without redundancy, this article proposes an automatic approach for merging ontologies, which is based on semantic similarity measures and exhaustive searching along of the closest concepts. The authors' approach was compared to other merging algorithms, and good results are obtained in terms of completeness, relationships and properties, without creating redundancy.


Sign in / Sign up

Export Citation Format

Share Document