scholarly journals CoTO: A Novel Approach for Fuzzy Aggregation of Semantic Similarity Measures

2017 ◽  
Author(s):  
Jorge Martinez-Gil

Semantic similarity measurement aims to determine the likeness between two text expressions that use different lexicographies for representing the same real object or idea. There are a lot of semantic similarity measures for addressing this problem. However, the best results have been achieved when aggregating a number of simple similarity measures. This means that after the various similarity values have been calculated, the overall similarity for a pair of text expressions is computed using an aggregation function of these individual semantic similarity values. This aggregation is often computed by means of statistical functions. In this work, we present CoTO (Consensus or Trade-Off) a solution based on fuzzy logic that is able to outperform these traditional approaches.

Author(s):  
Jorge Martinez-Gil

Semantic similarity measurement of biomedical nomenclature aims to determine the likeness between two biomedical expressions that use different lexicographies for representing the same real biomedical concept. There are many semantic similarity measures for trying to address this issue, many of them have represented an incremental improvement over the previous ones. In this work, we present yet another incremental solution that is able to outperform existing approaches by using a sophisticated aggregation method based on fuzzy logic. Results show us that our strategy is able to consistently beat existing approaches when solving well-known biomedical benchmark data sets.


2017 ◽  
Author(s):  
Jorge Martinez-Gil

Computing the semantic similarity between terms (or short text expressions) that have the same meaning but which are not lexicographically similar is a key challenge in many computer related fields. The problem is that traditional approaches to semantic similarity measurement are not suitable for all situations, for example, many of them often fail to deal with terms not covered by synonym dictionaries or are not able to cope with acronyms, abbreviations, buzzwords, brand names, proper nouns, and so on. In this paper, we present and evaluate a collection of emerging techniques developed to avoid this problem. These techniques use some kinds of web intelligence to determine the degree of similarity between text expressions. These techniques implement a variety of paradigms including the study of co-occurrence, text snippet comparison, frequent pattern finding, or search log analysis. The goal is to substitute the traditional techniques where necessary.


Author(s):  
Bojan Furlan ◽  
Vladimir Sivački ◽  
Davor Jovanović ◽  
Boško Nikolić

This paper presents methods for measuring the semantic similarity of texts, where we evaluated different approaches based on existing similarity measures. On one side word similarity was calculated by processing large text corpuses and on the other, commonsense knowledgebase was used. Given that a large fraction of the information available today, on the Web and elsewhere, consists of short text snippets (e.g. abstracts of scientific documents, image captions or product descriptions), where commonsense knowledge has an important role, in this paper we focus on computing the similarity between two sentences or two short paragraphs by extending existing measures with information from the ConceptNet knowledgebase. On the other hand, an extensive research has been done in the field of corpus-based semantic similarity, so we also evaluated existing solutions by imposing some modifications. Through experiments performed on a paraphrase data set, we demonstrate that some of proposed approaches can improve the semantic similarity measurement of short text.


Sign in / Sign up

Export Citation Format

Share Document