Smart combination of web measures for solving semantic similarity problems

Mapping Intimacies ◽

10.31219/osf.io/g2pwx ◽

2017 ◽

Author(s):

Jorge Martinez‐Gil ◽

José F. Aldana‐Montes

Keyword(s):

Data Integration ◽

Semantic Similarity ◽

Query Expansion ◽

Similarity Measures ◽

Text Clustering ◽

The Past

Semantic similarity measures are very important in many computer‐related fields. Previous works on applications such as data integration, query expansion, tag refactoring or text clustering have used some semantic similarity measures in the past. Despite the usefulness of semantic similarity measures in these applications, the problem of measuring the similarity between two text expressions remains a key challenge. This paper aims to address this issue.

Download Full-text

Genetic algorithm for text clustering using ontology and evaluating the validity of various semantic similarity measures

Expert Systems with Applications ◽

10.1016/j.eswa.2008.12.046 ◽

2009 ◽

Vol 36 (5) ◽

pp. 9095-9104 ◽

Cited By ~ 59

Author(s):

Wei Song ◽

Cheng Hua Li ◽

Soon Cheol Park

Keyword(s):

Genetic Algorithm ◽

Semantic Similarity ◽

Similarity Measures ◽

Text Clustering

Download Full-text

A Novel Approach to Query Expansion based on Semantic Similarity Measures

Proceedings of 4th International Conference on Data Management Technologies and Applications ◽

10.5220/0005579703440353 ◽

2015 ◽

Author(s):

Flora Amato ◽

Aniello De Santo ◽

Francesco Gargiulo ◽

Vincenzo Moscato ◽

Fabio Persia ◽

...

Keyword(s):

Semantic Similarity ◽

Query Expansion ◽

Similarity Measures ◽

Novel Approach

Download Full-text

Self-adaptive GA, quantitative semantic similarity measures and ontology-based text clustering

2008 International Conference on Natural Language Processing and Knowledge Engineering ◽

10.1109/nlpke.2008.4906791 ◽

2008 ◽

Author(s):

Chengzhi Zhang ◽

Wei Song ◽

Chenghua Li ◽

Wei Yu

Keyword(s):

Semantic Similarity ◽

Similarity Measures ◽

Text Clustering ◽

Self Adaptive

Download Full-text

A New Model to Compute the Information Content of Concepts from Taxonomic Knowledge

International Journal on Semantic Web and Information Systems ◽

10.4018/jswis.2012040102 ◽

2012 ◽

Vol 8 (2) ◽

pp. 34-50 ◽

Cited By ~ 31

Author(s):

David Sánchez ◽

Montserrat Batet

Keyword(s):

Information Content ◽

Semantic Similarity ◽

Similarity Measures ◽

New Model ◽

Amount Of Information ◽

Data Sparseness ◽

The Past

The Information Content (IC) of a concept quantifies the amount of information it provides when appearing in a context. In the past, IC used to be computed as a function of concept appearance probabilities in corpora, but corpora-dependency and data sparseness hampered results. Recently, some other authors tried to overcome previous approaches, estimating IC from the knowledge modeled in an ontology. In this paper, the authors develop this idea, by proposing a new model to compute the IC of a concept exploiting the taxonomic knowledge modeled in an ontology. In comparison with related works, their proposal aims to better capture semantic evidences found in the ontology. To test the authors’ approach, they have applied it to well-known semantic similarity measures, which were evaluated using standard benchmarks. Results show that the use of the authors’ model produces, in most cases, more accurate similarity estimations than related works.

Download Full-text

Faculty Opinions recommendation of Exploiting disjointness axioms to improve semantic similarity measures.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.722317980.793528331 ◽

2017 ◽

Author(s):

Sebastian Köhler

Keyword(s):

Semantic Similarity ◽

Similarity Measures

Download Full-text

Denoising distant supervision for ontology lexicalization using semantic similarity measures

Expert Systems with Applications ◽

10.1016/j.eswa.2021.114922 ◽

2021 ◽

Vol 177 ◽

pp. 114922

Author(s):

Mehdi Jabalameli ◽

Mohammadali Nematbakhsh ◽

Reza Ramezani

Keyword(s):

Semantic Similarity ◽

Similarity Measures ◽

Distant Supervision

Download Full-text

A set theory based similarity measure for text clustering and classification

Journal Of Big Data ◽

10.1186/s40537-020-00344-3 ◽

2020 ◽

Vol 7 (1) ◽

Cited By ~ 1

Author(s):

Ali A. Amer ◽

Hassan I. Abdalla

Keyword(s):

Set Theory ◽

Similarity Measure ◽

Similarity Measures ◽

Text Clustering ◽

Plagiarism Detection ◽

K Nearest Neighbor ◽

Single Measure ◽

Highly Effective ◽

Clustering And Classification ◽

Effectiveness And Efficiency

Abstract Similarity measures have long been utilized in information retrieval and machine learning domains for multi-purposes including text retrieval, text clustering, text summarization, plagiarism detection, and several other text-processing applications. However, the problem with these measures is that, until recently, there has never been one single measure recorded to be highly effective and efficient at the same time. Thus, the quest for an efficient and effective similarity measure is still an open-ended challenge. This study, in consequence, introduces a new highly-effective and time-efficient similarity measure for text clustering and classification. Furthermore, the study aims to provide a comprehensive scrutinization for seven of the most widely used similarity measures, mainly concerning their effectiveness and efficiency. Using the K-nearest neighbor algorithm (KNN) for classification, the K-means algorithm for clustering, and the bag of word (BoW) model for feature selection, all similarity measures are carefully examined in detail. The experimental evaluation has been made on two of the most popular datasets, namely, Reuters-21 and Web-KB. The obtained results confirm that the proposed set theory-based similarity measure (STB-SM), as a pre-eminent measure, outweighs all state-of-art measures significantly with regards to both effectiveness and efficiency.

Download Full-text

Evolution of Semantic Similarity—A Survey

ACM Computing Surveys ◽

10.1145/3440755 ◽

2021 ◽

Vol 54 (2) ◽

pp. 1-37

Author(s):

Dhivya Chandrasekaran ◽

Vijay Mago

Keyword(s):

Natural Language ◽

Semantic Similarity ◽

Language Processing ◽

Hybrid Methods ◽

Research Work ◽

Similarity Measures ◽

Text Data ◽

Knowledge Based ◽

Open Research ◽

Research Problems

Estimating the semantic similarity between text data is one of the challenging and open research problems in the field of Natural Language Processing (NLP). The versatility of natural language makes it difficult to define rule-based methods for determining semantic similarity measures. To address this issue, various semantic similarity methods have been proposed over the years. This survey article traces the evolution of such methods beginning from traditional NLP techniques such as kernel-based methods to the most recent research work on transformer-based models, categorizing them based on their underlying principles as knowledge-based, corpus-based, deep neural network–based methods, and hybrid methods. Discussing the strengths and weaknesses of each method, this survey provides a comprehensive view of existing systems in place for new researchers to experiment and develop innovative ideas to address the issue of semantic similarity.

Download Full-text

Procedure Based on Semantic Similarity for Merging Ontologies by Non-Redundant Knowledge Enrichment

International Journal of Knowledge Management ◽

10.4018/ijkm.2018040102 ◽

2018 ◽

Vol 14 (2) ◽

pp. 16-36 ◽

Cited By ~ 4

Author(s):

Carlos Ramón Rangel ◽

Junior Altamiranda ◽

Mariela Cerrada ◽

Jose Aguilar

Keyword(s):

Semantic Similarity ◽

Similarity Measures ◽

The Other ◽

Other Hand ◽

Knowledge Enrichment ◽

Merging Algorithms

The merging procedures of two ontologies are mostly related to the enrichment of one of the input ontologies, i.e. the knowledge of the aligned concepts from one ontology are copied into the other ontology. As a consequence, the resulting new ontology extends the original knowledge of the base ontology, but the unaligned concepts of the other ontology are not considered in the new extended ontology. On the other hand, there are experts-aided semi-automatic approaches to accomplish the task of including the knowledge that is left out from the resulting merged ontology and debugging the possible concept redundancy. With the aim of facing the posed necessity of including all the knowledge of the ontologies to be merged without redundancy, this article proposes an automatic approach for merging ontologies, which is based on semantic similarity measures and exhaustive searching along of the closest concepts. The authors' approach was compared to other merging algorithms, and good results are obtained in terms of completeness, relationships and properties, without creating redundancy.

Download Full-text

Paper Co-citation Analysis Using Semantic Similarity Measures

Advances in Intelligent Systems and Computing - Intelligent Systems Design and Applications ◽

10.1007/978-3-030-49342-4_26 ◽

2020 ◽

pp. 264-277

Author(s):

Mohamed Ali Hadj Taieb ◽

Mohamed Ben Aouicha ◽

Houcemeddine Turki

Keyword(s):

Citation Analysis ◽

Semantic Similarity ◽

Similarity Measures

Download Full-text