Enhancing collective entity resolution utilizing Quasi-Clique similarity measure

Errors with names occur frequently. “California” and “CA” refer to the same state of the USA; however, they may both appear as records in a database at the same time. Several techniques need to be proposed to solve these problems. In this chapter, the authors introduce the methods of entity resolution on names. They propose three methods. Similarity measure between names is a kind of fundamental techniques; it makes a significant contribution to the textual similarity. The method of string transformations can handle some situations beyond textual similarity. Recently, learning algorithms on string transformations have been proposed to make matching robust to such variations. Examples illustrate the benefits of each approach.

Download Full-text

A novel similarity measure for spatial entity resolution based on data granularity model: Managing inconsistencies in place descriptions

Applied Intelligence ◽

10.1007/s10489-020-01959-y ◽

2021 ◽

Author(s):

Mohammad Khodizadeh-Nahari ◽

Nasser Ghadiri ◽

Ahmad Baraani-Dastjerdi ◽

Jörg-Rüdiger Sack

Keyword(s):

Similarity Measure ◽

Entity Resolution ◽

Data Granularity ◽

Spatial Entity ◽

Place Descriptions

Download Full-text

Handling WSD using Hierarchical Clustering Algorithm with sentences

International Journal of Scientific Research in Science Engineering and Technology ◽

10.32628/ijsrset1841120 ◽

2018 ◽

pp. 83-88

Author(s):

Mohana Priya K ◽

Pooja Ragavi S ◽

Krishna Priya G

Keyword(s):

Hierarchical Clustering ◽

Similarity Measure ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Cosine Similarity Measure ◽

Hierarchical Clustering Algorithm ◽

Multiple Levels ◽

Pos Tagger ◽

Sentence Clustering ◽

The Right

Clustering is the process of grouping objects into subsets that have meaning in the context of a particular problem. It does not rely on predefined classes. It is referred to as an unsupervised learning method because no information is provided about the "right answer" for any of the objects. Many clustering algorithms have been proposed and are used based on different applications. Sentence clustering is one of best clustering technique. Hierarchical Clustering Algorithm is applied for multiple levels for accuracy. For tagging purpose POS tagger, porter stemmer is used. WordNet dictionary is utilized for determining the similarity by invoking the Jiang Conrath and Cosine similarity measure. Grouping is performed with respect to the highest similarity measure value with a mean threshold. This paper incorporates many parameters for finding similarity between words. In order to identify the disambiguated words, the sense identification is performed for the adjectives and comparison is performed. semcor and machine learning datasets are employed. On comparing with previous results for WSD, our work has improvised a lot which gives a percentage of 91.2%

Download Full-text