record similarity
Recently Published Documents


TOTAL DOCUMENTS

5
(FIVE YEARS 1)

H-INDEX

1
(FIVE YEARS 0)

2019 ◽  
pp. 10-30
Author(s):  
Ying Zhang ◽  
Puhai Yang ◽  
Chaopeng Li ◽  
Gengrui Zhang ◽  
Cheng Wang ◽  
...  

This article describes how geographic information systems (GISs) can enable, enrich and enhance geospatial applications and services. Accurate calculation of the similarity among geospatial entities that belong to different data sources is of great importance for geospatial data linking. At present, most research works use the name or category of the entity to measure the similarity of geographic information. Although the geospatial relationship is significant for geographic similarity measure, it has been ignored by most of the previous works. This article introduces the geospatial relationship and topology, and proposes an approach to compute the geospatial record similarity based on multiple features including the geospatial relationships, category and name tags. In order to improve the flexibility and operability, supervised machine learning such as SVM is used for the task of classifying pairs of mapping records. The authors test their approach using three sources, namely, OpenStreetMap, Google and Wikimapia. The results showed that the proposed approach obtained high correlation with the human judgements.


Author(s):  
Ying Zhang ◽  
Puhai Yang ◽  
Chaopeng Li ◽  
Gengrui Zhang ◽  
Cheng Wang ◽  
...  

This article describes how geographic information systems (GISs) can enable, enrich and enhance geospatial applications and services. Accurate calculation of the similarity among geospatial entities that belong to different data sources is of great importance for geospatial data linking. At present, most research works use the name or category of the entity to measure the similarity of geographic information. Although the geospatial relationship is significant for geographic similarity measure, it has been ignored by most of the previous works. This article introduces the geospatial relationship and topology, and proposes an approach to compute the geospatial record similarity based on multiple features including the geospatial relationships, category and name tags. In order to improve the flexibility and operability, supervised machine learning such as SVM is used for the task of classifying pairs of mapping records. The authors test their approach using three sources, namely, OpenStreetMap, Google and Wikimapia. The results showed that the proposed approach obtained high correlation with the human judgements.


A basic work of entity resolution is to detect duplicate records in single relation. To address this problem, many different approaches for different areas are proposed. The basic process of entity resolution is attribute similarity computation. Based on the attribute similarity computation methods, many techniques for different areas are proposed to fulfill the process of entity resolution. Rule-based approach is one of the main techniques for entity resolution. To speed up the process of duplicate record detecting, the authors use techniques such as canopy and blocking. In this chapter, the authors focus on the record similarity computation, rule-based approach, similarity threshold computation, and blocking.


2004 ◽  
Vol 36 (5) ◽  
pp. 365-370 ◽  
Author(s):  
Shun-Liang Cao ◽  
Lei Qin ◽  
Wei-Zhong He ◽  
Yang Zhong ◽  
Yang-Yong Zhu ◽  
...  

Abstract Semantic search is a key issue in integration of heterogeneous biological databases. In this paper, we present a methodology for implementing semantic search in BioDW, an integrated biological data warehouse. Two tables are presented: the DB2GO table to correlate Gene Ontology (GO) annotated entries from BioDW data sources with GO, and the semantic similarity table to record similarity scores derived from any pair of GO terms. Based on the two tables, multifarious ways for semantic search are provided and the corresponding entries in heterogeneous biological databases in semantic terms can be expediently searched.


Sign in / Sign up

Export Citation Format

Share Document