scholarly journals An Approach to Probabilistic Data Integration for the Semantic Web

Author(s):  
Andrea Calì ◽  
Thomas Lukasiewicz
Author(s):  
Tapio Niemi ◽  
Santtu Toivonen ◽  
Marko Niinimaki ◽  
Jyrki Nummenmaa

2019 ◽  
pp. 230-253
Author(s):  
Ying Zhang ◽  
Chaopeng Li ◽  
Na Chen ◽  
Shaowen Liu ◽  
Liming Du ◽  
...  

Since large amount of geospatial data are produced by various sources and stored in incompatible formats, geospatial data integration is difficult because of the shortage of semantics. Despite standardised data format and data access protocols, such as Web Feature Service (WFS), can enable end-users with access to heterogeneous data stored in different formats from various sources, it is still time-consuming and ineffective due to the lack of semantics. To solve this problem, a prototype to implement the geospatial data integration is proposed by addressing the following four problems, i.e., geospatial data retrieving, modeling, linking and integrating. First, we provide a uniform integration paradigm for users to retrieve geospatial data. Then, we align the retrieved geospatial data in the modeling process to eliminate heterogeneity with the help of Karma. Our main contribution focuses on addressing the third problem. Previous work has been done by defining a set of semantic rules for performing the linking process. However, the geospatial data has some specific geospatial relationships, which is significant for linking but cannot be solved by the Semantic Web techniques directly. We take advantage of such unique features about geospatial data to implement the linking process. In addition, the previous work will meet a complicated problem when the geospatial data sources are in different languages. In contrast, our proposed linking algorithms are endowed with translation function, which can save the translating cost among all the geospatial sources with different languages. Finally, the geospatial data is integrated by eliminating data redundancy and combining the complementary properties from the linked records. We mainly adopt four kinds of geospatial data sources, namely, OpenStreetMap(OSM), Wikmapia, USGS and EPA, to evaluate the performance of the proposed approach. The experimental results illustrate that the proposed linking method can get high performance in generating the matched candidate record pairs in terms of Reduction Ratio(RR), Pairs Completeness(PC), Pairs Quality(PQ) and F-score. The integrating results denote that each data source can get much Complementary Completeness(CC) and Increased Completeness(IC).


Author(s):  
César J. Acuña ◽  
Mariano Minoli ◽  
Esperanza Marcos

Several systems integration proposals have been suggested over the years. However these proposals have mainly focused on data integration, not allowing users to take advantage of services offered by Web portals. Most of the mentioned proposals only provide a set of design principles to build integrated systems and lack in suggesting a systematic way of how to develop systems based on the integration architecture they propose. In previous work we have developed PISA (Web Portal Integration Architecture)—a Web portal integration architecture for data and services—and MIDAS-S, a methodological approach for the development of integrated Web portals, built according to PISA. This work shows, by means of a case study, how both proposals fit together integrating Web portals.


Sign in / Sign up

Export Citation Format

Share Document