On Demand ETL of RDB to RDF Mapping for Linked Enterprise Data

Author(s):  
Lehireche Nesrine ◽  
Malki Mimoun ◽  
Lehireche Ahmed ◽  
Reda Mohamed Hamou

The purpose of the semantic web goes well beyond a simple provision of raw data: it is a matter of linking data together. This data meshing approach, called linked data (LD), refers to a set of best practices for publishing and interlinking data on the web. Due to its principles, a new context appeared called linked enterprise data (LED). The LED is the application of linked data to the information system of the enterprise to answer all the challenge of an IS, in order to have an agile and performing System. Where internal data sources link to external data, with easy access to information in performing time. This article focuses on using the LED to support the challenges of database integration and state-of-the-art for mapping RDB to RDF based on LD. Then, the authors introduce a proposition for on demand extract transform load (ETL) of RDB to RDF mapping using algorithms. Finally, the authors present a conclusion and discussion for their perspectives to implement the solution.

Author(s):  
Heiko Paulheim ◽  
Christian Bizer

Linked Data on the Web is either created from structured data sources (such as relational databases), from semi-structured sources (such as Wikipedia), or from unstructured sources (such as text). In the latter two cases, the generated Linked Data will likely be noisy and incomplete. In this paper, we present two algorithms that exploit statistical distributions of properties and types for enhancing the quality of incomplete and noisy Linked Data sets: SDType adds missing type statements, and SDValidate identifies faulty statements. Neither of the algorithms uses external knowledge, i.e., they operate only on the data itself. We evaluate the algorithms on the DBpedia and NELL knowledge bases, showing that they are both accurate as well as scalable. Both algorithms have been used for building the DBpedia 3.9 release: With SDType, 3.4 million missing type statements have been added, while using SDValidate, 13,000 erroneous RDF statements have been removed from the knowledge base.


2017 ◽  
Vol 22 (1) ◽  
pp. 21-37 ◽  
Author(s):  
Matthew T. Mccarthy

The web of linked data, otherwise known as the semantic web, is a system in which information is structured and interlinked to provide meaningful content to artificial intelligence (AI) algorithms. As the complex interactions between digital personae and these algorithms mediate access to information, it becomes necessary to understand how these classification and knowledge systems are developed. What are the processes by which those systems come to represent the world, and how are the controversies that arise in their creation, overcome? As a global form, the semantic web is an assemblage of many interlinked classification and knowledge systems, which are themselves assemblages. Through the perspectives of global assemblage theory, critical code studies and practice theory, I analyse netnographic data of one such assemblage. Schema.org is but one component of the larger global assemblage of the semantic web, and as such is an emergent articulation of different knowledges, interests and networks of actors. This articulation comes together to tame the profusion of things, seeking stability in representation, but in the process, it faces and produces more instability. Furthermore, this production of instability contributes to the emergence of new assemblages that have similar aims.


Author(s):  
Marlene Goncalves ◽  
Alberto Gobbi

Location-based Skyline queries select the nearest objects to a point that best meet the user's preferences. Particularly, this chapter focuses on location-based Skyline queries over web-accessible data. Web-accessible may have geographical location and be geotagged with documents containing ratings by web users. Location-based Skyline queries may express preferences based on dynamic features such as distance and changeable ratings. In this context, distance must be recalculated when a user changes his position while the ratings must be extracted from external data sources which are updated each time a user scores an item in the Web. This chapter describes and empirically studies four solutions capable of answering location-based Skyline queries considering user's position change and information extraction from the Web inside an area search around the user. They are based on an M-Tree index and Divide & Conquer principle.


2018 ◽  
Vol 14 (3) ◽  
pp. 134-166 ◽  
Author(s):  
Amit Singh ◽  
Aditi Sharan

This article describes how semantic web data sources follow linked data principles to facilitate efficient information retrieval and knowledge sharing. These data sources may provide complementary, overlapping or contradicting information. In order to integrate these data sources, the authors perform entity linking. Entity linking is an important task of identifying and linking entities across data sources that refer to the same real-world entities. In this work, they have proposed a genetic fuzzy approach to learn linkage rules for entity linking. This method is domain independent, automatic and scalable. Their approach uses fuzzy logic to adapt mutation and crossover rates of genetic programming to ensure guided convergence. The authors' experimental evaluation demonstrates that our approach is competitive and make significant improvements over state of the art methods.


Author(s):  
Christian Bizer ◽  
Tom Heath ◽  
Tim Berners-Lee

The term “Linked Data” refers to a set of best practices for publishing and connecting structured data on the Web. These best practices have been adopted by an increasing number of data providers over the last three years, leading to the creation of a global data space containing billions of assertions— the Web of Data. In this article, the authors present the concept and technical principles of Linked Data, and situate these within the broader context of related technological developments. They describe progress to date in publishing Linked Data on the Web, review applications that have been developed to exploit the Web of Data, and map out a research agenda for the Linked Data community as it moves forward.


2018 ◽  
Vol 14 (3) ◽  
pp. 167-183
Author(s):  
Ahmed Ktob ◽  
Zhoujun Li

This article describes how recently, many new technologies have been introduced to the web; linked data is probably the most important. Individuals and organizations started emerging and publishing their data on the web adhering to a set of best practices. This data is published mostly in English; hence, only English agents can consume it. Meanwhile, although the number of Arabic users on the web is immense, few Arabic datasets are published. Publication catalogs are one of the primary sources of Arabic data that is not being exploited. Arabic catalogs provide a significant amount of meaningful data and metadata that are commonly stored in excel sheets. In this article, an effort has been made to help publishers easily and efficiently share their catalogs' data as linked data. Marefa is the first tool implemented that automatically extracts RDF triples from Arabic catalogs, aligns them to the BIBO ontology and links them with the Arabic chapter of DBpedia. An evaluation of the framework was conducted, and some statistical measures were generated during the different phases of the extraction process.


2014 ◽  
Vol 519-520 ◽  
pp. 1568-1571
Author(s):  
Su Yan Wu ◽  
Wen Bo Li

The distinctive characteristics of knowledge service oriented expert management system needs to have instant, on-demand service, accurate service, comprehensive services, and personalized service. In order to meet the requirements of the establishment of this system, this paper studied the construction of expert resources and expert retrieval services , and The crowdsourcing technology-based open expert body build, experts of different heterogeneous data sources access to information based expert body interaction semantics expert retrieval method


Author(s):  
Kurt Cagle

Resource Oriented Architectures use the fundamental characteristics of the web itself in order to provide and update content on the web. While much of the philosophy concerning REST has been around since the early 1990s, the tools for turning these philosophies into working systems are only now becoming feasible. The fundamental tenets of ROA — that the web itself is primarily a giant database, that resources are abstractions that can be manifest in different representations, that a query-oriented resource architecture is more robust than a verb-oriented services architecture, and that a common publishing and syndication protocol is necessary to make such an architectural system work — are being adopted increasingly by people who realize that services oriented architecture are not effective at getting data from users or providing it to them in an easy to use way, but that ROA can do precisely that. Such a shift in perception is necessary but will nonetheless take a while to happen. It's necessary because the amount of information on the web is piling up faster than it can be indexed, and because under the current architectures the cost of developing "editors" for that data is prohibitive compared to the value of that information. It's necessary because the data within organizations is getting more complex than can be readily handled with a name/value approach to application development, and is increasingly contained within non-traditional data sources — Excel spreadsheets or Microsoft Word documents, for instance, or external data streams. Adoption will take time, however, because such an approach reduces the competitive barrier impedence that corporations can utilize to sell services, because it will take time to educate people in the underlying technologies and because there is a long-standing belief that ROA and SOA systems are incompatible. The rapidity at which companies lined up behind AtomPub, on the other hand, points to the fact that many IT organizations recognize the value to themselves that an AtomPub-type architecture opens up, while the educational curve is frankly true of most technologies — it will happen, slower than its proponents may hope but faster than its critics anticipate.


2021 ◽  
Vol 10 (10) ◽  
pp. 712
Author(s):  
Christian Zinke-Wehlmann ◽  
Amit Kirschenbaum

Geospatial linked data are an emerging domain, with growing interest in research and the industry. There is an increasing number of publicly available geospatial linked data resources, which can also be interlinked and easily integrated with private and industrial linked data on the web. The present paper introduces Geo-L, a system for the discovery of RDF spatial links based on topological relations. Experiments show that the proposed system improves state-of-the-art spatial linking processes in terms of mapping time and accuracy, as well as concerning resources retrieval efficiency and robustness.


Author(s):  
B. Margan ◽  
F. Hakimpour

Abstract. Linked Data is available data on the web in a standard format that is useful for content inspection and insights deriving from data through semantic queries. Querying and Exploring spatial and temporal features of various data sources will be facilitated by using Linked Data. In this paper, an application is presented for linking transport data on the web. Data from Google Maps API and OpenStreetMap linked and published on the web. Spatio-Temporal queries were executed over linked transport data and resulted in network and traffic information in accordance with the user’s position. The client-side of this application contains a web and a mobile application which presents a user interface to access network and traffic information according to the user’s position. The results of the experiment show that by using the intrinsic potential of Linked Data we have tackled the challenges of using heterogeneous data sources and have provided desirable information that could be used for discovering new patterns. The mobile GIS application enables assessing the profits of mentioned technologies through an easy and user-friendly way.


Sign in / Sign up

Export Citation Format

Share Document