Creating Linked Data from Relational Databases

Author(s):  
Nikolaos Konstantinou ◽  
Dimitrios-Emmanuel Spanos
Author(s):  
Heiko Paulheim ◽  
Christian Bizer

Linked Data on the Web is either created from structured data sources (such as relational databases), from semi-structured sources (such as Wikipedia), or from unstructured sources (such as text). In the latter two cases, the generated Linked Data will likely be noisy and incomplete. In this paper, we present two algorithms that exploit statistical distributions of properties and types for enhancing the quality of incomplete and noisy Linked Data sets: SDType adds missing type statements, and SDValidate identifies faulty statements. Neither of the algorithms uses external knowledge, i.e., they operate only on the data itself. We evaluate the algorithms on the DBpedia and NELL knowledge bases, showing that they are both accurate as well as scalable. Both algorithms have been used for building the DBpedia 3.9 release: With SDType, 3.4 million missing type statements have been added, while using SDValidate, 13,000 erroneous RDF statements have been removed from the knowledge base.


Author(s):  
Wei Lijun ◽  
Pan Yang ◽  
Wang Hao ◽  
Wang Xianchao ◽  
Zhang Yan

To make up for the defects of semanteme expression about linked data, this paper proposes a semanteme expressing method of associated entities based on relationship diagram so as to realize the machine expression and recognition of associated semanteme in relational databases. Starting with the structure and relationship of relational schema, this paper analyzes the rich semanteme of associated entities and presents the semanteme parsing method based on the traversal path as well as its formal expression; the analysis of instance database is also carried out. Studies show that this method can comprehensively parse and express the associated semanteme of entities. This work has reference significance for the research of intelligent semanteme synthesis and for semanteme-oriented intelligent query.


Author(s):  
Fernando-Luis Álvarez ◽  
Joaquín-L. Gómez-Pantoja ◽  
Elena García-Barriocanal

2010 ◽  
Vol 15 (6) ◽  
pp. 642-649 ◽  
Author(s):  
Jing Zhang ◽  
Chune Ma ◽  
Chenting Zhao ◽  
Jun Zhang ◽  
Li Yi ◽  
...  

2018 ◽  
Vol 7 (3.3) ◽  
pp. 84
Author(s):  
Ju Ri Kim ◽  
Zhanfang Zhao ◽  
Sung Kook Han

Background/Objectives: The mapping RDB to RDF has become important to populate Linked Data more efficiently. This paper shows how to implement SPARQL endpoint in RDB using a conceptual level mapping approach.Methods/Statistical analysis: Many diverse approaches and related languages for mapping RDB to RDF have been proposed. The prominent achievements of mapping RDB to RDF are two standard draft Direct Mapping and R2RML proposed by W3C RDB2RDF Working Group. This paper analyzes these conventional mapping approaches and proposes a new approach based on schema mapping. The paper also presents SPARQL query processing in RDB.Findings: There are distinct differences between instance level mapping and conceptual level mapping for RDB2RDF. Data redundancy of instance level mapping causes many inevitable problems during mapping procedure. The conceptual level mapping can provide straightforward and efficient way. The ER model in RDB and RDF model in Linked Data have obvious similarity. The ER model describes entities and relationships, which is the conceptual schema of RDB. RDF model consists of three parts: subject, predicate and object, which is the standard model for data interchange on the Web. The entities in ER model and subjects in RDF model are all the things that can be anything in the real world. Both the relationships in ER model and predicates in RDF model describe the relations between things.Since RDB and RDF share the similar modeling approach at the schema level, it is reasonable that mapping approach should be based on RDB schema. This kind of conceptual level mapping also can provide efficient SPARQL query processing in RDB.Improvements/Applications: The paper realizes SPARQL query processing in RDB, which is based on conceptual level mapping. The query experiments show that it is a concise and efficient way to populate Linked Data.  


Author(s):  
Benamar Bouougada ◽  
Djelloul Bouchiha ◽  
Abdelghani Bouziane ◽  
Mimoun Malki

One of the fundamental problems in the development of the semantic web is what is known as the ontology authoring. This process allows the domain expert to create ontologies and their instances by dedicated tools from relational databases and/or web applications. In this article is presented an approach that allows building OWL ontologies and RDF instances from web applications. The proposed approach starts with a reverse engineering process that aims to recover the original design from the web application source code by using program understanding techniques. Then, a forward engineering process is applied to create an OWL ontology from the recovered diagrams, based on a set of mapping rules. The proposed approach is concertized by a PHP2OWLGen tool and is evaluated with a set of web applications. The obtained results were encouraging and showed the efficiency of the proposed approach.


Author(s):  
D. Ulutaş Karakol ◽  
G. Kara ◽  
C. Yılmaz ◽  
Ç. Cömert

<p><strong>Abstract.</strong> Large amounts of spatial data are hold in relational databases. Spatial data in the relational databases must be converted to RDF for semantic web applications. Spatial data is an important key factor for creating spatial RDF data. Linked Data is the most preferred way by users to publish and share data in the relational databases on the Web. In order to define the semantics of the data, links are provided to vocabularies (ontologies or other external web resources) that are common conceptualizations for a domain. Linking data of resource vocabulary with globally published concepts of domain resources combines different data sources and datasets, makes data more understandable, discoverable and usable, improves data interoperability and integration, provides automatic reasoning and prevents data duplication. The need to convert relational data to RDF is coming in sight due to semantic expressiveness of Semantic Web Technologies. One of the important key factors of Semantic Web is ontologies. Ontology means “explicit specification of a conceptualization”. The semantics of spatial data relies on ontologies. Linking of spatial data from relational databases to the web data sources is not an easy task for sharing machine-readable interlinked data on the Web. Tim Berners-Lee, the inventor of the World Wide Web and the advocate of Semantic Web and Linked Data, layed down the Linked Data design principles. Based on these rules, firstly, spatial data in the relational databases must be converted to RDF with the use of supporting tools. Secondly, spatial RDF data must be linked to upper level-domain ontologies and related web data sources. Thirdly, external data sources (ontologies and web data sources) must be determined and spatial RDF data must be linked related data sources. Finally, spatial linked data must be published on the web. The main contribution of this study is to determine requirements for finding RDF links and put forward the deficiencies for creating or publishing linked spatial data. To achieve this objective, this study researches existing approaches, conversion tools and web data sources for relational data conversion to the spatial RDF. In this paper, we have investigated current state of spatial RDF data, standards, open source platforms (particularly D2RQ, Geometry2RDF, TripleGeo, GeoTriples, Ontop, etc.) and the Web Data Sources. Moreover, the process of spatial data conversion to the RDF and how to link it to the web data sources is described. The implementation of linking spatial RDF data to the web data sources is demonstrated with an example use case. Road data has been linked to the one of the related popular web data sources, DBPedia. SILK, a tool for discovering relationships between data items within different Linked Data sources, is used as a link discovery framework. Also, we evaluated other link discovery tools e.g. LIMES, Silk and results are compared to carry out matching/linking task. As a result, linked road data is shared and represented as an information resource on the web and enriched with definitions of related different resources. By this way, road datasets are also linked by the related classes, individuals, spatial relations and properties they cover such as, construction date, road length, coordinates, etc.</p>


Author(s):  
Heiko Paulheim ◽  
Christian Bizer

Linked Data on the Web is either created from structured data sources (such as relational databases), from semi-structured sources (such as Wikipedia), or from unstructured sources (such as text). In the latter two cases, the generated Linked Data will likely be noisy and incomplete. In this paper, we present two algorithms that exploit statistical distributions of properties and types for enhancing the quality of incomplete and noisy Linked Data sets: SDType adds missing type statements, and SDValidate identifies faulty statements. Neither of the algorithms uses external knowledge, i.e., they operate only on the data itself. We evaluate the algorithms on the DBpedia and NELL knowledge bases, showing that they are both accurate as well as scalable. Both algorithms have been used for building the DBpedia 3.9 release: With SDType, 3.4 million missing type statements have been added, while using SDValidate, 13,000 erroneous RDF statements have been removed from the knowledge base.


Sign in / Sign up

Export Citation Format

Share Document