SESLDS: An Extension Scheme for Linked Data Sources Based on Semantically Enhanced Annotation and Reasoning

Linked Data, a new form of knowledge representation and publishing described by RDF, can provide more precise and comprehensible semantic structures. However, the current RDF Schema (RDFS) and SPARQL-based query strategy cannot fully express the semantics of RDF since they cannot unleash the implicit semantics between linked entities, so they cannot unleash the potential of Linked Data. To fill this gap, this chapter first defines a new semantic annotating and reasoning method which can extend more implicit semantics from different properties and proposes a novel general Semantically-Extended Scheme for Linked Data Sources to realize the semantic extension over the target Linked Data source. Moreover, in order to effectively return more information in the process of semantic data retrieval, we then design a new querying model which extends the SPARQL pattern. Lastly, experimental results show that our proposal has advantages over the initial Linked Data source and can return more valid results than some of the most representative similarity search methods.

Download Full-text

Improving the Quality of Linked Data Using Statistical Distributions

Information Retrieval and Management ◽

10.4018/978-1-5225-5191-1.ch074 ◽

2018 ◽

pp. 1638-1664 ◽

Cited By ~ 1

Author(s):

Heiko Paulheim ◽

Christian Bizer

Keyword(s):

Knowledge Base ◽

Linked Data ◽

Relational Databases ◽

Knowledge Bases ◽

Structured Data ◽

Data Sources ◽

Data Sets ◽

Statistical Distributions ◽

The Web

Linked Data on the Web is either created from structured data sources (such as relational databases), from semi-structured sources (such as Wikipedia), or from unstructured sources (such as text). In the latter two cases, the generated Linked Data will likely be noisy and incomplete. In this paper, we present two algorithms that exploit statistical distributions of properties and types for enhancing the quality of incomplete and noisy Linked Data sets: SDType adds missing type statements, and SDValidate identifies faulty statements. Neither of the algorithms uses external knowledge, i.e., they operate only on the data itself. We evaluate the algorithms on the DBpedia and NELL knowledge bases, showing that they are both accurate as well as scalable. Both algorithms have been used for building the DBpedia 3.9 release: With SDType, 3.4 million missing type statements have been added, while using SDValidate, 13,000 erroneous RDF statements have been removed from the knowledge base.

Download Full-text

PROPheT – Ontology Population and Semantic Enrichment from Linked Data Sources

Communications in Computer and Information Science - Data Analytics and Management in Data Intensive Domains ◽

10.1007/978-3-319-96553-6_12 ◽

2018 ◽

pp. 157-168 ◽

Cited By ~ 1

Author(s):

Marina Riga ◽

Panagiotis Mitzias ◽

Efstratios Kontopoulos ◽

Ioannis Kompatsiaris

Keyword(s):

Linked Data ◽

Data Sources ◽

Semantic Enrichment ◽

Ontology Population

Download Full-text

The Role of Geo-Demographic Big Data for Assessing the Effectiveness of Crowd-Funded Software Projects

Geo-Intelligence and Visualization through Big Data Trends - Advances in Geospatial Technologies ◽

10.4018/978-1-4666-8465-2.ch004 ◽

2015 ◽

pp. 94-120

Author(s):

Jonathan Bishop

Keyword(s):

Big Data ◽

Linked Data ◽

Data Sources ◽

Social Phenomenon ◽

Software Projects ◽

Economic Problems ◽

The Social ◽

Business Analysis ◽

Average Position

The current phenomenon of Big Data – the use of datasets that are too big for traditional business analysis tools used in industry – is driving a shift in how social and economic problems are understood and analysed. This chapter explores the role Big Data can play in analysing the effectiveness of crowd-funding projects, using the data from such a project, which aimed to fund the development of a software plug-in called ‘QPress'. Data analysed included the website metrics of impressions, clicks and average position, which were found to be significantly connected with geographical factors using an ANOVA. These were combined with other country data to perform t-tests in order to form a geo-demographic understanding of those who are displayed advertisements inviting participation in crowd-funding. The chapter concludes that there are a number of interacting variables and that for Big Data studies to be effective, their amalgamation with other data sources, including linked data, is essential to providing an overall picture of the social phenomenon being studied.

Download Full-text

A collaborative methodology for developing a semantic model for interlinking Cancer Chemoprevention linked-data sources

Semantic Web ◽

10.3233/sw-130112 ◽

2014 ◽

Vol 5 (2) ◽

pp. 127-142 ◽

Cited By ~ 13

Author(s):

Dimitris Zeginis ◽

Ali Hasnain ◽

Nikolaos Loutas ◽

Helena Futscher Deus ◽

Ronan Fox ◽

...

Keyword(s):

Linked Data ◽

Cancer Chemoprevention ◽

Data Sources ◽

Semantic Model ◽

Collaborative Methodology

Download Full-text

Genetic-Fuzzy Programming Based Linkage Rule Miner (GFPLR-Miner) for Entity Linking in Semantic Web

International Journal on Semantic Web and Information Systems ◽

10.4018/ijswis.2018070107 ◽

2018 ◽

Vol 14 (3) ◽

pp. 134-166 ◽

Cited By ~ 1

Author(s):

Amit Singh ◽

Aditi Sharan

Keyword(s):

Fuzzy Logic ◽

Semantic Web ◽

Linked Data ◽

State Of The Art ◽

Fuzzy Programming ◽

Data Sources ◽

Fuzzy Approach ◽

Entity Linking ◽

Efficient Information ◽

Domain Independent

This article describes how semantic web data sources follow linked data principles to facilitate efficient information retrieval and knowledge sharing. These data sources may provide complementary, overlapping or contradicting information. In order to integrate these data sources, the authors perform entity linking. Entity linking is an important task of identifying and linking entities across data sources that refer to the same real-world entities. In this work, they have proposed a genetic fuzzy approach to learn linkage rules for entity linking. This method is domain independent, automatic and scalable. Their approach uses fuzzy logic to adapt mutation and crossover rates of genetic programming to ensure guided convergence. The authors' experimental evaluation demonstrates that our approach is competitive and make significant improvements over state of the art methods.

Download Full-text

An E-Commerce Platform for Industrialized Construction Procurement Based on BIM and Linked Data

Sustainability ◽

10.3390/su10082613 ◽

2018 ◽

Vol 10 (8) ◽

pp. 2613

Author(s):

Dandan He ◽

Zhongfu Li ◽

Chunlin Wu ◽

Xin Ning

Keyword(s):

Data Integration ◽

Construction Industry ◽

Resource Description Framework ◽

Linked Data ◽

Query Language ◽

Rapid Development ◽

Data Sources ◽

Product Data ◽

Description Framework ◽

Resource Description

Industrialized construction has raised the requirements of procurement methods used in the construction industry. The rapid development of e-commerce offers efficient and effective solutions, however the large number of participants in the construction industry means that the data involved are complex, and problems arise related to volume, heterogeneity, and fragmentation. Thus, the sector lags behind others in the adoption of e-commerce. In particular, data integration has become a barrier preventing further development. Traditional e-commerce platform, which considered data integration for common product data, cannot meet the requirements of construction product data integration. This study aimed to build an information-integrated e-commerce platform for industrialized construction procurement (ICP) to overcome some of the shortcomings existing platforms. We proposed a platform based on Building Information Modelling (BIM) and linked data, taking an innovative approach to data integration. It uses industrialized construction technology to support product standardization, BIM to support procurement process, and linked data to connect different data sources. The platform was validated using a case study. With the development of an e-commerce ontology, industrialized construction component information was extracted from BIM models and converted to Resource Description Framework (RDF) format. Related information from different data sources was also converted to RDF format, and Simple Protocol and Resource Description Framework Query Language (SPARQL) queries were implemented. The platform provides a solution for the development of e-commerce platform in the construction industry.

Download Full-text

Coverage Evaluation on Probabilistically Linked Data

Journal of Official Statistics ◽

10.1515/jos-2015-0025 ◽

2015 ◽

Vol 31 (3) ◽

pp. 415-429 ◽

Cited By ~ 2

Author(s):

Loredana Di Consiglio ◽

Tiziana Tuoto

Keyword(s):

Administrative Data ◽

Simulation Study ◽

Record Linkage ◽

Linked Data ◽

Data Sources ◽

Simple Method ◽

Population Total ◽

Linkage Error ◽

Capture Recapture ◽

Better Than

Abstract The Capture-recapture method is a well-known solution for evaluating the unknown size of a population. Administrative data represent sources of independent counts of a population and can be jointly exploited for applying the capture-recapture method. Of course, administrative sources are affected by over- or undercoverage when considered separately. The standard Petersen approach is based on strong assumptions, including perfect record linkage between lists. In reality, record linkage results can be affected by errors. A simple method for achieving linkage error-unbiased population total estimates is proposed in Ding and Fienberg (1994). In this article, an extension of the Ding and Fienberg model by relaxing their conditions is proposed. The procedures are illustrated for estimating the total number of road casualties, on the basis of a probabilistic record linkage between two administrative data sources. Moreover, a simulation study is developed, providing evidence that the adjusted estimator always performs better than the Petersen estimator.

Download Full-text

NEPS Starting Cohort 6 survey data linked to administrative data of the IAB (NEPS-SC6-ADIAB)

International Journal for Population Data Science ◽

10.23889/ijpds.v3i2.551 ◽

2018 ◽

Vol 3 (2) ◽

Author(s):

Nadine Bachbauer

Keyword(s):

Survey Data ◽

Administrative Data ◽

Linked Data ◽

Panel Study ◽

Data Access ◽

Data Sources ◽

Employment Agency ◽

Educational Trajectories ◽

Analytical Potential ◽

Site Access

BackgroundNEPS-SC6-ADIAB is a new linked data product containing survey data of Starting Cohort 6 of the German National Educational Panel Study (NEPS) and administrative employment data from the Institute for Employment Research (IAB), the research institute of the Federal Employment Agency. NEPS is provided by the Leibniz Institute for Educational Trajectories (LIfBi). Starting Cohort 6 of this panel survey includes adults in their professional life, the survey focuses on education in adulthood and lifelong learning. The administrative data in NEPS-SC6-ADIAB consist of comprehensive information on the employment histories. ObjectivesCombining these two data sources increases for example the information about individual employment history. Overall, the data volume is increased by the linkage between the survey data and the administrative data. MethodsA record linkage process was used to link the two data sources. The data access is free for the whole scientific community. In addition to a large number of On-site access locations within Germany, there are also international On-site access locations. Including London and Colchester. In addition a Remote Data Access is offered. ConclusionsThis data linkage project is very innovative and creates an extensive database, which results in extensive analytical potential. A short application example is made to exemplify the comprehensive analytical potential of NEPS-SC6-ADIAB. This ongoing project deals with nonresponse in survey data. The linked data has a variety of variables collected in both data sources, administratively and through the NEPS survey, allowing for comparative analyses. In this case an idea to compensate nonresponse in income data with administrative data is drawn.

Download Full-text

Discovering Concept Coverings in Ontologies of Linked Data Sources

The Semantic Web – ISWC 2012 - Lecture Notes in Computer Science ◽

10.1007/978-3-642-35176-1_27 ◽

2012 ◽

pp. 427-443 ◽

Cited By ~ 27

Author(s):

Rahul Parundekar ◽

Craig A. Knoblock ◽

José Luis Ambite

Keyword(s):

Linked Data ◽

Data Sources

Download Full-text