SESLDS: An Extension Scheme for Linked Data Sources Based on Semantically Enhanced Annotation and Reasoning

2017 ◽  
Vol 33 (2) ◽  
pp. 233-258 ◽  
Author(s):  
Pu Li ◽  
Bao Xiao ◽  
Aftab Akram ◽  
Yuncheng Jiang ◽  
Zhifeng Zhang
Author(s):  
Pu Li ◽  
Zhifeng Zhang ◽  
Lujuan Deng ◽  
Junxia Ma ◽  
Fenglong Wu ◽  
...  

Linked Data, a new form of knowledge representation and publishing described by RDF, can provide more precise and comprehensible semantic structures. However, the current RDF Schema (RDFS) and SPARQL-based query strategy cannot fully express the semantics of RDF since they cannot unleash the implicit semantics between linked entities, so they cannot unleash the potential of Linked Data. To fill this gap, this chapter first defines a new semantic annotating and reasoning method which can extend more implicit semantics from different properties and proposes a novel general Semantically-Extended Scheme for Linked Data Sources to realize the semantic extension over the target Linked Data source. Moreover, in order to effectively return more information in the process of semantic data retrieval, we then design a new querying model which extends the SPARQL pattern. Lastly, experimental results show that our proposal has advantages over the initial Linked Data source and can return more valid results than some of the most representative similarity search methods.


Author(s):  
Heiko Paulheim ◽  
Christian Bizer

Linked Data on the Web is either created from structured data sources (such as relational databases), from semi-structured sources (such as Wikipedia), or from unstructured sources (such as text). In the latter two cases, the generated Linked Data will likely be noisy and incomplete. In this paper, we present two algorithms that exploit statistical distributions of properties and types for enhancing the quality of incomplete and noisy Linked Data sets: SDType adds missing type statements, and SDValidate identifies faulty statements. Neither of the algorithms uses external knowledge, i.e., they operate only on the data itself. We evaluate the algorithms on the DBpedia and NELL knowledge bases, showing that they are both accurate as well as scalable. Both algorithms have been used for building the DBpedia 3.9 release: With SDType, 3.4 million missing type statements have been added, while using SDValidate, 13,000 erroneous RDF statements have been removed from the knowledge base.


Author(s):  
Jonathan Bishop

The current phenomenon of Big Data – the use of datasets that are too big for traditional business analysis tools used in industry – is driving a shift in how social and economic problems are understood and analysed. This chapter explores the role Big Data can play in analysing the effectiveness of crowd-funding projects, using the data from such a project, which aimed to fund the development of a software plug-in called ‘QPress'. Data analysed included the website metrics of impressions, clicks and average position, which were found to be significantly connected with geographical factors using an ANOVA. These were combined with other country data to perform t-tests in order to form a geo-demographic understanding of those who are displayed advertisements inviting participation in crowd-funding. The chapter concludes that there are a number of interacting variables and that for Big Data studies to be effective, their amalgamation with other data sources, including linked data, is essential to providing an overall picture of the social phenomenon being studied.


Semantic Web ◽  
2014 ◽  
Vol 5 (2) ◽  
pp. 127-142 ◽  
Author(s):  
Dimitris Zeginis ◽  
Ali Hasnain ◽  
Nikolaos Loutas ◽  
Helena Futscher Deus ◽  
Ronan Fox ◽  
...  

2018 ◽  
Vol 14 (3) ◽  
pp. 134-166 ◽  
Author(s):  
Amit Singh ◽  
Aditi Sharan

This article describes how semantic web data sources follow linked data principles to facilitate efficient information retrieval and knowledge sharing. These data sources may provide complementary, overlapping or contradicting information. In order to integrate these data sources, the authors perform entity linking. Entity linking is an important task of identifying and linking entities across data sources that refer to the same real-world entities. In this work, they have proposed a genetic fuzzy approach to learn linkage rules for entity linking. This method is domain independent, automatic and scalable. Their approach uses fuzzy logic to adapt mutation and crossover rates of genetic programming to ensure guided convergence. The authors' experimental evaluation demonstrates that our approach is competitive and make significant improvements over state of the art methods.


2018 ◽  
Vol 10 (8) ◽  
pp. 2613
Author(s):  
Dandan He ◽  
Zhongfu Li ◽  
Chunlin Wu ◽  
Xin Ning

Industrialized construction has raised the requirements of procurement methods used in the construction industry. The rapid development of e-commerce offers efficient and effective solutions, however the large number of participants in the construction industry means that the data involved are complex, and problems arise related to volume, heterogeneity, and fragmentation. Thus, the sector lags behind others in the adoption of e-commerce. In particular, data integration has become a barrier preventing further development. Traditional e-commerce platform, which considered data integration for common product data, cannot meet the requirements of construction product data integration. This study aimed to build an information-integrated e-commerce platform for industrialized construction procurement (ICP) to overcome some of the shortcomings existing platforms. We proposed a platform based on Building Information Modelling (BIM) and linked data, taking an innovative approach to data integration. It uses industrialized construction technology to support product standardization, BIM to support procurement process, and linked data to connect different data sources. The platform was validated using a case study. With the development of an e-commerce ontology, industrialized construction component information was extracted from BIM models and converted to Resource Description Framework (RDF) format. Related information from different data sources was also converted to RDF format, and Simple Protocol and Resource Description Framework Query Language (SPARQL) queries were implemented. The platform provides a solution for the development of e-commerce platform in the construction industry.


2015 ◽  
Vol 31 (3) ◽  
pp. 415-429 ◽  
Author(s):  
Loredana Di Consiglio ◽  
Tiziana Tuoto

Abstract The Capture-recapture method is a well-known solution for evaluating the unknown size of a population. Administrative data represent sources of independent counts of a population and can be jointly exploited for applying the capture-recapture method. Of course, administrative sources are affected by over- or undercoverage when considered separately. The standard Petersen approach is based on strong assumptions, including perfect record linkage between lists. In reality, record linkage results can be affected by errors. A simple method for achieving linkage error-unbiased population total estimates is proposed in Ding and Fienberg (1994). In this article, an extension of the Ding and Fienberg model by relaxing their conditions is proposed. The procedures are illustrated for estimating the total number of road casualties, on the basis of a probabilistic record linkage between two administrative data sources. Moreover, a simulation study is developed, providing evidence that the adjusted estimator always performs better than the Petersen estimator.


Author(s):  
Nadine Bachbauer

BackgroundNEPS-SC6-ADIAB is a new linked data product containing survey data of Starting Cohort 6 of the German National Educational Panel Study (NEPS) and administrative employment data from the Institute for Employment Research (IAB), the research institute of the Federal Employment Agency. NEPS is provided by the Leibniz Institute for Educational Trajectories (LIfBi). Starting Cohort 6 of this panel survey includes adults in their professional life, the survey focuses on education in adulthood and lifelong learning. The administrative data in NEPS-SC6-ADIAB consist of comprehensive information on the employment histories. ObjectivesCombining these two data sources increases for example the information about individual employment history. Overall, the data volume is increased by the linkage between the survey data and the administrative data. MethodsA record linkage process was used to link the two data sources. The data access is free for the whole scientific community. In addition to a large number of On-site access locations within Germany, there are also international On-site access locations. Including London and Colchester. In addition a Remote Data Access is offered. ConclusionsThis data linkage project is very innovative and creates an extensive database, which results in extensive analytical potential. A short application example is made to exemplify the comprehensive analytical potential of NEPS-SC6-ADIAB. This ongoing project deals with nonresponse in survey data. The linked data has a variety of variables collected in both data sources, administratively and through the NEPS survey, allowing for comparative analyses. In this case an idea to compensate nonresponse in income data with administrative data is drawn.


Sign in / Sign up

Export Citation Format

Share Document