Semantic Extension for the Linked Data Based on Semantically Enhanced Annotation and Reasoning

Author(s):  
Pu Li ◽  
Zhifeng Zhang ◽  
Lujuan Deng ◽  
Junxia Ma ◽  
Fenglong Wu ◽  
...  

Linked Data, a new form of knowledge representation and publishing described by RDF, can provide more precise and comprehensible semantic structures. However, the current RDF Schema (RDFS) and SPARQL-based query strategy cannot fully express the semantics of RDF since they cannot unleash the implicit semantics between linked entities, so they cannot unleash the potential of Linked Data. To fill this gap, this chapter first defines a new semantic annotating and reasoning method which can extend more implicit semantics from different properties and proposes a novel general Semantically-Extended Scheme for Linked Data Sources to realize the semantic extension over the target Linked Data source. Moreover, in order to effectively return more information in the process of semantic data retrieval, we then design a new querying model which extends the SPARQL pattern. Lastly, experimental results show that our proposal has advantages over the initial Linked Data source and can return more valid results than some of the most representative similarity search methods.

2019 ◽  
pp. 254-277 ◽  
Author(s):  
Ying Zhang ◽  
Chaopeng Li ◽  
Na Chen ◽  
Shaowen Liu ◽  
Liming Du ◽  
...  

Since large amount of geospatial data are produced by various sources, geospatial data integration is difficult because of the shortage of semantics. Despite standardised data format and data access protocols, such as Web Feature Service (WFS), can enable end-users with access to heterogeneous data stored in different formats from various sources, it is still time-consuming and ineffective due to the lack of semantics. To solve this problem, a prototype to implement the geospatial data integration is proposed by addressing the following four problems, i.e., geospatial data retrieving, modeling, linking and integrating. We mainly adopt four kinds of geospatial data sources to evaluate the performance of the proposed approach. The experimental results illustrate that the proposed linking method can get high performance in generating the matched candidate record pairs in terms of Reduction Ratio(RR), Pairs Completeness(PC), Pairs Quality(PQ) and F-score. The integrating results denote that each data source can get much Complementary Completeness(CC) and Increased Completeness(IC).


Author(s):  
Ying Zhang ◽  
Chaopeng Li ◽  
Na Chen ◽  
Shaowen Liu ◽  
Liming Du ◽  
...  

Since large amount of geospatial data are produced by various sources, geospatial data integration is difficult because of the shortage of semantics. Despite standardised data format and data access protocols, such as Web Feature Service (WFS), can enable end-users with access to heterogeneous data stored in different formats from various sources, it is still time-consuming and ineffective due to the lack of semantics. To solve this problem, a prototype to implement the geospatial data integration is proposed by addressing the following four problems, i.e., geospatial data retrieving, modeling, linking and integrating. We mainly adopt four kinds of geospatial data sources to evaluate the performance of the proposed approach. The experimental results illustrate that the proposed linking method can get high performance in generating the matched candidate record pairs in terms of Reduction Ratio(RR), Pairs Completeness(PC), Pairs Quality(PQ) and F-score. The integrating results denote that each data source can get much Complementary Completeness(CC) and Increased Completeness(IC).


Computers ◽  
2019 ◽  
Vol 8 (2) ◽  
pp. 49 ◽  
Author(s):  
Angela Di Iorio ◽  
Marco Schaerf

Library organizations have enthusiastically undertaken semantic web initiatives and in particular the data publishing as linked data. Nevertheless, different surveys report the experimental nature of initiatives and the consumer difficulty in re-using data. These barriers are a hindrance for using linked datasets, as an infrastructure that enhances the library and related information services. This paper presents an approach for encoding, as a Linked Vocabulary, the “tacit” knowledge of the information system that manages the data source. The objective is the improvement of the interpretation process of the linked data meaning of published datasets. We analyzed a digital library system, as a case study, for prototyping the “semantic data management” method, where data and its knowledge are natively managed, taking into account the linked data pillars. The ultimate objective of the semantic data management is to curate the correct consumers’ interpretation of data, and to facilitate the proper re-use. The prototype defines the ontological entities representing the knowledge, of the digital library system, that is not stored in the data source, nor in the existing ontologies related to the system’s semantics. Thus we present the local ontology and its matching with existing ontologies, Preservation Metadata Implementation Strategies (PREMIS) and Metadata Objects Description Schema (MODS), and we discuss linked data triples prototyped from the legacy relational database, by using the local ontology. We show how the semantic data management, can deal with the inconsistency of system data, and we conclude that a specific change in the system developer mindset, it is necessary for extracting and “codifying” the tacit knowledge, which is necessary to improve the data interpretation process.


2014 ◽  
Vol 2014 ◽  
pp. 1-7 ◽  
Author(s):  
Baocheng Huang ◽  
Guang Yu ◽  
Hamid Reza Karimi

It is valuable for the real world to find the opinion leaders. Because different data sources usually have different characteristics, there does not exist a standard algorithm to find and detect the opinion leaders in different data sources. Every data source has its own structural characteristics, and also has its own detection algorithm to find the opinion leaders. Experimental results show the opinion leaders and theirs characteristics can be found among the comments from the Weibo social network of China, which is like Facebook or Twitter in USA.


2017 ◽  
Vol 33 (2) ◽  
pp. 233-258 ◽  
Author(s):  
Pu Li ◽  
Bao Xiao ◽  
Aftab Akram ◽  
Yuncheng Jiang ◽  
Zhifeng Zhang

2016 ◽  
Vol 34 (3) ◽  
pp. 405-418
Author(s):  
Wei Lu ◽  
Xinghu Yue ◽  
Qikai Cheng ◽  
Rui Meng

Purpose The purpose of this paper is to explore the use of inverse local context analysis (ILCA) to obtain data from limited accessible data sources. Design/methodology/approach The experimental results show that the method the authors proposed can obtain all retrieved documents from the limited accessible data source using the least number of queries. Findings The experimental results show that the method we proposed can obtain all retrieved documents from the limited accessible data source using the least number of queries. Originality/value To the best of the authors’ knowledge, this paper provides the first attempt to gather all the retrieved documents from limited accessible data source, and the efficiency and ease of implementation of the proposed solution make it feasible for practical applications. The method the authors proposed can also benefit the construction of web corpus.


Author(s):  
Vânia M. P. Vidal ◽  
José A. F. de Macêdo ◽  
João C. Pinheiro ◽  
Marco A. Casanova ◽  
Fábio Porto

In this paper, the authors present a three-level mediator based framework for linked data integration. In the approach, the mediated schema is represented by a domain ontology, which provides a conceptual representation of the application. Each relevant data source is described by a source ontology, published on the Web according to the Linked Data principles. Each source ontology is rewritten as an application ontology, whose vocabulary is restricted to be a subset of the vocabulary of the domain ontology. The main contribution of the paper is an algorithm for reformulating a user query into sub-queries over the data sources. The reformulation algorithm exploits inter-ontology links to return more complete query results. The approach is illustrated by an example of a virtual store mediating access to online booksellers.


Author(s):  
Vânia M. P. Vidal ◽  
José A. F. de Macêdo ◽  
João C. Pinheiro ◽  
Marco A. Casanova ◽  
Fábio Porto

In this paper, the authors present a three-level mediator based framework for linked data integration. In the approach, the mediated schema is represented by a domain ontology, which provides a conceptual representation of the application. Each relevant data source is described by a source ontology, published on the Web according to the Linked Data principles. Each source ontology is rewritten as an application ontology, whose vocabulary is restricted to be a subset of the vocabulary of the domain ontology. The main contribution of the paper is an algorithm for reformulating a user query into sub-queries over the data sources. The reformulation algorithm exploits inter-ontology links to return more complete query results. The approach is illustrated by an example of a virtual store mediating access to online booksellers.


Author(s):  
Cordell Golden ◽  
Lisa Mirel

IntroductionThe linkage of survey data with administrative data enhances the scientific value and analytic potential of both sources of information. Combining multiple data sources facilitates richer analyses and allows data users to answer research questions that cannot be addressed easily using a single data source. Objectives and ApproachRecently, the United States National Center for Health Statistics (NCHS) and Department of Housing and Urban Development (HUD) collaborated to link two population health surveys conducted by NCHS with housing assistance program data maintained by HUD. The resulting linked data files enable researchers to examine relationships between the receipt of federal housing assistance and health. In this talk, we will describe some of the challenges faced when initiating a data sharing agreement between two federal agencies governed by distinct legislative authorities, particularly issues related to legal requirements and data access. ResultsWe will describe each of the data sources used in the linkage as well as the methodology used to combine the data. Lastly, the discussion will focus on the inter-agency collaboration that led to the production of the supporting technical documentation developed to assist researchers using the linked data files. The linkage of NCHS survey data and HUD administrative data serves as an example of how two agencies were able to overcome challenges to successfully form a data sharing partnership as a cost-effective means to develop a robust data source that benefits the collaborating agencies as well as policy makers and outside researchers. Conclusion/ImplicationsBoth agencies anticipate that this partnership will continue as additional survey and administrative data are collected.


Epidemiologia ◽  
2021 ◽  
Vol 2 (3) ◽  
pp. 315-324
Author(s):  
Juan M. Banda ◽  
Ramya Tekumalla ◽  
Guanyu Wang ◽  
Jingyuan Yu ◽  
Tuo Liu ◽  
...  

As the COVID-19 pandemic continues to spread worldwide, an unprecedented amount of open data is being generated for medical, genetics, and epidemiological research. The unparalleled rate at which many research groups around the world are releasing data and publications on the ongoing pandemic is allowing other scientists to learn from local experiences and data generated on the front lines of the COVID-19 pandemic. However, there is a need to integrate additional data sources that map and measure the role of social dynamics of such a unique worldwide event in biomedical, biological, and epidemiological analyses. For this purpose, we present a large-scale curated dataset of over 1.12 billion tweets, growing daily, related to COVID-19 chatter generated from 1 January 2020 to 27 June 2021 at the time of writing. This data source provides a freely available additional data source for researchers worldwide to conduct a wide and diverse number of research projects, such as epidemiological analyses, emotional and mental responses to social distancing measures, the identification of sources of misinformation, stratified measurement of sentiment towards the pandemic in near real time, among many others.


Sign in / Sign up

Export Citation Format

Share Document