Role of Vocabularies for Semantic Interoperability in Enabling the Linked Open Data Publishing

Author(s):  
Ahsan Morshed

In the spite of explosive growth of the Internet, information relevant to users is often unavailable even when using the latest browsers. At the same time, there is an ever-increasing number of documents that vary widely in content, format, and quality. The documents often change in content and location because they do not belong to any kind of centralized control. On the other hand, there is a huge number of unknown users with extremely diverse needs, skills, education, and cultural and language backgrounds. One of the solutions to these problems might be to use standard terms with meaning; this can be termed as controlled vocabulary (CV). Though there is no specific notion of CV, we can define it as a set of concepts or preferred terms and existing relations among them. These vocabularies play very important roles classifying the information. In this chapter, we focus the role of CV for publishing the web of data on the Web.

Author(s):  
JOSEP MARIA BRUNETTI ◽  
ROSA GIL ◽  
JUAN MANUEL GIMENO ◽  
ROBERTO GARCIA

Thanks to Open Data initiatives the amount of data available on the Web is rapidly increasing. Unfortunately, most of these initiatives only publish raw tabular data, which makes its analysis and reuse very difficult. Linked Data principles allow for a more sophisticated approach by making explicit both the structure and semantics of the data. However, from the user experience viewpoint, published datasets continue to be monolithic files which are completely opaque or difficult to explore by making complex semantic queries. Our objective is to facilitate the user to grasp what kind of entities are in the dataset, how they are interrelated, which are their main properties and values, etc. Rhizomer is a data publishing tool whose interface provides a set of components borrowed from Information Architecture (IA) that facilitate getting an insight of the dataset at hand. Rhizomer automatically generates navigation menus and facets based on the kinds of things in the dataset and how they are described through metadata properties and values. This tool is currently being evaluated with end users that discover a whole new perspective of the Web of Data.


Author(s):  
Muhammad Ahtisham Aslam ◽  
Naif Radi Aljohani

Producing the Linked Open Data (LOD) is getting potential to publish high-quality interlinked data. Publishing such data facilitates intelligent searching from the Web of data. In the context of scientific publications, data about millions of scientific documents published by hundreds and thousands of publishers is in silence as it is not published as open data and ultimately is not linked to other datasets. In this paper the authors present SPedia: a semantically enriched knowledge base of data about scientific documents. SPedia knowledge base provides information on more than nine million scientific documents, consisting of more than three hundred million RDF triples. These extracted datasets, allow users to put sophisticated queries by employing semantic Web techniques instead of relying on keyword-based searches. This paper also shows the quality of extracted data by performing sample queries through SPedia SPARQL Endpoint and analyzing results. Finally, the authors describe that how SPedia can serve as central hub for the cloud of LOD of scientific publications.


2017 ◽  
Vol 13 (1) ◽  
pp. 128-147 ◽  
Author(s):  
Muhammad Ahtisham Aslam ◽  
Naif Radi Aljohani

Producing the Linked Open Data (LOD) is getting potential to publish high-quality interlinked data. Publishing such data facilitates intelligent searching from the Web of data. In the context of scientific publications, data about millions of scientific documents published by hundreds and thousands of publishers is in silence as it is not published as open data and ultimately is not linked to other datasets. In this paper the authors present SPedia: a semantically enriched knowledge base of data about scientific documents. SPedia knowledge base provides information on more than nine million scientific documents, consisting of more than three hundred million RDF triples. These extracted datasets, allow users to put sophisticated queries by employing semantic Web techniques instead of relying on keyword-based searches. This paper also shows the quality of extracted data by performing sample queries through SPedia SPARQL Endpoint and analyzing results. Finally, the authors describe that how SPedia can serve as central hub for the cloud of LOD of scientific publications.


Author(s):  
Axel Polleres ◽  
Simon Steyskal

The World Wide Web Consortium (W3C) as the main standardization body for Web standards has set a particular focus on publishing and integrating Open Data. In this chapter, the authors explain various standards from the W3C's Semantic Web activity and the—potential—role they play in the context of Open Data: RDF, as a standard data format for publishing and consuming structured information on the Web; the Linked Data principles for interlinking RDF data published across the Web and leveraging a Web of Data; RDFS and OWL to describe vocabularies used in RDF and for describing mappings between such vocabularies. The authors conclude with a review of current deployments of these standards on the Web, particularly within public Open Data initiatives, and discuss potential risks and challenges.


Author(s):  
Albert Meroño-Peñuela ◽  
Ashkan Ashkpour ◽  
Valentijn Gilissen ◽  
Jan Jonker ◽  
Tom Vreugdenhil ◽  
...  

The Dutch Historical Censuses (1795–1971) contain statistics that describe almost two centuries of History in the Netherlands. These censuses were conducted once every 10 years (with some exceptions) from 1795 to 1971. Researchers have used its wealth of demographic, occupational, and housing information to answer fundamental questions in social economic history. However, accessing these data has traditionally been a time consuming and knowledge intensive task. In this paper, we describe the outcomes of the cedar project, which make access to the digitized assets of the Dutch Historical Censuses easier, faster, and more reliable. This is achieved by using the data publishing paradigm of Linked Data from the Semantic Web. We use a digitized sample of 2,288 census tables to produce a linked dataset of more than 6.8 million statistical observations. The dataset is modeled using the rdf Data Cube, Open Annotation, and prov vocabularies. The contributions of representing this dataset as Linked Data are: (1) a uniform database interface for efficient querying of census data; (2) a standardized and reproducible data harmonization workflow; and (3) an augmentation of the dataset through richer connections to related resources on the Web.


Author(s):  
Axel Polleres ◽  
Simon Steyskal

The World Wide Web Consortium (W3C) as the main standardization body for Web standards has set a particular focus on publishing and integrating Open Data. In this chapter, the authors explain various standards from the W3C's Semantic Web activity and the—potential—role they play in the context of Open Data: RDF, as a standard data format for publishing and consuming structured information on the Web; the Linked Data principles for interlinking RDF data published across the Web and leveraging a Web of Data; RDFS and OWL to describe vocabularies used in RDF and for describing mappings between such vocabularies. The authors conclude with a review of current deployments of these standards on the Web, particularly within public Open Data initiatives, and discuss potential risks and challenges.


2019 ◽  
Vol 15 (2) ◽  
Author(s):  
Eduardo Couto Dalcin ◽  
João Lanna ◽  
Natália Queiroz ◽  
Rafaela Campostrini Forzza

RESUMO Desde a Declaração de Berlin sobre o Acesso Aberto ao Conhecimento em Ciências e Humanidades, publicada em 2003, a demanda por uma “ciência aberta” cuja preocupação primordial é tornar a atividade de pesquisa mais transparente, mais cola­borativa e mais eficiente, tem crescido na comunidade acadêmica. Aliado a isso, vem se consolidando a  percepção de que o acesso e compartilhamento de dados de pesquisa contribui de forma significativa para que a ciência avance e  maximize os investimentos aplicados em programas de pesquisa. Neste sentido este estudo apresenta uma proposta composta de repositórios digitais e ferramentas computacionais voltadas para publicação e compartilhamento de recursos de informação em institutos de pesquisa. A arquitetura proposta, baseada em ferramentas livres e de código aberto mostrou-se adequada à gestão e publicação de recursos de informação em instituições de pesquisa. Porém, esta abordagem apontou a necessidade de uma ferramenta de busca que integre as diferentes ferramentas, assim como da existência de um vocabulário controlado, capaz de indexar os recursos em seus diferentes contextos.Palavras-chave: Dados Abertos; Ciência Aberta; Publicação de Dados Científicos.ABSTRACT Since the Berlin Declaration on Open Access to Knowledge in Science and Humanities published in 2003, the demand for an "open science" whose primary concern is to make research activity more transparent, more collaborative and more efficient, has grown at the academy. Added to this, the perception that the access and sharing of research data contribute significantly to science advance and maximize the investments applied in research programs has been consolidated. In this sense, the present work presents a proposal composed of digital repositories and computational tools aimed at publishing and sharing of information resources in research institutes. The proposed architecture, based on free and open-source tools, proved adequate for the management and publication of information resources in research institutions. However, this approach pointed to the need for a search tool that integrates the different tools, as well as the existence of a controlled vocabulary, capable of indexing resources in their different contexts.Keywords: Open Data; Open Science; Scientific Data Publishing.


Sign in / Sign up

Export Citation Format

Share Document