scholarly journals Evaluating the quality of linked open data in digital libraries

2020 ◽  
pp. 016555152093095
Author(s):  
Gustavo Candela ◽  
Pilar Escobar ◽  
Rafael C Carrasco ◽  
Manuel Marco-Such

Cultural heritage institutions have recently started to share their metadata as Linked Open Data (LOD) in order to disseminate and enrich them. The publication of large bibliographic data sets as LOD is a challenge that requires the design and implementation of custom methods for the transformation, management, querying and enrichment of the data. In this report, the methodology defined by previous research for the evaluation of the quality of LOD is analysed and adapted to the specific case of Resource Description Framework (RDF) triples containing standard bibliographic information. The specified quality measures are reported in the case of four highly relevant libraries.

2017 ◽  
Vol 44 (2) ◽  
pp. 203-229 ◽  
Author(s):  
Javier D Fernández ◽  
Miguel A Martínez-Prieto ◽  
Pablo de la Fuente Redondo ◽  
Claudio Gutiérrez

The publication of semantic web data, commonly represented in Resource Description Framework (RDF), has experienced outstanding growth over the last few years. Data from all fields of knowledge are shared publicly and interconnected in active initiatives such as Linked Open Data. However, despite the increasing availability of applications managing large-scale RDF information such as RDF stores and reasoning tools, little attention has been given to the structural features emerging in real-world RDF data. Our work addresses this issue by proposing specific metrics to characterise RDF data. We specifically focus on revealing the redundancy of each data set, as well as common structural patterns. We evaluate the proposed metrics on several data sets, which cover a wide range of designs and models. Our findings provide a basis for more efficient RDF data structures, indexes and compressors.


Author(s):  
Mariana Baptista Brandt ◽  
Silvana Aparecida Borsetti Gregorio Vidotti ◽  
José Eduardo Santarem Segundo

A presente pesquisa objetiva propor um modelo de dados abertos conectados (linked open data - LOD), para um conjunto de dados abertos legislativos da Câmara dos Deputados. Para tanto, procede-se à revisão de literatura sobre os conceitos de dados abertos, dados abertos governamentais, dados conectados (linked data), e dados abertos conectados (linked open data), seguido de pesquisa aplicada, com a modelagem de dados legislativos no modelo LOD. Para esta pesquisa foi selecionado o conjunto de dados "Deputados", que contém informações como partido político, unidade federativa, e-mail, legislatura, entre outras, sobre os parlamentares. Desse modo, observa-se que a estruturação do conjunto de dados em RDF (Resource Description Framework) é possível com reuso de vocabulários e padrões já estabelecidos na Web Semântica como Dublin Core, Friend of a Friend (FOAF), RDF e RDF Schema, além de vocabulários de áreas correlatas, como a Ontologia da Câmara dos Deputados italiana e a da Assembleia Nacional Francesa. Conforme recomendação do padrão Linked Data, os recursos foram relacionados também a outros conjuntos de LOD para enriquecimento semântico, como as bases Geonames e DBpedia. O estudo que permite concluir que a disponibilização dos dados governamentais, em especial, dados legislativos, pode ser feita seguindo as recomendações da W3C (World Wide Web Consortium) e, assim, integrar os dados legislativos à Web de Dados e ampliar as possibilidades de reuso e aplicações dos dados em ações de transparência e fiscalização, aproximando os cidadãos do Congresso e de seus representantes.


Information ◽  
2020 ◽  
Vol 11 (7) ◽  
pp. 361 ◽  
Author(s):  
Raji Ghawi ◽  
Jürgen Pfeffer

Linked Open Data (LOD) refers to freely available data on the World Wide Web that are typically represented using the Resource Description Framework (RDF) and standards built on it. LOD is an invaluable resource of information due to its richness and openness, which create new opportunities for many areas of application. In this paper, we address the exploitation of LOD by utilizing SPARQL queries in order to extract social networks among entities. This enables the application of de-facto techniques from Social Network Analysis (SNA) to study social relations and interactions among entities, providing deep insights into their latent social structure.


2020 ◽  
Vol 7 (1) ◽  
Author(s):  
Chuming Chen ◽  
Hongzhan Huang ◽  
Karen E. Ross ◽  
Julie E. Cowart ◽  
Cecilia N. Arighi ◽  
...  

Abstract The Protein Ontology (PRO) provides an ontological representation of protein-related entities, ranging from protein families to proteoforms to complexes. Protein Ontology Linked Open Data (LOD) exposes, shares, and connects knowledge about protein-related entities on the Semantic Web using Resource Description Framework (RDF), thus enabling integration with other Linked Open Data for biological knowledge discovery. For example, proteins (or variants thereof) can be retrieved on the basis of specific disease associations. As a community resource, we strive to follow the Findability, Accessibility, Interoperability, and Reusability (FAIR) principles, disseminate regular updates of our data, support multiple methods for accessing, querying and downloading data in various formats, and provide documentation both for scientists and programmers. PRO Linked Open Data can be browsed via faceted browser interface and queried using SPARQL via YASGUI. RDF data dumps are also available for download. Additionally, we developed RESTful APIs to support programmatic data access. We also provide W3C HCLS specification compliant metadata description for our data. The PRO Linked Open Data is available at https://lod.proconsortium.org/.


2021 ◽  
Author(s):  
Santoshi Laxmi Reddy Ellanki ◽  
John Jenq

In this report, a system was developed that can predict the outcome of opening a restaurant in NYC based on various NYC open data sets, such as 311 calls, New York Police crime records and restaurant rating data. The data sets were preprocessed and cleaned before analysis to improve the quality of our results.


Author(s):  
Franck Cotton ◽  
Daniel Gillman

Linked Open Statistical Metadata (LOSM) is Linked Open Data (LOD) applied to statistical metadata. LOD is a model for identifying, structuring, interlinking, and querying data published directly on the web. It builds on the standards of the semantic web defined by the W3C. LOD uses the Resource Description Framework (RDF), a simple data model expressing content as predicates linking resources between them or with literal properties. The simplicity of the model makes it able to represent any data, including metadata. We define statistical data as data produced through some statistical process or intended for statistical analyses, and statistical metadata as metadata describing statistical data. LOSM promotes discovery and the meaning and structure of statistical data in an automated way. Consequently, it helps with understanding and interpreting data and preventing inadequate or flawed visualizations for statistical data. This enhances statistical literacy and efforts at visualizing statistics.


Author(s):  
Liah Shonhe

The main focus of the study was to explore the practices of open data sharing in the agricultural sector, including establishing the research outputs concerning open data in agriculture. The study adopted a desktop research methodology based on literature review and bibliographic data from WoS database. Bibliometric indicators discussed include yearly productivity, most prolific authors, and enhanced countries. Study findings revealed that research activity in the field of agriculture and open access is very low. There were 36 OA articles and only 6 publications had an open data badge. Most researchers do not yet embrace the need to openly publish their data set despite the availability of numerous open data repositories. Unfortunately, most African countries are still lagging behind in management of agricultural open data. The study therefore recommends that researchers should publish their research data sets as OA. African countries need to put more efforts in establishing open data repositories and implementing the necessary policies to facilitate OA.


2016 ◽  
Vol 35 (2) ◽  
pp. 19 ◽  
Author(s):  
Manolis Peponakis

<p>The aim of this study is to contribute to the field of machine-processable bibliographic data that is suitable for the Semantic Web. We examine the Entity Relationship (ER) model, which has been selected by IFLA as a “conceptual framework” in order to model the FR family (FRBR, FRAD and RDA), and the problems ER causes as we move towards the Semantic Web. Subsequently, while maintaining the semantics of the aforementioned standards but rejecting the ER as a conceptual framework for bibliographic data, this paper builds on the Resource Description Framework (RDF) potential and documents how both the RDF and Linked Data’s rationale can affect the way we model bibliographic data.</p>In this way, a new approach to bibliographic data emerges where the distinction between description and authorities is obsolete. Instead, the integration of the authorities with descriptive information becomes fundamental so that a network of correlations can be established between the entities and the names by which the entities are known. Naming is a vital issue for human cultures because names are not random sequences of characters or sounds which stand just as identifiers for the entities - they also have socio-cultural meanings and interpretations. Thus, instead of describing indivisible resources, we could describe entities that appear in a variety of names on various resources. In this study, a method is proposed to connect the names with the entities they represent and, in this way, to document the provenance of these names by connecting specific resources with specific names.


2020 ◽  
Vol 1 (1) ◽  
pp. 428-444 ◽  
Author(s):  
Silvio Peroni ◽  
David Shotton

OpenCitations is an infrastructure organization for open scholarship dedicated to the publication of open citation data as Linked Open Data using Semantic Web technologies, thereby providing a disruptive alternative to traditional proprietary citation indexes. Open citation data are valuable for bibliometric analysis, increasing the reproducibility of large-scale analyses by enabling publication of the source data. Following brief introductions to the development and benefits of open scholarship and to Semantic Web technologies, this paper describes OpenCitations and its data sets, tools, services, and activities. These include the OpenCitations Data Model; the SPAR (Semantic Publishing and Referencing) Ontologies; OpenCitations’ open software of generic applicability for searching, browsing, and providing REST APIs over resource description framework (RDF) triplestores; Open Citation Identifiers (OCIs) and the OpenCitations OCI Resolution Service; the OpenCitations Corpus (OCC), a database of open downloadable bibliographic and citation data made available in RDF under a Creative Commons public domain dedication; and the OpenCitations Indexes of open citation data, of which the first and largest is COCI, the OpenCitations Index of Crossref Open DOI-to-DOI Citations, which currently contains over 624 million bibliographic citations and is receiving considerable usage by the scholarly community.


2017 ◽  
Vol 35 (1) ◽  
pp. 159-178
Author(s):  
Timothy W. Cole ◽  
Myung-Ja K. Han ◽  
Maria Janina Sarol ◽  
Monika Biel ◽  
David Maus

Purpose Early Modern emblem books are primary sources for scholars studying the European Renaissance. Linked Open Data (LOD) is an approach for organizing and modeling information in a data-centric manner compatible with the emerging Semantic Web. The purpose of this paper is to examine ways in which LOD methods can be applied to facilitate emblem resource discovery, better reveal the structure and connectedness of digitized emblem resources, and enhance scholar interactions with digitized emblem resources. Design/methodology/approach This research encompasses an analysis of the existing XML-based Spine (emblem-specific) metadata schema; the design of a new, domain-specific, Resource Description Framework compatible ontology; the mapping and transformation of metadata from Spine to both the new ontology and (separately) to the pre-existing Schema.org ontology; and the (experimental) modification of the Emblematica Online portal as a proof of concept to illustrate enhancements supported by LOD. Findings LOD is viable as an approach for facilitating discovery and enhancing the value to scholars of digitized emblem books; however, metadata must first be enriched with additional uniform resource identifiers and the workflow upgrades required to normalize and transform existing emblem metadata are substantial and still to be fully worked out. Practical implications The research described demonstrates the feasibility of transforming existing, special collections metadata to LOD. Although considerable work and further study will be required, preliminary findings suggest potential benefits of LOD for both users and libraries. Originality/value This research is unique in the context of emblem studies and adds to the emerging body of work examining the application of LOD best practices to library special collections.


Sign in / Sign up

Export Citation Format

Share Document