Library Linked Data Models: Library Data in the Semantic Web

2019 ◽  
Vol 57 (5) ◽  
pp. 261-277
Author(s):  
Hyoungjoo Park ◽  
Margaret Kipp
Keyword(s):  
Semantic Web ◽  
2021 ◽  
pp. 1-36
Author(s):  
Enrico Daga ◽  
Albert Meroño-Peñuela ◽  
Enrico Motta

Sequences are among the most important data structures in computer science. In the Semantic Web, however, little attention has been given to Sequential Linked Data. In previous work, we have discussed the data models that Knowledge Graphs commonly use for representing sequences and showed how these models have an impact on query performance and that this impact is invariant to triplestore implementations. However, the specific list operations that the management of Sequential Linked Data requires beyond the simple retrieval of an entire list or a range of its elements – e.g. to add or remove elements from a list –, and their impact in the various list data models, remain unclear. Covering this knowledge gap would be a significant step towards the realization of a Semantic Web list Application Programming Interface (API) that standardizes list manipulation and generalizes beyond specific data models. In order to address these challenges towards the realization of such an API, we build on our previous work in understanding the effects of various sequential data models for Knowledge Graphs, extending our benchmark and proposing a set of read-write Semantic Web list operations in SPARQL, with insert, update and delete support. To do so, we identify five classic list-based computer science sequential data structures (linked list, double linked list, stack, queue, and array), from which we derive nine atomic read-write operations for Semantic Web lists. We propose a SPARQL implementation of these operations with five typical RDF data models and compare their performance by executing them against six increasing dataset sizes and four different triplestores. In light of our results, we discuss the feasibility of our devised API and reflect on the state of affairs of Sequential Linked Data.


Author(s):  
Andra Waagmeester ◽  
Paul Braun ◽  
Manoj Karingamadathil ◽  
Jose Emilio Labra Gayo ◽  
Siobhan Leachman ◽  
...  

Moths form a diverse group of species that are predominantly active at night. They are colourful, have an ecological role, but are less well described compared to their closest relatives, the butterflies. Much remains to be understood about moths, which is shown by the many issues within their taxonomy, including being a paraphyletic group and the inability to clearly distinguish them from butterflies (Fig. 1). We present the Wikimedia architecture as a hub of knowledge on moths. This ecosystem consists of 312 language editions of Wikipedia and sister projects such as Wikimedia commons (a multimedia repository), and Wikidata (a public knowledge graph). Through Wikidata, external data repositories can be integrated into this knowledge landscape on moths. Wikidata contains links to (open) data repositories on biodiversity like iNaturalist, Global Biodiversity Information Facility (GBIF) and the Biodiversity Heritage Library (BHL) which in return contain detailed content like species occurrence data, images or publications on moths. We present a workflow that integrates crowd-sourced information and images from iNaturalist, with content from GBIF and BHL into the different language editions of Wikipedia. The Wikipedia articles in turn feed information to other sources. Taxon pages on iNaturalist, for example, have an "About" tab, which is fed by the Wikipedia article describing the respective taxon, where the current language of the (iNaturalist) interface fetches the appropriate language version from Wikipedia. This is a nice example of data reuse, which is one of the pillars of FAIR (Findable, Accessible, Interoperable and Reusable) (Wilkinson et al. 2016). Wikidata provides the linked data hub in this flow of knowledge. Since Wikidata is available in RDF, it aligns well with the data model of the semantic web. This allows rapid integration with other linked data sources, and provides an intuitive portal for non-linked data to be integrated as linked data with this semantic web. rapid integration with other linked data sources, and provides an intuitive portal for non-linked data to be integrated as linked data with this semantic web. Wikidata is includes information on all sorts of things (e.g., people, species, locations, events). Which is why it can structure data in a multitude of ways, thus leading to 9000+ properties. Because all those different domains and communities use the same source for different things it is important to have good structure and documentation for a specific topic so we and others can interpret the data. We present a schema that describes data about moth taxa on Wikidata. Since 2019, Wikidata has an EntitySchema namespace that allows contributors to specify applicable linked-data schemas. The schemas are expressed using Shape Expressions (ShEx) (Thornton et al. 2019), which is a formal modelling language for RDF, one of the data formats used on the Semantic Web. Since Wikidata is also rendered as RDF, it is possible to use ShEx to describe data models and user expectations in Wikidata (Waagmeester et al. 2021). These schemas can then be used to verify if a subset of Wikidata conforms to an expected or described data model. Starting from a document that describes an expected schema on moths, we have developed an EntitySchema (E321) for moths in Wikidata. This schema provides unambiguous guidance for contributors who have data they are not sure how to model. For example, a user with data about a particular species of moth may be working from a scientific article that states that the species is only found in New Zealand, and may be unsure of how to model that fact as a statement in Wikidata. After consulting Schema E321, the user will find out about Property P183 “endemic_to” and then use that property to state that the species is endemic to New Zealand. As more contributors follow the data model expressed in schema E321, there will be structural consistency across items for moths in Wikidata. This reduces the risk of contributors using different combinations of properties and qualifiers to express the same meaning. If a contributor needs to express something that is not yet represented in Schema E321 they can extend the schema itself, as each schema can be edited. The multilingual affordances of the Wikidata platform allow users to edit in over 300 languages. In this way, contributors edit in their preferred language and see the structure of the data as well as the schemas in their language of choice. This broadens the range of people who can contribute to these data models and reduces the dominance of English. There are approximately 160K+ estimated moth species. This number is equal to the number of moths described in iNaturalist, while Wikidata contains 220K items on moths. As the biggest language edition, the English Wikipedia contains 65K moth articles; other language editions contain far fewer Wikipedia articles. The higher number of items on moths in Wikidata can be partly explained by Wikidata taxon synonyms being treated as distinct taxa. Wikidata, as a proxy of knowledge on moths, is instrumental in getting them better described in Wikipedia and other (FAIR) sources. While in return, curation in Wikidata happens by a large community. This approach to data modelling has the advantage of allowing multilingual collaboration and iterative extension and improvement over time.


Semantic Web ◽  
2020 ◽  
pp. 1-29
Author(s):  
Bettina Klimek ◽  
Markus Ackermann ◽  
Martin Brümmer ◽  
Sebastian Hellmann

In the last years a rapid emergence of lexical resources has evolved in the Semantic Web. Whereas most of the linguistic information is already machine-readable, we found that morphological information is mostly absent or only contained in semi-structured strings. An integration of morphemic data has not yet been undertaken due to the lack of existing domain-specific ontologies and explicit morphemic data. In this paper, we present the Multilingual Morpheme Ontology called MMoOn Core which can be regarded as the first comprehensive ontology for the linguistic domain of morphological language data. It will be described how crucial concepts like morphs, morphemes, word forms and meanings are represented and interrelated and how language-specific morpheme inventories can be created as a new possibility of morphological datasets. The aim of the MMoOn Core ontology is to serve as a shared semantic model for linguists and NLP researchers alike to enable the creation, conversion, exchange, reuse and enrichment of morphological language data across different data-dependent language sciences. Therefore, various use cases are illustrated to draw attention to the cross-disciplinary potential which can be realized with the MMoOn Core ontology in the context of the existing Linguistic Linked Data research landscape.


Author(s):  
Yusuke Tagawa ◽  
Arata Tanaka ◽  
Yuya Minami ◽  
Daichi Namikawa ◽  
Michio Simomura ◽  
...  

Author(s):  
Georg Neubauer

The main subject of the work is the visualization of typed links in Linked Data. The academic subjects relevant to the paper in general are the Semantic Web, the Web of Data and information visualization. The Semantic Web, invented by Tim Berners-Lee in 2001, was announced as an extension to the World Wide Web (Web 2.0). The actual area of investigation concerns the connectivity of information on the World Wide Web. To be able to explore such interconnections, visualizations are critical requirements as well as a major part of processing data in themselves. In the context of the Semantic Web, representation of information interrelations can be achieved using graphs. The aim of the article is to primarily describe the arrangement of Linked Data visualization concepts by establishing their principles in a theoretical approach. Putting design restrictions into context leads to practical guidelines. By describing the creation of two alternative visualizations of a commonly used web application representing Linked Data as network visualization, their compatibility was tested. The application-oriented part treats the design phase, its results, and future requirements of the project that can be derived from this test.


2020 ◽  
Vol 32 ◽  
Author(s):  
Adriano de Oliveira GONÇALVES ◽  
Mark Douglas de Azevedo JACYNTHO
Keyword(s):  

Resumo Este trabalho propõe, por meio de uma metodologia de pesquisa quali-quantitativa, aplicada e experimental, um método para mapeamento e publicação sistemática de uma base relacional existente, segundo os princípios Linked Data, a partir de um estudo de caso de artigos acadêmicos da conferência interna Semana de Integração Acadêmica de uma universidade públicafederal brasileira. O método proposto é resultado de mapeamento do domínio de conhecimento estudado em ontologias Linked Data de referência (Schema.org, Friend of a Friend, Bibliographic Ontology, Semantic Web Conference Ontology, entre outras). O referido método foi aplicado ao banco de dados relacional da conferência, a fim de disponibilizá-lo em formato inteligível por máquinas na Web, estabelecendo-se ainda links semânticos com a famosa fonte de dados DBpedia, por meio de um processo de mashup automatizado. Os resultados obtidos com o método foram bastante satisfatórios, atingindo-se plenamente o objetivo de se publicar uma visão Linked Data sobre os dados relacionais, sem alterá-los. Espera-se, com este trabalho, fomentar a disponibilização de dados semânticos na Web, em consonância com os princípios Linked Data. Assim, contribui-se para a ampla divulgação de conhecimento, impulsionada pela capacidade que a Web Semântica provê às máquinas de interligar, compreender e descobrir informações.


2017 ◽  
Vol 22 (1) ◽  
pp. 21-37 ◽  
Author(s):  
Matthew T. Mccarthy

The web of linked data, otherwise known as the semantic web, is a system in which information is structured and interlinked to provide meaningful content to artificial intelligence (AI) algorithms. As the complex interactions between digital personae and these algorithms mediate access to information, it becomes necessary to understand how these classification and knowledge systems are developed. What are the processes by which those systems come to represent the world, and how are the controversies that arise in their creation, overcome? As a global form, the semantic web is an assemblage of many interlinked classification and knowledge systems, which are themselves assemblages. Through the perspectives of global assemblage theory, critical code studies and practice theory, I analyse netnographic data of one such assemblage. Schema.org is but one component of the larger global assemblage of the semantic web, and as such is an emergent articulation of different knowledges, interests and networks of actors. This articulation comes together to tame the profusion of things, seeking stability in representation, but in the process, it faces and produces more instability. Furthermore, this production of instability contributes to the emergence of new assemblages that have similar aims.


Author(s):  
Jose María Alvarez Rodríguez ◽  
José Emilio Labra Gayo ◽  
Patricia Ordoñez de Pablos

The aim of this chapter is to present a proposal and a case study to describe the information about organizations in a standard way using the Linked Data approach. Several models and ontologies have been provided in order to formalize the data, structure and behaviour of organizations. Nevertheless, these tries have not been fully accepted due to some factors: (1) missing pieces to define the status of the organization; (2) tangled parts to specify the structure (concepts and relations) between the elements of the organization; 3) lack of text properties, and other factors. These divergences imply a set of incomplete approaches to formalize data and information about organizations. Taking into account the current trends of applying semantic web technologies and linked data to formalize, aggregate, and share domain specific information, a new model for organizations taking advantage of these initiatives is required in order to overcome existing barriers and exploit the corporate information in a standard way. This work is especially relevant in some senses to: (1) unify existing models to provide a common specification; (2) apply semantic web technologies and the Linked Data approach; (3) provide access to the information via standard protocols, and (4) offer new services that can exploit this information to trace the evolution and behaviour of the organization over time. Finally, this work is interesting to improve the clarity and transparency of some scenarios in which organizations play a key role, like e-procurement, e-health, or financial transactions.


Sign in / Sign up

Export Citation Format

Share Document