scholarly journals Where the Semantic Web and Web 2.0 Meet Format Risk Management: P2 Registry

2011 ◽  
Vol 6 (1) ◽  
pp. 165-182 ◽  
Author(s):  
David Tarrant ◽  
Steve Hitchcock ◽  
Leslie Carr

The Web is increasingly becoming a platform for linked data. This means making connections and adding value to data on the Web. As more data becomes openly available and more people are able to use the data, it becomes more powerful. An example is file format registries and the evaluation of format risks. Here the requirement for information is now greater than the effort that any single institution can put into gathering and collating this information. Recognising that more is better, the creators of PRONOM, JHOVE, GDFR and others are joining to lead a new initiative: the Unified Digital Format Registry. Ahead of this effort, a new RDF-based framework for structuring and facilitating file format data from multiple sources, including PRONOM, has demonstrated it is able to produce more links, and thus provide more answers to digital preservation questions - about format risks, applications, viewers and transformations - than the native data alone. This paper will describe this registry, P2, and its services, show how it can be used, and provide examples where it delivers more answers than the contributing resources. The P2 Registry is a reference platform to allow and encourage publication of preservation data, and also an examplar of what can be achieved if more data is published openly online as simple machine-readable documents. This approach calls for the active participation of the digital preservation community to contribute data by simply publishing it openly on the Web as linked data.

2021 ◽  
Author(s):  
Gillian Byrne ◽  
Lisa Goddard

Since 1999 the W3C has been working on a set of Semantic Web standards that have the potential to revolutionize web search. Also known as Linked Data, the Machine‐Readable Web, the Web of Data, or Web3.0, the Semantic Web relies on highly structured metadata that allow computers to understand the relationships between objects. Semantic web standards are complex, and difficult to conceptualize, but they offer solutions to many of the issues that plague libraries, including precise web search, authority control, classification, data portability, and disambiguation. This article will outline some of the benefits that linked data could have for libraries, will discuss some of the non‐technical obstacles that we face in moving forward, and will finally offer suggestions for practical ways in which libraries can participate in the development of the semantic web.


Semantic Web ◽  
2020 ◽  
pp. 1-29
Author(s):  
Bettina Klimek ◽  
Markus Ackermann ◽  
Martin Brümmer ◽  
Sebastian Hellmann

In the last years a rapid emergence of lexical resources has evolved in the Semantic Web. Whereas most of the linguistic information is already machine-readable, we found that morphological information is mostly absent or only contained in semi-structured strings. An integration of morphemic data has not yet been undertaken due to the lack of existing domain-specific ontologies and explicit morphemic data. In this paper, we present the Multilingual Morpheme Ontology called MMoOn Core which can be regarded as the first comprehensive ontology for the linguistic domain of morphological language data. It will be described how crucial concepts like morphs, morphemes, word forms and meanings are represented and interrelated and how language-specific morpheme inventories can be created as a new possibility of morphological datasets. The aim of the MMoOn Core ontology is to serve as a shared semantic model for linguists and NLP researchers alike to enable the creation, conversion, exchange, reuse and enrichment of morphological language data across different data-dependent language sciences. Therefore, various use cases are illustrated to draw attention to the cross-disciplinary potential which can be realized with the MMoOn Core ontology in the context of the existing Linguistic Linked Data research landscape.


2021 ◽  
Vol 81 (3-4) ◽  
pp. 318-358
Author(s):  
Sander Stolk

Abstract This article provides an introduction to the web application Evoke. This application offers functionality to navigate, view, extend, and analyse thesaurus content. The thesauri that can be navigated in Evoke are expressed in Linguistic Linked Data, an interoperable data form that enables the extension of thesaurus content with custom labels and allows for the linking of thesaurus content to other digital resources. As such, Evoke is a powerful research tool that facilitates its users to perform novel cultural linguistic analyses over multiple sources. This article further demonstrates the potential of Evoke by discussing how A Thesaurus of Old English was made available in the application and how this has already been adopted in the field of Old English studies. Lastly, the author situates Evoke within a number of recent developments in the field of Digital Humanities and its applications for onomasiological research.


2017 ◽  
Vol 22 (1) ◽  
pp. 21-37 ◽  
Author(s):  
Matthew T. Mccarthy

The web of linked data, otherwise known as the semantic web, is a system in which information is structured and interlinked to provide meaningful content to artificial intelligence (AI) algorithms. As the complex interactions between digital personae and these algorithms mediate access to information, it becomes necessary to understand how these classification and knowledge systems are developed. What are the processes by which those systems come to represent the world, and how are the controversies that arise in their creation, overcome? As a global form, the semantic web is an assemblage of many interlinked classification and knowledge systems, which are themselves assemblages. Through the perspectives of global assemblage theory, critical code studies and practice theory, I analyse netnographic data of one such assemblage. Schema.org is but one component of the larger global assemblage of the semantic web, and as such is an emergent articulation of different knowledges, interests and networks of actors. This articulation comes together to tame the profusion of things, seeking stability in representation, but in the process, it faces and produces more instability. Furthermore, this production of instability contributes to the emergence of new assemblages that have similar aims.


Author(s):  
Leila Zemmouchi-Ghomari

Data play a central role in the effectiveness and efficiency of web applications, such as the Semantic Web. However, data are distributed across a very large number of online sources, due to which a significant effort is needed to integrate this data for its proper utilization. A promising solution to this issue is the linked data initiative, which is based on four principles related to publishing web data and facilitating interlinked and structured online data rather than the existing web of documents. The basic ideas, techniques, and applications of the linked data initiative are surveyed in this paper. The authors discuss some Linked Data open issues and potential tracks to address these pending questions.


Author(s):  
Amrapali Zaveri ◽  
Andrea Maurino ◽  
Laure-Berti Equille

The standardization and adoption of Semantic Web technologies has resulted in an unprecedented volume of data being published as Linked Data (LD). However, the “publish first, refine later” philosophy leads to various quality problems arising in the underlying data such as incompleteness, inconsistency and semantic ambiguities. In this article, we describe the current state of Data Quality in the Web of Data along with details of the three papers accepted for the International Journal on Semantic Web and Information Systems' (IJSWIS) Special Issue on Web Data Quality. Additionally, we identify new challenges that are specific to the Web of Data and provide insights into the current progress and future directions for each of those challenges.


Author(s):  
Alfio Ferrara ◽  
Andriy Nikolov ◽  
François Scharffe

By specifying that published datasets must link to other existing datasets, the 4th linked data principle ensures a Web of data and not just a set of unconnected data islands. The authors propose in this paper the term data linking to name the problem of finding equivalent resources on the Web of linked data. In order to perform data linking, many techniques were developed, finding their roots in statistics, database, natural language processing and graph theory. The authors begin this paper by providing background information and terminological clarifications related to data linking. Then a comprehensive survey over the various techniques available for data linking is provided. These techniques are classified along the three criteria of granularity, type of evidence, and source of the evidence. Finally, the authors survey eleven recent tools performing data linking and we classify them according to the surveyed techniques.


2014 ◽  
Vol 9 (1) ◽  
pp. 331-342 ◽  
Author(s):  
Herbert Van de Sompel ◽  
Robert Sanderson ◽  
Harihar Shankar ◽  
Martin Klein

Persistent IDentifiers (PIDs), such as DOIs, Handles and ARK identifiers, play a significant role in the identification of a wide variety of assets that are created and used in scholarly endeavours, including research papers, datasets, images, etc. Motivated by concerns about long-term persistence, among others, PIDs are minted outside the information access protocol of the day, HTTP. Yet, value-added services targeted at both humans and machines routinely assume or even require resources identified by means of HTTP URIs in order to make use of off-the-shelf components like web browsers and servers. Hence, an unambiguous bridge is required between the PID-oriented paradigm that is widespread in research communication and the HTTP-oriented web, semantic web and linked data environment. This paper describes the problem, and a possible solution towards defining and deploying such an interoperable bridge.


Author(s):  
B. KAMALA ◽  
J. M. NANDHINI

Ontologies have become the effective modeling for various applications and significantly in the semantic web. The difficulty of extracting information from the web, which was created mainly for visualising information, has driven the birth of the semantic web, which will contain much more resources than the web and will attach machine-readable semantic information to these resources. Ontological bootstrapping on a set of predefined sources, such as web services, must address the problem of multiple, largely unrelated concepts. The web services consist of basically two components, Web Services Description Language (WSDL) descriptors and free text descriptors. The WSDL descriptor is evaluated using two methods, namely Term Frequency/Inverse Document Frequency (TF/IDF) and web context generation. The proposed bootstrapping ontological process integrates TF/IDF and web context generation and applies validation using the free text descriptor service, so that, it offers more accurate definition of ontologies. This paper uses ranking adaption model which predicts the rank for a collection of web service documents which leads to the automatic construction, enrichment and adaptation of ontologies.


2011 ◽  
pp. 1027-1049
Author(s):  
Danica Damljanovic ◽  
Vladan Devedžic

Traditional E-Tourism applications store data internally in a form that is not interoperable with similar systems. Hence, tourist agents spend plenty of time updating data about vacation packages in order to provide good service to their clients. On the other hand, their clients spend plenty of time searching for the ‘perfect’ vacation package as the data about tourist offers are not integrated and are available from different spots on the Web. We developed Travel Guides - a prototype system for tourism management to illustrate how semantic web technologies combined with traditional E-Tourism applications: a.) help integration of tourism sources dispersed on the Web b) enable creating sophisticated user profiles. Maintaining quality user profiles enables system personalization and adaptivity of the content shown to the user. The core of this system is in ontologies – they enable machine readable and machine understandable representation of the data and more importantly reasoning.


Sign in / Sign up

Export Citation Format

Share Document