Measuring vocabulary use in the Linked Data Cloud

Purpose This paper reports on a quantitative study of data gathered from the Linked Open Vocabularies (LOV) catalogue, including the use of network analysis and metrics. The purpose of this paper is to gain insights into the structure of LOV and the use of vocabularies in the Web of Data. It is important to note that not all the vocabularies in it are registered in LOV. Given the de-centralised and collaborative nature of the use and adoption of these vocabularies, the results of the study can be used to identify emergent important vocabularies that are shaping the Web of Data. Design/methodology/approach The methodology is based on an analytical approach to a data set that captures a complete snapshot of the LOV catalogue dated April 2014. An initial analysis of the data is presented in order to obtain insights into the characteristics of the vocabularies found in LOV. This is followed by an analysis of the use of Vocabulary of a Friend properties that describe relations among vocabularies. Finally, the study is complemented with an analysis of the usage of the different vocabularies, and concludes by proposing a number of metrics. Findings The most relevant insight is that unsurprisingly the vocabularies with more presence are those used to model Semantic Web data, such as Resource Description Framework, RDF Schema and OWL, as well as broadly used standards as Simple Knowledge Organization System, DCTERMS and DCE. It was also discovered that the most used language is English and the vocabularies are not considered to be highly specialised in a field. Also, there is not a dominant scope of the vocabularies. Regarding the structural analysis, it is concluded that LOV is a heterogeneous network. Originality/value The paper provides an empirical analysis of the structure of LOV and the relations between its vocabularies, together with some metrics that may be of help to determine the important vocabularies from a practical perspective. The results are of interest for a better understanding of the evolution and dynamics of the Web of Data, and for applications that attempt to retrieve data in the Linked Data Cloud. These applications can benefit from the insights into the important vocabularies to be supported and the value added when mapping between and using the vocabularies.

Download Full-text

Linked Data

Advances in Human and Social Aspects of Technology - Handbook of Research on Technology Integration in the Global World ◽

10.4018/978-1-5225-6367-9.ch005 ◽

2019 ◽

pp. 87-113

Author(s):

Leila Zemmouchi-Ghomari

Keyword(s):

Linked Data ◽

Query Language ◽

Exciting Field ◽

Data Resource ◽

Simple Protocol ◽

Web Of Data ◽

Promising Solution ◽

Description Framework ◽

Resource Description ◽

The Web

The data on the web is heterogeneous and distributed, which makes its integration a sine qua non-condition for its effective exploitation within the context of the semantic web or the so-called web of data. A promising solution for web data integration is the linked data initiative, which is based on four principles that aim to standardize the publication of structured data on the web. The objective of this chapter is to provide an overview of the essential aspects of this fairly recent and exciting field, including the model of linked data: resource description framework (RDF), its query language: simple protocol, and the RDF query language (SPARQL), the available means of publication and consumption of linked data, and the existing applications and the issues not yet addressed in research.

Download Full-text

Quality measures for skos

Data Technologies and Applications ◽

10.1108/dta-05-2017-0037 ◽

2018 ◽

Vol 52 (3) ◽

pp. 405-423 ◽

Cited By ~ 3

Author(s):

Riccardo Albertoni ◽

Monica De Martino ◽

Paola Podestà

Keyword(s):

Linked Data ◽

Design Methodology ◽

Quality Measures ◽

Third Party ◽

Data Set ◽

Content Type ◽

Knowledge Organisation ◽

The Cross ◽

The Web

Purpose The purpose of this paper is to focus on the quality of the connections (linkset) among thesauri published as Linked Data on the Web. It extends the cross-walking measures with two new measures able to evaluate the enrichment brought by the information reached through the linkset (lexical enrichment, browsing space enrichment). It fosters the adoption of cross-walking linkset quality measures besides the well-known and deployed cardinality-based measures (linkset cardinality and linkset coverage). Design/methodology/approach The paper applies the linkset measures to the Linked Thesaurus fRamework for Environment (LusTRE). LusTRE is selected as testbed as it is encoded using a Simple Knowledge Organisation System (SKOS) published as Linked Data, and it explicitly exploits the cross-walking measures on its validated linksets. Findings The application on LusTRE offers an insight of the complementarities among the considered linkset measures. In particular, it shows that the cross-walking measures deepen the cardinality-based measures analysing quality facets that were not previously considered. The actual value of LusTRE’s linksets regarding the improvement of multilingualism and concept spaces is assessed. Research limitations/implications The paper considers skos:exactMatch linksets, which belong to a rather specific but a quite common kind of linkset. The cross-walking measures explicitly assume correctness and completeness of linksets. Third party approaches and tools can help to meet the above assumptions. Originality/value This paper fulfils an identified need to study the quality of linksets. Several approaches formalise and evaluate Linked Data quality focusing on data set quality but disregarding the other essential component: the connection among data.

Download Full-text

What comes next: understanding BIBFRAME

Library Hi Tech ◽

10.1108/lht-06-2018-0085 ◽

2019 ◽

Vol 37 (3) ◽

pp. 513-524

Author(s):

Thomas D. Steele

Keyword(s):

Data Model ◽

Linked Data ◽

Design Methodology ◽

Recent Literature ◽

Content Type ◽

The Future ◽

Description Framework ◽

Machine Readable ◽

Resource Description

Purpose Bibliographic framework initiative (BIBFRAME) is a data model created by the Library of Congress to with the long-term goal of replacing Machine Readable Cataloging (MARC). The purpose of this paper is to inform catalogers and other library professionals why MARC is lacking in the needs of current users, and how BIBFRAME works better to meet these needs. It will also explain linked data and the principles of Resource Description Framework, so catalogers will have a better understanding of BIBFRAME’s basic goals. Design/methodology/approach The review of recent literature in print and online, as well as using the BIBFRAME editor to create a BIBFRAME record, was the basis for this paper. Findings The paper concludes the user experience with the library catalog has changed and requires more in-depth search capabilities using linked data and that BIBFRAME is a first step in meeting the user needs of the future. Originality/value The paper gives the reader an entry point into the complicated future catalogers and other professionals may feel trepidation about. With a systematic walkthrough of the creation of a BIBFRAME record, the reader should feel more informed where the future of cataloging is going.

Download Full-text

Semantic Cloud Architecture An Integration of Cloud and Semantic Web

IMS Manthan (The Journal of Innovations) ◽

10.18701/imsmanthan.v8i2.5133 ◽

2015 ◽

Vol 8 (2) ◽

Author(s):

Jayalakshmi Srinivasan

Keyword(s):

Cloud Computing ◽

Linked Data ◽

Collaborative Work ◽

Data Sets ◽

It Services ◽

Cloud Architecture ◽

Description Framework ◽

Data Portability ◽

Resource Description ◽

The Web

In the last few years, the amount of structured data made available on the Web in semantic formats has grown by several orders of magnitude. On one side, the Linked Data effort has made available online hundreds of millions of entity descriptions based on the Resource Description Framework (RDF) in data sets. On the other hand, the Web 2.0 community has increasingly embraced the idea of data portability, and the first efforts have already produced billions of RDF equivalent triples either embedded inside HTML pages using micro formats or exposed directly using eRDF (embedded RDF) and RDFa (RDF attributes). In another side Cloud Computing is offering utility concerned IT services to users worldwide. It enables hosting of applications from consumers, scientific and business domains. The beauty of cloud computing is its simplicity. This paper focuses on the process of transitioning from IT architectures of today to Semantic Cloud Architecture. The emphasis is on collaborative work of business and enterprise architects to reduce operational costs and to achieve heights.

Download Full-text

On the Graph Structure of the Web of Data

International Journal on Semantic Web and Information Systems ◽

10.4018/ijswis.2018040104 ◽

2018 ◽

Vol 14 (2) ◽

pp. 70-85 ◽

Cited By ~ 1

Author(s):

Alberto Nogales Moyano ◽

Miguel Angel Sicilia ◽

Elena Garcia Barriocanal

Keyword(s):

Open Data ◽

Graph Structure ◽

Principal Mechanism ◽

Web Of Data ◽

Description Framework ◽

The One ◽

Machine Readable ◽

Resource Description ◽

The Web ◽

Bow Tie

This article describes how the Web of Data has emerged as the realization of a machine readable web relying on the resource description framework language as a way to provide richer semantics to datasets. While the web of data is based on similar principles as the original web, being interlinked in the principal mechanism to relate information, the differences in the structure of the information is evident. Several studies have analysed the graph structure of the web, yielding important insights that were used in relevant applications. However, those findings cannot be transposed to the Web of Data, due to fundamental differences in the production, link creation and usage. This article reports on a study of the graph structure of the Web of Data using methods and techniques from similar studies for the Web. Results show that the Web of Data also complies with the theory of the bow-tie. Other characteristics are the low distance between nodes or the closeness and degree centrality are low. Regarding the datasets, the biggest one is Open Data Euskadi but the one with more connections to other datasets is Dbpedia.

Download Full-text

Modelos de representação semântica na era do Big Data

Brazilian Journal of Information Science ◽

10.36311/1981-1640.2018.v12n3.04.p34 ◽

2018 ◽

Vol 12 (3) ◽

Author(s):

Janailton Lopes Souza ◽

Paulo George Miranda Martins ◽

Rogério Aparecido Sá Ramalho

Keyword(s):

Big Data ◽

Resource Description Framework ◽

Knowledge Organization ◽

Simple Knowledge Organization System ◽

Knowledge Organization System ◽

Description Framework ◽

Resource Description

O termo Big Data se refere ao grande volume de dados produzidos e disponibilizados em ambientes digitais. Ao longo dos últimos anos novos modelos de representação têm sido propostos no intuito de aperfeiçoar as formas de representação de informações em ambientes digitais. O presente trabalho está vinculado a um projeto de pesquisa em andamento, financiado pelas agências FAPESP e CNPq, e possui como objetivo analisar os princípios que fundamentam o Big Data e sua relação com os novos padrões de representação Resource Description Framework (RDF); Simple Knowledge Organization System (SKOS) e Ontology Web Language (OWL). A pesquisa possui caráter teórico e abordagem qualitativa, pois busca apresentar características voltadas à descrição, compreensão e explicação das relações do Big Data com os novos modelos de representação. A partir do levantamento teórico realizado, foi verificado que os modelos de representação analisados contribuem para interligar grandes volumes de dados sem perder o contexto no qual são originados, favorecendo um melhor entendimento do Big Data e os novos paradigmas de representação em ambientes digitais.

Download Full-text

Characterising RDF data sets

Journal of Information Science ◽

10.1177/0165551516677945 ◽

2017 ◽

Vol 44 (2) ◽

pp. 203-229 ◽

Cited By ~ 6

Author(s):

Javier D Fernández ◽

Miguel A Martínez-Prieto ◽

Pablo de la Fuente Redondo ◽

Claudio Gutiérrez

Keyword(s):

Data Structures ◽

Large Scale ◽

Open Data ◽

Structural Features ◽

Data Sets ◽

Data Set ◽

Wide Range ◽

Rdf Data ◽

Description Framework ◽

Resource Description

The publication of semantic web data, commonly represented in Resource Description Framework (RDF), has experienced outstanding growth over the last few years. Data from all fields of knowledge are shared publicly and interconnected in active initiatives such as Linked Open Data. However, despite the increasing availability of applications managing large-scale RDF information such as RDF stores and reasoning tools, little attention has been given to the structural features emerging in real-world RDF data. Our work addresses this issue by proposing specific metrics to characterise RDF data. We specifically focus on revealing the redundancy of each data set, as well as common structural patterns. We evaluate the proposed metrics on several data sets, which cover a wide range of designs and models. Our findings provide a basis for more efficient RDF data structures, indexes and compressors.

Download Full-text

Does intellectual capital impact firms' capital structure? Exploring the role of firm risk and profitability

Managerial Finance ◽

10.1108/mf-02-2020-0089 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

A. D'Amato

Keyword(s):

Capital Structure ◽

Intellectual Capital ◽

Value Added ◽

Firm Risk ◽

Financial Leverage ◽

Data Set ◽

Content Type ◽

Novel Approach ◽

Firm Profitability ◽

The Relationship

PurposeThe purpose of this paper is to analyze the relationship between intellectual capital and firm capital structure by exploring whether firm profitability and risk are drivers of this relationship.Design/methodology/approachBased on a comprehensive data set of Italian firms over the 2008–2017 period, this paper examines whether intellectual capital affects firm financial leverage. Moreover, it analyzes whether firm profitability and risk mediate the abovementioned relationship. Financial leverage is measured by the debt/equity ratio. Intellectual capital is measured via the value-added intellectual coefficient approach.FindingsThe findings show that firms with a high level of intellectual capital have lower financial leverage and are more profitable and riskier than firms with a low level of intellectual capital. Furthermore, this study finds that firm profitability and risk mediate the relationship between intellectual capital and financial leverage. Thus, the higher profitability and risk of intellectual capital-intensive firms help explain their lower financial leverage.Research limitations/implicationsThe findings have several implications. From a theoretical standpoint, the paper presents and tests a mediating model of the relationship between intellectual capital and financial leverage and its underlying processes. In terms of the more general managerial implications, the results provide managers with a clear interpretation of the relationship between intellectual capital and financial leverage and point to the need to strengthen the capital structure of intangible-intensive firms.Originality/valueThrough a mediation framework, this study provides empirical evidence on the relationship between intellectual capital and firm financial leverage by exploring the underlying mechanisms behind that relationship, which is a novel approach in the literature.

Download Full-text

Migrating a complex classification scheme to the semantic web: expressing the Integrative Levels Classification using SKOS RDF

Journal of Documentation ◽

10.1108/jd-10-2020-0167 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Ceri Binding ◽

Claudio Gnoli ◽

Douglas Tudhope

Keyword(s):

Knowledge Organization ◽

Content Type ◽

Classification Schemes ◽

Facet Structure ◽

Simple Knowledge Organization System ◽

Trade Offs ◽

Knowledge Organization System ◽

Faceted Classification ◽

Integrative Levels ◽

Query Patterns

PurposeThe Integrative Levels Classification (ILC) is a comprehensive “freely faceted” knowledge organization system not previously expressed as SKOS (Simple Knowledge Organization System). This paper reports and reflects on work converting the ILC to SKOS representation.Design/methodology/approachThe design of the ILC representation and the various steps in the conversion to SKOS are described and located within the context of previous work considering the representation of complex classification schemes in SKOS. Various issues and trade-offs emerging from the conversion are discussed. The conversion implementation employed the STELETO transformation tool.FindingsThe ILC conversion captures some of the ILC facet structure by a limited extension beyond the SKOS standard. SPARQL examples illustrate how this extension could be used to create faceted, compound descriptors when indexing or cataloguing. Basic query patterns are provided that might underpin search systems. Possible routes for reducing complexity are discussed.Originality/valueComplex classification schemes, such as the ILC, have features which are not straight forward to represent in SKOS and which extend beyond the functionality of the SKOS standard. The ILC's facet indicators are modelled as rdf:Property sub-hierarchies that accompany the SKOS RDF statements. The ILC's top-level fundamental facet relationships are modelled by extensions of the associative relationship – specialised sub-properties of skos:related. An approach for representing faceted compound descriptions in ILC and other faceted classification schemes is proposed.

Download Full-text

Towards Massive RDF Storage in NoSQL Databases

Advances in Data Mining and Database Management - Emerging Technologies and Applications in Data Processing and Management ◽

10.4018/978-1-5225-8446-9.ch013 ◽

2019 ◽

pp. 263-284 ◽

Cited By ~ 2

Author(s):

Zongmin Ma ◽

Li Yan

Keyword(s):

Data Storage ◽

Large Scale ◽

Future Research ◽

Nosql Databases ◽

Current State ◽

Data Store ◽

Rdf Data ◽

Description Framework ◽

Resource Description ◽

The Web

The resource description framework (RDF) is a model for representing information resources on the web. With the widespread acceptance of RDF as the de-facto standard recommended by W3C (World Wide Web Consortium) for the representation and exchange of information on the web, a huge amount of RDF data is being proliferated and becoming available. So, RDF data management is of increasing importance and has attracted attention in the database community as well as the Semantic Web community. Currently, much work has been devoted to propose different solutions to store large-scale RDF data efficiently. In order to manage massive RDF data, NoSQL (not only SQL) databases have been used for scalable RDF data store. This chapter focuses on using various NoSQL databases to store massive RDF data. An up-to-date overview of the current state of the art in RDF data storage in NoSQL databases is provided. The chapter aims at suggestions for future research.

Download Full-text