H-SPOOL

Purpose Linked data (LD) has promoted publishing information, and links published information. There are increasing number of LD datasets containing numerical data such as statistics. For this reason, analyzing numerical facts on LD has attracted attentions from diverse domains. This paper aims to support analytical processing for LD data. Design/methodology/approach This paper proposes a framework called H-SPOOL which provides series of SPARQL (SPARQL Protocol and RDF Query Language) queries extracting objects and attributes from LD data sets, converts them into star/snowflake schemas and materializes relevant triples as fact and dimension tables for online analytical processing (OLAP). Findings The applicability of H-SPOOL is evaluated using exiting LD data sets on the Web, and H-SPOOL successfully processes the LD data sets to ETL (Extract, Transform, and Load) for OLAP. Besides, experiments show that H-SPOOL reduces the number of downloaded triples comparing with existing approach. Originality/value H-SPOOL is the first work for extracting OLAP-related information from SPARQL endpoints, and H-SPOOL drastically reduces the amount of downloaded triples.

Download Full-text

BOLD (Big and Open Linked Data): what’s next?

Library Hi Tech News ◽

10.1108/lhtn-04-2017-0020 ◽

2017 ◽

Vol 34 (5) ◽

pp. 10-13 ◽

Cited By ~ 1

Author(s):

Stuti Saxena

Keyword(s):

Open Source ◽

Linked Data ◽

Design Methodology ◽

Data Sets ◽

Content Type ◽

Data Architecture ◽

Different Sources ◽

Practical Implications

Purpose The purpose of this paper is to appreciate the futuristic trends of Big and Open Linked Data (BOLD). While designating the ongoing progress of BOLD as BOLD 0.0, the paper also identifies the trajectory of BOLD 0.0 as BOLD 1.0, BOLD 2.0 and BOLD 3.0 in terms of the complexity and management of data sets from different sources. Design/methodology/approach This is a viewpoint and the ideas presented here are personal. Findings The trajectory of BOLD shall witness ever-growing challenges as the nature and scope of data sets grow complicated. The paper posits that by the time BOLD would attain its maturity, there would be a need for newer technologies and data architecture platforms which are relatively affordable and available as “Open Source”, if possible. Research limitations/implications Being exploratory in approach, this viewpoint presents a futuristic trend, which may or may not be valid. Nevertheless, there are significant practical implications for the academicians and practitioners to appreciate the likely challenges in the coming times for ensuring the sustainability of BOLD. Originality/value While there are a number of studies on BOLD, there are no studies which seek to propose the possible trends in BOLD’s progress. This paper seeks to plug this gap.

Download Full-text

Quality measures for skos

Data Technologies and Applications ◽

10.1108/dta-05-2017-0037 ◽

2018 ◽

Vol 52 (3) ◽

pp. 405-423 ◽

Cited By ~ 3

Author(s):

Riccardo Albertoni ◽

Monica De Martino ◽

Paola Podestà

Keyword(s):

Linked Data ◽

Design Methodology ◽

Quality Measures ◽

Third Party ◽

Data Set ◽

Content Type ◽

Knowledge Organisation ◽

The Cross ◽

The Web

Purpose The purpose of this paper is to focus on the quality of the connections (linkset) among thesauri published as Linked Data on the Web. It extends the cross-walking measures with two new measures able to evaluate the enrichment brought by the information reached through the linkset (lexical enrichment, browsing space enrichment). It fosters the adoption of cross-walking linkset quality measures besides the well-known and deployed cardinality-based measures (linkset cardinality and linkset coverage). Design/methodology/approach The paper applies the linkset measures to the Linked Thesaurus fRamework for Environment (LusTRE). LusTRE is selected as testbed as it is encoded using a Simple Knowledge Organisation System (SKOS) published as Linked Data, and it explicitly exploits the cross-walking measures on its validated linksets. Findings The application on LusTRE offers an insight of the complementarities among the considered linkset measures. In particular, it shows that the cross-walking measures deepen the cardinality-based measures analysing quality facets that were not previously considered. The actual value of LusTRE’s linksets regarding the improvement of multilingualism and concept spaces is assessed. Research limitations/implications The paper considers skos:exactMatch linksets, which belong to a rather specific but a quite common kind of linkset. The cross-walking measures explicitly assume correctness and completeness of linksets. Third party approaches and tools can help to meet the above assumptions. Originality/value This paper fulfils an identified need to study the quality of linksets. Several approaches formalise and evaluate Linked Data quality focusing on data set quality but disregarding the other essential component: the connection among data.

Download Full-text

Ontology assessment based on linked data principles

International Journal of Web Information Systems ◽

10.1108/ijwis-01-2018-0003 ◽

2018 ◽

Vol 14 (4) ◽

pp. 453-479

Author(s):

Leila Zemmouchi-Ghomari ◽

Kaouther Mezaache ◽

Mounia Oumessad

Keyword(s):

Linked Data ◽

Design Methodology ◽

State Of The Art ◽

Experimental Results ◽

Data Sets ◽

Content Type ◽

Detailed Interpretation

Purpose The purpose of this paper is to evaluate ontologies with respect to the linked data principles. This paper presents a concrete interpretation of the four linked data principles applied to ontologies, along with an implementation that automatically detects violations of these principles and fixes them (semi-automatically). The implementation is applied to a number of state-of-the-art ontologies. Design/methodology/approach Based on a precise and detailed interpretation of the linked data principles in the context of ontologies (to become as reusable as possible), the authors propose a set of algorithms to assess ontologies according to the four linked data principles along with means to implement them using a Java/Jena framework. All ontology elements are extracted and examined taking into account particular cases, such as blank nodes and literals. The authors also provide propositions to fix some of the detected anomalies. Findings The experimental results are consistent with the proven quality of popular ontologies of the linked data cloud because these ontologies obtained good scores from the linked data validator tool. Originality/value The proposed approach and its implementation takes into account the assessment of the four linked data principles and propose means to correct the detected anomalies in the assessed data sets, whereas most LD validator tools focus on the evaluation of principle 2 (URI dereferenceability) and principle 3 (RDF validation); additionally, they do not tackle the issue of fixing detected errors.

Download Full-text

Improving recommender systems’ performance on cold-start users and controversial items by a new similarity model

International Journal of Web Information Systems ◽

10.1108/ijwis-07-2015-0024 ◽

2016 ◽

Vol 12 (2) ◽

pp. 126-149 ◽

Cited By ~ 4

Author(s):

Masoud Mansoury ◽

Mehdi Shajari

Keyword(s):

Real World ◽

Design Methodology ◽

Cold Start ◽

Selection Function ◽

Data Sets ◽

Real World Data ◽

Content Type ◽

User Similarity ◽

Active User ◽

Similarity Model

Purpose This paper aims to improve the recommendations performance for cold-start users and controversial items. Collaborative filtering (CF) generates recommendations on the basis of similarity between users. It uses the opinions of similar users to generate the recommendation for an active user. As a similarity model or a neighbor selection function is the key element for effectiveness of CF, many variations of CF are proposed. However, these methods are not very effective, especially for users who provide few ratings (i.e. cold-start users). Design/methodology/approach A new user similarity model is proposed that focuses on improving recommendations performance for cold-start users and controversial items. To show the validity of the authors’ similarity model, they conducted some experiments and showed the effectiveness of this model in calculating similarity values between users even when only few ratings are available. In addition, the authors applied their user similarity model to a recommender system and analyzed its results. Findings Experiments on two real-world data sets are implemented and compared with some other CF techniques. The results show that the authors’ approach outperforms previous CF techniques in coverage metric while preserves accuracy for cold-start users and controversial items. Originality/value In the proposed approach, the conditions in which CF is unable to generate accurate recommendations are addressed. These conditions affect CF performance adversely, especially in the cold-start users’ condition. The authors show that their similarity model overcomes CF weaknesses effectively and improve its performance even in the cold users’ condition.

Download Full-text

Agricultural marketing and food safety in China: a utility perspective

Journal of Agribusiness in Developing and Emerging Economies ◽

10.1108/jadee-02-2013-0009 ◽

2014 ◽

Vol 4 (1) ◽

pp. 23-31 ◽

Cited By ~ 11

Author(s):

David L. Ortega ◽

Colin G. Brown ◽

Scott A. Waldron ◽

H. Holly Wang

Keyword(s):

Food Safety ◽

Design Methodology ◽

Numerical Data ◽

Marketing System ◽

Agricultural Marketing ◽

Content Type ◽

Safety Issues ◽

Time Form ◽

Made In ◽

Chinese Food

Purpose – The purpose of this paper is to explore Chinese food safety issues by analysing select incidents within he Chinese agricultural marketing system. Design/methodology/approach – A marketing utility framework is utilized to discuss some of the major food safety incidents in China and potential solutions are explored. Findings – The paper finds that food safety issues arise from problems of asymmetric information which leads to the profit seeking behaviour of agents distorting rather than enhancing the creation of one of the four types or marketing utility (time, form, place and possession). Additionally, structural causes found within the Chinese food marketing system have contributed to the food safety problems. Research limitations/implications – This is not an empirical research with numerical data. Originality/value – This study is one of the first to address Chinese food safety problems from an agricultural marketing utility perspective. Key anecdotes are used to support the claims made in this study.

Download Full-text

The Google Knowledge Graph

Strategic Direction ◽

10.1108/sd-04-2014-0049 ◽

2014 ◽

Vol 30 (4) ◽

pp. 15-17 ◽

Cited By ~ 1

Keyword(s):

Case Studies ◽

Design Methodology ◽

Reading Time ◽

Social Impact ◽

Knowledge Graph ◽

Content Type ◽

Principal Source ◽

Pertinent Information ◽

Practical Implications ◽

The Web

Purpose – This paper aims to review the latest management developments across the globe and pinpoint practical implications from cutting-edge research and case studies. Design/methodology/approach – This briefing is prepared by an independent writer who adds their own impartial comments and places the articles in context. Findings – Becoming increasingly reliant on the web as a principal source of finding information is altering our brains and the way that we obtain and hold knowledge. We are becoming less reliant on our memories to hold knowledge, instead using technology – and search engines like Google in particular – to deposit and retrieve information. Practical implications – The paper provides strategic insights and practical thinking that have influenced some of the world's leading organizations. Social implications – The paper provides strategic insights and practical thinking that can have a broader social impact. Originality/value – The briefing saves busy executives and researchers hours of reading time by selecting only the very best, most pertinent information and presenting it in a condensed and easy-to-digest format.

Download Full-text

Marketing and delivering information literacy on the web, yesterday and today, from 2009 to 2012

Library Hi Tech News ◽

10.1108/lhtn-03-2014-0019 ◽

2014 ◽

Vol 31 (4) ◽

pp. 10-13 ◽

Cited By ~ 1

Author(s):

Sharon Q. Yang

Keyword(s):

Information Literacy ◽

Web Sites ◽

Academic Libraries ◽

Design Methodology ◽

Status Quo ◽

Content Type ◽

The Status ◽

Practical Implications ◽

The Web ◽

Made In

Purpose – This study aims to ascertain the trends and changes of how academic libraries market and deliver information literacy (IL) on the web. Design/methodology/approach – The author compares the findings from two separate studies that scanned the Web sites for IL-related activities in 2009 and 2012, respectively. Findings – Academic libraries intensified their efforts to promote and deliver IL on the web between 2009 and 2012. There was a significant increase in IL-related activities on the web in the three-year period. Practical implications – The findings describe the status quo and changes in IL-related activities on the libraries’ Web sites. This information may help librarians to know what they have been doing and if there is space for improvement. Originality/value – This is the only study that spans three years in measuring the progress librarians made in marketing and delivering IL on the Web.

Download Full-text

Website traffic measurement and rankings: competitive intelligence tools examination

International Journal of Web Information Systems ◽

10.1108/ijwis-01-2018-0001 ◽

2018 ◽

Vol 14 (4) ◽

pp. 423-437 ◽

Cited By ~ 1

Author(s):

David Prantl ◽

Martin Prantl

Keyword(s):

Design Methodology ◽

State Of The Art ◽

Competitive Intelligence ◽

Data Sets ◽

Traffic Data ◽

Content Type ◽

Traffic Measurement ◽

Data Estimation ◽

Research Studies

PurposeThe purpose of this paper is to examine and verify the competitive intelligence tools Alexa and SimilarWeb, which are broadly used for website traffic data estimation. Tested tools belong to the state of the art in this area.Design/methodology/approachThe authors use quantitative approach. Research was conducted on a sample of Czech websites for which there are accurate traffic data values, against which the other data sets (less accurate) provided by Alexa and SimilarWeb will be compared.FindingsThe results show that neither tool can accurately determine the ranking of websites on the internet. However, it is possible to approximately determine the significance of a particular website. These results are useful for another research studies which use data from Alexa or SimilarWeb. Moreover, the results show that it is still not possible to accurately estimate website traffic of any website in the world.Research limitations/implicationsThe limitation of the research lies in the fact that it was conducted solely in the Czech market.Originality/valueSignificant amount of research studies use data sets provided by Alexa and SimilarWeb. However, none of these research studies focus on the quality of the website traffic data acquired by Alexa or SimilarWeb, nor do any of them refer to other studies that would deal with this issue. Furthermore, authors describe approaches to measuring website traffic and based on the analysis, the possible usability of these methods is discussed.

Download Full-text

Improving the Quality of Linked Data Using Statistical Distributions

Information Retrieval and Management ◽

10.4018/978-1-5225-5191-1.ch074 ◽

2018 ◽

pp. 1638-1664 ◽

Cited By ~ 1

Author(s):

Heiko Paulheim ◽

Christian Bizer

Keyword(s):

Knowledge Base ◽

Linked Data ◽

Relational Databases ◽

Knowledge Bases ◽

Structured Data ◽

Data Sources ◽

Data Sets ◽

Statistical Distributions ◽

The Web

Linked Data on the Web is either created from structured data sources (such as relational databases), from semi-structured sources (such as Wikipedia), or from unstructured sources (such as text). In the latter two cases, the generated Linked Data will likely be noisy and incomplete. In this paper, we present two algorithms that exploit statistical distributions of properties and types for enhancing the quality of incomplete and noisy Linked Data sets: SDType adds missing type statements, and SDValidate identifies faulty statements. Neither of the algorithms uses external knowledge, i.e., they operate only on the data itself. We evaluate the algorithms on the DBpedia and NELL knowledge bases, showing that they are both accurate as well as scalable. Both algorithms have been used for building the DBpedia 3.9 release: With SDType, 3.4 million missing type statements have been added, while using SDValidate, 13,000 erroneous RDF statements have been removed from the knowledge base.

Download Full-text

Application for online conferences and meetings

Library Hi Tech News ◽

10.1108/lhtn-07-2020-0068 ◽

2020 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Mochammad Rifai ◽

Devi Fitrianah

Keyword(s):

Software Development ◽

Web Application ◽

Design Methodology ◽

Black Box ◽

Process Validation ◽

Content Type ◽

Registration Process ◽

Testing Method ◽

Black Box Testing ◽

The Web

Purpose This study aims to support an institution to hold an online meeting or conference in the middle of social distancing, which is currently in effect. Design/methodology/approach In developing this application, rapid application design methodology is used. The implementation used HTML5 and PHP for the Web and MySQL for the database and Agora Software Development Kit. To evaluate the application, the authors had a black box testing method. Findings This application will support the participant registration process, validation, payment, providing a link to the workshop to the participant, token and room name to be able to join an online meeting or conference up to the process of giving a digital attendance certificate to participants or members participating in it. Originality/value An integrated Web application provides full services, starting from the registration process, payment, the conference meeting itself and certificate of attendance.

Download Full-text