System II: A Native RDF Repository Based on the Hypergraph Representation for RDF Data Model

Resource Description Framework (RDF) is a commonly used data model in the Semantic Web environment. Libraries and various other communities have been using the RDF data model to store valuable data after it is extracted from traditional storage systems. However, because of the large volume of the data, processing and storing it is becoming a nightmare for traditional data-management tools. This challenge demands a scalable and distributed system that can manage data in parallel. In this article, a distributed solution is proposed for efficiently processing and storing the large volume of library linked data stored in traditional storage systems. Apache Spark is used for parallel processing of large data sets and a column-oriented schema is proposed for storing RDF data. The storage system is built on top of Hadoop Distributed File Systems (HDFS) and uses the Apache Parquet format to store data in a compressed form. The experimental evaluation showed that storage requirements were reduced significantly as compared to Jena TDB, Sesame, RDF/XML, and N-Triples file formats. SPARQL queries are processed using Spark SQL to query the compressed data. The experimental evaluation showed a good query response time, which significantly reduces as the number of worker nodes increases.

Download Full-text

System Π: A Native RDF Repository Based on the Hypergraph Representation for RDF Data Model

Journal of Computer Science and Technology ◽

10.1007/s11390-009-9265-9 ◽

2009 ◽

Vol 24 (4) ◽

pp. 652-664 ◽

Cited By ~ 3

Author(s):

Gang Wu ◽

Juan-Zi Li ◽

Jian-Qiang Hu ◽

Ke-Hong Wang

Keyword(s):

Data Model ◽

Rdf Data

Download Full-text

A novel tool for standardizing clinical data in a realism-based common data model

10.1101/2020.05.12.091223 ◽

2020 ◽

Author(s):

Hayden G. Freedman ◽

Heather Williams ◽

Mark A. Miller ◽

David Birtwell ◽

Danielle L. Mowery ◽

...

Keyword(s):

Clinical Data ◽

Data Model ◽

Clinical Information ◽

Clinical Situation ◽

Common Data Model ◽

Full Potential ◽

Semantic Web Technologies ◽

High Quality Research ◽

Rdf Data ◽

Clinical Data Model

AbstractStandardizing clinical information in a common data model is important for promoting interoperability and facilitating high quality research. Semantic Web technologies such as Resource Description Framework can be utilized to their full potential when a clinical data model accurately reflects the reality of the clinical situation it describes. To this end, the Open Biomedical Ontologies Foundry provides a set of ontologies that conform to the principles of realism and can be used to create a realism-based clinical data model. However, the challenge of programmatically defining such a model and loading data from disparate sources into the model has not been addressed by pre-existing software solutions. The PennTURBO Semantic Engine is a tool developed at the University of Pennsylvania that works in conjunction with data aggregation software to transform source-specific RDF data into a source-independent, realism-based data model. This system sources classes from an application ontology and specifically defines how instances of those classes may relate to each other. Additionally, the system defines and executes RDF data transformations by launching dynamically generated SPARQL update statements. The Semantic Engine was designed as a generalizable RDF data standardization tool, and is able to work with various data models and incoming data sources. Its human-readable configuration files can easily be shared between institutions, providing the basis for collaboration on a standard realism-based clinical data model.

Download Full-text

N3Logic: A logical framework for the World Wide Web

Theory and Practice of Logic Programming ◽

10.1017/s1471068407003213 ◽

2008 ◽

Vol 8 (3) ◽

pp. 249-269 ◽

Cited By ~ 89

Author(s):

TIM BERNERS-LEE ◽

DAN CONNOLLY ◽

LALANA KAGAL ◽

YOSI SCHARF ◽

JIM HENDLER

Keyword(s):

Semantic Web ◽

Data Model ◽

Structured Data ◽

Minimal Extension ◽

Web Environment ◽

Knowledge Models ◽

Rdf Data ◽

Description Framework ◽

Resource Description ◽

The Web

AbstractThe Semantic Web drives toward the use of the Web for interacting with logically interconnected data. Through knowledge models such as Resource Description Framework (RDF), the Semantic Web provides a unifying representation of richly structured data. Adding logic to the Web implies the use of rules to make inferences, choose courses of action, and answer questions. This logic must be powerful enough to describe complex properties of objects but not so powerful that agents can be tricked by being asked to consider a paradox. The Web has several characteristics that can lead to problems when existing logics are used, in particular, the inconsistencies that inevitably arise due to the openness of the Web, where anyone can assert anything. N3Logic is a logic that allows rules to be expressed in a Web environment. It extends RDF with syntax for nested graphs and quantified variables and with predicates for implication and accessing resources on the Web, and functions including cryptographic, string, math. The main goal of N3Logic is to be a minimal extension to the RDF data model such that the same language can be used for logic and data. In this paper, we describe N3Logic and illustrate through examples why it is an appropriate logic for the Web.

Download Full-text

Mapping Spatiotemporal Data to RDF: A SPARQL Endpoint for Brussels

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi8080353 ◽

2019 ◽

Vol 8 (8) ◽

pp. 353 ◽

Cited By ~ 1

Author(s):

Alejandro Vaisman ◽

Kevin Chentout

Keyword(s):

Spatial Data ◽

Data Model ◽

Relational Databases ◽

Open Data ◽

Linked Open Data ◽

Sparql Endpoint ◽

External Data ◽

Triple Store ◽

Rdf Data ◽

Map Data

This paper describes how a platform for publishing and querying linked open data for the Brussels Capital region in Belgium is built. Data are provided as relational tables or XML documents and are mapped into the RDF data model using R2RML, a standard language that allows defining customized mappings from relational databases to RDF datasets. In this work, data are spatiotemporal in nature; therefore, R2RML must be adapted to allow producing spatiotemporal Linked Open Data.Data generated in this way are used to populate a SPARQL endpoint, where queries are submitted and the result can be displayed on a map. This endpoint is implemented using Strabon, a spatiotemporal RDF triple store built by extending the RDF store Sesame. The first part of the paper describes how R2RML is adapted to allow producing spatial RDF data and to support XML data sources. These techniques are then used to map data about cultural events and public transport in Brussels into RDF. Spatial data are stored in the form of stRDF triples, the format required by Strabon. In addition, the endpoint is enriched with external data obtained from the Linked Open Data Cloud, from sites like DBpedia, Geonames, and LinkedGeoData, to provide context for analysis. The second part of the paper shows, through a comprehensive set of the spatial extension to SPARQL (stSPARQL) queries, how the endpoint can be exploited.

Download Full-text

Towards the Representation of Etymological Data on the Semantic Web

Information ◽

10.3390/info9120304 ◽

2018 ◽

Vol 9 (12) ◽

pp. 304 ◽

Cited By ~ 1

Author(s):

Anas Khan

Keyword(s):

Semantic Web ◽

Data Model ◽

Linked Data ◽

New Model ◽

Data Framework ◽

Preceding Discussion ◽

Rdf Data ◽

Resource Data

In this article, we look at the potential for a wide-coverage modelling of etymological information as linked data using the Resource Data Framework (RDF) data model. We begin with a discussion of some of the most typical features of etymological data and the challenges that these might pose to an RDF-based modelling. We then propose a new vocabulary for representing etymological data, the Ontolex-lemon Etymological Extension (lemonETY), based on the ontolex-lemon model. Each of the main elements of our new model is motivated with reference to the preceding discussion.

Download Full-text

TSV2RDF: Generating RDF Data Model from TSV File Format Using Semantic Web Technologies

Journal of Digital Information Management ◽

10.6025/jdim/2021/19/1/10-24 ◽

2021 ◽

Vol 19 (1) ◽

pp. 10

Author(s):

Mammadov Hasan ◽

Yan Li ◽

Muhammad Waqas Ahmad

Keyword(s):

Semantic Web ◽

Data Model ◽

File Format ◽

Semantic Web Technologies ◽

Web Technologies ◽

Rdf Data

Download Full-text

A Lightweight RDF Data Model for Business Process Analysis

Lecture Notes in Business Information Processing - Data-Driven Process Discovery and Analysis ◽

10.1007/978-3-642-40919-6_1 ◽

2013 ◽

pp. 1-23 ◽

Cited By ~ 5

Author(s):

Marcello Leida ◽

Basim Majeed ◽

Maurizio Colombo ◽

Andrej Chu

Keyword(s):

Business Process ◽

Data Model ◽

Process Analysis ◽

Business Process Analysis ◽

Rdf Data

Download Full-text

A DISTRIBUTIONAL STRUCTURED SEMANTIC SPACE FOR QUERYING RDF GRAPH DATA

International Journal of Semantic Computing ◽

10.1142/s1793351x1100133x ◽

2011 ◽

Vol 05 (04) ◽

pp. 433-462 ◽

Cited By ~ 9

Author(s):

ANDRÉ FREITAS ◽

EDWARD CURRY ◽

JOÃO GABRIEL OLIVEIRA ◽

SEÁN O'RIAIN

Keyword(s):

Vector Space ◽

Data Model ◽

Linked Data ◽

Fundamental Problem ◽

Semantic Space ◽

Semantic Interpretation ◽

Rdf Graph ◽

Data Consumption ◽

Model Independent ◽

Rdf Data

The vision of creating a Linked Data Web brings together the challenge of allowing queries across highly heterogeneous and distributed datasets. In order to query Linked Data on the Web today, end users need to be aware of which datasets potentially contain the data and also which data model describes these datasets. The process of allowing users to expressively query relationships in RDF while abstracting them from the underlying data model represents a fundamental problem for Web-scale Linked Data consumption. This article introduces a distributional structured semantic space which enables data model independent natural language queries over RDF data. The center of the approach relies on the use of a distributional semantic model to address the level of semantic interpretation demanded to build the data model independent approach. The article analyzes the geometric aspects of the proposed space, providing its description as a distributional structured vector space, which is built upon the Generalized Vector Space Model (GVSM). The final semantic space proved to be flexible and precise under real-world query conditions achieving mean reciprocal rank = 0.516, avg. precision = 0.482 and avg. recall = 0.491.

Download Full-text

System II: A Native RDF Repository Based on the Hypergraph Representation for RDF Data Model

A Multidimensional Semantic Space for Data Model Independent Queries over RDF Data

Efficiently Processing and Storing Library Linked Data using Apache Spark and Parquet

System Π: A Native RDF Repository Based on the Hypergraph Representation for RDF Data Model

A novel tool for standardizing clinical data in a realism-based common data model

N3Logic: A logical framework for the World Wide Web

Mapping Spatiotemporal Data to RDF: A SPARQL Endpoint for Brussels

Towards the Representation of Etymological Data on the Semantic Web

TSV2RDF: Generating RDF Data Model from TSV File Format Using Semantic Web Technologies

A Lightweight RDF Data Model for Business Process Analysis

A DISTRIBUTIONAL STRUCTURED SEMANTIC SPACE FOR QUERYING RDF GRAPH DATA

Export Citation Format