scholarly journals Using superimposed multidimensional schemas and OLAP patterns for RDF data analysis

2018 ◽  
Vol 8 (1) ◽  
pp. 18-37 ◽  
Author(s):  
Median Hilal ◽  
Christoph G. Schuetz ◽  
Michael Schrefl

Abstract The foundations for traditional data analysis are Online Analytical Processing (OLAP) systems that operate on multidimensional (MD) data. The Resource Description Framework (RDF) serves as the foundation for the publication of a growing amount of semantic web data still largely untapped by companies for data analysis. Most RDF data sources, however, do not correspond to the MD modeling paradigm and, as a consequence, elude traditional OLAP. The complexity of RDF data in terms of structure, semantics, and query languages renders RDF data analysis challenging for a typical analyst not familiar with the underlying data model or the SPARQL query language. Hence, conducting RDF data analysis is not a straightforward task. We propose an approach for the definition of superimposed MD schemas over arbitrary RDF datasets and show how to represent the superimposed MD schemas using well-known semantic web technologies. On top of that, we introduce OLAP patterns for RDF data analysis, which are recurring, domain-independent elements of data analysis. Analysts may compose queries by instantiating a pattern using only the MD concepts and business terms. Upon pattern instantiation, the corresponding SPARQL query over the source data can be automatically generated, sparing analysts from technical details and fostering self-service capabilities.

Semantic Web ◽  
2021 ◽  
pp. 1-19
Author(s):  
Marilena Daquino ◽  
Ivan Heibi ◽  
Silvio Peroni ◽  
David Shotton

Semantic Web technologies are widely used for storing RDF data and making them available on the Web through SPARQL endpoints, queryable using the SPARQL query language. While the use of SPARQL endpoints is strongly supported by Semantic Web experts, it hinders broader use of RDF data by common Web users, engineers and developers unfamiliar with Semantic Web technologies, who normally rely on Web RESTful APIs for querying Web-available data and creating applications over them. To solve this problem, we have developed RAMOSE, a generic tool developed in Python to create REST APIs over SPARQL endpoints. Through the creation of source-specific textual configuration files, RAMOSE enables the querying of SPARQL endpoints via simple Web RESTful API calls that return either JSON or CSV-formatted data, thus hiding all the intrinsic complexities of SPARQL and RDF from common Web users. We provide evidence that the use of RAMOSE to provide REST API access to RDF data within OpenCitations triplestores is beneficial in terms of the number of queries made by external users of such RDF data using the RAMOSE API, compared with the direct access via the SPARQL endpoint. Our findings show the importance for suppliers of RDF data of having an alternative API access service, which enables its use by those with no (or little) experience in Semantic Web technologies and the SPARQL query language. RAMOSE can be used both to query any SPARQL endpoint and to query any other Web API, and thus it represents an easy generic technical solution for service providers who wish to create an API service to access Linked Data stored as RDF in a triplestore.


2004 ◽  
Vol 1 (2) ◽  
pp. 127-151 ◽  
Author(s):  
Dragan Gasevic

This paper gives the Petri net ontology as the most important element in providing Petri net support for the Semantic Web. Available Petri net formal descriptions are: metamodels, UML profiles, ontologies and syntax. Metamodels are useful, but their main purpose is for Petri net tools. Although the current Petri-net community effort Petri Net Markup Language (PNML) is XML-based, it lacks a precise definition of semantics. Existing Petri net ontologies are partial solutions specialized for a specific problem. In order to show current Petri net model sharing features we use P3 tool that uses PNML/XSLT-based approach for model sharing. This paper suggests developing the Petri net ontology to represent semantics appropriately. This Petri net ontology is described using UML, Resource Description Framework (Schema) RDF(S) and the Web Ontology Language-OWL.


2008 ◽  
pp. 3309-3320
Author(s):  
Csilla Farkas

This chapter investigates the threat of unwanted Semantic Web inferences. We survey the current efforts to detect and remove unwanted inferences, identify research gaps, and recommend future research directions. We begin with a brief overview of Semantic Web technologies and reasoning methods, followed by a description of the inference problem in traditional databases. In the context of the Semantic Web, we study two types of inferences: (1) entailments defined by the formal semantics of the Resource Description Framework (RDF) and the RDF Schema (RDFS) and (2) inferences supported by semantic languages like the Web Ontology Language (OWL). We compare the Semantic Web inferences to the inferences studied in traditional databases. We show that the inference problem exists on the Semantic Web and that existing security methods do not fully prevent indirect data disclosure via inference channels.


Author(s):  
Dimitrios Tektonidis ◽  
Albert Bokma

Integrating enterprise system has become an issue of sharing information rather than transforming information due to the increasing complexity and the heterogeneity of the applications. The transition from application centric to integration centric enterprise application integration (EAI) requires methods and technologies that will enable and facilitate the definition of shared information. The use of ontologies Semantic Web and technologies can improve the existing EAI methods by providing a framework capable to define shared information. Ontologies based Enterprise application integration (ONAR) framework utilizes Semantic Web technologies to define shared information among heterogeneous systems. The present paper presents the utilization of ontologies for the formation of ONAR framework and its application for service oriented application integration (SOAI)


2013 ◽  
Vol 441 ◽  
pp. 970-973
Author(s):  
Yan Qin Zhang ◽  
Jing Bin Wang

As the development of the semantic web, RDF data set has grown rapidly, thus causing the query problem of massive RDF. Using distributed technique to complete the SPARQL (Simple Protocol and RDF Query Language) Query is a new way of solving the large amounts of RDF query problem. At present, most of the RDF query strategies based on Hadoop have to use multiple MapReduce jobs to complete the task, resulting in waste of time. In order to overcome this drawback, MRQJ (using MapReduce to query and join) algorithm is proposed in the paper, which firstly uses a greedy strategy to generate join plan, then only one MapReduce job should be created to get the query results in SPARQL query execution. Finally, a contrast experiment on the LUBM (Lehigh University Benchmark) test data set is conducted, the results of which show that MRQJ method has a great advantage in the case that the query is more complicated.


2020 ◽  
Vol 1 (1) ◽  
pp. 428-444 ◽  
Author(s):  
Silvio Peroni ◽  
David Shotton

OpenCitations is an infrastructure organization for open scholarship dedicated to the publication of open citation data as Linked Open Data using Semantic Web technologies, thereby providing a disruptive alternative to traditional proprietary citation indexes. Open citation data are valuable for bibliometric analysis, increasing the reproducibility of large-scale analyses by enabling publication of the source data. Following brief introductions to the development and benefits of open scholarship and to Semantic Web technologies, this paper describes OpenCitations and its data sets, tools, services, and activities. These include the OpenCitations Data Model; the SPAR (Semantic Publishing and Referencing) Ontologies; OpenCitations’ open software of generic applicability for searching, browsing, and providing REST APIs over resource description framework (RDF) triplestores; Open Citation Identifiers (OCIs) and the OpenCitations OCI Resolution Service; the OpenCitations Corpus (OCC), a database of open downloadable bibliographic and citation data made available in RDF under a Creative Commons public domain dedication; and the OpenCitations Indexes of open citation data, of which the first and largest is COCI, the OpenCitations Index of Crossref Open DOI-to-DOI Citations, which currently contains over 624 million bibliographic citations and is receiving considerable usage by the scholarly community.


2010 ◽  
Vol 04 (04) ◽  
pp. 423-451 ◽  
Author(s):  
SUNITHA RAMANUJAM ◽  
VAIBHAV KHADILKAR ◽  
LATIFUR KHAN ◽  
MURAT KANTARCIOGLU ◽  
BHAVANI THURAISINGHAM ◽  
...  

The current buzzword in the Internet community is the Semantic Web initiative proposed by the W3C to yield a Web that is more flexible and self-adapting. However, for the Semantic Web initiative to become a reality, heterogeneous data sources need to be integrated in order to enable access to them in a homogeneous manner. Since a vast majority of data currently resides in relational databases, integrating relational data sources with semantic web technologies is at the top of the list of activities required to realize the semantic web vision. Several efforts exist that publish relational data as Resource Description Framework (RDF) triples; however almost all current work in this arena is uni-directional, presenting data from an underlying relational database into a corresponding virtual RDF store in a read-only manner. An enhancement over previous relational-to-RDF bridging work in the form of bi-directionality support is presented in this paper. The bi-directional bridge proposed here allows RDF data updates specified as triples to be propagated back into the underlying relational database as tuples. Towards this end, we present various algorithms to translate the triples to be updated/inserted/deleted into equivalent relational attributes/tuples whenever possible. Particular emphasis is laid, in this paper, on the translation and update propagation process for triples containing blank nodes and reification nodes, and a platform enhanced with our algorithms, called D2RQ++, through which bi-directional translation can be achieved, is presented.


F1000Research ◽  
2019 ◽  
Vol 8 ◽  
pp. 1822 ◽  
Author(s):  
Ana Claudia Sima ◽  
Christophe Dessimoz ◽  
Kurt Stockinger ◽  
Monique Zahn-Zabal ◽  
Tarcisio Mendes de Farias

The increasing use of Semantic Web technologies in the life sciences, in particular the use of the Resource Description Framework (RDF) and the RDF query language SPARQL, opens the path for novel integrative analyses, combining information from multiple sources. However, analyzing evolutionary data in RDF is not trivial, due to the steep learning curve required to understand both the data models adopted by different RDF data sources, as well as the SPARQL query language. In this article, we provide a hands-on introduction to querying evolutionary data across multiple sources that publish orthology information in RDF, namely: The Orthologous MAtrix (OMA), the European Bioinformatics Institute (EBI) RDF platform, the Database of Orthologous Groups (OrthoDB) and the Microbial Genome Database (MBGD). We present four protocols in increasing order of complexity. In these protocols, we demonstrate through SPARQL queries how to retrieve pairwise orthologs, homologous groups, and hierarchical orthologous groups. Finally, we show how orthology information in different sources can be compared, through the use of federated SPARQL queries.


Sign in / Sign up

Export Citation Format

Share Document