SPARQL Query Language

As the development of the semantic web, RDF data set has grown rapidly, thus causing the query problem of massive RDF. Using distributed technique to complete the SPARQL (Simple Protocol and RDF Query Language) Query is a new way of solving the large amounts of RDF query problem. At present, most of the RDF query strategies based on Hadoop have to use multiple MapReduce jobs to complete the task, resulting in waste of time. In order to overcome this drawback, MRQJ (using MapReduce to query and join) algorithm is proposed in the paper, which firstly uses a greedy strategy to generate join plan, then only one MapReduce job should be created to get the query results in SPARQL query execution. Finally, a contrast experiment on the LUBM (Lehigh University Benchmark) test data set is conducted, the results of which show that MRQJ method has a great advantage in the case that the query is more complicated.

Download Full-text

Biotea, semantics for Pubmed Central

10.7287/peerj.preprints.3469 ◽

2017 ◽

Author(s):

Alexander Garcia ◽

Federico Lopez ◽

Leyla Garcia ◽

Olga Giraldo ◽

Victor Bucheli ◽

...

Keyword(s):

Linked Data ◽

Scientific Literature ◽

Query Language ◽

Biomedical Ontology ◽

Sparql Query ◽

Biomedical Literature ◽

Automatic Annotation ◽

Pubmed Central ◽

Web Technologies ◽

Aggregate Content

A significant portion of biomedical literature is represented in a manner that makes it difficult for consumers to find or aggregate content through a computational query. One approach to facilitate reuse of the scientific literature is to structure this information as linked data using standardized web technologies. In this paper we present the second version of Biotea, a semantic, linked data version of the open-access subset of PubMed Central that has been enhanced with specialized annotation pipelines that uses existing infrastructure from the National Center for Biomedical Ontology. We expose our models, services, software and datasets. Our infrastructure enables manual and semi-automatic annotation, resulting data are represented as RDF-based linked data and can be readily queried using the SPARQL query language. We illustrate the utility of our system with several use cases. Availability: Our datasets, methods and techniques are available at http://biotea.github.io

Download Full-text

Fact Checking via Evidence Patterns

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/522 ◽

2018 ◽

Cited By ~ 3

Author(s):

Valeria Fionda ◽

Giuseppe Pirrò

Keyword(s):

Query Language ◽

State Of The Art ◽

Sparql Query ◽

Efficient Algorithms ◽

The State ◽

The Core ◽

Numerical Assessment ◽

Comparable Accuracy ◽

Fact Checking ◽

Knowledge Graphs

We tackle fact checking using Knowledge Graphs (KGs) as a source of background knowledge. Our approach leverages the KG schema to generate candidate evidence patterns, that is, schema-level paths that capture the semantics of a target fact in alternative ways. Patterns verified in the data are used to both assemble semantic evidence for a fact and provide a numerical assessment of its truthfulness. We present efficient algorithms to generate and verify evidence patterns, and assemble evidence. We also provide a translation of the core of our algorithms into the SPARQL query language. Not only our approach is faster than the state of the art and offers comparable accuracy, but it can also use any SPARQL-enabled KG.

Download Full-text

Supporting Semantic Verification of Process Models

Handbook of Research on E-Business Standards and Protocols ◽

10.4018/978-1-4666-0146-8.ch023 ◽

2012 ◽

pp. 495-511

Author(s):

Michael Fellmann ◽

Oliver Thomas ◽

Frank Hogrebe

Keyword(s):

Query Language ◽

Real Life ◽

Sparql Query ◽

Process Models ◽

Capital City ◽

Formal Ontology ◽

Web Ontology Language ◽

Formal Process ◽

Administrative Process ◽

Ontology Language

This chapter presents an ontology-driven approach that aims at supporting semantic verification of semi-formal process models. The ontology-driven approach suggested consists of two steps. The first step is the development of a model for ontology-based representation of process models. This representation allows enriching process models by annotating them with semantics specified in a formal ontology. In the second step, the authors use this model to support an ontology-based semantic verification possible with this representation and in conjunction with machine reasoning. To implement the approach, the authors use the standardized Web Ontology Language (OWL) and the SPARQL query language. They demonstrate the approach using real-life administrative process models taken from a capital city.

Download Full-text

Using superimposed multidimensional schemas and OLAP patterns for RDF data analysis

Open Computer Science ◽

10.1515/comp-2018-0003 ◽

2018 ◽

Vol 8 (1) ◽

pp. 18-37 ◽

Cited By ~ 1

Author(s):

Median Hilal ◽

Christoph G. Schuetz ◽

Michael Schrefl

Keyword(s):

Data Analysis ◽

Semantic Web ◽

Query Language ◽

Query Languages ◽

Sparql Query ◽

Semantic Web Technologies ◽

Rdf Data ◽

Self Service ◽

Description Framework ◽

Definition Of

Abstract The foundations for traditional data analysis are Online Analytical Processing (OLAP) systems that operate on multidimensional (MD) data. The Resource Description Framework (RDF) serves as the foundation for the publication of a growing amount of semantic web data still largely untapped by companies for data analysis. Most RDF data sources, however, do not correspond to the MD modeling paradigm and, as a consequence, elude traditional OLAP. The complexity of RDF data in terms of structure, semantics, and query languages renders RDF data analysis challenging for a typical analyst not familiar with the underlying data model or the SPARQL query language. Hence, conducting RDF data analysis is not a straightforward task. We propose an approach for the definition of superimposed MD schemas over arbitrary RDF datasets and show how to represent the superimposed MD schemas using well-known semantic web technologies. On top of that, we introduce OLAP patterns for RDF data analysis, which are recurring, domain-independent elements of data analysis. Analysts may compose queries by instantiating a pattern using only the MD concepts and business terms. Upon pattern instantiation, the corresponding SPARQL query over the source data can be automatically generated, sparing analysts from technical details and fostering self-service capabilities.

Download Full-text

New definition of the SPARQL — query language for the Semantic Web

Reports of the National Academy of Sciences of Ukraine ◽

10.15407/dopovidi2018.11.019 ◽

2018 ◽

pp. 19-31 ◽

Cited By ~ 1

Author(s):

A.F. Kurgaev ◽

Keyword(s):

Semantic Web ◽

Query Language ◽

Sparql Query ◽

Definition Of

Download Full-text

Creating RESTful APIs over SPARQL endpoints using RAMOSE

Semantic Web ◽

10.3233/sw-210439 ◽

2021 ◽

pp. 1-19

Author(s):

Marilena Daquino ◽

Ivan Heibi ◽

Silvio Peroni ◽

David Shotton

Keyword(s):

Semantic Web ◽

Service Providers ◽

Query Language ◽

Sparql Query ◽

Direct Access ◽

Technical Solution ◽

Semantic Web Technologies ◽

Web Technologies ◽

Sparql Endpoint ◽

Rdf Data

Semantic Web technologies are widely used for storing RDF data and making them available on the Web through SPARQL endpoints, queryable using the SPARQL query language. While the use of SPARQL endpoints is strongly supported by Semantic Web experts, it hinders broader use of RDF data by common Web users, engineers and developers unfamiliar with Semantic Web technologies, who normally rely on Web RESTful APIs for querying Web-available data and creating applications over them. To solve this problem, we have developed RAMOSE, a generic tool developed in Python to create REST APIs over SPARQL endpoints. Through the creation of source-specific textual configuration files, RAMOSE enables the querying of SPARQL endpoints via simple Web RESTful API calls that return either JSON or CSV-formatted data, thus hiding all the intrinsic complexities of SPARQL and RDF from common Web users. We provide evidence that the use of RAMOSE to provide REST API access to RDF data within OpenCitations triplestores is beneficial in terms of the number of queries made by external users of such RDF data using the RAMOSE API, compared with the direct access via the SPARQL endpoint. Our findings show the importance for suppliers of RDF data of having an alternative API access service, which enables its use by those with no (or little) experience in Semantic Web technologies and the SPARQL query language. RAMOSE can be used both to query any SPARQL endpoint and to query any other Web API, and thus it represents an easy generic technical solution for service providers who wish to create an API service to access Linked Data stored as RDF in a triplestore.

Download Full-text

The Berlin SPARQL Benchmark

Semantic Services, Interoperability and Web Applications ◽

10.4018/978-1-60960-593-3.ch004 ◽

2011 ◽

pp. 81-103 ◽

Cited By ~ 1

Author(s):

Christian Bizer ◽

Andreas Schultz

Keyword(s):

Relational Database ◽

Database Management ◽

Relational Databases ◽

Query Language ◽

Storage Systems ◽

Sparql Query ◽

Use Case ◽

Benchmark Experiment ◽

Relational Database Management ◽

Relational Database Management Systems

The SPARQL Query Language for RDF and the SPARQL Protocol for RDF are implemented by a growing number of storage systems and are used within enterprise and open Web settings. As SPARQL is taken up by the community, there is a growing need for benchmarks to compare the performance of storage systems that expose SPARQL endpoints via the SPARQL protocol. Such systems include native RDF stores as well as systems that rewrite SPARQL queries to SQL queries against non-RDF relational databases. This article introduces the Berlin SPARQL Benchmark (BSBM) for comparing the performance of native RDF stores with the performance of SPARQL-to-SQL rewriters across architectures. The benchmark is built around an e-commerce use case in which a set of products is offered by different vendors and consumers have posted reviews about products. The benchmark query mix emulates the search and navigation pattern of a consumer looking for a product. The article discusses the design of the BSBM benchmark and presents the results of a benchmark experiment comparing the performance of four popular RDF stores (Sesame, Virtuoso, Jena TDB, and Jena SDB) with the performance of two SPARQL-to-SQL rewriters (D2R Server and Virtuoso RDF Views) as well as the performance of two relational database management systems (MySQL and Virtuoso RDBMS).

Download Full-text

How do you Develop a Data Standard? Wikibase might be the Solution…

Biodiversity Information Science and Standards ◽

10.3897/biss.4.59211 ◽

2020 ◽

Vol 4 ◽

Author(s):

Maarten Trekels ◽

Matt Woodburn ◽

Deborah L Paul ◽

Sharon Grant ◽

Kate Webbink ◽

...

Keyword(s):

Query Language ◽

Relevant Information ◽

Knowledge Bases ◽

Sparql Query ◽

Use Cases ◽

Data Set ◽

Controlled Vocabularies ◽

Triple Store ◽

Description Framework ◽

High Level

Data standards allow us to aggregate, compare, compute and communicate data from a wide variety of origins. However, for historical reasons, data are most likely to be stored in many different formats and conform to different models. Every data set might contain a huge amount of information, but it becomes tremendously difficult to compare them without a common way to represent the data. That is when standards development jumps in. Developing a standard is a formidable process, often involving many stakeholders. Typically the initial blueprint of a standard is created by a limited number of people who have a clear view of their use cases. However, as development continues, additional stakeholders participate in the process. As a result, conflicting opinions and interests will influence the development of the standard. Compromises need to be made and the standard might look very different from the initial concept. In order to address the needs of the community, a high level of engagement in the development process is encouraged. However, this does not necessarily increase the usability of the standard. To mitigate this, there is a need to test the standard during the early stages of development. In order to facilitate this, we explored the use of Wikibase to create an initial implementation of the standard. Wikibase is the underlying technology that drives Wikidata. The software is open-source and can be customized for creating collaborative knowledge bases. In addition to containing an RDF (Resource Description Framework) triple store under the hood, it provides users with an easy-to-use graphical user interface (see Fig. 1). This facilitates the use of an implementation of a standard by non-technical users. The Wikibase remains fully flexible in the way data are represented and no data model is enforced. This allows users to map their data onto the standard without any restrictions. Retrieving information from RDF data can be done through the SPARQL query language (W3C 2020). The software package has also a built-in SPARQL endpoint, allowing users to extract the relevant information: Does the standard cover all use cases envisioned? Are parts of the standard underdeveloped? Are the controlled vocabularies sufficient to describe the data? Does the standard cover all use cases envisioned? Are parts of the standard underdeveloped? Are the controlled vocabularies sufficient to describe the data? This strategy was applied during the development of the TDWG Collection Description standard. After completing a rough version of the standard, the different terms that were defined in the first version were transferred to a Wikibase instance running on WBStack (Addshore 2020). Initially, collection data were entered manually, which revealed several issues. The Wikibase allowed us to easily define controlled vocabularies and expand them as needed. The feedback reported from users then flowed back to the further development of the standard. Currently we envisage creating automated scripts that will import data en masse from collections. Using the SPARQL query interface, it will then be straightforward to ensure that data can be extracted from the Wikibase to support the envisaged use cases.

Download Full-text

Biotea: semantics for Pubmed Central

PeerJ ◽

10.7717/peerj.4201 ◽

2018 ◽

Vol 6 ◽

pp. e4201 ◽

Cited By ~ 5

Author(s):

Alexander Garcia ◽

Federico Lopez ◽

Leyla Garcia ◽

Olga Giraldo ◽

Victor Bucheli ◽

...

Keyword(s):

Linked Data ◽

Scientific Literature ◽

Query Language ◽

Biomedical Ontology ◽

Sparql Query ◽

Biomedical Literature ◽

Automatic Annotation ◽

Pubmed Central ◽

Web Technologies ◽

Aggregate Content

A significant portion of biomedical literature is represented in a manner that makes it difficult for consumers to find or aggregate content through a computational query. One approach to facilitate reuse of the scientific literature is to structure this information as linked data using standardized web technologies. In this paper we present the second version of Biotea, a semantic, linked data version of the open-access subset of PubMed Central that has been enhanced with specialized annotation pipelines that uses existing infrastructure from the National Center for Biomedical Ontology. We expose our models, services, software and datasets. Our infrastructure enables manual and semi-automatic annotation, resulting data are represented as RDF-based linked data and can be readily queried using the SPARQL query language. We illustrate the utility of our system with several use cases. Our datasets, methods and techniques are available at http://biotea.github.io.

Download Full-text