A Subject Partitioning Based SPARQL Query Engine and Its NoSQL Implementation

AbstractOntology-mediated querying (OMQ) provides a paradigm for query answering according to which users not only query records at the database but also query implicit information inferred from ontology. A key challenge in OMQ is that the implicit information may be infinite, which cannot be stored at the database and queried by off -the -shelf query engine. The commonly adopted technique to deal with infinite entailments is query rewriting, which, however, comes at the cost of query rewriting at runtime. In this work, the partial materialization method is proposed to ensure that the extension is always finite. The partial materialization technology does not rewrite query but instead computes partial consequences entailed by ontology before the online query. Besides, a query analysis algorithm is designed to ensure the completeness of querying rooted and Boolean conjunctive queries over partial materialization. We also soundly and incompletely expand our method to support highly expressive ontology language, OWL 2 DL. Finally, we further optimize the materialization efficiency by role rewriting algorithm and implement our approach as a prototype system SUMA by integrating off-the-shelf efficient SPARQL query engine. The experiments show that SUMA is complete on each test ontology and each test query, which is the same as Pellet and outperforms PAGOdA. Besides, SUMA is highly scalable on large datasets.

Download Full-text

DESERT: A Continuous SPARQL Query Engine for On-Demand Query Answering

International Journal of Semantic Computing ◽

10.1142/s1793351x18400172 ◽

2018 ◽

Vol 12 (03) ◽

pp. 373-397 ◽

Cited By ~ 1

Author(s):

Farah Karim ◽

Ioanna Lytra ◽

Christian Mader ◽

Sören Auer ◽

Maria-Esther Vidal

Keyword(s):

Query Processing ◽

Window Size ◽

Sparql Query ◽

Knowledge Graph ◽

Stream Data ◽

Industrial Manufacturing ◽

Query Engine ◽

Novel Method ◽

Continuous Query Processing ◽

Iot Devices

The Internet of Things (IoT) has been rapidly adopted in many domains ranging from household appliances e.g. ventilation, lighting, and heating, to industrial manufacturing and transport networks. Despite the, enormous benefits of optimization, monitoring, and maintenance rendered by IoT devices, an ample amount of data is generated continuously. Semantically describing IoT generated data using ontologies enables a precise interpretation of this data. However, ontology-based descriptions tremendously increase the size of IoT data and in presence of repeated sensor measurements, a large amount of the data are duplicates that do not contribute to new insights during query processing or IoT data analytics. In order to ensure that only required ontology-based descriptions are generated, we devise a knowledge-driven approach named DESERT that is able to on-[Formula: see text]emand factoriz[Formula: see text] and [Formula: see text]emantically [Formula: see text]nrich st[Formula: see text]eam da[Formula: see text]a. DESERT resorts to a knowledge graph to describe IoT stream data; it utilizes only the data that is required to answer an input continuous SPARQL query and applies a novel method of data factorization to reduce duplicated measurements in the knowledge graph. The performance of DESERT is empirically studied on a collection of continuous SPARQL queries from SRBench, a benchmark of IoT stream data and continuous SPARQL queries. Furthermore, data streams with various combinations of uniform and varying data stream speeds and streaming window size dimensions are considered in the study. Experimental results suggest that DESERT is capable of speeding up continuous query processing while creates knowledge graphs that include no replications.

Download Full-text

Representing and querying multiple ontologies with Contextual Logic Programming

Computer Science and Information Systems ◽

10.2298/csis0802039l ◽

2008 ◽

Vol 5 (2) ◽

pp. 39-62

Author(s):

Nuno Lopes ◽

Cláudio Fernandes ◽

Salvador Abreu

Keyword(s):

Logic Programming ◽

Semantic Integration ◽

Sparql Query ◽

Multiple Sources ◽

Data Repositories ◽

Query Engine ◽

Multiple Ontologies

The system presented in this paper uses Contextual Logic Programming as a computational hub for representing and reasoning over knowledge modeled by web ontologies, integrating the approach with similar mechanisms which we already developed. As a result of its Logic Programming heritage, the system may also recursively interrogate other ontologies or data repositories, providing a semantic integration of multiple sources. The components required to behave as a SPARQL query engine are explained and examples of integration of deferent sources are shown - in particular, the case of multiple OWL ontologies is discussed. .

Download Full-text

Dataset Characteristics Identification for Federated SPARQL Query

Scientific Journal of Informatics ◽

10.15294/sji.v6i1.17258 ◽

2019 ◽

Vol 6 (1) ◽

pp. 23-33

Author(s):

Nur Aini Rakhmawati ◽

Lutfi Nur Fadzilah

Keyword(s):

Linked Data ◽

Sparql Query ◽

Research Papers ◽

Spreading Factor ◽

Query Engine ◽

Number Of Classes

Nowadays, the amount of data published in the RDF format is increasing. Federated SPARQL query engines that can query from multiple distributed SPARQL endpoints have been developed recently. A federated query engine usually has different performance compared to the others. One of the factors that affect the performance of the query engine is the characteristic of the accessed RDF dataset, such as the number of triples, the number of classes, the number of properties, the number of subjects, the number of entities, the number of objects, and the spreading factor of a dataset. The aim of this work is to identify the characteristic of RDF dataset and create a query set for evaluating a federated engine. The study was conducted by identifying 16 datasets that used by ten research papers in Linked Data area.

Download Full-text

Keyword search over schema-less RDF datasets by SPARQL query compilation

Information Systems ◽

10.1016/j.is.2021.101814 ◽

2021 ◽

pp. 101814

Author(s):

Yenier T. Izquierdo ◽

Grettel M. García ◽

Elisa Menendez ◽

Luiz André P.P. Leme ◽

Angelo Neves ◽

...

Keyword(s):

Keyword Search ◽

Sparql Query

Download Full-text

An empirical evaluation of cost-based federated SPARQL query processing engines

Semantic Web ◽

10.3233/sw-200420 ◽

2021 ◽

pp. 1-26

Author(s):

Umair Qudus ◽

Muhammad Saleem ◽

Axel-Cyrille Ngonga Ngomo ◽

Young-Koo Lee

Keyword(s):

Query Processing ◽

Detailed Analysis ◽

Performance Metrics ◽

Empirical Evaluation ◽

Sparql Query ◽

Evaluation Metrics ◽

Future Cost ◽

Query Plan ◽

Fine Grained ◽

Runtime Performance

Finding a good query plan is key to the optimization of query runtime. This holds in particular for cost-based federation engines, which make use of cardinality estimations to achieve this goal. A number of studies compare SPARQL federation engines across different performance metrics, including query runtime, result set completeness and correctness, number of sources selected and number of requests sent. Albeit informative, these metrics are generic and unable to quantify and evaluate the accuracy of the cardinality estimators of cost-based federation engines. To thoroughly evaluate cost-based federation engines, the effect of estimated cardinality errors on the overall query runtime performance must be measured. In this paper, we address this challenge by presenting novel evaluation metrics targeted at a fine-grained benchmarking of cost-based federated SPARQL query engines. We evaluate five cost-based federated SPARQL query engines using existing as well as novel evaluation metrics by using LargeRDFBench queries. Our results provide a detailed analysis of the experimental outcomes that reveal novel insights, useful for the development of future cost-based federated SPARQL query processing engines.

Download Full-text