JOTR: Join-Optimistic Triple Reordering Approach for SPARQL Query Optimization on Big RDF Data

As the development of the semantic web, RDF data set has grown rapidly, thus causing the query problem of massive RDF. Using distributed technique to complete the SPARQL (Simple Protocol and RDF Query Language) Query is a new way of solving the large amounts of RDF query problem. At present, most of the RDF query strategies based on Hadoop have to use multiple MapReduce jobs to complete the task, resulting in waste of time. In order to overcome this drawback, MRQJ (using MapReduce to query and join) algorithm is proposed in the paper, which firstly uses a greedy strategy to generate join plan, then only one MapReduce job should be created to get the query results in SPARQL query execution. Finally, a contrast experiment on the LUBM (Lehigh University Benchmark) test data set is conducted, the results of which show that MRQJ method has a great advantage in the case that the query is more complicated.

Download Full-text

Selectivity Estimation of Correlated Properties in RDF Data for SPARQL Query Optimization

2009 Fifth International Conference on Semantics, Knowledge and Grid ◽

10.1109/skg.2009.49 ◽

2009 ◽

Author(s):

Bin Lv ◽

Xiaoyong Du ◽

Yan Wang

Keyword(s):

Query Optimization ◽

Sparql Query ◽

Selectivity Estimation ◽

Rdf Data

Download Full-text

SPARQL Query Optimization on Top of DHTs

Lecture Notes in Computer Science - The Semantic Web – ISWC 2010 ◽

10.1007/978-3-642-17746-0_27 ◽

2010 ◽

pp. 418-435 ◽

Cited By ~ 23

Author(s):

Zoi Kaoudi ◽

Kostis Kyzirakos ◽

Manolis Koubarakis

Keyword(s):

Query Optimization ◽

Sparql Query

Download Full-text

Distance-Based Triple Reordering for SPARQL Query Optimization

2017 IEEE 33rd International Conference on Data Engineering (ICDE) ◽

10.1109/icde.2017.227 ◽

2017 ◽

Cited By ~ 3

Author(s):

Marios Meimaris ◽

George Papastefanatos

Keyword(s):

Query Optimization ◽

Sparql Query

Download Full-text

On the Expressivity of ASK Queries in SPARQL

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i03.5700 ◽

2020 ◽

Vol 34 (03) ◽

pp. 3057-3064

Author(s):

Xiaowang Zhang ◽

Jan Van den Bussche ◽

Kewen Wang ◽

Heng Zhang ◽

Xuanxing Yang ◽

...

Keyword(s):

Query Optimization ◽

Systematic Study ◽

Expressive Power ◽

Sparql Query ◽

Rich Picture ◽

Boolean Queries ◽

Query Type

As a major query type in SPARQL, ASK queries are boolean queries and have found applications in several domains such as semantic SPARQL optimization. This paper is a first systematic study of the relative expressive power of various fragments of ASK queries in SPARQL. Among many new results, a surprising one is that the operator UNION is redundant for ASK queries. The results in this paper as a whole paint a rich picture for the expressivity of fragments of ASK queries with the four basic operators of SPARQL 1.0 possibly together with a negation. The work in this paper provides a guideline for future SPARQL query optimization and implementation.

Download Full-text

Map-Side Join Processing of SPARQL Queries Based on Abstract RDF Data Filtering

Journal of Database Management ◽

10.4018/jdm.2019010102 ◽

2019 ◽

Vol 30 (1) ◽

pp. 22-40 ◽

Cited By ~ 2

Author(s):

Minjae Song ◽

Hyunsuk Oh ◽

Seungmin Seo ◽

Kyong-Ho Lee

Keyword(s):

Query Processing ◽

Input Data ◽

General Trend ◽

Processing System ◽

Sparql Query ◽

Data Filtering ◽

Mapreduce Framework ◽

Query Plan ◽

Network Bandwidth ◽

Rdf Data

The amount of RDF data being published on the Web is increasing at a massive rate. MapReduce-based distributed frameworks have become the general trend in processing SPARQL queries against RDF data. Currently, query processing systems that use MapReduce have not been able to keep up with the increase of semantic annotated data, resulting in non-interactive SPARQL query processing. The principal reason is that intermediate query results from join operations in a MapReduce framework are so massive that they consume all available network bandwidth. In this article, the authors present an efficient SPARQL processing system that uses MapReduce and HBase. The system runs a job optimized query plan using their proposed abstract RDF data to decrease the number of jobs and also decrease the amount of input data. The authors also present an efficient algorithm of using Map-side joins while also using the abstract RDF data to filter out unneeded RDF data. Experimental results show that the proposed approach demonstrates better performance when processing queries with a large amount of input data than those found in previous works.

Download Full-text