Extension of the one-shot semijoin strategy to minimize data transmission cost in distributed query processing

1999 ◽  
Vol 114 (1-4) ◽  
pp. 1-21 ◽  
Author(s):  
Faïza Najjar ◽  
Yahya Slimani
2004 ◽  
Vol 9 (5-6) ◽  
pp. 205-234 ◽  
Author(s):  
Haiwei Ye ◽  
Brigitte Kerhervé ◽  
Gregor V Bochmann

2014 ◽  
Vol 10 (3) ◽  
pp. 226-244 ◽  
Author(s):  
Johannes Lorey

Purpose – The purpose of this study is to introduce several metrics that enable universal and fine-grained characterization of arbitrary Linked Data repositories. Publicly accessible SPARQL endpoints contain vast amounts of knowledge from a large variety of domains. However, oftentimes these endpoints are not configured to process specific workloads as efficiently as possible. Assisting users in leveraging SPARQL endpoints requires insight into functional and non-functional properties of these knowledge bases. Design/methodology/approach – This study presents comprehensive approaches for deriving these metrics. More specifically, the study utilizes concrete SPARQL queries to determine corresponding values. Furthermore, it validates and discusses the introduced metrics through extensive evaluation on real-world SPARQL endpoints. Findings – The evaluation determined that endpoints exhibit different characteristics. While it comes as no surprise that latency and throughput are influenced by the network infrastructure, the costs for join operations depend on a number of factors that are not obvious to a data consumer. Moreover, as the author discusses mean, median and upper quartile values, it was found both endpoints behaving consistently as well as repositories offering varying levels of performance. Originality/value – On the one hand, the contribution of the authors work lies in assisting data consumers in evaluation of the quality of service of publicly available SPARQL endpoints. On the other hand, the performance metrics introduced in this study can also be considered as additional input features for distributed query processing frameworks. Moreover, the author provides a universal means for discerning characteristics of different SPARQL endpoints without the need of (synthetic or real-world) query workloads.


Sign in / Sign up

Export Citation Format

Share Document