When is the Peak Performance Reached? An Analysis of RDF Triple Stores

With significant growth in RDF datasets, application developers demand online availability of these datasets to meet the end users’ expectations. Various interfaces are available for querying RDF data using SPARQL query language. Studies show that SPARQL end-points may provide high query runtime performance at the cost of low availability. For example, it has been observed that only 32.2% of public endpoints have a monthly uptime of 99–100%. One possible reason for this low availability is the high workload experienced by these SPARQL endpoints. As complete query execution is performed at server side (i.e., SPARQL endpoint), this high query processing workload may result in performance degradation or even a service shutdown. We performed extensive experiments to show the query processing capabilities of well-known triple stores by using their SPARQL endpoints. In particular, we stressed these triple stores with multiple parallel requests from different querying agents. Our experiments revealed the maximum query processing capabilities of these triple stores after which point they lead to service shutdowns. We hope this analysis will help triple store developers to design workload-aware RDF engines to improve the availability of their public endpoints with high throughput.

Download Full-text

Query Operators in Temporal XML Databases

Encyclopedia of Database Technologies and Applications ◽

10.4018/978-1-59140-560-3.ch083 ◽

2005 ◽

pp. 500-505

Author(s):

Kjetil Nørvåg

Keyword(s):

Query Processing ◽

Data Warehouse ◽

Query Language ◽

Database System ◽

Mass Storage ◽

Xml Database ◽

Xml Databases ◽

Temporal Query ◽

Temporal Query Language ◽

The Cost

The amount of data available in XML is rapidly increasing and at the same time the price of mass storage is rapidly decreasing, and this makes it possible to store larger amounts of data. The contents of a database or data warehouse are seldom static. New documents are created, documents are deleted and, more important, documents are updated. In many cases, one wants to be able to search in historical (old) versions, retrieve documents that were valid at a certain time, query changes to documents, and so forth. (Note that although this process is somewhat similar to general document versioning maintenance, the aspect of time makes possibilities and appropriate solutions different.) The “easiest” way to do this is to store all versions of all documents in the database and use a middleware layer to convert temporal query language statements into conventional statements, executed by an underlying database system (an example of such a system is TeXOR; Nørvåg, Limstrand, & Myklebust, 2003). Although this approach makes the introduction of temporal support easier, it can be difficult to achieve good performance: temporal query processing is in general costly, and the cost of storing the complete document versions can be high. Thus, a temporal XML database system is necessary.

Download Full-text

Keynote

ACM SIGMETRICS Performance Evaluation Review ◽

10.1145/3466826.3466829 ◽

2021 ◽

Vol 48 (4) ◽

pp. 3-3

Author(s):

Ingo Weber

Keyword(s):

Cost Estimation ◽

Estimation Method ◽

Main Topic ◽

System Throughput ◽

Distributed Ledger ◽

Smart Contract ◽

Distributed Ledger Technology ◽

Wide Range ◽

The Cost ◽

Application Developers

Blockchain is a novel distributed ledger technology. Through its features and smart contract capabilities, a wide range of application areas opened up for blockchain-based innovation [5]. In order to analyse how concrete blockchain systems as well as blockchain applications are used, data must be extracted from these systems. Due to various complexities inherent in blockchain, the question how to interpret such data is non-trivial. Such interpretation should often be shared among parties, e.g., if they collaborate via a blockchain. To this end, we devised an approach codify the interpretation of blockchain data, to extract data from blockchains accordingly, and to output it in suitable formats [1, 2]. This work will be the main topic of the keynote. In addition, application developers and users of blockchain applications may want to estimate the cost of using or operating a blockchain application. In the keynote, I will also discuss our cost estimation method [3, 4]. This method was designed for the Ethereum blockchain platform, where cost also relates to transaction complexity, and therefore also to system throughput.

Download Full-text

Server-Side Query Language for Protein Structure Similarity Searching

Advances in Intelligent and Soft Computing - Human – Computer Systems Interaction: Backgrounds and Applications 2 ◽

10.1007/978-3-642-23172-8_26 ◽

2012 ◽

pp. 395-415 ◽

Cited By ~ 4

Author(s):

B. Małysiak-Mrozek ◽

S. Kozielski ◽

D. Mrozek

Keyword(s):

Protein Structure ◽

Query Language ◽

Similarity Searching ◽

Server Side ◽

Structure Similarity

Download Full-text

Semantic Search Engine and Object Database Guidelines for Service Oriented Architecture Models

Technology Diffusion and Adoption ◽

10.4018/978-1-4666-2791-8.ch015 ◽

2013 ◽

pp. 225-250

Author(s):

Omar Shehab ◽

Ali Hussein Saleh Zolait

Keyword(s):

Search Engine ◽

Service Oriented Architecture ◽

Query Language ◽

Semantic Search ◽

Semantic Query ◽

Qualitative Survey ◽

Ontology Language ◽

Service Oriented ◽

The Cost ◽

Semantic Search Engine

In this paper, the authors propose a Semantic Search Engine, which retrieves software components precisely and uses techniques to store these components in a database, such as ontology technology. The engine uses semantic query language to retrieve these components semantically. The authors use an exploratory study where the proposed method is mapped between object-oriented concepts and web ontology language. A qualitative survey and interview techniques were used to collect data. The findings after implementing this research are a set of guidelines, a model, and a prototype to describe the semantic search engine system. The guidelines provided help software developers and companies reduce the cost, time, and risks of software development.

Download Full-text

DocBase

Innovations in Database Design, Web Applications, and Information Systems Management ◽

10.4018/978-1-4666-2044-5.ch014 ◽

2013 ◽

pp. 365-393

Author(s):

Arijit Sengupta ◽

Ramesh Venkataraman

Keyword(s):

Query Processing ◽

Query Language ◽

Formal Model ◽

Query Languages ◽

Prototype System ◽

Storage And Retrieval ◽

Visual Query Formulation ◽

Visual Query ◽

Nested Relations ◽

Entity Relationship

This chapter introduces a complete storage and retrieval architecture for a database environment for XML documents. DocBase, a prototype system based on this architecture, uses a flexible storage and indexing technique to allow highly expressive queries without the necessity of mapping documents to other database formats. DocBase is an integration of several techniques that include (i) a formal model called Heterogeneous Nested Relations (HNR), (ii) a conceptual model XER (Extensible Entity Relationship), (ii) formal query languages (Document Algebra and Calculus), (iii) a practical query language (Document SQL or DSQL), (iv) a visual query formulation method with QBT (Query By Templates), and (v) the DocBase query processing architecture. This paper focuses on the overall architecture of DocBase including implementation details, describes the details of the query-processing framework, and presents results from various performance tests. The paper summarizes experimental and usability analyses to demonstrate its feasibility as a general architecture for native as well as embedded document manipulation methods.

Download Full-text

Polynomial Batch Codes for Efficient IT-PIR

Proceedings on Privacy Enhancing Technologies ◽

10.1515/popets-2016-0036 ◽

2016 ◽

Vol 2016 (4) ◽

pp. 202-218 ◽

Cited By ~ 5

Author(s):

Ryan Henry

Keyword(s):

Private Information ◽

Research Literature ◽

Private Information Retrieval ◽

New Techniques ◽

Computation Cost ◽

Trade Off ◽

Server Side ◽

Efficient Information ◽

The Cost ◽

Database Servers

Abstract Private information retrieval (PIR) is a way for clients to query a remote database without the database holder learning the clients’ query terms or the responses they generate. Compelling applications for PIR are abound in the cryptographic and privacy research literature, yet existing PIR techniques are notoriously inefficient. Consequently, no such PIRbased application to date has seen real-world at-scale deployment. This paper proposes new “batch coding” techniques to help address PIR’s efficiency problem. The new techniques exploit the connection between ramp secret sharing schemes and efficient information-theoretically secure PIR (IT-PIR) protocols. This connection was previously observed by Henry, Huang, and Goldberg (NDSS 2013), who used ramp schemes to construct efficient “batch queries” with which clients can fetch several database records for the same cost as fetching a single record using a standard, non-batch query. The new techniques in this paper generalize and extend those of Henry et al. to construct “batch codes” with which clients can fetch several records for only a fraction the cost of fetching a single record using a standard non-batch query over an unencoded database. The batch codes are highly tuneable, providing a means to trade off (i) lower server-side computation cost, (ii) lower server-side storage cost, and/or (iii) lower uni- or bi-directional communication cost, in exchange for a comparatively modest decrease in resilience to Byzantine database servers.

Download Full-text

QUERYING METABOLISM UNDER DIFFERENT PHYSIOLOGICAL CONSTRAINTS

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720010004604 ◽

2010 ◽

Vol 08 (02) ◽

pp. 247-293 ◽

Cited By ~ 3

Author(s):

ALI CAKMAK ◽

GULTEKIN OZSOYOGLU ◽

RICHARD W. HANSON

Keyword(s):

Query Processing ◽

Physiological Condition ◽

Query Language ◽

Visual Representations ◽

Living Cells ◽

Physiological Conditions ◽

Physiological Constraints ◽

Environmental Perturbations ◽

Wide Range

Metabolism is a representation of the biochemical principles that govern the production, consumption, degradation, and biosynthesis of metabolites in living cells. Organisms respond to changes in their physiological conditions or environmental perturbations (i.e. constraints) via cooperative implementation of such principles. Querying inner working principles of metabolism under different constraints provides invaluable insights for both researchers and educators. In this paper, we propose a metabolism query language (MQL) and discuss its query processing. MQL enables researchers to explore the behavior of the metabolism with a wide-range of predicates including dietary and physiological condition specifications. The query results of MQL are enriched with both textual and visual representations, and its query processing is completely tailored based on the underlying metabolic principles.

Download Full-text

Intelligent Querying Techniques for Sensor Data Fusion

Intelligent Techniques for Warehousing and Mining Sensor Network Data ◽

10.4018/978-1-60566-328-9.ch010 ◽

2010 ◽

pp. 213-233 ◽

Cited By ~ 1

Author(s):

Shi-Kuo Chang ◽

Gennaro Costagliola ◽

Erland Jungert ◽

Karin Camara

Keyword(s):

Data Fusion ◽

Query Processing ◽

Query Language ◽

Query Languages ◽

Multimedia Data ◽

Sensor Data ◽

Multiple Sources ◽

Sensor Data Fusion ◽

Temporal Query Language ◽

Processing Techniques

Sensor data fusion imposes a number of novel requirements on query languages and query processing techniques. A spatial/temporal query language called SQL has been proposed to support the retrieval of multimedia information from multiple sources and databases. This chapter investigates intelligent querying techniques including fusion techniques, multimedia data transformations, interactive progressive query building and SQL query processing techniques using sensor data fusion. The authors illustrate and discuss tasks and query patterns for information fusion, provide a number of examples of iterative queries and show the effectiveness of SQL in a command-action scenario.

Download Full-text

Real-Time Query Processing on Live Videos in Networks of Distributed Cameras

Research, Practice, and Educational Advancements in Telecommunications and Networking ◽

10.4018/978-1-4666-0050-8.ch003 ◽

2012 ◽

pp. 27-48 ◽

Cited By ~ 1

Author(s):

Rui Peng ◽

Alex J. Aved ◽

Kien A. Hua

Keyword(s):

Query Processing ◽

Real Time ◽

Database Management ◽

High Speed ◽

Query Language ◽

Optimization Technique ◽

Database Management System ◽

Live Video ◽

Storage Devices ◽

Distributed Cameras

With the proliferation of inexpensive cameras and the availability of high-speed wired and wireless networks, systems of distributed cameras are becoming an enabling technology for a broad range of interdisciplinary applications in domains such as public safety and security, manufacturing, transportation, and healthcare. Today’s live video processing systems on networks of distributed cameras, however, are designed for specific classes of applications. To provide a generic query processing platform for applications of distributed camera networks, the authors designed and implemented a new class of general purpose database management systems, the live video database management system (LVDBMS). The authors view networked video cameras as a special class of interconnected storage devices, and allow the user to formulate ad hoc queries expressed over real-time live video feeds. This paper introduces their system and presents the live video data model, the query language, and the query processing and optimization technique.

Download Full-text

RDF Storage and Querying

Information Retrieval and Management ◽

10.4018/978-1-5225-5191-1.ch019 ◽

2018 ◽

pp. 415-433 ◽

Cited By ~ 1

Author(s):

Jingwei Cheng ◽

Z. M. Ma ◽

Qiang Tong

Keyword(s):

Query Processing ◽

Query Language ◽

Web Resources ◽

Storage And Retrieval ◽

Language Constructs

RDF plays an important role in representing Web resources in a natural and flexible way. As the amount of RDF datasets increasingly growing, storing and querying theses data have attracted the attention of more and more researchers. In this chapter, we first make a review of approaches for query processing of RDF datasets. We categorize existing methods as two classes, those making use of RDBMS to implement the storage and retrieval, and those devising their own native storage schemas. They are called Relational RDF Stores and Native Stores respectively. Secondly, we survey some important extensions of SPARQL, standard query language for RDF, which extend the expressing power of SPARQL to allow more sophisticated language constructs that meet the needs from various application scenarios.

Download Full-text