Optimizing read convoys in main-memory query processing

AbstractWe propose a simple model for distributed query processing based on the concept of a distributed array. Such an array has fields of some data type whose values can be stored on different machines. It offers operations to manipulate all fields in parallel within the distributed algebra. The arrays considered are one-dimensional and just serve to model a partitioned and distributed data set. Distributed arrays rest on a given set of data types and operations called the basic algebra implemented by some piece of software called the basic engine. It provides a complete environment for query processing on a single machine. We assume this environment is extensible by types and operations. Operations on distributed arrays are implemented by one basic engine called the master which controls a set of basic engines called the workers. It maps operations on distributed arrays to the respective operations on their fields executed by workers. The distributed algebra is completely generic: any type or operation added in the extensible basic engine will be immediately available for distributed query processing. To demonstrate the use of the distributed algebra as a language for distributed query processing, we describe a fairly complex algorithm for distributed density-based similarity clustering. The algorithm is a novel contribution by itself. Its complete implementation is shown in terms of the distributed algebra and the basic algebra. As a basic engine the Secondo system is used, a rich environment for extensible query processing, providing useful tools such as main memory M-trees, graphs, or a DBScan implementation.

Download Full-text

Semantic Analytics in Intelligence

Advances in Database Research - Advanced Topics in Database Research, Volume 5 ◽

10.4018/978-1-59140-935-9.ch020 ◽

2011 ◽

pp. 401-419 ◽

Cited By ~ 3

Author(s):

Boanerges Aleman-Meza ◽

Amit P. Sheth ◽

Devanand Palaniswami ◽

Matthew Eavenson ◽

I. Budak Arpinar

Keyword(s):

Query Processing ◽

Main Memory ◽

Insider Threats ◽

Semantic Relationships ◽

Semantic Metadata ◽

Domain Specific ◽

Metadata Extraction ◽

Semantic Web Technology ◽

Ontological Approach ◽

Relevance Measure

We describe an ontological approach for determining the relevance of documents based on the underlying concept of exploiting complex semantic relationships among real-world entities. This research builds upon semantic metadata extraction and annotation, practical domain-specific ontology creation, main-memory query processing, and the notion of semantic association. A prototype application illustrates the approach by supporting the identification of insider threats for document access. In this scenario, we describe how investigative assignments performed by intelligence analysts are captured into a context of investigation by including concepts andrelationships from the ontology. A relevance measure for documents is computed using semantic analytics techniques. Additionally, a graph-based visualization component allows exploration of potential document access beyond the ‘need to know’. We also discuss how a commercial product using Semantic Web technology, Semagix Freedom, is used for metadata extraction when designing and populating an ontology from heterogeneous sources.

Download Full-text

Querying data warehouses efficiently using the Bitmap Join Index OLAP Tool

CLEI electronic journal ◽

10.19153/cleiej.15.2.7 ◽

2012 ◽

Vol 15 (2) ◽

Author(s):

Anderson Chaves Carniel ◽

Thiago Luís Lopes Siqueira

Keyword(s):

Query Processing ◽

Operating Systems ◽

Business Intelligence ◽

Main Memory ◽

Performance Gain ◽

Data Warehouses ◽

Speed Up ◽

Performance Results ◽

Reasonable Use ◽

A Performance

Data warehouse and OLAP are core aspects of business intelligence environments, since the former store integrated and time-variant data, while the latter enables multidimensional queries, visualization and analysis. The bitmap join index has been recognized as an efficient mechanism to speed up queries over data warehouses. However, existing OLAP tools does not use strictly this index to improve the performance of query processing. In this paper, we introduce the BJIn OLAP Tool to efficiently perform OLAP queries over data warehouses, such as roll-up, drill-down, slice-and-dice and pivoting, by employing the bitmap join index. The BJIn OLAP Tool was implemented and tested through a performance evaluation to assess its efficiency and to corroborate the feasibility of adopting the bitmap join index to execute OLAP queries. The performance results reported that our BJIn OLAP Tool provided a performance gain that ranged from 31% up to 97% if compared to existing solutions regarding the query processing. Our tool has proven not only to efficiently process queries, but also to process OLAP operations on the server and client sides, for different volumes of data and taking into account different operating systems. Besides, it provides a reasonable use of the main memory and enables new rows to be appended to bitmap join indices.

Download Full-text