Framework for GeoSpatial Query Processing by Integrating Cassandra With Hadoop

Author(s):  
S. Vasavi ◽  
Mallela Padma Priya ◽  
Anu A. Gokhale

We are moving towards digitization and making all our devices, such as sensors and cameras, connected to internet, producing bigdata. This bigdata has variety of data and has paved the way to the emergence of NoSQL databases, like Cassandra, for achieving scalability and availability. Hadoop framework has been developed for storing and processing distributed data. In this chapter, the authors investigated the storage and retrieval of geospatial data by integrating Hadoop and Cassandra using prefix-based partitioning and Cassandra's default partitioning algorithm (i.e., Murmur3partitioner) techniques. Geohash value is generated, which acts as a partition key and also helps in effective search. Hence, the time taken for retrieving data is optimized. When users request spatial queries like finding nearest locations, searching in Cassandra database starts using both partitioning techniques. A comparison on query response time is made so as to verify which method is more effective. Results show the prefix-based partitioning technique is more efficient than Murmur3 partitioning technique.

2019 ◽  
pp. 353-388
Author(s):  
S. Vasavi ◽  
Mallela Padma Priya ◽  
Anu A. Gokhale

We are moving towards digitization and making all our devices, such as sensors and cameras, connected to internet, producing bigdata. This bigdata has variety of data and has paved the way to the emergence of NoSQL databases, like Cassandra, for achieving scalability and availability. Hadoop framework has been developed for storing and processing distributed data. In this chapter, the authors investigated the storage and retrieval of geospatial data by integrating Hadoop and Cassandra using prefix-based partitioning and Cassandra's default partitioning algorithm (i.e., Murmur3partitioner) techniques. Geohash value is generated, which acts as a partition key and also helps in effective search. Hence, the time taken for retrieving data is optimized. When users request spatial queries like finding nearest locations, searching in Cassandra database starts using both partitioning techniques. A comparison on query response time is made so as to verify which method is more effective. Results show the prefix-based partitioning technique is more efficient than Murmur3 partitioning technique.


2019 ◽  
Vol 8 (3) ◽  
pp. 1-25 ◽  
Author(s):  
S. Vasavi ◽  
V.N. Priyanka G ◽  
Anu A. Gokhale

Nowadays we are moving towards digitization and making all our devices produce a variety of data, this has paved the way to the emergence of NoSQL databases like Cassandra, MongoDB, and Redis. Big data such as geospatial data allows for geospatial analytics in applications such as tourism, marketing, and rural development. Spark frameworks provide operators storage and processing of distributed data. This article proposes “GeoRediSpark” to integrate Redis with Spark. Redis is a key-value store that uses an in-memory store, hence integrating Redis with Spark can extend the real-time processing of geospatial data. The article investigates storage and retrieval of the Redis built-in geospatial queries and has added two new geospatial operators, GeoWithin and GeoIntersect, to enhance the capabilities of Redis. Hashed indexing is used to improve the processing performance. A comparison on Redis metrics with three benchmark datasets is made. Hashset is used to display geographic data. The output of geospatial queries is visualized to the type of place and the nature of the query using Tableau.


2019 ◽  
Vol 13 (2) ◽  
pp. 14-31
Author(s):  
Mamdouh Alenezi ◽  
Muhammad Usama ◽  
Khaled Almustafa ◽  
Waheed Iqbal ◽  
Muhammad Ali Raza ◽  
...  

NoSQL-based databases are attractive to store and manage big data mainly due to high scalability and data modeling flexibility. However, security in NoSQL-based databases is weak which raises concerns for users. Specifically, security of data at rest is a high concern for the users deployed their NoSQL-based solutions on the cloud because unauthorized access to the servers will expose the data easily. There have been some efforts to enable encryption for data at rest for NoSQL databases. However, existing solutions do not support secure query processing, and data communication over the Internet and performance of the proposed solutions are also not good. In this article, the authors address NoSQL data at rest security concern by introducing a system which is capable to dynamically encrypt/decrypt data, support secure query processing, and seamlessly integrate with any NoSQL- based database. The proposed solution is based on a combination of chaotic encryption and Order Preserving Encryption (OPE). The experimental evaluation showed excellent results when integrated the solution with MongoDB and compared with the state-of-the-art existing work.


Author(s):  
A. K. Tripathi ◽  
S. Agrawal ◽  
R. D. Gupta

Abstract. Sharing and management of geospatial data among different communities and users is a challenge which is suitably addressed by Spatial Data Infrastructure (SDI). SDI helps people in the discovery, editing, processing and visualization of spatial data. The user can download the data from SDI and process it using the local resources. However, large volume and heterogeneity of data make this processing difficult at the client end. This problem can be resolved by orchestrating the Web Processing Service (WPS) with SDI. WPS is a service interface through which geoprocessing can be done over the internet. In this paper, a WPS enabled SDI framework with OGC compliant services is conceptualized and developed. It is based on the three tier client server architecture. OGC services are provided through GeoServer. WPS extension of GeoServer is used to perform geospatial data processing and analysis. The developed framework is utilized to create a public health SDI prototype using Open Source Software (OSS). The integration of WPS with SDI demonstrates how the various data analysis operations of WPS can be performed over the web on distributed data sources provided by SDI.


Author(s):  
Arijit Sengupta ◽  
Ramesh Venkataraman

This chapter introduces a complete storage and retrieval architecture for a database environment for XML documents. DocBase, a prototype system based on this architecture, uses a flexible storage and indexing technique to allow highly expressive queries without the necessity of mapping documents to other database formats. DocBase is an integration of several techniques that include (i) a formal model called Heterogeneous Nested Relations (HNR), (ii) a conceptual model XER (Extensible Entity Relationship), (ii) formal query languages (Document Algebra and Calculus), (iii) a practical query language (Document SQL or DSQL), (iv) a visual query formulation method with QBT (Query By Templates), and (v) the DocBase query processing architecture. This paper focuses on the overall architecture of DocBase including implementation details, describes the details of the query-processing framework, and presents results from various performance tests. The paper summarizes experimental and usability analyses to demonstrate its feasibility as a general architecture for native as well as embedded document manipulation methods.


Sign in / Sign up

Export Citation Format

Share Document