scholarly journals State-of-the-Art Geospatial Information Processing in NoSQL Databases

2020 ◽  
Vol 9 (5) ◽  
pp. 331
Author(s):  
Dongming Guo ◽  
Erling Onstein

Geospatial information has been indispensable for many application fields, including traffic planning, urban planning, and energy management. Geospatial data are mainly stored in relational databases that have been developed over several decades, and most geographic information applications are desktop applications. With the arrival of big data, geospatial information applications are also being modified into, e.g., mobile platforms and Geospatial Web Services, which require changeable data schemas, faster query response times, and more flexible scalability than traditional spatial relational databases currently have. To respond to these new requirements, NoSQL (Not only SQL) databases are now being adopted for geospatial data storage, management, and queries. This paper reviews state-of-the-art geospatial data processing in the 10 most popular NoSQL databases. We summarize the supported geometry objects, main geometry functions, spatial indexes, query languages, and data formats of these 10 NoSQL databases. Moreover, the pros and cons of these NoSQL databases are analyzed in terms of geospatial data processing. A literature review and analysis showed that current document databases may be more suitable for massive geospatial data processing than are other NoSQL databases due to their comprehensive support for geometry objects and data formats and their performance, geospatial functions, index methods, and academic development. However, depending on the application scenarios, graph databases, key-value, and wide column databases have their own advantages.

Author(s):  
Renu Chaudhary ◽  
Gagangeet Singh

NoSQL databases (commonly interpreted by developers as „not only SQL databases‟ and not „no SQL‟) is an emerging alternative to the most widely used relational databases. As the name suggests, it does not completely replace SQL but compliments it in such a way that they can co-exist. In this paper we will be discussing the NoSQL data model, types of NoSQL data stores, characteristics and features of each data store, query languages used in NoSQL, advantages and disadvantages of NoSQL over RDBMS and the future prospects of NoSQL. Motivation/Background:NoSQL systems exhibit the ability to store and index arbitrarily big data sets while enabling a large amount of concurrent user requests. Method:Many people think NoSQL is a derogatory term created to poke at SQL. In reality, the term means Not Only SQL. The idea is that both technologies can coexist and each has its place. Results:Large-scale data processing (parallel processing over distributed systems); Embedded IR (basic machine-to-machine information look-up & retrieval); Exploratory analytics on semi-structured data (expert level); Large volume data storage (unstructured, semi-structured, small-packet structured). Conclusions:This study report motivation to provide an independent understanding of the strengths and weaknesses of various NoSQL database approaches to supporting applications that process huge volumes of data; as well as to provide a global overview of this non-relational  NoSQL databases.


PLoS ONE ◽  
2021 ◽  
Vol 16 (8) ◽  
pp. e0255562
Author(s):  
Eman Khashan ◽  
Ali Eldesouky ◽  
Sally Elghamrawy

The growing popularity of big data analysis and cloud computing has created new big data management standards. Sometimes, programmers may interact with a number of heterogeneous data stores depending on the information they are responsible for: SQL and NoSQL data stores. Interacting with heterogeneous data models via numerous APIs and query languages imposes challenging tasks on multi-data processing developers. Indeed, complex queries concerning homogenous data structures cannot currently be performed in a declarative manner when found in single data storage applications and therefore require additional development efforts. Many models were presented in order to address complex queries Via multistore applications. Some of these models implemented a complex unified and fast model, while others’ efficiency is not good enough to solve this type of complex database queries. This paper provides an automated, fast and easy unified architecture to solve simple and complex SQL and NoSQL queries over heterogeneous data stores (CQNS). This proposed framework can be used in cloud environments or for any big data application to automatically help developers to manage basic and complicated database queries. CQNS consists of three layers: matching selector layer, processing layer, and query execution layer. The matching selector layer is the heart of this architecture in which five of the user queries are examined if they are matched with another five queries stored in a single engine stored in the architecture library. This is achieved through a proposed algorithm that directs the query to the right SQL or NoSQL database engine. Furthermore, CQNS deal with many NoSQL Databases like MongoDB, Cassandra, Riak, CouchDB, and NOE4J databases. This paper presents a spark framework that can handle both SQL and NoSQL Databases. Four scenarios’ benchmarks datasets are used to evaluate the proposed CQNS for querying different NoSQL Databases in terms of optimization process performance and query execution time. The results show that, the CQNS achieves best latency and throughput in less time among the compared systems.


Author(s):  
Sangeeta Gupta

The massive amount of data collected by various fields is a challenging aspect for analysis using the available storage technologies. Relational databases are a traditional approach of data storage more suitable for structured data formats and are constrained by ACID properties. As the modern world data in the form of word documents, pdf files, audio and video formats is unstructured, where tables and schema definition is not a major concern. Relational databases such as Mysql may not be suitable to serve such Bigdata. An alternate approach is to use the emerging Nosql databases. This paper presents a comparative analysis of Nosql types such as Hbase, Mongodb, Simple DB and Big Table with relational database like Mysql and specifies their limitations when applied to real world problems. It also proposes solution to overcome these limitations using an integrated data store which serve to be beneficial over the mentioned Nosql and Mysql stores in terms of efficiently implementing simple and complex queries yielding better performance.


Author(s):  
Ganesh Chandra Deka

NoSQL databases are designed to meet the huge data storage requirements of cloud computing and big data processing. NoSQL databases have lots of advanced features in addition to the conventional RDBMS features. Hence, the “NoSQL” databases are popularly known as “Not only SQL” databases. A variety of NoSQL databases having different features to deal with exponentially growing data-intensive applications are available with open source and proprietary option. This chapter discusses some of the popular NoSQL databases and their features on the light of CAP theorem.


Author(s):  
Berkay Aydin ◽  
Vijay Akkineni ◽  
Rafal A Angryk

With the ever-growing nature of spatiotemporal data, it is inevitable to use non-relational and distributed database systems for storing massive spatiotemporal datasets. In this chapter, the important aspects of non-relational (NoSQL) databases for storing large-scale spatiotemporal trajectory data are investigated. Mainly, two data storage schemata are proposed for storing trajectories, which are called traditional and partitioned data models. Additionally spatiotemporal and non-spatiotemporal indexing structures are designed for efficiently retrieving data under different usage scenarios. The results of the experiments exhibit the advantages of utilizing data models and indexing structures for various query types.


Author(s):  
Longgang Xiang ◽  
Xiaotian Shao ◽  
Dehao Wang

Supporting large amounts of spatial data is a significant characteristic of modern databases. However, unlike some mature relational databases, such as Oracle and PostgreSQL, most of current burgeoning NoSQL databases are not well designed for storing geospatial data, which is becoming increasingly important in various fields. In this paper, we propose a novel method to provide R-tree index, as well as corresponding spatial range query and nearest neighbour query functions, for MongoDB, one of the most prevalent NoSQL databases. First, after in-depth analysis of MongoDB’s features, we devise an efficient tabular document structure which flattens R-tree index into MongoDB collections. Further, relevant mechanisms of R-tree operations are issued, and then we discuss in detail how to integrate R-tree into MongoDB. Finally, we present the experimental results which show that our proposed method out-performs the built-in spatial index of MongoDB. Our research will greatly facilitate big data management issues with MongoDB in a variety of geospatial information applications.


2017 ◽  
Vol 2 (2) ◽  
pp. 103
Author(s):  
Danny Kriestanto ◽  
Alif Benden Arnado

The new technology of database has moved forward the relational databases. Now, the massive and unstructured data encourage experts to create a new type of database without using query. One of this technology is called NoSQL (Not Only SQL). One of the developing RDBMS that using this technique is MongoDB, which already supporting data storage technology that is no longer need for structured tables and rigid-typed of data. The schema was made flexible to handle the changes of data. The MongoDB data collecting characteristics in the form of arrays is considered suitable for the implementation of boarding house searching where each of the boarding houses have their own scenario structures. MongoDB also supports several programming language, including PHP with Bootstrap material as interface. The results of the research showed that there are alot of difference in implementing a NoSQL database with the regular relational one. NoSQL databases considered alot more complicated in structure, data type, even the CRUD system. The results also showed that in order to view an array inside another array will need two processes.


Author(s):  
Longgang Xiang ◽  
Xiaotian Shao ◽  
Dehao Wang

Supporting large amounts of spatial data is a significant characteristic of modern databases. However, unlike some mature relational databases, such as Oracle and PostgreSQL, most of current burgeoning NoSQL databases are not well designed for storing geospatial data, which is becoming increasingly important in various fields. In this paper, we propose a novel method to provide R-tree index, as well as corresponding spatial range query and nearest neighbour query functions, for MongoDB, one of the most prevalent NoSQL databases. First, after in-depth analysis of MongoDB’s features, we devise an efficient tabular document structure which flattens R-tree index into MongoDB collections. Further, relevant mechanisms of R-tree operations are issued, and then we discuss in detail how to integrate R-tree into MongoDB. Finally, we present the experimental results which show that our proposed method out-performs the built-in spatial index of MongoDB. Our research will greatly facilitate big data management issues with MongoDB in a variety of geospatial information applications.


Author(s):  
Yi-Cheng Tu ◽  
Gang Ding

Database administration (tuning) is the process of adjusting database configurations in order to accomplish desirable performance goals. This job is performed by human operators called database administrators (DBAs) who are generally well-paid, and are becoming more and more expensive with the increasing complexity and scale of modern databases. There has been considerable effort dedicated to reducing such cost (which often dominates the total ownership cost of missioncritical databases) by making database tuning more automated and transparent to users (Chaudhuri et al, 2004; Chaudhuri and Weikum, 2006). Research in this area seeks ways to automate the hardware deployment, physical database design, parameter configuration, and resource management in such systems. The goal is to achieve acceptable performance on the whole system level without (or with limited) human intervention. According to Weikum et al. (2002), problems in this category can be stated as: workload × configuration (?) ? performance which means that, given the features of the incoming workload to the database, we are to find the right settings for all system knobs such that the performance goals are satisfied The following two are representatives of a series of such tuning problems in different databases: • Problem 1: Maintenance of multi-class servicelevel agreements (SLA) in relational databases. Database service providers usually offer various levels of performance guarantees to requests from different groups of customers. Fulfillment of such guarantees (SLAs) is accomplished by allocating different amounts of system resources to different queries. For example, query response time is negatively related to the amount of memory buffer assigned to that query. We need to dynamically allocate memory to individual queries such that the absolute or relative response times of queries from different users are satisfied. • Problem 2: Load shedding in stream databases. Stream databases are used for processing data generated continuously from sources such as a sensor network. In streaming databases, data processing delay, i.e., the time consumed to process a data point, is the most critical performance metric (Tatbul et al., 2003). The ability to remain within a desired level of delay is significantly hampered under situations of overloading (caused by bursty data arrivals and time-varying unit data processing cost). When overloaded, some data is discarded (i.e., load shedding) in order to keep pace with the incoming load. The system needs to continuously adjust the amount of data to be discarded such that 1) delay is maintained under a desirable level; 2) data is not discarded unnecessarily.


Author(s):  
Kornelije Rabuzin

In the past few years, many NoSQL databases have emerged, including graph databases. NoSQL databases have certain advantages and they can be used in certain domains as an alternative to relational databases. In order to use graph databases, one needs to be familiar with specific languages like Cypher Query Language (CQL) or Gremlin. However, some statements in CQL can be considered too complex for end users as it is shown later on. Because of that, the main idea of this chapter is to explore two other languages for graph databases. One of them is new and it is used to pose queries visually. Since CQL does not support recursion, views, etc., the other language is used to show how to use recursion and views on a graph database.


Sign in / Sign up

Export Citation Format

Share Document