scholarly journals A Framework for Executing Complex Querying for Relational and NoSQL Databases (CQNS)

Author(s):  
Eman A. Khashan ◽  
Ali I. El Desouky ◽  
Sally M. Elghamrawy

The increasing of data on the web poses major confrontations. The amount of stored data and query data sources have become needful features for huge data systems. There are a large number of platforms used to handle the NoSQL database model such as: Spark, H2O and Hadoop HDFS / MapReduce, which are suitable for controlling and managing the amount of big data. Developers of different applications impose data stores on difficult tasks by interacting with mixed data models through different APIs and queries. In this paper, a complex SQL Query and NoSQL (CQNS) framework that acts as an interpreter sends complex queries received from any data store to its corresponding executable engine called CQNS. The proposed framework supports application queries and database transformation at the same time, which in turn speeds up the process. Moreover, CQNS handles many NoSQL databases like MongoDB and Cassandra. This paper provides a spark framework that can handle SQL and NoSQL databases. This work also examines the importance of MongoDB block sharding and composition. Cassandra database deals with two types of sections vertex and edge Portioning. The four scenarios criteria datasets are used to evaluate the proposed CQNS to query the various NOSQL databases in terms of optimization performance and timing of query execution. The results show that among the comparative system, CQNS achieves optimum latency and productivity in less time.

The chapter explains how NoSQL databases work. Since different NoSQL databases are classified into four categories (key-value, column-family, document, and graph stores), three main features of NoSQL databases are chosen, and their practical implementation is explained using examples of one or two typical NoSQL databases from each NoSQL database category. The three chosen features are: distributed storage architecture that comprises the distributed, cluster-oriented, and horizontally scalable features; consistency model that refers to the CAP and BASE features; query execution that refers to the schemaless feature. These features are chosen because, through them, it is possible to describe most of the new and innovative approaches that NoSQL databases bring to the database world.


Author(s):  
Mainak Adhikari ◽  
Sukhendu Kar

NoSQL database provides a mechanism for storage and access of data across multiple storage clusters. NoSQL dabases are finding significant and growing industry to meet the huge data storage requirements of Big data, real time applications, and Cloud Computing. NoSQL databases have lots of advantages over the conventional RDBMS features. NoSQL systems are also referred to as “Not only SQL” to emphasize that they may in fact allow Structured language like SQL, and additionally, they allow Semi Structured as well as Unstructured language. A variety of NoSQL databases having different features to deal with exponentially growing data intensive applications are available with open source and proprietary option mostly prompted and used by social networking sites. This chapter discusses some features and challenges of NoSQL databases and some of the popular NoSQL databases with their features on the light of CAP theorem.


PLoS ONE ◽  
2021 ◽  
Vol 16 (8) ◽  
pp. e0255562
Author(s):  
Eman Khashan ◽  
Ali Eldesouky ◽  
Sally Elghamrawy

The growing popularity of big data analysis and cloud computing has created new big data management standards. Sometimes, programmers may interact with a number of heterogeneous data stores depending on the information they are responsible for: SQL and NoSQL data stores. Interacting with heterogeneous data models via numerous APIs and query languages imposes challenging tasks on multi-data processing developers. Indeed, complex queries concerning homogenous data structures cannot currently be performed in a declarative manner when found in single data storage applications and therefore require additional development efforts. Many models were presented in order to address complex queries Via multistore applications. Some of these models implemented a complex unified and fast model, while others’ efficiency is not good enough to solve this type of complex database queries. This paper provides an automated, fast and easy unified architecture to solve simple and complex SQL and NoSQL queries over heterogeneous data stores (CQNS). This proposed framework can be used in cloud environments or for any big data application to automatically help developers to manage basic and complicated database queries. CQNS consists of three layers: matching selector layer, processing layer, and query execution layer. The matching selector layer is the heart of this architecture in which five of the user queries are examined if they are matched with another five queries stored in a single engine stored in the architecture library. This is achieved through a proposed algorithm that directs the query to the right SQL or NoSQL database engine. Furthermore, CQNS deal with many NoSQL Databases like MongoDB, Cassandra, Riak, CouchDB, and NOE4J databases. This paper presents a spark framework that can handle both SQL and NoSQL Databases. Four scenarios’ benchmarks datasets are used to evaluate the proposed CQNS for querying different NoSQL Databases in terms of optimization process performance and query execution time. The results show that, the CQNS achieves best latency and throughput in less time among the compared systems.


Author(s):  
Sangeeta Gupta

The massive amount of data collected by various fields is a challenging aspect for analysis using the available storage technologies. Relational databases are a traditional approach of data storage more suitable for structured data formats and are constrained by ACID properties. As the modern world data in the form of word documents, pdf files, audio and video formats is unstructured, where tables and schema definition is not a major concern. Relational databases such as Mysql may not be suitable to serve such Bigdata. An alternate approach is to use the emerging Nosql databases. This paper presents a comparative analysis of Nosql types such as Hbase, Mongodb, Simple DB and Big Table with relational database like Mysql and specifies their limitations when applied to real world problems. It also proposes solution to overcome these limitations using an integrated data store which serve to be beneficial over the mentioned Nosql and Mysql stores in terms of efficiently implementing simple and complex queries yielding better performance.


Kilat ◽  
2019 ◽  
Vol 8 (1) ◽  
Author(s):  
Widya Nita Suliyanti

With the increasing need to store large amounts of unstructured and semi-structured data, the database that used to be mostly using SQL technology, began using the NoSQL database. The purpose of this paper is to conduct a literature study of the characteristics, advantages and disadvantages of SQL and NoSQL databases. This literature study shows that there are differences in SQL databases and based on characteristics (ACID for SQL vs. BASE and CAP for NoSQL); data model (relational for SQL and key-value for NoSQL); data structure (structured for SQL and non- or semi-structured for NoSQL); process (subquery, join and grouping / aggregation and complex queries faster only for SQL); and the number of servers used (single large server for SQL and multiple multiple levels for NoSQL). A literature review for further SQL and NoSQL applications is needed in the future.


2018 ◽  
Vol 2 (2) ◽  
pp. 51
Author(s):  
M. Sandeep Kumar ◽  
Prabhu .J

A Huge amount of data is manipulated by using the web application, Facebook, Twitter, social sites etc. Most of the data are unstructured data. It is not desirable for storing, performing and analyzing data in the relational database for huge data. It affords way towards performing NoSQL database and uses fully for handling the big data. In this paper, we present the performance in store and query operation in NoSQL database, estimating the performance of both reads and write operation using simple and complex queries. Result represents that comparing Cassandra with relation database, Cassandra outperforms the relation database. Most of the organization used only Hbase and Cassandra for benefit of cost. Comparison Various NoSQL Database, issues while performing NoSQL database. 


2020 ◽  
Vol 5 (2) ◽  
pp. 13-32
Author(s):  
Hye-Kyung Yang ◽  
Hwan-Seung Yong

AbstractPurposeWe propose InParTen2, a multi-aspect parallel factor analysis three-dimensional tensor decomposition algorithm based on the Apache Spark framework. The proposed method reduces re-decomposition cost and can handle large tensors.Design/methodology/approachConsidering that tensor addition increases the size of a given tensor along all axes, the proposed method decomposes incoming tensors using existing decomposition results without generating sub-tensors. Additionally, InParTen2 avoids the calculation of Khari–Rao products and minimizes shuffling by using the Apache Spark platform.FindingsThe performance of InParTen2 is evaluated by comparing its execution time and accuracy with those of existing distributed tensor decomposition methods on various datasets. The results confirm that InParTen2 can process large tensors and reduce the re-calculation cost of tensor decomposition. Consequently, the proposed method is faster than existing tensor decomposition algorithms and can significantly reduce re-decomposition cost.Research limitationsThere are several Hadoop-based distributed tensor decomposition algorithms as well as MATLAB-based decomposition methods. However, the former require longer iteration time, and therefore their execution time cannot be compared with that of Spark-based algorithms, whereas the latter run on a single machine, thus limiting their ability to handle large data.Practical implicationsThe proposed algorithm can reduce re-decomposition cost when tensors are added to a given tensor by decomposing them based on existing decomposition results without re-decomposing the entire tensor.Originality/valueThe proposed method can handle large tensors and is fast within the limited-memory framework of Apache Spark. Moreover, InParTen2 can handle static as well as incremental tensor decomposition.


Author(s):  
Omoruyi Osemwegie ◽  
Kennedy Okokpujie ◽  
Nsikan Nkordeh ◽  
Charles Ndujiuba ◽  
Samuel John ◽  
...  

<p>Increasing requirements for scalability and elasticity of data storage for web applications has made Not Structured Query Language NoSQL databases more invaluable to web developers. One of such NoSQL Database solutions is Redis. A budding alternative to Redis database is the SSDB database, which is also a key-value store but is disk-based. The aim of this research work is to benchmark both databases (Redis and SSDB) using the Yahoo Cloud Serving Benchmark (YCSB). YCSB is a platform that has been used to compare and benchmark similar NoSQL database systems. Both databases were given variable workloads to identify the throughput of all given operations. The results obtained shows that SSDB gives a better throughput for majority of operations to Redis’s performance.</p>


Author(s):  
Ganesh Chandra Deka

NoSQL databases are designed to meet the huge data storage requirements of cloud computing and big data processing. NoSQL databases have lots of advanced features in addition to the conventional RDBMS features. Hence, the “NoSQL” databases are popularly known as “Not only SQL” databases. A variety of NoSQL databases having different features to deal with exponentially growing data-intensive applications are available with open source and proprietary option. This chapter discusses some of the popular NoSQL databases and their features on the light of CAP theorem.


Author(s):  
Zongmin Ma ◽  
Li Yan

The resource description framework (RDF) is a model for representing information resources on the web. With the widespread acceptance of RDF as the de-facto standard recommended by W3C (World Wide Web Consortium) for the representation and exchange of information on the web, a huge amount of RDF data is being proliferated and becoming available. So, RDF data management is of increasing importance and has attracted attention in the database community as well as the Semantic Web community. Currently, much work has been devoted to propose different solutions to store large-scale RDF data efficiently. In order to manage massive RDF data, NoSQL (not only SQL) databases have been used for scalable RDF data store. This chapter focuses on using various NoSQL databases to store massive RDF data. An up-to-date overview of the current state of the art in RDF data storage in NoSQL databases is provided. The chapter aims at suggestions for future research.


Sign in / Sign up

Export Citation Format

Share Document