When Relational-Based Applications Go to NoSQL Databases: A Survey

Several data-centric applications today produce and manipulate a large volume of data, the so-called Big Data. Traditional databases, in particular, relational databases, are not suitable for Big Data management. As a consequence, some approaches that allow the definition and manipulation of large relational data sets stored in NoSQL databases through an SQL interface have been proposed, focusing on scalability and availability. This paper presents a comparative analysis of these approaches based on an architectural classification that organizes them according to their system architectures. Our motivation is that wrapping is a relevant strategy for relational-based applications that intend to move relational data to NoSQL databases (usually maintained in the cloud). We also claim that this research area has some open issues, given that most approaches deal with only a subset of SQL operations or give support to specific target NoSQL databases. Our intention with this survey is, therefore, to contribute to the state-of-art in this research area and also provide a basis for choosing or even designing a relational-to-NoSQL data wrapping solution.

Download Full-text

Which Way to Go for the Future

Bridging Relational and NoSQL Databases - Advances in Data Mining and Database Management ◽

10.4018/978-1-5225-3385-6.ch008 ◽

2018 ◽

pp. 311-328

Keyword(s):

Big Data ◽

Relational Database ◽

Data Model ◽

Driving Force ◽

Relational Databases ◽

High Availability ◽

Nosql Databases ◽

The Future ◽

Data Requirements ◽

Open Issues

The chapter presents how relational databases answer to typical NoSQL features, and, vice versa, how NoSQL databases answer to typical relational features. Open issues related to the integration of relational and NoSQL databases, as well as next database generation features are discussed. The big relational database vendors have continuously worked to incorporate NoSQL features into their databases, as well as NoSQL vendors are trying to make their products more like relational databases. The convergence of these two groups of databases has been a driving force in the evolution of database market, in establishing a new level of focus to resolving big data requirements, and in enabling users to fully use data potential, wherever data is stored, in relational or NoSQL databases. In turn, the database of choice in the future will likely be one that provides the best of both worlds: flexible data model, high availability, and enterprise reliability.

Download Full-text

Formalizing the Mapping of UML Conceptual Schemas to Column-Oriented Databases

International Journal of Data Warehousing and Mining ◽

10.4018/ijdwm.2018070103 ◽

2018 ◽

Vol 14 (3) ◽

pp. 44-68 ◽

Cited By ~ 1

Author(s):

Fatma Abdelhedi ◽

Amal Ait Brahim ◽

Gilles Zurfluh

Keyword(s):

Big Data ◽

Data Warehouse ◽

Relational Databases ◽

Traditional Approach ◽

Physical Models ◽

Decision Making Process ◽

Nosql Databases ◽

Care Field ◽

Sufficient Degree

Nowadays, most organizations need to improve their decision-making process using Big Data. To achieve this, they have to store Big Data, perform an analysis, and transform the results into useful and valuable information. To perform this, it's necessary to deal with new challenges in designing and creating data warehouse. Traditionally, creating a data warehouse followed well-governed process based on relational databases. The influence of Big Data challenged this traditional approach primarily due to the changing nature of data. As a result, using NoSQL databases has become a necessity to handle Big Data challenges. In this article, the authors show how to create a data warehouse on NoSQL systems. They propose the Object2NoSQL process that generates column-oriented physical models starting from a UML conceptual model. To ensure efficient automatic transformation, they propose a logical model that exhibits a sufficient degree of independence so as to enable its mapping to one or more column-oriented platforms. The authors provide experiments of their approach using a case study in the health care field.

Download Full-text

Big Data Analytics

Handbook of Research on Cloud Computing and Big Data Applications in IoT - Advances in Computer and Electrical Engineering ◽

10.4018/978-1-5225-8407-0.ch004 ◽

2019 ◽

pp. 67-81

Author(s):

Nitigya Sambyal ◽

Poonam Saini ◽

Rupali Syal

Keyword(s):

Big Data ◽

Open Problem ◽

Data Analytics ◽

Big Data Analytics ◽

Data Capture ◽

Data Sets ◽

Full Potential ◽

Processing Application ◽

Nosql Databases ◽

The World

The world is increasingly driven by huge amounts of data. Big data refers to data sets that are so large or complex that traditional data processing application software are inadequate to deal with them. Healthcare analytics is a prominent area of big data analytics. It has led to significant reduction in morbidity and mortality associated with a disease. In order to harness full potential of big data, various tools like Apache Sentry, BigQuery, NoSQL databases, Hadoop, JethroData, etc. are available for its processing. However, with such enormous amounts of information comes the complexity of data management, other big data challenges occur during data capture, storage, analysis, search, transfer, information privacy, visualization, querying, and update. The chapter focuses on understanding the meaning and concept of big data, analytics of big data, its role in healthcare, various application areas, trends and tools used to process big data along with open problem challenges.

Download Full-text

Security Analysis of MongoDB

10.31219/osf.io/c3w7y ◽

2020 ◽

Author(s):

Sahib Singh

Keyword(s):

Cloud Computing ◽

Big Data ◽

Relational Databases ◽

Security Analysis ◽

Nosql Databases ◽

Computing Platforms

NoSQL Databases are a form of non-relational databases whose primary purpose is to store and retrieve data. Due to recent advancements in cloud computing platforms and the emergence of Big Data, NoSQL Databases are more becoming popular than ever. In this paper we are going to understand and analyze the fundamental security features and the vulnerabilities of MongoDB and how it performs compared to relational databases on these fronts.

Download Full-text

Epilogue

Topology: A Very Short Introduction ◽

10.1093/actrade/9780198832683.003.0007 ◽

2019 ◽

pp. 128-130

Author(s):

Richard Earl

Keyword(s):

Big Data ◽

Data Analysis ◽

Research Area ◽

General Topology ◽

Topological Data Analysis ◽

Data Sets ◽

Current Interest ◽

And Topology ◽

Active Research ◽

Active Research Area

Topology remains a large, active research area in mathematics. Unsurprisingly its character has changed over the last century—there is considerably less current interest in general topology, but whole new areas have emerged, such as topological data analysis to help analyze big data sets. The Epilogue concludes that the interfaces of topology with other areas have remained rich and numerous, and it can be hard telling where topology stops and geometry or algebra or analysis or physics begin. Often that richness comes from studying structures that have interconnected flavours of algebra, geometry, and topology, but sometimes a result, seemingly of an entirely algebraic nature say, can be proved by purely topological means.

Download Full-text

Locally Consistent Bayesian Network Scores for Multi-Relational Data

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/375 ◽

2017 ◽

Cited By ~ 1

Author(s):

Oliver Schulte ◽

Sajjad Gholami

Keyword(s):

Model Selection ◽

Bayesian Network ◽

Relational Databases ◽

Structure Learning ◽

Empirical Evaluation ◽

Relational Data ◽

Data Sets ◽

Gain Function ◽

Single Model ◽

Log Linear

An important task for relational learning is Bayesian network (BN) structure learning. A fundamental component of structure learning is a model selection score that measures how well a model fits a dataset. We describe a new method that upgrades for multi-relational databases, a log-linear BN score designed for single-table i.i.d. data. Chickering and Meek showed that for i.i.d. data, standard BN scores are locally consistent, meaning that their maxima converge to an optimal model, that represents the data generating distribution {\em and} contains no redundant edges. Our main theorem establishes that if a model selection score is locally consistent for i.i.d. data, then our upgraded gain function is locally consistent for relational data as well. To our knowledge this is the first consistency result for relational structure learning. A novel aspect of our approach is employing a {\em gain function} that compares two models: a current vs. an alternative BN structure. In contrast, previous approaches employed a score that is a function of a single model only. Empirical evaluation on six benchmark relational databases shows that our gain function is also practically useful: On realistic size data sets, it selects informative BN structures with a better data fit than those selected by baseline single-model scores.

Download Full-text

Comparative study of NoSQL databases for big data storage

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.6.10072 ◽

2018 ◽

Vol 7 (2.6) ◽

pp. 83

Author(s):

Gourav Bathla ◽

Rinkle Rani ◽

Himanshu Aggarwal

Keyword(s):

Big Data ◽

Data Storage ◽

Relational Databases ◽

Large Scale ◽

Technology Selection ◽

Business Organizations ◽

Nosql Databases ◽

Real Solutions ◽

Adequate Knowledge ◽

Long Time

Big data is a collection of large scale of structured, semi-structured and unstructured data. It is generated due to Social networks, Business organizations, interaction and views of social connected users. It is used for important decision making in business and research organizations. Storage which is efficient to process this large scale of data to extract important information in less response time is the need of current competitive time. Relational databases which have ruled the storage technology for such a long time seems not suitable for mixed types of data. Data can not be represented just in the form of rows and columns in tables. NoSQL (Not only SQL) is complementary to SQL technology which can provide various formats for storage that can be easily compatible with high velocity,large volume and different variety of data. NoSQL databases are categorized in four techniques- Column oriented, Key Value based, Graph based and Document oriented databases. There are approximately 120 real solutions existing for these categories; most commonly used solutions are elaborated in Introduction section. Several research works have been carried out to analyze these NoSQL technology solutions. These studies have not mentioned the situations in which a particular data storage technique is to be chosen. In this study and analysis, we have tried our best to provide answer on technology selection based on specific requirement to the reader. In previous research, comparisons amongNoSQL data storage techniques have been described by using real examples like MongoDB, Neo4J etc. Our observation is that if users have adequate knowledge of NoSQL categories and their comparison, then it is easy for them to choose best suitable category and then real solutions can be selected from this category.

Download Full-text

Trends in IT Innovation to Build a Next Generation Bioinformatics Solution to Manage and Analyse Biological Big Data Produced by NGS Technologies

BioMed Research International ◽

10.1155/2015/904541 ◽

2015 ◽

Vol 2015 ◽

pp. 1-15 ◽

Cited By ~ 15

Author(s):

Alexandre G. de Brevern ◽

Jean-Philippe Meyniel ◽

Cécile Fairhead ◽

Cécile Neuvéglise ◽

Alain Malpertuy

Keyword(s):

Big Data ◽

Human Genome ◽

Business Intelligence ◽

Relational Databases ◽

Complete Sequence ◽

Nosql Databases ◽

Technology Innovations ◽

Data Formats ◽

It Innovation ◽

High Processing

Sequencing the human genome began in 1994, and 10 years of work were necessary in order to provide a nearly complete sequence. Nowadays, NGS technologies allow sequencing of a whole human genome in a few days. This deluge of data challenges scientists in many ways, as they are faced with data management issues and analysis and visualization drawbacks due to the limitations of current bioinformatics tools. In this paper, we describe how the NGS Big Data revolution changes the way of managing and analysing data. We present how biologists are confronted with abundance of methods, tools, and data formats. To overcome these problems, focus on Big Data Information Technology innovations from web and business intelligence. We underline the interest of NoSQL databases, which are much more efficient than relational databases. Since Big Data leads to the loss of interactivity with data during analysis due to high processing time, we describe solutions from the Business Intelligence that allow one to regain interactivity whatever the volume of data is. We illustrate this point with a focus on the Amadea platform. Finally, we discuss visualization challenges posed by Big Data and present the latest innovations with JavaScript graphic libraries.

Download Full-text

A NOVEL TECHNIQUE IN NoSQL DATA EXTRACTION

International Journal of Research -GRANTHAALAYAH ◽

10.29121/granthaalayah.v1.i1.2014.3086 ◽

2014 ◽

Vol 1 (1) ◽

pp. 51-58

Author(s):

Renu Chaudhary ◽

Gagangeet Singh

Keyword(s):

Data Storage ◽

Relational Databases ◽

Data Extraction ◽

Query Languages ◽

Data Sets ◽

Volume Data ◽

Nosql Databases ◽

Advantages And Disadvantages ◽

Data Store ◽

Concurrent User

NoSQL databases (commonly interpreted by developers as „not only SQL databases‟ and not „no SQL‟) is an emerging alternative to the most widely used relational databases. As the name suggests, it does not completely replace SQL but compliments it in such a way that they can co-exist. In this paper we will be discussing the NoSQL data model, types of NoSQL data stores, characteristics and features of each data store, query languages used in NoSQL, advantages and disadvantages of NoSQL over RDBMS and the future prospects of NoSQL. Motivation/Background:NoSQL systems exhibit the ability to store and index arbitrarily big data sets while enabling a large amount of concurrent user requests. Method:Many people think NoSQL is a derogatory term created to poke at SQL. In reality, the term means Not Only SQL. The idea is that both technologies can coexist and each has its place. Results:Large-scale data processing (parallel processing over distributed systems); Embedded IR (basic machine-to-machine information look-up & retrieval); Exploratory analytics on semi-structured data (expert level); Large volume data storage (unstructured, semi-structured, small-packet structured). Conclusions:This study report motivation to provide an independent understanding of the strengths and weaknesses of various NoSQL database approaches to supporting applications that process huge volumes of data; as well as to provide a global overview of this non-relational NoSQL databases.

Download Full-text

An adaptive spark-based framework for querying large-scale NoSQL and relational databases

PLoS ONE ◽

10.1371/journal.pone.0255562 ◽

2021 ◽

Vol 16 (8) ◽

pp. e0255562

Author(s):

Eman Khashan ◽

Ali Eldesouky ◽

Sally Elghamrawy

Keyword(s):

Big Data ◽

Data Storage ◽

Relational Databases ◽

Large Scale ◽

Query Languages ◽

Heterogeneous Data ◽

Query Execution ◽

Database Queries ◽

Nosql Databases ◽

Complex Queries

The growing popularity of big data analysis and cloud computing has created new big data management standards. Sometimes, programmers may interact with a number of heterogeneous data stores depending on the information they are responsible for: SQL and NoSQL data stores. Interacting with heterogeneous data models via numerous APIs and query languages imposes challenging tasks on multi-data processing developers. Indeed, complex queries concerning homogenous data structures cannot currently be performed in a declarative manner when found in single data storage applications and therefore require additional development efforts. Many models were presented in order to address complex queries Via multistore applications. Some of these models implemented a complex unified and fast model, while others’ efficiency is not good enough to solve this type of complex database queries. This paper provides an automated, fast and easy unified architecture to solve simple and complex SQL and NoSQL queries over heterogeneous data stores (CQNS). This proposed framework can be used in cloud environments or for any big data application to automatically help developers to manage basic and complicated database queries. CQNS consists of three layers: matching selector layer, processing layer, and query execution layer. The matching selector layer is the heart of this architecture in which five of the user queries are examined if they are matched with another five queries stored in a single engine stored in the architecture library. This is achieved through a proposed algorithm that directs the query to the right SQL or NoSQL database engine. Furthermore, CQNS deal with many NoSQL Databases like MongoDB, Cassandra, Riak, CouchDB, and NOE4J databases. This paper presents a spark framework that can handle both SQL and NoSQL Databases. Four scenarios’ benchmarks datasets are used to evaluate the proposed CQNS for querying different NoSQL Databases in terms of optimization process performance and query execution time. The results show that, the CQNS achieves best latency and throughput in less time among the compared systems.

Download Full-text