XML2HBase: Storing and querying large collections of XML documents using a NoSQL database system

2019 ◽

pp. 129-157

Author(s):

Khaled Dehdouh

Keyword(s):

Big Data ◽

Database System ◽

Massive Data ◽

Data Warehouses ◽

Online Analysis ◽

Storage Model ◽

Data Cubes ◽

Nosql Database ◽

Oriented Approach

In the big data warehouses context, a column-oriented NoSQL database system is considered as the storage model which is highly adapted to data warehouses and online analysis. Indeed, the use of NoSQL models allows data scalability easily and the columnar store is suitable for storing and managing massive data, especially for decisional queries. However, the column-oriented NoSQL DBMS do not offer online analysis operators (OLAP). To build OLAP cubes corresponding to the analysis contexts, the most common way is to integrate other software such as HIVE or Kylin which has a CUBE operator to build data cubes. By using that, the cube is built according to the row-oriented approach and does not allow to fully obtain the benefits of a column-oriented approach. In this chapter, the main contribution is to define a cube operator called MC-CUBE (MapReduce Columnar CUBE), which allows building columnar NoSQL cubes according to the columnar approach by taking into account the non-relational and distributed aspects when data warehouses are stored.

Download Full-text

Big Data

Pattern and Data Analysis in Healthcare Settings - Advances in Medical Technologies and Clinical Practice ◽

10.4018/978-1-5225-0536-5.ch009 ◽

2017 ◽

pp. 158-179

Author(s):

Vinod Kumar ◽

Ramjeevan Singh Thakur

Keyword(s):

Big Data ◽

Database System ◽

Smart Phones ◽

Data Handling ◽

Data Generation ◽

Business Organizations ◽

Nosql Databases ◽

Data Formats ◽

Nosql Database ◽

Tools And Techniques

With every passing day, data generation is increasing exponentially, its volume, variety, velocity are making it quite challenging to analyze, interpret, visualize for gaining the greater insights from the available data. Billions of networked sensors are being embedded in devices such as smart phones, automobiles, social media sites, laptop, PC's and industrial machines etc. that operates, generate and communicate data. Thus, the data obtained from various resources exists in structured, semi-structured and unstructured form. The traditional database system is not suitable to handle these data formats. Therefore, new tools and techniques are developed to work with these data. NoSQL is one of them. Currently, many NoSQL database are available in the market, each one of them specially designed to solve specific type of data handling problems, most of the NoSQL databases are developed with special attention to problem of business organizations and enterprises. The chapter focuses various aspects of NoSQL as tool for handling the big data.

Download Full-text

Big Data Warehouse

International Journal of Decision Support System Technology ◽

10.4018/ijdsst.2020010101 ◽

2020 ◽

Vol 12 (1) ◽

pp. 1-24

Author(s):

Khaled Dehdouh ◽

Omar Boussaid ◽

Fadila Bentayeb

Keyword(s):

Big Data ◽

Data Warehouse ◽

Database System ◽

Massive Data ◽

Data Warehouses ◽

Online Analysis ◽

Storage Model ◽

Nosql Database ◽

Big Data Warehouse ◽

Oriented Approach

In the Big Data warehouse context, a column-oriented NoSQL database system is considered as the storage model which is highly adapted to data warehouses and online analysis. Indeed, the use of NoSQL models allows data scalability easily and the columnar store is suitable for storing and managing massive data, especially for decisional queries. However, the column-oriented NoSQL DBMS do not offer online analysis operators (OLAP). To build OLAP cubes corresponding to the analysis contexts, the most common way is to integrate other software such as HIVE or Kylin which has a CUBE operator to build data cubes. By using that, the cube is built according to the row-oriented approach and does not allow to fully obtain the benefits of a column-oriented approach. In this article, the focus is to define a cube operator called MC-CUBE (MapReduce Columnar CUBE), which allows building columnar NoSQL cubes according to the columnar approach by taking into account the non-relational and distributed aspects when data warehouses are stored.

Download Full-text

A general technique for querying XML documents using a relational database system

ACM SIGMOD Record ◽

10.1145/603867.603871 ◽

2001 ◽

Vol 30 (3) ◽

pp. 20-26 ◽

Cited By ~ 72

Author(s):

Jayavel Shanmugasundaram ◽

Eugene Shekita ◽

Jerry Kiernan ◽

Rajasekar Krishnamurthy ◽

Efstratios Viglas ◽

...

Keyword(s):

Relational Database ◽

Database System ◽

General Technique ◽

Xml Documents ◽

Relational Database System

Download Full-text

An Approach for Schema Extraction of NoSQL Columnar Databases: the HBase Case Study

Journal of Information and Data Management ◽

10.5753/jidm.2021.1966 ◽

2021 ◽

Vol 12 (5) ◽

Author(s):

Angelo Augusto Frozza ◽

Eduardo Dias Defreyn ◽

Ronaldo Dos Santos Mello

Keyword(s):

A Priori ◽

Database System ◽

Data Representation ◽

Data Types ◽

Data Interoperability ◽

Nosql Databases ◽

Prototype Tool ◽

Nosql Database ◽

Integration Data

Although NoSQL databases do not require a schema a priori, being aware of the database schema is essential for activities like data integration, data validation, or data interoperability. This paper presents a process for the extraction of columnar NoSQL database schemas. We adopt JSON as a canonical format for data representation, and we validate the proposed process through a prototype tool that is able to extract schemas from the HBase columnar NoSQL database system. HBase was chosen as a case study because it is one of the most popular columnar NoSQL solutions. When compared to related work, we innovate by proposing a simple solution for the inference of column data types for columnar NoSQL databases that store only byte arrays as column values, and a resulting schema that follows the JSON Schema format.

Download Full-text

Genomic data persistency on a NoSQL database system

2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) ◽

10.1109/bibm.2014.6999304 ◽

2014 ◽

Cited By ~ 2

Author(s):

Rodrigo Aniceto ◽

Rene Xavier ◽

Maristela Holanda ◽

Maria Emilia Walter ◽

Sergio Lifschitz

Keyword(s):

Genomic Data ◽

Database System ◽

Nosql Database

Download Full-text

MHB-Tree: A Distributed Spatial Index Method for Document Based NoSQL Database System

Lecture Notes in Electrical Engineering - Ubiquitous Information Technologies and Applications ◽

10.1007/978-94-007-5857-5_53 ◽

2012 ◽

pp. 489-497 ◽

Cited By ~ 3

Author(s):

Yan Li ◽

GyoungBae Kim ◽

LongRi Wen ◽

HaeYoung Bae

Keyword(s):

Database System ◽

Spatial Index ◽

Index Method ◽

Nosql Database

Download Full-text

A Process for Inference of Columnar NoSQL Database Schemas

10.5753/sbbd.2020.13637 ◽

2020 ◽

Author(s):

Angelo Augusto Frozza ◽

Eduardo Dias Defreyn ◽

Ronaldo Dos Santos Mello

Keyword(s):

A Priori ◽

Database System ◽

Data Validation ◽

Data Types ◽

Data Interoperability ◽

Nosql Databases ◽

Prototype Tool ◽

Nosql Database ◽

Integration Data

Although NoSQL Databases do not require a schema a priori, to be aware of the database schema is essential for activities like data integration, data validation or data interoperability. This paper presents a process for inference of columnar NoSQL DB schemas. We validate the proposed process through a prototype tool that is able to extract schemas from the HBase columnar NoSQL database system. HBase was chosen as a case study because it is one of the most popular columnar NoSQL solutions. When compared to related work, we novel by proposing a simple solution for the inference of column data types for columnar NoSQL databases that store only byte arrays as column values, as well as a generated schema that follows the JSON Schema format.

Download Full-text

Using Logic for Querying XML Data

Web-Powered Databases ◽

10.4018/978-1-59140-035-6.ch001 ◽

2003 ◽

pp. 1-35 ◽

Cited By ~ 6

Author(s):

Nick Bassiliades ◽

Ioannis Vlahavas ◽

Dimitros Sampson

Keyword(s):

Query Language ◽

Database System ◽

Order Logic ◽

First Order Logic ◽

Deductive Database ◽

Xml Data ◽

First Order ◽

Xml Documents ◽

Order Of Elements ◽

Object Oriented Database

In this chapter, we propose the use of first-order logic, in the form of deductive database rules, as a query language for XML data, and we present X-Device, an extension of the deductive object-oriented database system Device, for storing and querying XML data. XML documents are stored into the OODB by automatically mapping the DTD to an object schema. XML elements are treated either as classes or attributes based on their complexity, without loosing the relative order of elements in the original document. Furthermore, this chapter describes the extension of the system’s deductive rule query language with second-order variables, general path and ordering expressions, for querying over the stored, tree-structured XML data and constructing XML documents as a result. The extensions were implemented by translating all the extended features into the basic, first-order deductive rule language of Device using meta-data about stored XML objects.

Download Full-text

CASE STUDY OF TRADITIONAL RDBMS AND NOSQL DATABASE SYSTEM

International Journal of Research -GRANTHAALAYAH ◽

10.29121/granthaalayah.v7.i7.2019.777 ◽

2019 ◽

Vol 7 (7) ◽

pp. 351-359

Author(s):

Yashraj Sharma ◽

Yashasvi Sharma

Keyword(s):

Relational Database ◽

System Architecture ◽

Relational Databases ◽

Database System ◽

Database Management System ◽

Nosql Databases ◽

Advantages And Disadvantages ◽

Relational Database System ◽

Big Data Applications ◽

Nosql Database

On the basis of reliability, rational models are useful but not in terms of systems which involve huge amount of data; in such cases, non-relational models are much more useful. To store large chunks of data, NoSQL databases are used. NoSQL databases are scalable and wide ranged because they are non-relationally distributed. In relational databases, it was not possible to manage data which involved very large number of Big Data applications hence the concept of NoSQL database was introduced. There are a lot of advantages of NoSQL which not only involve its own features but also some features of relational database management system. The severe benefit of NoSQL database is that it is an open source system which helps to adapt many numbers of features for newly generated applications. This paper is focused on understanding the concepts of non-relational database system architecture with relational database system architecture and figure out the advantages and disadvantages of both simultaneously.

Download Full-text

XML2HBase: Storing and querying large collections of XML documents using a NoSQL database system

Building OLAP Cubes From Columnar NoSQL Data Warehouses

Big Data

Big Data Warehouse

A general technique for querying XML documents using a relational database system

An Approach for Schema Extraction of NoSQL Columnar Databases: the HBase Case Study

Genomic data persistency on a NoSQL database system

MHB-Tree: A Distributed Spatial Index Method for Document Based NoSQL Database System

A Process for Inference of Columnar NoSQL Database Schemas

Using Logic for Querying XML Data

CASE STUDY OF TRADITIONAL RDBMS AND NOSQL DATABASE SYSTEM

Export Citation Format