Big Data Tools—Hadoop Ecosystem, Spark and NoSQL Databases

Author(s):  
C.S.R. Prabhu ◽  
Aneesh Sreevallabh Chivukula ◽  
Aditya Mogadala ◽  
Rohit Ghosh ◽  
L.M. Jenila Livingston
Author(s):  
Mohammed El Malki ◽  
Arlind Kopliku ◽  
Essaid Sabir ◽  
Olivier Teste
Keyword(s):  
Big Data ◽  

2018 ◽  
Vol 14 (3) ◽  
pp. 44-68 ◽  
Author(s):  
Fatma Abdelhedi ◽  
Amal Ait Brahim ◽  
Gilles Zurfluh

Nowadays, most organizations need to improve their decision-making process using Big Data. To achieve this, they have to store Big Data, perform an analysis, and transform the results into useful and valuable information. To perform this, it's necessary to deal with new challenges in designing and creating data warehouse. Traditionally, creating a data warehouse followed well-governed process based on relational databases. The influence of Big Data challenged this traditional approach primarily due to the changing nature of data. As a result, using NoSQL databases has become a necessity to handle Big Data challenges. In this article, the authors show how to create a data warehouse on NoSQL systems. They propose the Object2NoSQL process that generates column-oriented physical models starting from a UML conceptual model. To ensure efficient automatic transformation, they propose a logical model that exhibits a sufficient degree of independence so as to enable its mapping to one or more column-oriented platforms. The authors provide experiments of their approach using a case study in the health care field.


Author(s):  
Ganesh Chandra Deka

NoSQL databases are designed to meet the huge data storage requirements of cloud computing and big data processing. NoSQL databases have lots of advanced features in addition to the conventional RDBMS features. Hence, the “NoSQL” databases are popularly known as “Not only SQL” databases. A variety of NoSQL databases having different features to deal with exponentially growing data-intensive applications are available with open source and proprietary option. This chapter discusses some of the popular NoSQL databases and their features on the light of CAP theorem.


Author(s):  
Nitigya Sambyal ◽  
Poonam Saini ◽  
Rupali Syal

The world is increasingly driven by huge amounts of data. Big data refers to data sets that are so large or complex that traditional data processing application software are inadequate to deal with them. Healthcare analytics is a prominent area of big data analytics. It has led to significant reduction in morbidity and mortality associated with a disease. In order to harness full potential of big data, various tools like Apache Sentry, BigQuery, NoSQL databases, Hadoop, JethroData, etc. are available for its processing. However, with such enormous amounts of information comes the complexity of data management, other big data challenges occur during data capture, storage, analysis, search, transfer, information privacy, visualization, querying, and update. The chapter focuses on understanding the meaning and concept of big data, analytics of big data, its role in healthcare, various application areas, trends and tools used to process big data along with open problem challenges.


Computing ◽  
2019 ◽  
Vol 102 (6) ◽  
pp. 1521-1545 ◽  
Author(s):  
Lanxiang Chen ◽  
Nan Zhang ◽  
Hung-Min Sun ◽  
Chin-Chen Chang ◽  
Shui Yu ◽  
...  

Information ◽  
2019 ◽  
Vol 10 (7) ◽  
pp. 241
Author(s):  
Geomar A. Schreiner ◽  
Denio Duarte ◽  
Ronaldo dos S. Melo

Several data-centric applications today produce and manipulate a large volume of data, the so-called Big Data. Traditional databases, in particular, relational databases, are not suitable for Big Data management. As a consequence, some approaches that allow the definition and manipulation of large relational data sets stored in NoSQL databases through an SQL interface have been proposed, focusing on scalability and availability. This paper presents a comparative analysis of these approaches based on an architectural classification that organizes them according to their system architectures. Our motivation is that wrapping is a relevant strategy for relational-based applications that intend to move relational data to NoSQL databases (usually maintained in the cloud). We also claim that this research area has some open issues, given that most approaches deal with only a subset of SQL operations or give support to specific target NoSQL databases. Our intention with this survey is, therefore, to contribute to the state-of-art in this research area and also provide a basis for choosing or even designing a relational-to-NoSQL data wrapping solution.


2015 ◽  
Vol 2 (1) ◽  
Author(s):  
Sara Landset ◽  
Taghi M. Khoshgoftaar ◽  
Aaron N. Richter ◽  
Tawfiq Hasanin

Sign in / Sign up

Export Citation Format

Share Document