Programming and Pre-Processing Systems for Big Data Storage and Visualization

Author(s):  
Hidayat Ur Rahman ◽  
Rehan Ullah Khan ◽  
Amjad Ali

This chapter of the book chapter provides detailed overview of the major concept used in Big Data. In order to process the huge volume of data, the first step is the pre-processing which is required to anomalies such as, missing values by applying various transformations. This chapter provides a detail overview of preprocessing tools used for Big Data such as, R, Yahoo! Pipes, Mechanical Turk, Elasticsearch etc. Beside preprocessing tools, the chapter provides detailed overview of storage tools, programming tools, data visualization, log processing tools and caching tools used for Big Data analytics. In other words, this chapter is the core of the book and provides the overview of the major technologies discussed later in the book.

Author(s):  
Gerard G. Dumancas ◽  
Ghalib A. Bello ◽  
Jeff Hughes ◽  
Renita Murimi ◽  
Lakshmi Chockalingam Kasi Viswanath ◽  
...  

Modern instruments have the capacity to generate and store enormous volumes of data and the challenges involved in processing, analyzing and visualizing this data are well recognized. The field of Chemometrics (a subspecialty of Analytical Chemistry) grew out of efforts to develop a toolbox of statistical and computer applications for data processing and analysis. This chapter will discuss key concepts of Big Data Analytics within the context of Analytical Chemistry. The chapter will devote particular emphasis on preprocessing techniques, statistical and Machine Learning methodology for data mining and analysis, tools for big data visualization and state-of-the-art applications for data storage. Various statistical techniques used for the analysis of Big Data in Chemometrics are introduced. This chapter also gives an overview of computational tools for Big Data Analytics for Analytical Chemistry. The chapter concludes with the discussion of latest platforms and programming tools for Big Data storage like Hadoop, Apache Hive, Spark, Google Bigtable, and more.


Author(s):  
Janet Chan

Internet and telecommunications, ubiquitous sensing devices, and advances in data storage and analytic capacities have heralded the age of Big Data, where the volume, velocity, and variety of data not only promise new opportunities for the harvesting of information, but also threaten to overload existing resources for making sense of this information. The use of Big Data technology for criminal justice and crime control is a relatively new development. Big Data technology has overlapped with criminology in two main areas: (a) Big Data is used as a type of data in criminological research, and (b) Big Data analytics is employed as a predictive tool to guide criminal justice decisions and strategies. Much of the debate about Big Data in criminology is concerned with legitimacy, including privacy, accountability, transparency, and fairness. Big Data is often made accessible through data visualization. Big Data visualization is a performance that simultaneously masks the power of commercial and governmental surveillance and renders information political. The production of visuality operates in an economy of attention. In crime control enterprises, future uncertainties can be masked by affective triggers that create an atmosphere of risk and suspicion. There have also been efforts to mobilize data to expose harms and injustices and garner support for resistance. While Big Data and visuality can perform affective modulation in the race for attention, the impact of data visualization is not always predictable. By removing the visibility of real people or events and by aestheticizing representations of tragedies, data visualization may achieve further distancing and deadening of conscience in situations where graphic photographic images might at least garner initial emotional impact.


2019 ◽  
Vol 19 (3) ◽  
pp. 16-24 ◽  
Author(s):  
Ivan P. Popchev ◽  
Daniela A. Orozova

Abstract The issues related to the analysis and management of Big Data, aspects of the security, stability and quality of the data, represent a new research, and engineering challenge. In the present paper, techniques for Big Data storage, search, analysis and management in the area of the virtual e-Learning space and the problems in front of them are considered. A numerical example for explorative analysis of data about the students from Burgas Free University is applied, using instrument for Data Mining of Orange. The analysis is a base for a system for localization of students at risk.


Author(s):  
Abhilasha Rangra ◽  
Vivek Kumar Sehgal

In recent years, the concept of cloud computing and big data analysis are considered as two major problems. It empowers the resources of computing to be maintained as the service of information technology with high effectiveness and efficiency. In the present scenario, big data is treated as one of the issues that the experts are trying to solve and finding ways to tackle the problem of handling big data analytics, how it could be managed with the technology of cloud computing and handled in the recent systems, and apart from this, the most significant issue is how to have perfect safety of big data in the cloud computing environment. In this paper, the authors mainly improve the performance of big data storage on cloud mechanics as the integration of mobile digital healthcare. The proposed framework involves the process of refining the sensitivity by using a deep learning approach. After this, it involves the step of computing or storage in the cloud-based server in an optimized manner. The experimental analysis provides a significant improvement in terms of cost, time, and accuracy.


Author(s):  
Arvind Panwar ◽  
Vishal Bhatnagar

Data is the biggest asset after people for businesses, and it is a new driver of the world economy. The volume of data that enterprises gather every day is growing rapidly. This kind of rapid growth of data in terms of volume, variety, and velocity is known as Big Data. Big Data is a challenge for enterprises, and the biggest challenge is how to store Big Data. In the past and some organizations currently, data warehouses are used to store Big Data. Enterprise data warehouses work on the concept of schema-on-write but Big Data analytics want data storage which works on the schema-on-read concept. To fulfill market demand, researchers are working on a new data repository system for Big Data storage known as a data lake. The data lake is defined as a data landing area for raw data from many sources. There is some confusion and questions which must be answered about data lakes. The objective of this article is to reduce the confusion and address some question about data lakes with the help of architecture.


2015 ◽  
Vol 12 (6) ◽  
pp. 106-115 ◽  
Author(s):  
Hongbing Cheng ◽  
Chunming Rong ◽  
Kai Hwang ◽  
Weihong Wang ◽  
Yanyan Li

Sign in / Sign up

Export Citation Format

Share Document