An Approach in Big Data Analytics to improve the velocity of unstructured data Using Map Reduce

Big Data Analytics is an innovative approach for extracting the data from a huge volume of data warehouse systems. It reveals the method to compress the high volume of data into clusters by MapReduce and HDFS. However, the data processing has taken more time for extract and store in Hadoop clusters. The proposed system deals with the challenges of time delay in shuffle phase of map-reduce due to scheduling and sequencing. For improving the speed of big data, this proposed work using the Compressed Elastic Search Index (CESI) and MapReduce-Based Next Generation Sequencing Approach (MRBNGSA). This approach helps to increase the speed of data retrieval from HDFS clusters because of the way it is stored in that. this method is stored only the metadata in HDFS which takes less memory during runtime compare to big data due to the volume of data stored in HDFS. This approach is reduces the CPU utilization and memory allocation of the resource manager in Hadoop Framework and imroves data processing speed, such a way that time delay has to be reduced with minimum latency.

Download Full-text

An Approach in Big Data Analytics to Improve the Velocity of Unstructured Data Using MapReduce

International Journal of System Dynamics Applications ◽

10.4018/ijsda.20211001oa22 ◽

2021 ◽

Vol 10 (4) ◽

pp. 0-0

Keyword(s):

Big Data ◽

Time Delay ◽

Data Processing ◽

Data Analytics ◽

Big Data Analytics ◽

Data Retrieval ◽

High Volume ◽

Minimum Latency ◽

Hadoop Clusters ◽

Search Index

Big Data Analytics is an innovative approach for extracting the data from a huge volume of data warehouse systems. It reveals the method to compress the high volume of data into clusters by MapReduce and HDFS. However, the data processing has taken more time for extract and store in Hadoop clusters. The proposed system deals with the challenges of time delay in shuffle phase of map-reduce due to scheduling and sequencing. For improving the speed of big data, this proposed work using the Compressed Elastic Search Index (CESI) and MapReduce-Based Next Generation Sequencing Approach (MRBNGSA). This approach helps to increase the speed of data retrieval from HDFS clusters because of the way it is stored in that. this method is stored only the metadata in HDFS which takes less memory during runtime compare to big data due to the volume of data stored in HDFS. This approach is reduces the CPU utilization and memory allocation of the resource manager in Hadoop Framework and imroves data processing speed, such a way that time delay has to be reduced with minimum latency.

Download Full-text

Analysis of Big Data Analytics in Healthcare Sector: Applications and Tools

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.9459 ◽

2020 ◽

Vol 17 (12) ◽

pp. 5605-5612

Author(s):

A. Kaliappan ◽

D. Chitra

Keyword(s):

Big Data ◽

Data Analytics ◽

Health Sector ◽

Big Data Analytics ◽

High Volume ◽

Healthcare Sector ◽

Genome Database ◽

Tremendous Amount ◽

The Impact ◽

Different Sources

In today’s world, an immense measure of information in the form of unstructured, semi-structured and unstructured is generated by different sources all over the world in a tremendous amount. Big data is the termed coined to address these enormous amounts of data. One of the major challenges in the health sector is handling a high-volume variety of data generated from diverse sources and utilizing it for the wellbeing of human. Big data analytics is one of technique designed to operate with monstrous measures of information. The impact of big data in healthcare field and utilization of Hadoop system tools for supervising the big data are deliberated in this paper. The big data analytics role and its theoretical and conceptual architecture include the gathering of diverse information’s such as electronic health records, genome database and clinical decisions support systems, text representation in health care industry is investigated in this paper.

Download Full-text

Modeling Big Data Analytics with a Real-Time Executable Specification Language

Handbook of Research on Trends and Future Directions in Big Data and Web Intelligence - Advances in Data Mining and Database Management ◽

10.4018/978-1-4666-8505-5.ch014 ◽

2015 ◽

pp. 289-312

Author(s):

Amir A. Khwaja

Keyword(s):

Big Data ◽

Data Processing ◽

Real Time ◽

Data Analytics ◽

Big Data Analytics ◽

Exception Handling ◽

Specification Language ◽

Big Data Processing ◽

Business Decisions ◽

Concurrent Processes

Big data explosion has already happened and the situation is only going to exacerbate with such a high number of data sources and high-end technology prevalent everywhere, generating data at a frantic pace. One of the most important aspects of big data is being able to capture, process, and analyze data as it is happening in real-time to allow real-time business decisions. Alternate approaches must be investigated especially consisting of highly parallel and real-time computations for big data processing. The chapter presents RealSpec real-time specification language that may be used for the modeling of big data analytics due to the inherent language features needed for real-time big data processing such as concurrent processes, multi-threading, resource modeling, timing constraints, and exception handling. The chapter provides an overview of RealSpec and applies the language to a detailed big data event recognition case study to demonstrate language applicability to big data framework and analytics modeling.

Download Full-text

Big Data Analytics in Cloud Computing

Advances in Computer and Electrical Engineering - Novel Practices and Trends in Grid and Cloud Computing ◽

10.4018/978-1-5225-9023-1.ch018 ◽

2019 ◽

pp. 325-341

Author(s):

Rajganesh Nagarajan ◽

Ramkumar Thirunavukarasu

Keyword(s):

Cloud Computing ◽

Big Data ◽

Data Processing ◽

Data Visualization ◽

Data Analytics ◽

Big Data Analytics ◽

Future Research ◽

Big Data Processing ◽

Investment Cost

In this chapter, the authors consider different categories of data, which are processed by the big data analytics tools. The challenges with respect to the big data processing are identified and a solution with the help of cloud computing is highlighted. Since the emergence of cloud computing is highly advocated because of its pay-per-use concept, the data processing tools can be effectively deployed within cloud computing and certainly reduce the investment cost. In addition, this chapter talks about the big data platforms, tools, and applications with data visualization concept. Finally, the applications of data analytics are discussed for future research.

Download Full-text

Modeling Big Data Analytics with a Real-Time Executable Specification Language

Big Data ◽

10.4018/978-1-4666-9840-6.ch021 ◽

2016 ◽

pp. 418-440

Author(s):

Amir A. Khwaja

Keyword(s):

Big Data ◽

Data Processing ◽

Real Time ◽

Data Analytics ◽

Big Data Analytics ◽

Exception Handling ◽

Specification Language ◽

Big Data Processing ◽

Business Decisions ◽

Concurrent Processes

Big data explosion has already happened and the situation is only going to exacerbate with such a high number of data sources and high-end technology prevalent everywhere, generating data at a frantic pace. One of the most important aspects of big data is being able to capture, process, and analyze data as it is happening in real-time to allow real-time business decisions. Alternate approaches must be investigated especially consisting of highly parallel and real-time computations for big data processing. The chapter presents RealSpec real-time specification language that may be used for the modeling of big data analytics due to the inherent language features needed for real-time big data processing such as concurrent processes, multi-threading, resource modeling, timing constraints, and exception handling. The chapter provides an overview of RealSpec and applies the language to a detailed big data event recognition case study to demonstrate language applicability to big data framework and analytics modeling.

Download Full-text

Data Processing Model to Perform Big Data Analytics in Hybrid Infrastructures

IEEE Access ◽

10.1109/access.2020.3023344 ◽

2020 ◽

Vol 8 ◽

pp. 170281-170294

Author(s):

Julio C. S. Dos Anjos ◽

Kassiano J. Matteussi ◽

Paulo R. R. De Souza ◽

Gabriel J. A. Grabher ◽

Guilherme A. Borges ◽

...

Keyword(s):

Big Data ◽

Data Processing ◽

Data Analytics ◽

Big Data Analytics

Download Full-text

A Survey on Big Data Analytics Using HADOOP

Asian Journal of Computer Science and Technology ◽

10.51983/ajcst-2019.8.s3.2091 ◽

2019 ◽

Vol 8 (S3) ◽

pp. 35-40

Author(s):

S. Mamatha ◽

T. Sudha

Keyword(s):

Big Data ◽

Social Networking Sites ◽

Data Analytics ◽

Business Processes ◽

Big Data Analytics ◽

Large Data ◽

Structured Data ◽

Map Reduce ◽

Data Set ◽

Digital World

In this digital world, as organizations are evolving rapidly with data centric asset the explosion of data and size of the databases have been growing exponentially. Data is generated from different sources like business processes, transactions, social networking sites, web servers, etc. and remains in structured as well as unstructured form. The term ― Big data is used for large data sets whose size is beyond the ability of commonly used software tools to capture, manage, and process the data within a tolerable elapsed time. Big data varies in size ranging from a few dozen terabytes to many petabytes of data in a single data set. Difficulties include capture, storage, search, sharing, analytics and visualizing. Big data is available in structured, unstructured and semi-structured data format. Relational database fails to store this multi-structured data. Apache Hadoop is efficient, robust, reliable and scalable framework to store, process, transforms and extracts big data. Hadoop framework is open source and fee software which is available at Apache Software Foundation. In this paper we will present Hadoop, HDFS, Map Reduce and c-means big data algorithm to minimize efforts of big data analysis using Map Reduce code. The objective of this paper is to summarize the state-of-the-art efforts in clinical big data analytics and highlight what might be needed to enhance the outcomes of clinical big data analytics tools and related fields.

Download Full-text

Big Data Analytics in Online Structural Health Monitoring

International Journal of Prognostics and Health Management ◽

10.36001/ijphm.2016.v7i4.2462 ◽

2020 ◽

Vol 7 (4) ◽

Author(s):

Guowei Cai ◽

Sankaran Mahadevan

Keyword(s):

Big Data ◽

Structural Health Monitoring ◽

Health Monitoring ◽

Data Analytics ◽

Structural Damage ◽

Big Data Analytics ◽

High Volume ◽

Heterogeneous Data ◽

Sensor Technology ◽

Structural Health

This manuscript explores the application of big data analytics in online structural health monitoring. As smart sensor technology is making progress and low cost online monitoring is increasingly possible, large quantities of highly heterogeneous data can be acquired during the monitoring, thus exceeding the capacity of traditional data analytics techniques. This paper investigates big data techniques to handle the highvolume data obtained in structural health monitoring. In particular, we investigate the analysis of infrared thermal images for structural damage diagnosis. We explore the MapReduce technique to parallelize the data analytics and efficiently handle the high volume, high velocity and high variety of information. In our study, MapReduce is implemented with the Spark platform, and image processing functions such as uniform filter and Sobel filter are wrapped in the mappers. The methodology is illustrated with concrete slabs, using actual experimental data with induced damage

Download Full-text

Urban Planning and Smart City Decision Management Empowered by Real-Time Data Processing Using Big Data Analytics

Sensors ◽

10.3390/s18092994 ◽

2018 ◽

Vol 18 (9) ◽

pp. 2994 ◽

Cited By ~ 28

Author(s):

Bhagya Silva ◽

Murad Khan ◽

Changsu Jung ◽

Jihun Seo ◽

Diyan Muhammad ◽

...

Keyword(s):

Big Data ◽

Data Processing ◽

Real Time ◽

Smart City ◽

Data Analytics ◽

Smart Cities ◽

Big Data Analytics ◽

Time Data ◽

Real Time Data

The Internet of Things (IoT), inspired by the tremendous growth of connected heterogeneous devices, has pioneered the notion of smart city. Various components, i.e., smart transportation, smart community, smart healthcare, smart grid, etc. which are integrated within smart city architecture aims to enrich the quality of life (QoL) of urban citizens. However, real-time processing requirements and exponential data growth withhold smart city realization. Therefore, herein we propose a Big Data analytics (BDA)-embedded experimental architecture for smart cities. Two major aspects are served by the BDA-embedded smart city. Firstly, it facilitates exploitation of urban Big Data (UBD) in planning, designing, and maintaining smart cities. Secondly, it occupies BDA to manage and process voluminous UBD to enhance the quality of urban services. Three tiers of the proposed architecture are liable for data aggregation, real-time data management, and service provisioning. Moreover, offline and online data processing tasks are further expedited by integrating data normalizing and data filtering techniques to the proposed work. By analyzing authenticated datasets, we obtained the threshold values required for urban planning and city operation management. Performance metrics in terms of online and offline data processing for the proposed dual-node Hadoop cluster is obtained using aforementioned authentic datasets. Throughput and processing time analysis performed with regard to existing works guarantee the performance superiority of the proposed work. Hence, we can claim the applicability and reliability of implementing proposed BDA-embedded smart city architecture in the real world.

Download Full-text