Analysis of Periodontitis using Map Reduce in Big Data Analytics

In this digital world, as organizations are evolving rapidly with data centric asset the explosion of data and size of the databases have been growing exponentially. Data is generated from different sources like business processes, transactions, social networking sites, web servers, etc. and remains in structured as well as unstructured form. The term ― Big data is used for large data sets whose size is beyond the ability of commonly used software tools to capture, manage, and process the data within a tolerable elapsed time. Big data varies in size ranging from a few dozen terabytes to many petabytes of data in a single data set. Difficulties include capture, storage, search, sharing, analytics and visualizing. Big data is available in structured, unstructured and semi-structured data format. Relational database fails to store this multi-structured data. Apache Hadoop is efficient, robust, reliable and scalable framework to store, process, transforms and extracts big data. Hadoop framework is open source and fee software which is available at Apache Software Foundation. In this paper we will present Hadoop, HDFS, Map Reduce and c-means big data algorithm to minimize efforts of big data analysis using Map Reduce code. The objective of this paper is to summarize the state-of-the-art efforts in clinical big data analytics and highlight what might be needed to enhance the outcomes of clinical big data analytics tools and related fields.

Download Full-text

Parallel decision tree with map reduce model for big data analytics

2017 International Conference on Trends in Electronics and Informatics (ICEI) ◽

10.1109/icoei.2017.8300800 ◽

2017 ◽

Author(s):

Arati Koli ◽

Swati Shinde

Keyword(s):

Big Data ◽

Decision Tree ◽

Data Analytics ◽

Big Data Analytics ◽

Map Reduce

Download Full-text

Performance Analysis and Challenges of the Map Reduce Framework in Big Data Analytics

2018 International Conference on Current Trends towards Converging Technologies (ICCTCT) ◽

10.1109/icctct.2018.8551013 ◽

2018 ◽

Cited By ~ 1

Author(s):

Chaitanya Varma

Keyword(s):

Big Data ◽

Performance Analysis ◽

Data Analytics ◽

Big Data Analytics ◽

Map Reduce

Download Full-text

Machine learning approaches on map reduce for Big Data analytics

2015 International Conference on Green Computing and Internet of Things (ICGCIoT) ◽

10.1109/icgciot.2015.7380512 ◽

2015 ◽

Author(s):

J V N Lakshmi ◽

Ananthi Sheshasaayee

Keyword(s):

Machine Learning ◽

Big Data ◽

Data Analytics ◽

Big Data Analytics ◽

Map Reduce ◽

Learning Approaches

Download Full-text

An Approach in Big Data Analytics to improve the velocity of unstructured data Using Map Reduce

International Journal of System Dynamics Applications ◽

10.4018/ijsda.20211001oa06 ◽

2021 ◽

Vol 10 (4) ◽

pp. 0-0

Keyword(s):

Big Data ◽

Time Delay ◽

Data Processing ◽

Data Analytics ◽

Big Data Analytics ◽

Data Retrieval ◽

High Volume ◽

Map Reduce ◽

Hadoop Clusters ◽

Search Index

Big Data Analytics is an innovative approach for extracting the data from a huge volume of data warehouse systems. It reveals the method to compress the high volume of data into clusters by MapReduce and HDFS. However, the data processing has taken more time for extract and store in Hadoop clusters. The proposed system deals with the challenges of time delay in shuffle phase of map-reduce due to scheduling and sequencing. For improving the speed of big data, this proposed work using the Compressed Elastic Search Index (CESI) and MapReduce-Based Next Generation Sequencing Approach (MRBNGSA). This approach helps to increase the speed of data retrieval from HDFS clusters because of the way it is stored in that. this method is stored only the metadata in HDFS which takes less memory during runtime compare to big data due to the volume of data stored in HDFS. This approach is reduces the CPU utilization and memory allocation of the resource manager in Hadoop Framework and imroves data processing speed, such a way that time delay has to be reduced with minimum latency.

Download Full-text

Examination of Big Dataset using LEOS, JOSE, SVM on Map Reduce

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.a1094.1191s19 ◽

2019 ◽

Vol 9 (1S) ◽

pp. 449-455

Keyword(s):

Big Data ◽

Data Structures ◽

Data Analytics ◽

Big Data Analytics ◽

Map Reduce ◽

Information Sets ◽

Speed Up ◽

Integrated Technique ◽

The Moment ◽

Medical Dataset

Data analytics (DA) is the job of reviewing datasets in order to frame conclusions about the information they have, increasingly using specialized systems and software. As with the emergence of Big Data, data analytics was needed. The problems that we are considering are going to be in a fraud detection application. Where we'll considering major aspects such application-independent format(XML/JSON) for the clusterization process based on the no label classification algorithm where we will focusing on the clusters to enhance the oversampling process and utilize the merits of parallel computing to speed up our system. We aim to use MapReduce functionality in our application and deploy it on Amazon AWS. Datasets gathered for studies often comprise millions of records and can carry hard-to-detect concealed pitfalls. In this paper, we are working on two datasets. The first one is a medical dataset and the second one is a customer dataset. Big Data Analytics is the suggested solution in this day and age, with growing demands for analyzing huge information sets and performing the required processing on complicated data structures. The problem faced at the moment is mainly, how to store and analyze the large amount of data which is generated from heterogeneous sources like social media and what to use to make data fast accessible as well as in pocket budget. To resolve all problems Map-Reduce framework is useful-by offering an integrated technique towards machine learning, it speeds up processing. In this, we will explore the LEOS algorithm, SVM, MapReduce and JOSE algorithm, their requirements, their benefits, their disadvantages, difficulties, and their corresponding solutions.

Download Full-text

Optimal Fuzzy C-means Clustering Technique for Big Data Analytics with Map Reduce based on Hybrid Optimization Algorithm

Journal of Advanced Research in Dynamical and Control Systems ◽

10.5373/jardcs/v11sp10/20192975 ◽

2019 ◽

Vol 11 (10-SPECIAL ISSUE) ◽

pp. 1298-1310

Author(s):

Dandugala Lakshmi Srinivasulu ◽

K. Suvarna Vani

Keyword(s):

Big Data ◽

Optimization Algorithm ◽

Data Analytics ◽

Big Data Analytics ◽

Hybrid Optimization ◽

Map Reduce ◽

Clustering Technique ◽

Fuzzy C Means ◽

Hybrid Optimization Algorithm ◽

Fuzzy C Means Clustering

Download Full-text

Olympics Big Data Prognostications

International Journal of Rough Sets and Data Analysis ◽

10.4018/ijrsda.2016100103 ◽

2016 ◽

Vol 3 (4) ◽

pp. 32-45 ◽

Cited By ~ 11

Author(s):

Arushi Jain ◽

Vishal Bhatnagar

Keyword(s):

Mathematical Model ◽

Big Data ◽

Rio De Janeiro ◽

Data Analytics ◽

Big Data Analytics ◽

Map Reduce ◽

Reliable Estimate ◽

Number Of Factors ◽

Historical Performance ◽

To Come

Data is continuously snowballing over the years, gradually a huge growth is seen in data to store and tame to yield meticulous result. It gives rise to a concept nowadays, reckoned as big data analytics. With the summer Olympics at Rio de Janeiro, Brazil in the year 2016 round the corner, we, the authors have implemented a mathematical model by implementing efficient map reduce program to predict the number of medals each country might bag at the games. Based on a number of factors such as historical performance of the country in terms of medals won, the performance of athletes, financial scenario in the country, fitness levels and nutrition of athletes along with familiarity to the playing conditions can be used to come up with a reliable estimate.

Download Full-text