A Study on MapReduce: Challenges and Trends

Author(s):  
Sachin Arun Thanekar ◽  
K. Subrahmanyam ◽  
A. B. Bagwan

<p>Nowadays we all are surrounded by Big data. The term ‘Big Data’ itself indicates huge volume, high velocity, variety and veracity i.e. uncertainty of data which gave rise to new difficulties and challenges. Big data generated may be structured data, Semi Structured data or unstructured data. For existing database and systems lot of difficulties are there to process, analyze, store and manage such a Big Data. The Big Data challenges are Protection, Curation, Capture, Analysis, Searching, Visualization, Storage, Transfer and sharing. Map Reduce is a framework using which we can write applications to process huge amount of data, in parallel, on large clusters of commodity hardware in a reliable manner. Lot of efforts have been put by different researchers to make it simple, easy, effective and efficient. In our survey paper we emphasized on the working of Map Reduce, challenges, opportunities and recent trends so that researchers can think on further improvement. </p>

Author(s):  
Sachin Arun Thanekar ◽  
K. Subrahmanyam ◽  
A. B. Bagwan

<p>Nowadays we all are surrounded by Big data. The term ‘Big Data’ itself indicates huge volume, high velocity, variety and veracity i.e. uncertainty of data which gave rise to new difficulties and challenges. Big data generated may be structured data, Semi Structured data or unstructured data. For existing database and systems lot of difficulties are there to process, analyze, store and manage such a Big Data.  The Big Data challenges are Protection, Curation, Capture, Analysis, Searching, Visualization, Storage, Transfer and sharing. Map Reduce is a framework using which we can write applications to process huge amount of data, in parallel, on large clusters of commodity hardware in a reliable manner. Lot of efforts have been put by different researchers to make it simple, easy, effective and efficient. In our survey paper we emphasized on the working of Map Reduce, challenges, opportunities and recent trends so that researchers can think on further improvement.</p>


Author(s):  
Sachin Arun Thanekar ◽  
K. Subrahmanyam ◽  
A. B. Bagwan

<p>Nowadays we all are surrounded by Big data. The term ‘Big Data’ itself indicates huge volume, high velocity, variety and veracity i.e. uncertainty of data which gave rise to new difficulties and challenges. Big data generated may be structured data, Semi Structured data or unstructured data. For existing database and systems lot of difficulties are there to process, analyze, store and manage such a Big Data.  The Big Data challenges are Protection, Curation, Capture, Analysis, Searching, Visualization, Storage, Transfer and sharing. Map Reduce is a framework using which we can write applications to process huge amount of data, in parallel, on large clusters of commodity hardware in a reliable manner. Lot of efforts have been put by different researchers to make it simple, easy, effective and efficient. In our survey paper we emphasized on the working of Map Reduce, challenges, opportunities and recent trends so that researchers can think on further improvement.</p>


Author(s):  
Vinay Kumar ◽  
Arpana Chaturvedi

<div><p><em>With the advent of Social Networking Sites (SNS), volumes of data are generated daily. Most of these data are multimedia type and unstructured with exponential growth. This exponential growth of variety, volume and complexity of structured and unstructured data leads to the concept of big data. Managing big data and harnessing its benefits is a real challenge. With increase in access to big data repository for various applications, security and access control is another aspect that needs to be considered while managing big data. We have discussed area of application of big data, opportunities it provides and challenges that we face in the managing such huge amount of data for various applications. Issues related to security against different threat perception of big data are also discussed. </em></p></div>


Author(s):  
Mohd Vasim Ahamad ◽  
Misbahul Haque ◽  
Mohd Imran

In the present digital era, more data are generated and collected than ever before. But, this huge amount of data is of no use until it is converted into some useful information. This huge amount of data, coming from a number of sources in various data formats and having more complexity, is called big data. To convert the big data into meaningful information, the authors use different analytical approaches. Information extracted, after applying big data analytics methods over big data, can be used in business decision making, fraud detection, healthcare services, education sector, machine learning, extreme personalization, etc. This chapter presents the basics of big data and big data analytics. Big data analysts face many challenges in storing, managing, and analyzing big data. This chapter provides details of challenges in all mentioned dimensions. Furthermore, recent trends of big data analytics and future directions for big data researchers are also described.


Big Data ◽  
2016 ◽  
pp. 1495-1518
Author(s):  
Mohammad Alaa Hussain Al-Hamami

Big Data is comprised systems, to remain competitive by techniques emerging due to Big Data. Big Data includes structured data, semi-structured and unstructured. Structured data are those data formatted for use in a database management system. Semi-structured and unstructured data include all types of unformatted data including multimedia and social media content. Among practitioners and applied researchers, the reaction to data available through blogs, Twitter, Facebook, or other social media can be described as a “data rush” promising new insights about consumers' choices and behavior and many other issues. In the past Big Data has been used just by very large organizations, governments and large enterprises that have the ability to create its own infrastructure for hosting and mining large amounts of data. This chapter will show the requirements for the Big Data environments to be protected using the same rigorous security strategies applied to traditional database systems.


Author(s):  
Trupti Vishwambhar Kenekar ◽  
Ajay R. Dani

As Big Data is group of structured, unstructured and semi-structure data collected from various sources, it is important to mine and provide privacy to individual data. Differential Privacy is one the best measure which provides strong privacy guarantee. The chapter proposed differentially private frequent item set mining using map reduce requires less time for privately mining large dataset. The chapter discussed problem of preserving data privacy, different challenges to preserving data privacy in big data environment, Data privacy techniques and their applications to unstructured data. The analyses of experimental results on structured and unstructured data set are also presented.


2019 ◽  
Vol 8 (S3) ◽  
pp. 35-40
Author(s):  
S. Mamatha ◽  
T. Sudha

In this digital world, as organizations are evolving rapidly with data centric asset the explosion of data and size of the databases have been growing exponentially. Data is generated from different sources like business processes, transactions, social networking sites, web servers, etc. and remains in structured as well as unstructured form. The term ― Big data is used for large data sets whose size is beyond the ability of commonly used software tools to capture, manage, and process the data within a tolerable elapsed time. Big data varies in size ranging from a few dozen terabytes to many petabytes of data in a single data set. Difficulties include capture, storage, search, sharing, analytics and visualizing. Big data is available in structured, unstructured and semi-structured data format. Relational database fails to store this multi-structured data. Apache Hadoop is efficient, robust, reliable and scalable framework to store, process, transforms and extracts big data. Hadoop framework is open source and fee software which is available at Apache Software Foundation. In this paper we will present Hadoop, HDFS, Map Reduce and c-means big data algorithm to minimize efforts of big data analysis using Map Reduce code. The objective of this paper is to summarize the state-of-the-art efforts in clinical big data analytics and highlight what might be needed to enhance the outcomes of clinical big data analytics tools and related fields.


2014 ◽  
Vol 16 (6) ◽  
pp. 37-40 ◽  
Author(s):  
Shital Suryawanshi ◽  
◽  
Prof. V.S Wadne

Author(s):  
Reema Abdulraziq ◽  
Muneer Bani Yassein ◽  
Shadi Aljawarneh

Big data refers to the huge amount of data that is being used in commercial, industrial and economic environments. There are three types of big data; structured, unstructured and semi-structured data. When it comes to discussions on big data, three major aspects that can be considered as its main dimensions are the volume, velocity, and variety of the data. This data is collected, analysed and checked for use by the end users. Cloud computing and the Internet of Things (IoT) are used to enable this huge amount of collected data to be stored and connected to the Internet. The time and the cost are reduced by means of these technologies, and in addition, they are able to accommodate this large amount of data regardless of its size. This chapter focuses on how big data, with the emergence of cloud computing and the Internet of Things (IOT), can be used via several applications and technologies.


Author(s):  
Jaimin N. Undavia ◽  
Atul Patel ◽  
Sheenal Patel

Availability of huge amount of data has opened up a new area and challenge to analyze these data. Analysis of these data become essential for each organization and these analyses may yield some useful information for their future prospectus. To store, manage and analyze such huge amount of data traditional database systems are not adequate and not capable also, so new data term is introduced – “Big Data”. This term refers to huge amount of data which are used for analytical purpose and future prediction or forecasting. Big Data may consist of combination of structured, semi structured or unstructured data and managing such data is a big challenge in current time. Such heterogeneous data is required to maintained in very secured and specific way. In this chapter, we have tried to identify such challenges and issues and also tried to resolve it with specific tools.


Sign in / Sign up

Export Citation Format

Share Document