Need of Hadoop and Map Reduce for Processing and Managing Big Data

Web Services ◽  
2019 ◽  
pp. 1588-1600
Author(s):  
Manjunath Thimmasandra Narayanapppa ◽  
A. Channabasamma ◽  
Ravindra S. Hegadi

The amount of data around us in three sixty degrees getting increased second on second and the world is exploding as a result the size of the database used in today's enterprises, which is growing at an exponential rate day by day. At the same time, the need to process and analyze the bulky data for business decision making has also increased. Several business and scientific applications generate terabytes of data which have to be processed in efficient manner on daily bases. Data gets collected and stored at unprecedented rates. Moreover the challenge here is not only to store and manage the huge amount of data, but even to analyze and extract meaningful values from it. This has contributed to the problem of big data faced by the industry due to the inability of usual software tools and database systems to manage and process the big data sets within reasonable time limits. The main focus of the chapter is on unstructured data analysis.

Author(s):  
Manjunath Thimmasandra Narayanapppa ◽  
A. Channabasamma ◽  
Ravindra S. Hegadi

The amount of data around us in three sixty degrees getting increased second on second and the world is exploding as a result the size of the database used in today's enterprises, which is growing at an exponential rate day by day. At the same time, the need to process and analyze the bulky data for business decision making has also increased. Several business and scientific applications generate terabytes of data which have to be processed in efficient manner on daily bases. Data gets collected and stored at unprecedented rates. Moreover the challenge here is not only to store and manage the huge amount of data, but even to analyze and extract meaningful values from it. This has contributed to the problem of big data faced by the industry due to the inability of usual software tools and database systems to manage and process the big data sets within reasonable time limits. The main focus of the chapter is on unstructured data analysis.


Author(s):  
Stephen H. Kiasler ◽  
William H. Money ◽  
Stephen J. Cohen

The world of data has been evolving due to the expansion of operations and the complexity of the data processed by systems. Big Data is no longer numbers and characters but are now unstructured data types collected by a variety of devices. Recent work has postulated that the Big Data evolutionary process is making a conceptual leap to incorporate intelligence. This challenges system engineers with new issues as they envision and create service systems to process and incorporate these new data sets and structures. This article proposes that Big Data has not yet made a complete evolutionary leap, but rather that a new class of data—a higher level of abstraction—is needed to integrate this “intelligence” concept. This article examines previous definitions of Smart Data, offers a new conceptualization for smart objects (SO), examines the smart data concept, and identifies issues and challenges of understanding smart objects as a new data managed software paradigm. It concludes that smart objects incorporate new features and have different properties from passive and inert Big Data.


2021 ◽  
pp. 67-74
Author(s):  
Liudmyla Zubyk ◽  
Yaroslav Zubyk

Big data is one of modern tools that have impacted the world industry a lot of. It also plays an important role in determining the ways in which businesses and organizations formulate their strategies and policies. However, very limited academic researches has been conducted into forecasting based on big data due to the difficulties in capturing, collecting, handling, and modeling of unstructured data, which is normally characterized by it’s confidential. We define big data in the context of ecosystem for future forecasting in business decision-making. It can be difficult for a single organization to possess all of the necessary capabilities to derive strategic business value from their findings. That’s why different organizations will build, and operate their own analytics ecosystems or tap into existing ones. An analytics ecosystem comprising a symbiosis of data, applications, platforms, talent, partnerships, and third-party service providers lets organizations be more agile and adapt to changing demands. Organizations participating in analytics ecosystems can examine, learn from, and influence not only their own business processes, but those of their partners. Architectures of popular platforms for forecasting based on big data are presented in this issue.


Author(s):  
Jyotsna Talreja Wassan

Big data is revolutionizing the world in the internet age. The wide variety of areas like online businesses, electronic health management, social networking, demographics, geographic information systems, online education, etc. are gaining insight from big data principles. Big data is comprised of heterogeneous datasets which are too large to be handled by traditional relational database systems. An important reason for explosion of interest in big data is that it has become cheap to store volumes of data and there is a major rise in computation capacity. This chapter gives an overview of big data ecosystems comprising various big data platforms useful in today's competitive world.


Author(s):  
Nitigya Sambyal ◽  
Poonam Saini ◽  
Rupali Syal

The world is increasingly driven by huge amounts of data. Big data refers to data sets that are so large or complex that traditional data processing application software are inadequate to deal with them. Healthcare analytics is a prominent area of big data analytics. It has led to significant reduction in morbidity and mortality associated with a disease. In order to harness full potential of big data, various tools like Apache Sentry, BigQuery, NoSQL databases, Hadoop, JethroData, etc. are available for its processing. However, with such enormous amounts of information comes the complexity of data management, other big data challenges occur during data capture, storage, analysis, search, transfer, information privacy, visualization, querying, and update. The chapter focuses on understanding the meaning and concept of big data, analytics of big data, its role in healthcare, various application areas, trends and tools used to process big data along with open problem challenges.


Big Data ◽  
2016 ◽  
pp. 1495-1518
Author(s):  
Mohammad Alaa Hussain Al-Hamami

Big Data is comprised systems, to remain competitive by techniques emerging due to Big Data. Big Data includes structured data, semi-structured and unstructured. Structured data are those data formatted for use in a database management system. Semi-structured and unstructured data include all types of unformatted data including multimedia and social media content. Among practitioners and applied researchers, the reaction to data available through blogs, Twitter, Facebook, or other social media can be described as a “data rush” promising new insights about consumers' choices and behavior and many other issues. In the past Big Data has been used just by very large organizations, governments and large enterprises that have the ability to create its own infrastructure for hosting and mining large amounts of data. This chapter will show the requirements for the Big Data environments to be protected using the same rigorous security strategies applied to traditional database systems.


2019 ◽  
Vol 11 ◽  
pp. 184797901989077 ◽  
Author(s):  
Kiran Adnan ◽  
Rehan Akbar

During the recent era of big data, a huge volume of unstructured data are being produced in various forms of audio, video, images, text, and animation. Effective use of these unstructured big data is a laborious and tedious task. Information extraction (IE) systems help to extract useful information from this large variety of unstructured data. Several techniques and methods have been presented for IE from unstructured data. However, numerous studies conducted on IE from a variety of unstructured data are limited to single data types such as text, image, audio, or video. This article reviews the existing IE techniques along with its subtasks, limitations, and challenges for the variety of unstructured data highlighting the impact of unstructured big data on IE techniques. To the best of our knowledge, there is no comprehensive study conducted to investigate the limitations of existing IE techniques for the variety of unstructured big data. The objective of the structured review presented in this article is twofold. First, it presents the overview of IE techniques from a variety of unstructured data such as text, image, audio, and video at one platform. Second, it investigates the limitations of these existing IE techniques due to the heterogeneity, dimensionality, and volume of unstructured big data. The review finds that advanced techniques for IE, particularly for multifaceted unstructured big data sets, are the utmost requirement of the organizations to manage big data and derive strategic information. Further, potential solutions are also presented to improve the unstructured big data IE systems for future research. These solutions will help to increase the efficiency and effectiveness of the data analytics process in terms of context-aware analytics systems, data-driven decision-making, and knowledge management.


Author(s):  
Jaimin N. Undavia ◽  
Atul Patel ◽  
Sheenal Patel

Availability of huge amount of data has opened up a new area and challenge to analyze these data. Analysis of these data become essential for each organization and these analyses may yield some useful information for their future prospectus. To store, manage and analyze such huge amount of data traditional database systems are not adequate and not capable also, so new data term is introduced – “Big Data”. This term refers to huge amount of data which are used for analytical purpose and future prediction or forecasting. Big Data may consist of combination of structured, semi structured or unstructured data and managing such data is a big challenge in current time. Such heterogeneous data is required to maintained in very secured and specific way. In this chapter, we have tried to identify such challenges and issues and also tried to resolve it with specific tools.


Author(s):  
Siddesh G. M. ◽  
Srinidhi Hiriyannaiah ◽  
K. G. Srinivasa

The world of Internet has driven the computing world from a few gigabytes of information to terabytes, petabytes of information turning into a huge volume of information. These volumes of information come from a variety of sources that span over from structured to unstructured data formats. The information needs to update in a quick span of time and be available on demand with the cheaper infrastructures. The information or the data that spans over three Vs, namely Volume, Variety, and Velocity, is called Big Data. The challenge is to store and process this Big Data, running analytics on the stored Big Data, making critical decisions on the results of processing, and obtaining the best outcomes. In this chapter, the authors discuss the capabilities of Big Data, its uses, and processing of Big Data using Hadoop technologies and tools by Apache foundation.


Author(s):  
Arpit Kumar Sharma ◽  
Arvind Dhaka ◽  
Amita Nandal ◽  
Kumar Swastik ◽  
Sunita Kumari

The meaning of the term “big data” can be inferred by its name itself (i.e., the collection of large structured or unstructured data sets). In addition to their huge quantity, these data sets are so complex that they cannot be analyzed in any way using the conventional data handling software and hardware tools. If processed judiciously, big data can prove to be a huge advantage for the industries using it. Due to its usefulness, studies are being conducted to create methods to handle the big data. Knowledge extraction from big data is very important. Other than this, there is no purpose for accumulating such volumes of data. Cloud computing is a powerful tool which provides a platform for the storage and computation of massive amounts of data.


Sign in / Sign up

Export Citation Format

Share Document