Big Data Analytics Using Apache Hive to Analyze Health Data

2022 ◽  
pp. 979-992
Author(s):  
Pavani Konagala

A large volume of data is stored electronically. It is very difficult to measure the total volume of that data. This large amount of data is coming from various sources such as stock exchange, which may generate terabytes of data every day, Facebook, which may take about one petabyte of storage, and internet archives, which may store up to two petabytes of data, etc. So, it is very difficult to manage that data using relational database management systems. With the massive data, reading and writing from and into the drive takes more time. So, the storage and analysis of this massive data has become a big problem. Big data gives the solution for these problems. It specifies the methods to store and analyze the large data sets. This chapter specifies a brief study of big data techniques to analyze these types of data. It includes a wide study of Hadoop characteristics, Hadoop architecture, advantages of big data and big data eco system. Further, this chapter includes a comprehensive study of Apache Hive for executing health-related data and deaths data of U.S. government.

Author(s):  
Pavani Konagala

A large volume of data is stored electronically. It is very difficult to measure the total volume of that data. This large amount of data is coming from various sources such as stock exchange, which may generate terabytes of data every day, Facebook, which may take about one petabyte of storage, and internet archives, which may store up to two petabytes of data, etc. So, it is very difficult to manage that data using relational database management systems. With the massive data, reading and writing from and into the drive takes more time. So, the storage and analysis of this massive data has become a big problem. Big data gives the solution for these problems. It specifies the methods to store and analyze the large data sets. This chapter specifies a brief study of big data techniques to analyze these types of data. It includes a wide study of Hadoop characteristics, Hadoop architecture, advantages of big data and big data eco system. Further, this chapter includes a comprehensive study of Apache Hive for executing health-related data and deaths data of U.S. government.


2017 ◽  
pp. 83-99
Author(s):  
Sivamathi Chokkalingam ◽  
Vijayarani S.

The term Big Data refers to large-scale information management and analysis technologies that exceed the capability of traditional data processing technologies. Big Data is differentiated from traditional technologies in three ways: volume, velocity and variety of data. Big data analytics is the process of analyzing large data sets which contains a variety of data types to uncover hidden patterns, unknown correlations, market trends, customer preferences and other useful business information. Since Big Data is new emerging field, there is a need for development of new technologies and algorithms for handling big data. The main objective of this paper is to provide knowledge about various research challenges of Big Data analytics. A brief overview of various types of Big Data analytics is discussed in this paper. For each analytics, the paper describes process steps and tools. A banking application is given for each analytics. Some of research challenges and possible solutions for those challenges of big data analytics are also discussed.


Author(s):  
Mamata Rath

Big data analytics is an refined advancement for fusion of large data sets that include a collection of data elements to expose hidden prototype, undetected associations, showcase business logic, client inclinations, and other helpful business information. Big data analytics involves challenging techniques to mine and extract relevant data that includes the actions of penetrating a database, effectively mining the data, querying and inspecting data committed to enhance the technical execution of various task segments. The capacity to synthesize a lot of data can enable an association to manage impressive data that can influence the business. In this way, the primary goal of big data analytics is to help business relationship to have enhanced comprehension of data and, subsequently, settle on proficient and educated decisions.


Author(s):  
Abhishek Bajpai ◽  
Dr. Sanjiv Sharma

As the Volume of the data produced is increasing day by day in our society, the exploration of big data in healthcare is increasing at an unprecedented rate. Now days, Big data is very popular buzzword concept in the various areas. This paper provide an effort is made to established that even the healthcare industries are stepping into big data pool to take all advantages from its various advanced tools and technologies. This paper provides the review of various research disciplines made in health care realm using big data approaches and methodologies. Big data methodologies can be used for the healthcare data analytics (which consist 4 V’s) which provide the better decision to accelerate the business profit and customer affection, acquire a better understanding of market behaviours and trends and to provide E-Health services using Digital imaging and communication in Medicine (DICOM).Big data Techniques like Map Reduce, Machine learning can be applied to develop system for early diagnosis of disease, i.e. analysis of the chronic disease like- heart disease, diabetes and stroke. The analysis on the data is performed using big data analytics framework Hadoop. Hadoop framework is used to process large data sets Further the paper present the various Big data tools , challenges and opportunities and various hurdles followed by the conclusion.                                      


2017 ◽  
Vol 7 (1) ◽  
pp. 183-195
Author(s):  
Sasikala V

Big data analytics is the process of examining large data sets to uncover hidden patterns,unknown correlations, market trends, customer preferences and other useful business information. The analytical findings can lead to more effective marketing, new revenue opportunities, better customer service, improved operational efficiency, competitive advantages over rival organizations and other business benefits.


2021 ◽  
Author(s):  
PRANJAL KUMAR ◽  
Siddhartha Chauhan

Abstract Big data analysis and Artificial Intelligence have received significant attention recently in creating more opportunities in the health sector for aggregating or collecting large-scale data. Today, our genomes and microbiomes can be sequenced i.e., all information exchanged between physicians and patients in Electronic Health Records (EHR) can be collected and traced at least theoretically. Social media and mobile devices today obviously provide many health-related data regarding activity, diets, social contacts, and so on. However, it is increasingly difficult to use this information to answer health questions and, in particular, because the data comes from various domains and lives in different infrastructures and of course it also is very variable quality. The massive collection and aggregation of personal data come with a number of ethical policy, methodological, technological challenges. It should be acknowledged that large-scale clinical evidence remains to confirm the promise of Big Data and Artificial Intelligence (AI) in health care. This paper explores the complexities of big data & artificial intelligence in healthcare as well as the benefits and prospects.


Author(s):  
J.Phani Prasad ◽  
T. Venkatesham

Retailing concept has fundamentally changed the business, and the customers at present have the ability to use and access to the variety of products offered with the help of retail outlets in many forms. In order to compete with the global market and business needs and as well as for better growth, retail companies are using marketing strategies based on the aspect of data. This has made a way for companies in retailing where data being a major source to understand the timely needs of customers and in the prediction of the buying behaviour of the customers. Retail companies are finding different ways to extract meaningful information from large data sets through different sources, and in various formats. Big data is one of the technologies in present days which are helping the retail industries. Companies are trying to understand how big data and analytics can empower them to take right decisions. In this study it tells about how big data analytics impacts retail sector.


Author(s):  
Anitha S. Pillai ◽  
Bindu Menon

Advancement in technology has paved the way for the growth of big data. We are able to exploit this data to a great extent as the costs of collecting, storing, and analyzing a large volume of data have plummeted considerably. There is an exponential increase in the amount of health-related data being generated by smart devices. Requisite for proper mining of the data for knowledge discovery and therapeutic product development is very essential. The expanding field of big data analytics is playing a vital role in healthcare practices and research. A large number of people are being affected by Alzheimer's Disease (AD), and as a result, it becomes very challenging for the family members to handle these individuals. The objective of this chapter is to highlight how deep learning can be used for the early diagnosis of AD and present the outcomes of research studies of both neurologists and computer scientists. The chapter gives introduction to big data, deep learning, AD, biomarkers, and brain images and concludes by suggesting blood biomarker as an ideal solution for early detection of AD.


2018 ◽  
Vol 7 (3.29) ◽  
pp. 12
Author(s):  
L Chandra Sekhar Reddy ◽  
Dr D. Murali

We live today in a digital world a tremendous amount of data is generated by each digital service we use. This vast amount of data generated is called Big Data. According to Wikipedia, Big Data is a word for large data sets or compositions that the traditional data monitoring application software is pitiful to compress [5]. Extensive data cannot be used to receive data, store data, analyse data, search, share, transfer, view, consult, and update and maintain the confidentiality of information. Google's streaming services, YouTube, are one of the best examples of services that produce a massive amount of data in a brief period. Data extraction of a significant amount of data is done using Hadoop and MapReduce to measure performance. Hadoop is a system that offers consistent memory. Storage is provided by HDFS (Hadoop Distributed File System) and MapReduce analysis. MapReduce is a programming model and a corresponding implementation for processing large data sets. This article presents the analysis of Big Data on YouTube using the Hadoop and MapReduce techniques.   


2016 ◽  
Vol 6 (3) ◽  
pp. 53-64 ◽  
Author(s):  
Louise Leenen ◽  
Thomas Meyer

The Governments, military forces and other organisations responsible for cybersecurity deal with vast amounts of data that has to be understood in order to lead to intelligent decision making. Due to the vast amounts of information pertinent to cybersecurity, automation is required for processing and decision making, specifically to present advance warning of possible threats. The ability to detect patterns in vast data sets, and being able to understanding the significance of detected patterns are essential in the cyber defence domain. Big data technologies supported by semantic technologies can improve cybersecurity, and thus cyber defence by providing support for the processing and understanding of the huge amounts of information in the cyber environment. The term big data analytics refers to advanced analytic techniques such as machine learning, predictive analysis, and other intelligent processing techniques applied to large data sets that contain different data types. The purpose is to detect patterns, correlations, trends and other useful information. Semantic technologies is a knowledge representation paradigm where the meaning of data is encoded separately from the data itself. The use of semantic technologies such as logic-based systems to support decision making is becoming increasingly popular. However, most automated systems are currently based on syntactic rules. These rules are generally not sophisticated enough to deal with the complexity of decisions required to be made. The incorporation of semantic information allows for increased understanding and sophistication in cyber defence systems. This paper argues that both big data analytics and semantic technologies are necessary to provide counter measures against cyber threats. An overview of the use of semantic technologies and big data technologies in cyber defence is provided, and important areas for future research in the combined domains are discussed.


Sign in / Sign up

Export Citation Format

Share Document