scholarly journals A Comparative Study on Big Data Analytics Frameworks, Data Resources and Challenges

2019 ◽  
Vol 13 (7) ◽  
pp. 1 ◽  
Author(s):  
Flasteen Abuqabita ◽  
Razan Al-Omoush ◽  
Jaber Alwidian

Recently, huge amount of data has been generated in all over the world; these data are very huge, extremely fast and varies in its type. In order to extract the value from this data and make sense of it, a lot of frameworks and tools are needed to be developed for analyzing it. Until now a lot of tools and frameworks were generated to capture, store, analyze and visualize it. In this study we categorized the existing frameworks which is used for processing the big data into three groups, namely as, Batch processing, Stream analytics and Interactive analytics, we discussed each of them in detailed and made comparison on each of them.

Author(s):  
Nirmit Singhal ◽  
Amita Goel, ◽  
Nidhi Sengar ◽  
Vasudha Bahl

The world generated 52 times the amount of data in 2010 and 76 times the number of information sources in 2022. The ability to use this data creates enormous opportunities, and in order to make these opportunities a reality, people must use data to solve problems. Unfortunately, in the midst of a global pandemic, when people all over the world seek reliable, trustworthy information about COVID-19 (Coronavirus). Tableau plays a key role in this scenario because it is an extremely powerful tool for quickly visualizing large amounts of data. It has a simple drag-and-drop interface. Beautiful infographics are simple to create and take little time. Tableau works with a wide variety of data sources. COVID-19 (Coronavirus)analytics with Tableau will allow you to create dashboards that will assist you. Tableau is a tool that deals with big data analytics and generates output in a visualization technique, making it more understandable and presentable. Data blending, real-time reporting, and data collaboration are one of its features. Ultimately, this paper provides a clear picture of the growing COVID19 (Coronavirus) data and the tools that can assist more effectively, accurately, and efficiently. Keywords: Data Visualization, Tableau, Data Analysis, Covid-19 analysis, Covid-19 data


Author(s):  
Pethuru Raj

The implications of the digitization process among a bevy of trends are definitely many and memorable. One is the abnormal growth in data generation, gathering, and storage due to a steady increase in the number of data sources, structures, scopes, sizes, and speeds. In this chapter, the author shows some of the impactful developments brewing in the IT space, how the tremendous amount of data getting produced and processed all over the world impacts the IT and business domains, how next-generation IT infrastructures are accordingly getting refactored, remedied, and readied for the impending big data-induced challenges, how likely the move of the big data analytics discipline towards fulfilling the digital universe requirements of extracting and extrapolating actionable insights for the knowledge-parched is, and finally, the establishment and sustenance of the dreamt smarter planet.


Author(s):  
Mohd Imran ◽  
Mohd Vasim Ahamad ◽  
Misbahul Haque ◽  
Mohd Shoaib

The term big data analytics refers to mining and analyzing of the voluminous amount of data in big data by using various tools and platforms. Some of the popular tools are Apache Hadoop, Apache Spark, HBase, Storm, Grid Gain, HPCC, Casandra, Pig, Hive, and No SQL, etc. These tools are used depending on the parameter taken for big data analysis. So, we need a comparative analysis of such analytical tools to choose best and simpler way of analysis to gain more optimal throughput and efficient mining. This chapter contributes to a comparative study of big data analytics tools based on different aspects such as their functionality, pros, and cons based on characteristics that can be used to determine the best and most efficient among them. Through the comparative study, people are capable of using such tools in a more efficient way.


Author(s):  
Mohd Vasim Ahamad ◽  
Misbahul Haque ◽  
Mohd Imran

In the present digital era, more data are generated and collected than ever before. But, this huge amount of data is of no use until it is converted into some useful information. This huge amount of data, coming from a number of sources in various data formats and having more complexity, is called big data. To convert the big data into meaningful information, the authors use different analytical approaches. Information extracted, after applying big data analytics methods over big data, can be used in business decision making, fraud detection, healthcare services, education sector, machine learning, extreme personalization, etc. This chapter presents the basics of big data and big data analytics. Big data analysts face many challenges in storing, managing, and analyzing big data. This chapter provides details of challenges in all mentioned dimensions. Furthermore, recent trends of big data analytics and future directions for big data researchers are also described.


Author(s):  
Nitigya Sambyal ◽  
Poonam Saini ◽  
Rupali Syal

The world is increasingly driven by huge amounts of data. Big data refers to data sets that are so large or complex that traditional data processing application software are inadequate to deal with them. Healthcare analytics is a prominent area of big data analytics. It has led to significant reduction in morbidity and mortality associated with a disease. In order to harness full potential of big data, various tools like Apache Sentry, BigQuery, NoSQL databases, Hadoop, JethroData, etc. are available for its processing. However, with such enormous amounts of information comes the complexity of data management, other big data challenges occur during data capture, storage, analysis, search, transfer, information privacy, visualization, querying, and update. The chapter focuses on understanding the meaning and concept of big data, analytics of big data, its role in healthcare, various application areas, trends and tools used to process big data along with open problem challenges.


Author(s):  
P. Venkateswara Rao ◽  
A. Ramamohan Reddy ◽  
V. Sucharita

In the field of Aquaculture with the help of digital advancements huge amount of data is constantly produced for which the data of the aquaculture has entered in the big data world. The requirement for data management and analytics model is increased as the development progresses. Therefore, all the data cannot be stored on single machine. There is need for solution that stores and analyzes huge amounts of data which is nothing but Big Data. In this chapter a framework is developed that provides a solution for shrimp disease by using historical data based on Hive and Hadoop. The data regarding shrimps is acquired from different sources like aquaculture websites, various reports of laboratory etc. The noise is removed after the collection of data from various sources. Data is to be uploaded on HDFS after normalization is done and is to be put in a file that supports Hive. Finally classified data will be located in particular place. Based on the features extracted from aquaculture data, HiveQL can be used to analyze shrimp diseases symptoms.


2022 ◽  
pp. 622-631
Author(s):  
Mohd Imran ◽  
Mohd Vasim Ahamad ◽  
Misbahul Haque ◽  
Mohd Shoaib

The term big data analytics refers to mining and analyzing of the voluminous amount of data in big data by using various tools and platforms. Some of the popular tools are Apache Hadoop, Apache Spark, HBase, Storm, Grid Gain, HPCC, Casandra, Pig, Hive, and No SQL, etc. These tools are used depending on the parameter taken for big data analysis. So, we need a comparative analysis of such analytical tools to choose best and simpler way of analysis to gain more optimal throughput and efficient mining. This chapter contributes to a comparative study of big data analytics tools based on different aspects such as their functionality, pros, and cons based on characteristics that can be used to determine the best and most efficient among them. Through the comparative study, people are capable of using such tools in a more efficient way.


Information ◽  
2020 ◽  
Vol 11 (2) ◽  
pp. 60 ◽  
Author(s):  
Lorenzo Carnevale ◽  
Antonio Celesti ◽  
Maria Fazio ◽  
Massimo Villari

Nowadays, we are observing a growing interest about Big Data applications in different healthcare sectors. One of this is definitely cardiology. In fact, electrocardiogram produces a huge amount of data about the heart health status that need to be stored and analysed in order to detect a possible issues. In this paper, we focus on the arrhythmia detection problem. Specifically, our objective is to address the problem of distributed processing considering big data generated by electrocardiogram (ECG) signals in order to carry out pre-processing analysis. Specifically, an algorithm for the identification of heartbeats and arrhythmias is proposed. Such an algorithm is designed in order to carry out distributed processing over the Cloud since big data could represent the bottleneck for cardiology applications. In particular, we implemented the Menard algorithm in Apache Spark in order to process big data coming form ECG signals in order to identify arrhythmias. Experiments conducted using a dataset provided by the Physionet.org European ST-T Database show an improvement in terms of response times. As highlighted by our outcomes, our solution provides a scalable and reliable system, which may address the challenges raised by big data in healthcare.


Sign in / Sign up

Export Citation Format

Share Document