A Study of Big Data Processing for Sentiments Analysis

2022 ◽  
pp. 1162-1191
Author(s):  
Dinesh Chander ◽  
Hari Singh ◽  
Abhinav Kirti Gupta

Data processing has become an important field in today's big data-dominated world. The data has been generating at a tremendous pace from different sources. There has been a change in the nature of data from batch-data to streaming-data, and consequently, data processing methodologies have also changed. Traditional SQL is no longer capable of dealing with this big data. This chapter describes the nature of data and various tools, techniques, and technologies to handle this big data. The chapter also describes the need of shifting big data on to cloud and the challenges in big data processing in the cloud, the migration from data processing to data analytics, tools used in data analytics, and the issues and challenges in data processing and analytics. Then the chapter touches an important application area of streaming data, sentiment analysis, and tries to explore it through some test case demonstrations and results.

Author(s):  
Dinesh Chander ◽  
Hari Singh ◽  
Abhinav Kirti Gupta

Data processing has become an important field in today's big data-dominated world. The data has been generating at a tremendous pace from different sources. There has been a change in the nature of data from batch-data to streaming-data, and consequently, data processing methodologies have also changed. Traditional SQL is no longer capable of dealing with this big data. This chapter describes the nature of data and various tools, techniques, and technologies to handle this big data. The chapter also describes the need of shifting big data on to cloud and the challenges in big data processing in the cloud, the migration from data processing to data analytics, tools used in data analytics, and the issues and challenges in data processing and analytics. Then the chapter touches an important application area of streaming data, sentiment analysis, and tries to explore it through some test case demonstrations and results.


Author(s):  
Amir A. Khwaja

Big data explosion has already happened and the situation is only going to exacerbate with such a high number of data sources and high-end technology prevalent everywhere, generating data at a frantic pace. One of the most important aspects of big data is being able to capture, process, and analyze data as it is happening in real-time to allow real-time business decisions. Alternate approaches must be investigated especially consisting of highly parallel and real-time computations for big data processing. The chapter presents RealSpec real-time specification language that may be used for the modeling of big data analytics due to the inherent language features needed for real-time big data processing such as concurrent processes, multi-threading, resource modeling, timing constraints, and exception handling. The chapter provides an overview of RealSpec and applies the language to a detailed big data event recognition case study to demonstrate language applicability to big data framework and analytics modeling.


Author(s):  
Rajganesh Nagarajan ◽  
Ramkumar Thirunavukarasu

In this chapter, the authors consider different categories of data, which are processed by the big data analytics tools. The challenges with respect to the big data processing are identified and a solution with the help of cloud computing is highlighted. Since the emergence of cloud computing is highly advocated because of its pay-per-use concept, the data processing tools can be effectively deployed within cloud computing and certainly reduce the investment cost. In addition, this chapter talks about the big data platforms, tools, and applications with data visualization concept. Finally, the applications of data analytics are discussed for future research.


Big Data ◽  
2016 ◽  
pp. 418-440
Author(s):  
Amir A. Khwaja

Big data explosion has already happened and the situation is only going to exacerbate with such a high number of data sources and high-end technology prevalent everywhere, generating data at a frantic pace. One of the most important aspects of big data is being able to capture, process, and analyze data as it is happening in real-time to allow real-time business decisions. Alternate approaches must be investigated especially consisting of highly parallel and real-time computations for big data processing. The chapter presents RealSpec real-time specification language that may be used for the modeling of big data analytics due to the inherent language features needed for real-time big data processing such as concurrent processes, multi-threading, resource modeling, timing constraints, and exception handling. The chapter provides an overview of RealSpec and applies the language to a detailed big data event recognition case study to demonstrate language applicability to big data framework and analytics modeling.


2020 ◽  
Vol 10 (14) ◽  
pp. 4901
Author(s):  
Waleed Albattah ◽  
Rehan Ullah Khan ◽  
Khalil Khan

Processing big data requires serious computing resources. Because of this challenge, big data processing is an issue not only for algorithms but also for computing resources. This article analyzes a large amount of data from different points of view. One perspective is the processing of reduced collections of big data with less computing resources. Therefore, the study analyzed 40 GB data to test various strategies to reduce data processing. Thus, the goal is to reduce this data, but not to compromise on the detection and model learning in machine learning. Several alternatives were analyzed, and it is found that in many cases and types of settings, data can be reduced to some extent without compromising detection efficiency. Tests of 200 attributes showed that with a performance loss of only 4%, more than 80% of the data could be ignored. The results found in the study, thus provide useful insights into large data analytics.


Big Data ◽  
2016 ◽  
pp. 1-29 ◽  
Author(s):  
Yushi Shen ◽  
Yale Li ◽  
Ling Wu ◽  
Shaofeng Liu ◽  
Qian Wen

This chapter provides an overview of big data and its environment and opportunities. It starts with a definition of big data and describes the unique characteristics, structure, and value of big data, and the business drivers for big data analytics. It defines the role of the data scientist and describes the new ecosystem for big data processing and analysis.


Author(s):  
Anwar H. Katrawi ◽  
Rosni Abdullah ◽  
Mohammed Anbar ◽  
Ibrahim AlShourbaji ◽  
Ammar Kamal Abasi

The proliferation of information technology produces a huge amount of data called big data that cannot be processed by traditional database systems. These Various types of data come from different sources. However, stragglers are a major bottleneck in big data processing, and hence the early detection and accurate identification of stragglers can have important impacts on the performance of big data processing. This work aims to assess five stragglers identification methods: Hadoop native scheduler, LATE Scheduler, Mantri, MonTool, and Dolly. The performance of these techniques was evaluated based on three benchmarked methods: Sort, Grep and WordCount. The results show that the LATE Scheduler performs the best and it would be efficient to obtain better results for stragglers identification.


Author(s):  
Yushi Shen ◽  
Yale Li ◽  
Ling Wu ◽  
Shaofeng Liu ◽  
Qian Wen

This chapter provides an overview of big data and its environment and opportunities. It starts with a definition of big data and describes the unique characteristics, structure, and value of big data, and the business drivers for big data analytics. It defines the role of the data scientist and describes the new ecosystem for big data processing and analysis.


2021 ◽  
Vol 2052 (1) ◽  
pp. 012020
Author(s):  
A V Kolnogorov

Abstract We consider the two-alternative processing of big data in the framework of the two-armed bandit problem. We assume that there are two processing methods with different, fixed but a priori unknown efficiencies which are due to different reasons including those caused by legislation. Results of data processing are interpreted as random incomes. During control process, one has to determine the most efficient method and to provide its primary usage. The difficulty of the problem is caused by the fact that its solution essentially depends on distributions of one-step incomes corresponding to results of data processing. However, in case of big data we show that there are universal processing strategies for a wide class of distributions of one-step incomes. To this end, we consider Gaussian two-armed bandit which naturally arises when batch data processing is analyzed. Minimax risk and minimax strategy are searched for as Bayesian ones corresponding to the worst-case prior distribution. We present recursive integro-difference equation for computing Bayesian risk and Bayesian strategy with respect to the worst-case prior distribution and a second order partial differential equation into which integro-difference equation turns in the limiting case as the control horizon goes to infinity. We also show that, in case of big data, processing of data one-by-one is not more efficient than optimal batch data processing for some types of distributions of one-step incomes, e.g. for Bernoulli and Poissonian distributions. Numerical experiments are presented and show that proposed universal strategies provide high performance of two-alternative big data processing.


Author(s):  
Kommu Narendra ◽  
G. Aghila

Many sectors and fields are being computerized to make the work paperless, more transparent, and efficient. Banking is one such sector that has undergone enormous changes. Any amount from any part to any corner of the world is now possible around the clock. The dependency on technology for providing the services necessitates security, and the additional risks involved in cross-border nature of transactions of banks poses new challenges for banking regulators and supervisors. Many types of research are going in this area of banks big data processing, data analytics, and providing security for cross-border payments to mitigate the risks. Block chain is one such advancement for addressing the challenges in financial services. This chapter provides a brief overview of block chain usage, addressing the traditional issues and challenges for cross-border transactions.


Sign in / Sign up

Export Citation Format

Share Document