scholarly journals Universal strategies for the two-alternative big data processing

2021 ◽  
Vol 2052 (1) ◽  
pp. 012020
Author(s):  
A V Kolnogorov

Abstract We consider the two-alternative processing of big data in the framework of the two-armed bandit problem. We assume that there are two processing methods with different, fixed but a priori unknown efficiencies which are due to different reasons including those caused by legislation. Results of data processing are interpreted as random incomes. During control process, one has to determine the most efficient method and to provide its primary usage. The difficulty of the problem is caused by the fact that its solution essentially depends on distributions of one-step incomes corresponding to results of data processing. However, in case of big data we show that there are universal processing strategies for a wide class of distributions of one-step incomes. To this end, we consider Gaussian two-armed bandit which naturally arises when batch data processing is analyzed. Minimax risk and minimax strategy are searched for as Bayesian ones corresponding to the worst-case prior distribution. We present recursive integro-difference equation for computing Bayesian risk and Bayesian strategy with respect to the worst-case prior distribution and a second order partial differential equation into which integro-difference equation turns in the limiting case as the control horizon goes to infinity. We also show that, in case of big data, processing of data one-by-one is not more efficient than optimal batch data processing for some types of distributions of one-step incomes, e.g. for Bernoulli and Poissonian distributions. Numerical experiments are presented and show that proposed universal strategies provide high performance of two-alternative big data processing.

2022 ◽  
pp. 1162-1191
Author(s):  
Dinesh Chander ◽  
Hari Singh ◽  
Abhinav Kirti Gupta

Data processing has become an important field in today's big data-dominated world. The data has been generating at a tremendous pace from different sources. There has been a change in the nature of data from batch-data to streaming-data, and consequently, data processing methodologies have also changed. Traditional SQL is no longer capable of dealing with this big data. This chapter describes the nature of data and various tools, techniques, and technologies to handle this big data. The chapter also describes the need of shifting big data on to cloud and the challenges in big data processing in the cloud, the migration from data processing to data analytics, tools used in data analytics, and the issues and challenges in data processing and analytics. Then the chapter touches an important application area of streaming data, sentiment analysis, and tries to explore it through some test case demonstrations and results.


Author(s):  
Dinesh Chander ◽  
Hari Singh ◽  
Abhinav Kirti Gupta

Data processing has become an important field in today's big data-dominated world. The data has been generating at a tremendous pace from different sources. There has been a change in the nature of data from batch-data to streaming-data, and consequently, data processing methodologies have also changed. Traditional SQL is no longer capable of dealing with this big data. This chapter describes the nature of data and various tools, techniques, and technologies to handle this big data. The chapter also describes the need of shifting big data on to cloud and the challenges in big data processing in the cloud, the migration from data processing to data analytics, tools used in data analytics, and the issues and challenges in data processing and analytics. Then the chapter touches an important application area of streaming data, sentiment analysis, and tries to explore it through some test case demonstrations and results.


2019 ◽  
Vol 12 (1) ◽  
pp. 42 ◽  
Author(s):  
Andrey I. Vlasov ◽  
Konstantin A. Muraviev ◽  
Alexandra A. Prudius ◽  
Demid A. Uzenkov

Sign in / Sign up

Export Citation Format

Share Document