New Directions in Streaming Algorithms
Large volumes of available data have led to the emergence of new computational models for data analysis. One such model is captured by the notion of streaming algorithms: given a sequence of N items, the goal is to compute the value of a given function of the input items by a small number of passes and using a sublinear amount of space in N. Streaming algorithms have applications in many areas such as networking and large scale machine learning. Despite a huge amount of work on this area over the last two decades, there are multiple aspects of streaming algorithms that remained poorly understood, such as (a) streaming algorithms for combinatorial optimization problems and (b) incorporating modern machine learningtechniques in the design of streaming algorithms. In the first part of this thesis, we will describe (essentially) optimal streaming algorithms for set cover and maximum coverage, two classic problems in combinatorial optimization. Next, in the second part, we will show how to augment classic streaming algorithms of the frequency estimation and low-rank approximation problems with machine learning oracles in order to improve their space-accuracy tradeoffs. The new algorithms combine the benefits of machine learning with the formal guarantees available through algorithm design theory.