Abstract
Leverage scores, loosely speaking, reflect the importance of the rows and columns of a matrix. Ideally, given the leverage scores of a rank-r matrix $M\in \mathbb{R}^{n\times n}$, that matrix can be reliably completed from just $O (rn\log ^{2}n )$ samples if the samples are chosen randomly from a non-uniform distribution induced by the leverage scores. In practice, however, the leverage scores are often unknown a priori. As such, the sample complexity in uniform matrix completion—using uniform random sampling—increases to $O(\eta (M)\cdot rn\log ^{2}n)$, where η(M) is the largest leverage score of M. In this paper, we propose a two-phase algorithm called MC2 for matrix completion: in the first phase, the leverage scores are estimated based on uniform random samples, and then in the second phase the matrix is resampled non-uniformly based on the estimated leverage scores and then completed. For well-conditioned matrices, the total sample complexity of MC2 is no worse than uniform matrix completion, and for certain classes of well-conditioned matrices—namely, reasonably coherent matrices whose leverage scores exhibit mild decay—MC2 requires substantially fewer samples. Numerical simulations suggest that the algorithm outperforms uniform matrix completion in a broad class of matrices and, in particular, is much less sensitive to the condition number than our theory currently requires.