Part of Advances in Neural Information Processing Systems 29 (NIPS 2016)
Maria-Florina F. Balcan, Hongyang Zhang
We study the problem of recovering an incomplete $m\times n$ matrix of rank $r$ with columns arriving online over time. This is known as the problem of life-long matrix completion, and is widely applied to recommendation system, computer vision, system identification, etc. The challenge is to design provable algorithms tolerant to a large amount of noises, with small sample complexity. In this work, we give algorithms achieving strong guarantee under two realistic noise models. In bounded deterministic noise, an adversary can add any bounded yet unstructured noise to each column. For this problem, we present an algorithm that returns a matrix of a small error, with sample complexity almost as small as the best prior results in the noiseless case. For sparse random noise, where the corrupted columns are sparse and drawn randomly, we give an algorithm that exactly recovers an $\mu_0$-incoherent matrix by probability at least $1-\delta$ with sample complexity as small as $O(\mu_0rn\log(r/\delta))$. This result advances the state-of-the-art work and matches the lower bound in a worst case. We also study the scenario where the hidden matrix lies on a mixture of subspaces and show that the sample complexity can be even smaller. Our proposed algorithms perform well experimentally in both synthetic and real-world datasets.