On the Sample Complexity of Robust PCA

Part of Advances in Neural Information Processing Systems 25 (NIPS 2012)

Bibtex Metadata Paper


Matthew Coudron, Gilad Lerman


We estimate the sample complexity of a recent robust estimator for a generalized version of the inverse covariance matrix. This estimator is used in a convex algorithm for robust subspace recovery (i.e., robust PCA). Our model assumes a sub-Gaussian underlying distribution and an i.i.d.~sample from it. Our main result shows with high probability that the norm of the difference between the generalized inverse covariance of the underlying distribution and its estimator from an i.i.d.~sample of size $N$ is of order $O(N^{-0.5+\eps})$ for arbitrarily small $\eps>0$ (affecting the probabilistic estimate); this rate of convergence is close to one of direct covariance and inverse covariance estimation, i.e., $O(N^{-0.5})$. Our precise probabilistic estimate implies for some natural settings that the sample complexity of the generalized inverse covariance estimation when using the Frobenius norm is $O(D^{2+\delta})$ for arbitrarily small $\delta>0$ (whereas the sample complexity of direct covariance estimation with Frobenius norm is $O(D^{2})$). These results provide similar rates of convergence and sample complexity for the corresponding robust subspace recovery algorithm, which are close to those of PCA. To the best of our knowledge, this is the only work analyzing the sample complexity of any robust PCA algorithm.