Part of Advances in Neural Information Processing Systems 32 (NeurIPS 2019)
Arnak Dalalyan, Philip Thompson
We study the problem of estimating a p-dimensional s-sparse vector in a linear model with Gaussian design. In the case where the labels are contaminated by at most o adversarial outliers, we prove that the ℓ1-penalized Huber's M-estimator based on n samples attains the optimal rate of convergence (s/n)1/2+(o/n), up to a logarithmic factor. For more general design matrices, our results highlight the importance of two properties: the transfer principle and the incoherence property. These properties with suitable constants are shown to yield the optimal rates of robust estimation with adversarial contamination.