Classification Under Misspecification: Halfspaces, Generalized Linear Models, and Evolvability

Part of Advances in Neural Information Processing Systems 33 (NeurIPS 2020)

AuthorFeedback Bibtex MetaReview Paper Review Supplemental


Sitan Chen, Frederic Koehler, Ankur Moitra, Morris Yau


In this paper, we revisit the problem of distribution-independently learning halfspaces under Massart noise with rate $\eta$. Recent work resolved a long-standing problem in this model of efficiently learning to error $\eta + \epsilon$ for any $\epsilon > 0$, by giving an improper learner that partitions space into $\text{poly}(d,1/\epsilon)$ regions. Here we give a much simpler algorithm and settle a number of outstanding open questions: (1) We give the first \emph{proper} learner for Massart halfspaces that achieves $\eta + \epsilon$. (2) Based on (1), we develop a blackbox knowledge distillation procedure to convert an arbitrarily complex classifier to an equally good proper classifier. (3) By leveraging a simple but overlooked connection to \emph{evolvability}, we show any SQ algorithm requires super-polynomially many queries to achieve $\mathsf{OPT} + \epsilon$. We then zoom out to study generalized linear models and give an efficient algorithm for learning under a challenging new corruption model generalizing Massart noise. Finally we study our algorithm for learning halfspaces under Massart noise empirically and find that it exhibits some appealing fairness properties as a byproduct of its strong provable robustness guarantees.