Learning Halfspaces with the Zero-One Loss: Time-Accuracy Tradeoffs

Part of Advances in Neural Information Processing Systems 25 (NIPS 2012)

Bibtex Metadata Paper Supplemental


Aharon Birnbaum, Shai Shwartz


Given $\alpha,\epsilon$, we study the time complexity required to improperly learn a halfspace with misclassification error rate of at most $(1+\alpha)\,L^*_\gamma + \epsilon$, where $L^*_\gamma$ is the optimal $\gamma$-margin error rate. For $\alpha = 1/\gamma$, polynomial time and sample complexity is achievable using the hinge-loss. For $\alpha = 0$, \cite{ShalevShSr11} showed that $\poly(1/\gamma)$ time is impossible, while learning is possible in time $\exp(\tilde{O}(1/\gamma))$. An immediate question, which this paper tackles, is what is achievable if $\alpha \in (0,1/\gamma)$. We derive positive results interpolating between the polynomial time for $\alpha = 1/\gamma$ and the exponential time for $\alpha=0$. In particular, we show that there are cases in which $\alpha = o(1/\gamma)$ but the problem is still solvable in polynomial time. Our results naturally extend to the adversarial online learning model and to the PAC learning with malicious noise model.