Part of Advances in Neural Information Processing Systems 26 (NIPS 2013)
Francesco Orabona
We present a new online learning algorithm that extends the exponentiated gradient to infinite dimensional spaces. Our analysis shows that the algorithm is implicitly able to estimate the $L_2$ norm of the unknown competitor, $U$, achieving a regret bound of the order of $O(U \log (U T+1))\sqrt{T})$, instead of the standard $O((U^2 +1) \sqrt{T})$, achievable without knowing $U$. For this analysis, we introduce novel tools for algorithms with time-varying regularizers, through the use of local smoothness. Through a lower bound, we also show that the algorithm is optimal up to $\sqrt{\log T}$ term for linear and Lipschitz losses.