NIPS 2018
Sun Dec 2nd through Sat the 8th, 2018 at Palais des Congrès de Montréal
Paper ID: 691 Adaptive Online Learning in Dynamic Environments

### Reviewer 1

The authors study the online convex optimization problem in a dynamic environment. The objective of the paper is to bound the general dynamic regret, i.e. the regret against any sequence of comparators. In this paper, the first lower bound on the general dynamic regret in terms of path-length $P_T$ and time horizon $T$ is proven. They further develop two algorithms that match the regret lower bound up to a $O(sqrt(loglog(T)/P_T))$ factor. The first algorithm called ‘Ader’ uses O(log(T)) function and gradient evaluations each step. The second algorithm called ‘Improved Ader’ provides significant improvement with one function and gradient evaluations each step. The proposed algorithms rely on a key principle that if $P_T$ is known then online gradient descent (Zinkevich 2013) with a specific choice of step size can attain the optimal general dynamic regret. As $P_T$ is unknown, running separate online gradient descent instances each with its own step size and later combining them through the Hedge algorithm provides the upper bound. Finally, the authors extend their result to a setting where an approximate dynamical model for the comparator sequence is known. Similar upper and lower bounds hold in this setting. Pros: * This work effectively narrows down the gap of achievable regret in online convex optimization problem under dynamic environment upto a $O(sqrt loglog(T) )$ factor in terms of path length $P_T$. This improves the gap by $sqrt P_T$. * The writing is fluent and easy to follow. The ideas are well presented in the main body and the proofs presented in the supplementary part is elegant. Cons: * It is not specified clearly that the proposed algorithms assume the knowledge of domain diameter $D$ and the gradient upper bound $G$.