NIPS 2018
Sun Dec 2nd through Sat the 8th, 2018 at Palais des Congrès de Montréal
Paper ID: 962 Distributed Stochastic Optimization via Adaptive SGD

### Reviewer 1

The authors propose a distributed stochastic algorithm for expected risk minimization that achieves optimal convergence rate $N^{-1/2}$, with time $O(N/m)$ (here $m$ is the number of local machines"), space $O(1)$ and $K= O(1)$ communication rounds. The algorithm seems interesting. However, as I did not manage to read the proofs, thus I am willing to let other qualified reviewers make the final decision. Minor comments: There are much taking expectation step" in the paper. Perhaps it is better to clarify which random variable you are taking the expectation with respect to? For example, in Proposition 1, $\mathbb{E}[g_t]$ should be $\mathbb{E}[g_t | w_t]?$ In the proof of Lemma 1, $v_k$ is a random variable, and thus when applying the concentration inequalities, the event will also depend on this random variable.