Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Dear Author(s), with reviewers, we have discussed many aspects of your paper and agreed that the main contribution of the paper is that it presents a novel, efficient, distributed second order method for ERM problems. As compared to existing algorithms (some of which have been published at ICML and NeurIPS), the algorithm has less hyper-parameters (and is less sensitive to the hyper-parameters), can be applied to a wider range of problems, out-performs these methods (in some cases with a significant margin) in the experiments shown in the paper, and has a better communication-computation balance. In terms of the analysis: The analysis is novel in the technique used. The algorithm is designed such that the search directions fall into three different categories. This way, the authors consider 3 different cases and combine them to get the overall convergence guarantees. The first two cases are well thought out and motivated. The third case is a safe guard that is seldom used in practice (note, this case never occurs in the strongly convex case). On the negative side, the worst-case complexity results of this method (I believe) is worse that first-order method. But that is to be expected, and I strongly believe that such argument should not be used to downgrade and reject the method (remember LBFGS has worse convergence guarantees than GD, but is far superior in practice). Another interesting contribution is the use of a line search. Line searches are not usually used by the ML community and we hope that your paper can bring a discussion. Many ML researchers argue that it is too expensive. I would like to encourage you if you can highlight the contribution (mentioned above) a bit more in your final (camera-ready version) paper. The reviewers also pointed out some minor issues and please make sure to address them in your final camera-ready submission.