Reviews: The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure For Least Squares

The paper considers SGD for least squares regression, and establishes results for the last iterate (as is often done in practice) as opposed to an average over many iterates (as is often in theory). Well written. Tools are not new, and so somewhat incremental in that sense, but the paper is well written and on a core problem, so is of interest in that sense.

Paper ID:	8546
Title:	The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure For Least Squares