NeurIPS 2020

AdaBelief Optimizer: Adapting Stepsizes by the Belief in Observed Gradients


Meta Review

All reviewers agree that the proposed method makes simple and effective change to the popular Adam algorithm, supported by strong empirical results and relatively standard convergence guarantees. Due to its simplicity, effectiveness and clear and convincing writing, the method has the potential of becoming a new standard method in deep learning. Therefore I recommend acceptance. The reviewers have concerns of somewhat unsubstantiated claims and oversold statements, but I believe these are relatively minor compared with the contribution. I urge the authors to carefully address these concerns in the revision.