Reviewers generally agree that the main and interesting novel contributions of the paper are in Section 3.2, where the equivalence of training neural networks with regularization and kernel ridge regression is shown with both random Gaussian and leverage score initializations. This section should be moved to the front and emphasized more. Section 3.1 on the leverage scores is rather straightforward and should not be considered a main contribution. Weakness: R5 is concerned that it might not be possible to obtain reasonable convergence rates using the bounds in Theorems 3.3, 3.7, and 3.9.