The authors propose a way to improve the robustness to corruptions by adapting batchnorm statistics on the test set, and show that this improves performance significantly on multiple benchmarks. The reviewers initially raised a few concerns particularly around generalization to other datasets and connections to related work. The authors did a great job of responding to the reviewers' questions including some additional experiments. During the follow-up discussion, the reviewers agreed that the revised experiments satisfactorily address some of the major concerns and some of them increased scores as well. Overall, I think this is a good paper (particularly with the extensive additional results) and I recommend acceptance. I encourage the authors to add the new results (new datasets + ablations) as well as revise the related work section (points raised in the rebuttal + differences from other parallel work that R1 mentioned) in the camera ready version.