The reviewers generally agree that this paper offers a novel viewpoint on avoiding catastrophic forgetting. The theoretical and experimental results are well received. R3 would have preferred to see a deeper discussion on the differences with OWM. However, the authors explained during the rebuttal that their learning rule modifies both sides of the gradient update, differently to OWM. This characteristic, together with the intricacies involved in considering a sequential application, makes the overall contribution significant enough.