NeurIPS 2020

A mathematical model for automatic differentiation in machine learning

Meta Review

The paper discusses the widely recognized phenomenon that automatic differentiation of mathematical functions represented as code may produce inconsistencies. The paper makes concrete contributions beyond the existing literature on e.g. Clarke gradients, by defining a restricted but large class of functions on which the correctness of the proposed AD can be confirmed. R2 raises valuable questions about the implications of the results in practice, but (a) the rebuttal gracefully accedes these points, and makes a convincing promise to "avoid saying that the problem can be met in practice", and (b) I believe that the points made in the paper are worth addressing even if in practice they may have less alarming implications.