Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
The paper present a new approach for rematerialization that can trade off memory and computation based on tree decomposition of the computation graph. In contrast to previous approaches in ML / AD that addressed this optimization in the limited context of gradient calculation, the current paper address it as a general scheduling problem of a computational graph. The complexity of the algorithm and the overhead of recomputation is bound by the tree width of the graph. Given the novelty of the approach I would like to accept this paper despite the weakness of the paper especially the lack of consideration of the actual computation cost. More detailed analysis and better way to control the trade-off of (actual) computation and memory would make the paper much stronger.