Review for NeurIPS paper: A Local Temporal Difference Code for Distributional Reinforcement Learning

NeurIPS 2020

A Local Temporal Difference Code for Distributional Reinforcement Learning

Meta Review

The reviewers appreciated the interesting and novel contribution made here. However, Reviewers 2 and 4 expressed some serious concerns about the legibility of the paper. To quote the discussion, "Someone not familiar with either distributional RL or neuroscience will be lost when reading this paper." The question is therefore whether the issue can be resolved during this conference cycle. I believe it can but that it will require significant editing; I also think it is critical to support interdisciplinary work. However, the burden of clarity remains on the authors. Beyond what the reviewers have said, some recommendations: - There is no doubt that the same argument can be made, while moving some parts to the appendix, especially additional discussion points - Introduce the readers to both neuroscience and distributional RL in a necessarily longer backgroudn section (the RL part is woefully short as it stands) Finally, it may be that this conference paper is an advertisement for a larger journal paper, judging by its density.