Title:Distributional Reward Decomposition for Reinforcement Learning

The reviewers enjoyed the paper, although expressed some concerns regarding the novelty (it combines a number of existing ideas). Still, the combination does result in a clear performance increase on a small set of Atari 2600 games. In the discussion the reviewers appreciated the additional experiments provided in the rebuttal, and reiterated the need for the final version of this paper to incorporate these and to be cleaned up. I also want to encourage the authors to report the performance of their algorithm on a larger number of Atari 2600 games -- in particular, how were these 6 games selected? Was there an unconscious bias in this selection?