NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Reviewers agreed that the paper addresses an important problem in current deep RL research and appreciated the effort put into the rebuttal by the authors. New experiments using the 0/1 reward formulation and a comparison to fixed hand-tuned hyper parameters addressed two of the main concerns raised by reviewers. In the end all three reviewers recommended accepting the paper.