NeurIPS 2020

Self-Paced Deep Reinforcement Learning

Meta Review

This paper presents a method for curriculum generation in reinforcement learning, by shaping the sampling distribution in a dynamic way to improve performance on a target task distribution. There is clear intuition and exposition of the method, and a good evaluation on a variety of environments and RL algorithms showing positive results. I encourage the authors to incorporate the feedback of the reviewers in their final draft.