NeurIPS 2020

Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning

Meta Review

Reviewers are in favor of acceptance after the discussion and I agree. The key novelty in this work is to apply the Multiple Choice Learning framework to model based reinforcement learning. Doing so allows for the model to learn multimodal distributions over future states and the authors provide strong empirical results. Neither dynamics learning nor MCL are novel; however, their combination is novel and demonstrated to be effective. The reviewers have left a number of useful suggestions about how to further strengthen the paper in terms of writing and experimentation and I encourage the authors to make use of this feedback.