Non-Crossing Quantile Regression for Distributional Reinforcement Learning

Meta Review

The strong rebuttal with additional results on NC-IQN swayed multiple initially hesitant reviewers to argue for acceptance, and I concur. The one unresolved concern is about reproducing the baseline results more accurately: I assume this is a matter of codebase/implementation details that does not detract from fair head-to-head comparisons. However, it's worth discussing any discrepancies in more depth in the final version.