Review for NeurIPS paper: Gradient Regularized V-Learning for Dynamic Treatment Regimes

NeurIPS 2020

Gradient Regularized V-Learning for Dynamic Treatment Regimes

Meta Review

The reviewers found the gradient regularized V-learning algorithm proposed in this paper to be novel and to address an important problem in the application of machine learning to clinical applications. They pointed out a number of advantages including the theoretical properties of the approach, the ability to incorporate neural networks into the V-learning framework, and the performance of the algorithm on the data in the experiments. There were a few conerns about the paper, most of which having to do with the experimental evaluation, and others the relationship between the proposed method and some existing work. In particular, the reviewers thought that there were other baselines that would be more appropriate to compare with and that there wasn't a "real experiment". Additionally, some reviewers found the paper to be very complicated and unclear in some places. The authors provided a detailed set of feedback that addressed the main criticism of the reviewers. In the discussion phase the most reviewers indicated that the authors had addressed their concerns in a reasonable manner. In one case, a reviewer lowered their score a bit and pointed out the need for principled ablation studies as in the paper by Shi, et al. Despite this decreased score the reviewers still came to the consensus that the paper should be accepted. This is a dense paper and the reviewers offered some helpful feedback to improve the clarity of the paper. Additionally, there are many experiments that can be included in an expanded version of the paper. As written, this is a good NeurIPS contribution.