NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Paper ID: 1931 Deep Generalized Method of Moments for Instrumental Variable Analysis

### Reviewer 1

Originality: The theoretical contributions are completely novel. In spite of the similarity of their approach to AGMM, the formulation of the problem is based on a different objective function that allows optimal reweighing, instead of the unweighted moment conditions. Quality: The algorithmic, theoretical and empirical contributions are sound. Though the evaluation is not extensive, it is convincing. Clarity: The clarity and organization of the paper can be improved 1) It might be helpful to use different notations for the unweighted norm and weighted norm. In particular, lemma 1 would read better without having to refer back to the definition of the weighted norm. 2) Line 102 about equivalence to non-causal linear regression requires justification or a reference. 3) Line 79 is not meaningful. There ought to be some relationship between $m$ and complexity of $\theta$ 4) Theorem 2. It says that $\tilde{\theta}_n$ has a limit. Can this be any limit? or it should be $\theta_0$? If not, its quite counter intuitive and requires further explanation. 5) In particular, AGMM paper also suggests a way of learning the moment functions via deep networks. What precisely makes DeepGMM needs to be emphasized. 6) Line 190. it says that $\tilde{\theta}$ does not enter the gradient of $\theta$. Wouldn't that mean that the optimum of $\theta$ does not depend on $\tilde{\theta}$? Perhaps you just mean that $\tilde{\theta}$ should be treated as constants when taking gradient w.r.t \theta. It can still be a part of the gradient. 7) Line 224. Please elaborate on "When Z is high-dimensional, we use the moment conditions given by each of its components". 8) I could not understand the data for high dimensional case. Seems like $X$ is sometimes and image and sometimes a number. Furthermore, defining $g_0$ as abs would mean that one is taking absolute value of an image. Is that meaningful? Significance: The paper is an important contribution to the field of causality research. And likely to be used considering the performance of the algorithm. --- Post rebuttal comments: The authors responded adequately to most of my concerns, but they did not clarify comment 8 in my review. Furthermore, I agree with the issues pointed out by the other reviewers on the experimental section. I have lowered my score to reflect that.