Review for NeurIPS paper: Provably Efficient Neural Estimation of Structural Equation Models: An Adversarial Approach

NeurIPS 2020

Provably Efficient Neural Estimation of Structural Equation Models: An Adversarial Approach

Review 1

Summary and Contributions: The paper proposes an adversarial minimax two player game approach for optimising the parameters of a generalised structural equation model (SEM) formulated as a saddle-point problem. The generalised SEM is defined in terms of a conditional expectation operator mapping between a hilbert space of structural functions of interest to a hilbert space of known or estimated functions of the outcome. These spaces are subsequently chosen to be the space of possible neural networks and a stochastic primal-dual algorithm is given for finding a solution to the saddle-point problem. Furthermore, the work proves global convergence of the algorithm. This main result is achieved, under certain specific data and weight initialisation conditions, using a regret analysis while considering the infinite width limit for neural networks that cause them to behave like linear learners.

Strengths: *A tractable estimation algorithm for generalised SEMs. *Derived results for precise statements about the rate of error in estimation. *A proof of global convergence of the algorithm.

Weaknesses: *No empirical validation or performance analysis is given for the prosed algorithm, although I understand that the point of the paper is perhaps solely as a theoretical contribution. *The construction of the proposed algorithm seems to largely overlaps with prior work (Muandet et al., 2019), however, there remain significant differences between the two papers and the convergence proof is unique to this work. -- K. Muandet, A. Mehrjou, S. K. Lee, and A. Raj. Dual IV: A single stage instrumental variable regression. arXiv preprint arXiv:1910.12358, 2019.

Correctness: To best of my knowledge, the claims in the paper seem correct, however I did not check all the proofs provided in the appendix. Below are minor suggestions: ### minor ### *Line 48 (.. works in incorporating ..): remove "in" *Line 194/195 (b_1, ..., b_r): Should the subscript not index from 1 to m?

Clarity: The paper is well written and the presentation of the work is also well structured, following a sensible progression.

Relation to Prior Work: The paper includes a related work section which is adequate, with additional discussion on specific connections given in the appendix. In particular, a specific section is dedicated to the work on Dual IV (Muandet et al., 2019), where that work is compared and contrasted to the presented work.

Reproducibility: Yes

Additional Feedback: I would like to complement the authors on including so many detailed examples of SEMs, which I thought was a nice addition to the paper. Although as mentioned above, I can understand that the work is primarily focused on being a theoretical contribution, however, for applied practitioners of SEM models, the paper might have more appeal if the algorithm is demonstrated and compared to other approaches on a real-world practical problem, e.g. the demand estimation problem considered in the Dual IV paper as well as in other prior work. ### Post-rebuttal update ### I thank the authors for their response and appreciate their willingness to include an experimental section in their work (which will only strengthen the presentation) as well as highlighting the comparison to Dual IV. I remain with my original evaluation, seeing this as a good submission.

Review 2

Summary and Contributions: This paper proposes a new estimation procedure of structural equation models based on a min-max game formulation in which both players are represented by neural networks. The authors show that the algorithm they derive converges and is consistent in the sense that the estimate obtained is close to the solution of a regularized version of the orignal structural equation problem.

Strengths: This study introduces a complete framework for generalized structural equation models in which the original problem is first regularized prior to being reformulated (through a saddle-point reformulation) using neural networks to approximate the functions to be optimized. The theoretical guarantees provided in the study further show that the solution provided is of "good quality" (as usual when the number of samples is sufficiently large).

Weaknesses: The main weakness to my opinion lies in the lack of empirical validation of the proposal. I would have liked to see how the algorithm behaves in practice on different datasets for which the number of samples is limited.

Correctness: The claims are correct as far as I can tell (the paper is purely theoretical; no experiments have been conducted).

Clarity: The paper is definitely well written.

Relation to Prior Work: As far as I can tell, the authors have clearly positioned their work with respect to previous contributions. The novelty is clear.

Reproducibility: Yes

Additional Feedback: Two minor remarks: 1. The notation E_{init} should be explained. 2. Theorem 5.2, Eq. 15, replace f with f^{\alpha}.

Review 3

Summary and Contributions: The paper proposes a framework to solve structural equation models formalized as a class of inverse problems where both the the data and the operator can be stochastic. In particular, the authors propose and analyze an approach based on the solution of saddle point problem over classes of neural networks. The statistical properties of the proposed method are analyzed.

Strengths: The study is thorough and complete. The mathematical analysis seems sound. The problem relevant.

Weaknesses: I miss some clearer explanation of why the Tikhonov approach is problematic and the saddle point approach with neural network needed. What are the precise assumption on the inverse problem under study? Some basic experiments are missing.

Correctness: The analyses appears to be sound and correct.

Clarity: It's fairly well written but I got lost in some key points.

Relation to Prior Work: Quite good, but I feel some more reference to the huge literature in inverse problems should be added placing the contribution in that context.

Reproducibility: Yes

Additional Feedback: I found the response satisfying. I encourage the authors to improve the presentation .