Review for NeurIPS paper: Entropic Causal Inference: Identifiability and Finite Sample Results

NeurIPS 2020

Entropic Causal Inference: Identifiability and Finite Sample Results

Meta Review

In this paper, the authors continue a previous line of work (initiated in the cited reference [11]) and prove some interesting results on the finite-sample identifiability of causal pairs under an "entropic identifiability" assumption. In the end, there was substantial discussion and reviewers were split on this paper, raising several crucial issues that need to be carefully addressed by the authors. Please pay close attention in particular to the comments by R2, R7. Below I outline the major issues that must be addressed in the camera ready: - Please add a critical discussion and comparison with existing work on identifiability in causal models. Given the prior work [11] and the lack of substantial discussion or comparisons with existing methods for inferring causal pairs, the significance of this work is left unclear. There is already substantial literature on identifiability in causal models and causal pairs in particular, and a proper discussion to situate this work within this prior work is missing. - Add detailed comparisons with existing methods. The authors mention that such comparisons are available by manual inspection of existing papers, however, this must be collected in one place and critically discussed in the paper. (As proposed by the authors in their response, this should not require any additional experiments.) - A critical discussion of Theorem 1: The authors only prove a high probability result under strong assumptions, which is strictly weaker than even very weak notions of identifiability such as generic identifiability. This is an unusual approach to identifiability, and this warrants additional discussion. See the comments from R3. - In several places, technical jargon is used without clarification (e.g. "high probability", "O(1)", etc.). For example, in Conjecture 1, O(1) is used even though this is not an asymptotic statement. Similarly, "with high probability" is used without clarifying what the authors mean by this (again, Conjecture 1 is not an asymptotic statement). These constants and probabilities as well as their dependencies (e.g. what is going to infinity?) should be made explicit. Ultimately, the substantive ideas seem to have appeared previously, and the main contribution here is a detailed technical analysis that rests on very strong assumptions. Additional discussion is needed to properly situate this work and identify its limitations.