Paper ID: | 6781 |
---|---|

Title: | Efficient Identification in Linear Structural Causal Models with Instrumental Cutsets |

The paper seems to address an important computational problem in identifying particular generalizations of the IVs, proposing an algorithm that can be implemented in polynomial time. The importance of this algorithm depends on how useful these generalizations are. The authors mention that they can be applied to arbitrary structural causal models (as opposed to linear SCMs that are commonly assumed in the literature). As these references are relatively new and somewhat specialized, it would be useful for the authors to include a more detailed introduction to these methods, highlighting the main ideas. For example, there is a big gap from the second to the third paragraph in Section 2; also in Definition 2.2. Another concern is lack of numerical studies to illustrate the importance of the proposed method. Maybe a run time comparison would better illustrate the benefits of the proposed method in realistic settings.

As noted above, I find this work to be quite interesting and also quite densely informative. I suspect that, should this paper be accepted, the authors are planning to further extend this work into a longer form. In that case, I would definitely suggest expanding on the examples and the introduction, since this was a challenging read for someone who is not intimately familiar with all of the previous work. In the future, I would also appreciate the implementation of their suggested algorithm and some simulation comparisons of their work and other existing work. Of course, the worst case scenario is clear from their theoretical results, but I would still be interested to see how these algorithms compare in a simulation. Minor comments: - lines 31 and 32, I believe the indexes should be reversed \lambda_ij = 0, that is, whenever x_j is not a direct cause of x_i - the references sometimes use initials and sometimes full first names. - line 365 corollary -> Corollary

UPDATE: Thank you for the thoughtful response, those changes should improve the things that were unclear to me. There is a rich recent literature on identification criteria for linear structural causal models, but most of the recently proposed criteria largely ignore the question of efficient computability. This paper answers important questions in this area by given efficient algorithms for some criteria, while showing others to be NP-complete. The paper is original and generally clear and of high quality. Minor comments: l100: double "a" l104: the equation you refer to is in the supplement, which should be mentioned here. There should also be a general mention in the main paper informing readers about the supplement. Figure 2a: some labels are right between two nodes, making it hard to see which they belong to. Definition 3.1: "for each s_i ... are in the set T_j" is extremely hard to parse. I suggest: "any path from s_i \in S_f to t \in T \setminus T_f has an intermediate node in S_F \cup T" l149: You didn't mention yet what "this problem" is; I believe it is finding a maximal match-blocking? (Where "maximal" is inclusionwise?) Theorem 3.1: Also hard to read, in particular the precedence of the "... and ... or ..." at the end is ambiguous. Maybe the "there exists ... and ..." could be rewritten to "for all ... such that ..., we have ..."? Algorithm 1: After the while loop, T will contain only nodes that participate in the flow, while S may not (e.g. for the graph s_1 -> t <- s_2). So the line after the while loop should modify S, not T. (BTW, these match-blocks seem related to the Edmond-Gallai decomposition, though I believe that is defined for undirected graphs. You might want to check if you can find literature for a more efficient algorithm, or theoretical results. For example, I would guess that all maximal match-blocks have the same T, and so as a corollary, all maximal match-blocks are maximum.) Supplement l445: period missing