__ Summary and Contributions__: In this paper, the authors extend the theoretical results in some previous works. The authors consider soft-intervention, unknown intervention targets, broken causal sufficiency. Specifically, the authors define \Psi-Markov equivalence for the (graph, intervention targets) pairs and provide the graphical criterion to test whether two pairs are equivalent. The authors give a sound and complete algorithm to discover the PAG given a set of distribution under different interventions.

__ Strengths__: This paper provides solid theoretical results. To distinguish the (graph, intervention targets) pair, the authors define the corresponding equivalence class and provide the graphical characterization. The proposed algorithm is sound and complete. The paper is well written and mathematically sound.

__ Weaknesses__: Compared to Kocaoglu et al. 2019, in which the authors provide a new perspective and an exciting solution for causal discovery without causal sufficiency by interventional data and observational data, more improvements could be expected in this paper.

__ Correctness__: The claims appear to be correct. But I didn't check the proofs in detail.

__ Clarity__: Yes.

__ Relation to Prior Work__: Yes.

__ Reproducibility__: Yes

__ Additional Feedback__: C-faithfulness seems stricter than faithfulness assumption, which is shown prone to be broken when the variable number increases. Hence I am a bit concerning the possibility that the assumption is broken.
I am sorry that I am not quite familiar with such region. Hence I am very likely to change my score if other reviewers or authors give persuasive opinions.
***********after the discussion**********
Thanks for the rebuttal. After the discussion with other reviewers, I adjust my score accordingly.

__ Summary and Contributions__: Update: after the discussions, I agree with the other reviewers that the lack of comparison with [37] seems a bit concerning. Therefore, I decide to lower my score to weak accept. Still, I believe it's an interesting paper. I encourage the authors to provide a solid comparison with [37] and submit to the other venues if this did not go through.
In many real world applications, people are usually faced with interventional data with unknown interventional targets. This calls for new methods that can successfully learn causal graphical graphical models with unknown-target interventional data. Learning causal graphical models with known-target interventional data is a well studied topic in this field, while there are still a lot of open questions for the task of causal discovery with unknown targets. In this paper, the authors provide a new graphical characterization of the equivalence classes of causal inference with unknown-target interventional data. A sound and complete algorithm is proposed to estimate the equivalent class.

__ Strengths__: 1. Learning causal graphical models with observational and interventional data is an important topic in this field. While preliminary research mainly assume that all the interventional targets are known a priori, causal inference with unknown target interventional data is not well studied. This paper provides to my knowledge the first result to characterization the degree of identifiability from unknown-target interventional data, which facilities the researchers new techniques to develop methods to utilize such data for causal discovery;
2. The sound and complete algorithm looks inspiring. I especially like its comparisons with JCI in Section D.2, which shows clearly the improvement over the JCI framework. The JCI framework is well-known for its lack of completeness result. The completeness result from this paper has successfully filled this gap, together with an example showing clearly what has been improved over the JCI framework.
3. I did not check the proof in detail. But the proof seems correct to me and is technically sound.
Overall I think this is an interesting paper and is relevant to the audience of the NeurIPS community.

__ Weaknesses__: 1. In line 60, it should be “intervention on the variable X”.

__ Correctness__: The proof seems to be correct and technically sound

__ Clarity__: Yes

__ Relation to Prior Work__: Very clear

__ Reproducibility__: Yes

__ Additional Feedback__:

__ Summary and Contributions__: This paper establishes a novel theoretical framework for reasoning about a causal graph and a set of unknown interventions with respect to observed interventional (and optionally an observational) distributions.
The summary of the contributions is as follows:
1. Systematic development of the notion of Psi-Markov property and Psi-Equivalence Class.
2. The methodological tools needed to solve the learning problem (such as the augmented graph).
3. The learning algorithm, which builds on FCI.
4. Theoretical statements analyzing the correctness and exhaustiveness of the learning algorithm with respect to the Psi-Equivalence class.

__ Strengths__: The main strength of this paper is mainly theoretical and methodological. The development of the framework (as outlined above) is meticulous, thorough, and the contributions seem sound and significant for the field of causal discovery. In particular it is appreciated that the authors very thoroughly discussed their method in comparison to existing work in the Appendix, through theoretical statements and clear examples.

__ Weaknesses__: The main weakness of this work is the lack of empirical evaluation, which is the reason this paper is a borderline accept. While the experiment in the appendix is appreciated and reassuring, it would be very useful if the authors could including more extensive experiments, including on real data, with multiple baselines.

__ Correctness__: To the best of my knowledge, the method and the theoretical statements seem correct.

__ Clarity__: The paper is clear and well written given the abundance of information and theoretical statements that it needs to present with the limited space in the main text.

__ Relation to Prior Work__: The prior work is very extensively discussed and related to in the appendix.

__ Reproducibility__: Yes

__ Additional Feedback__: Two minor points:
1. Theorem 4 in the appendix is not referenced anywhere in the text, and is almost the same as Definition 2 in the main text.
2. References in the appendix seem incomplete. All the related work should also be cited in the appendix, since it is so extensively discussed.
******** After author feedback and discussion ***************
After carefully considering the rebuttal and other reviewer's comments, and after discussing the paper, I think that although this submission provides detailed comparison to some related work (some of which was not discussed detail in the submission), this comparison should extend to additional comparisons.
In particular, in [Zhang et al.] (Section 4.1) it is discussed that invariances are a special case of independent causal mechanisms, and thus it seems to me that it could be potentially restrictive. In addition, the notion of unknown interventional targets is related to not knowing which independent causal mechanisms undergo changes as treated in the above-cited work. Therefore, it would be very beneficial to this already good paper if the authors could possibly include discussion, analysis and if time permits, even empirical experiments comparing to the algorithm based on independent causal mechanisms in the camera-ready version of the manuscript.
References:
Zhang, K., Huang, B., Zhang, J., Glymour, C., & Schölkopf, B. (2017, August). Causal discovery from nonstationary/heterogeneous data: Skeleton estimation and orientation determination. In IJCAI: Proceedings of the Conference (Vol. 2017, p. 1347). NIH Public Access.

__ Summary and Contributions__: This paper introduces a new framework for learning causal graphs based on a combination of observational and interventional data. The authors specifically consider the case of having access to soft interventions where the interventional targets may not be known. Importantly, their framework allows for latent variables, which is not the case for many existing causal inference algorithms. The authors introduce a new property and a corresponding equivalence class (\Psi-markov equivalence class) for characterizing whether two causal graphs and their intervention targets belong to this equivalence class. Subsequently, they introduce an algorithm, \Psi-FCI for learning this equivalence class. They show that the algorithm is sound and complete.

__ Strengths__: The paper addresses, to my knowledge, an important gap in prior work by developing an algorithm that allows for latent variables in causal structure discovery from observational and interventional data. The authors provide an extensive theoretical comparison of their algorithm with previous works through examples in the appendix.
The presented equivalence class and algorithm is theoretically well developed. The authors provide important theoretical claims about their algorithm and include examples for further clarification.

__ Weaknesses__: My major point of criticism is the lack of empirical results for the proposed algorithm. Although authors provide some evaluation in the Appendix Fig 6, this part needs to be more expanded. It would be great if authors could show their algorithm’s performance in simulation and on real data as well as how it compares to other algorithms such as those mentioned in prior work. Since the main advantage of this work is the ability to handle latent variables, it would be particularly great to showcase the algorithm in this setting.
In addition, given data, it’s not immediately clear how Algorithm 1 would be applied since the input to the algorithm is a tuple of distributions. I think that writing the algorithm in terms of data would be more helpful and useful for the practitioner. The authors should mention specific tests (e.g. F-tests) that would be used for testing invariances.
Minor points:
How does \Psi-FCI differ from ref [25]. I did not see this discussed in the main paper or the appendix.
How does c-faithfilness compare to normal faithfulness?
Could the authors add a point on applicability/lack of of their work to hard interventions?

__ Correctness__: The claims appear to be correct, although I did not thoroughly check.

__ Clarity__: The paper is overall well-written.

__ Relation to Prior Work__: The authors make a through theoretical comparison to prior work in the appendix. However, the paper could benefit from empirical comparisons on simulation and real data with prior work.

__ Reproducibility__: Yes

__ Additional Feedback__: