Review for NeurIPS paper: Towards practical differentially private causal graph discovery

NeurIPS 2020

Towards practical differentially private causal graph discovery

Review 1

Summary and Contributions: This paper proposes a differentially private causal graph discovery algorithm. The basic idea is that by adding the Laplace noise in the conditional independent test, one can achieve the differential with a certain privacy budget. The contribution of this paper is to improve the PC-EM algorithm to make it more practical with regard to speed and performance.

Strengths: This work improves the PC-EM algorithm to make it more practical with regard to speed and performance.

Weaknesses: However, there are some concerns: 1. Does the $ mean subsample in Algorithm 1 (line 2)? Is it necessary to sub-sample the data? The choose of 2. Is this method can only be applied in discrete data? If not why only use the discrete type of data in experiments. 3. It is also interesting to see how the performance of Priv-PC in a larger graph compare to the standard PC algorithm. 4. I'm still unclear about how serious the privacy problem is in PC. The example given in the second paragraph seems not enough to indicate the disease in a patient, because there could be several ways in PC to add a causal relationship. For example, consider the ground truth v-structure A->B<-C, and the current learned causal structure is $A-B-C; A-C$, and it is possible that if the newly added patent contains information "A is independent of C", and we can delete the edge A-C, then a v-structure will be detected. As a result, A->B<-C is added. However, this causal relationship does not reveal the patient's privacy since it only contains the information about "A is independent of C". ------- It is glad to see that the author adds some new experiments, but the authors did not respond to my concern about the privacy issues in the PC algorithm which might need further explanation. I will consider this paper as borderline and keep my score unchanged.

Correctness: Yes, they are correct.

Clarity: This paper is well written and well orgranize.

Relation to Prior Work: Yes, it has disccussed the related work but not very in detail about the difference bettween PC-EM and the proposed Priv-PC.

Reproducibility: Yes

Additional Feedback:

Review 2

Summary and Contributions: This paper studies the differentially private PC algorithm for constructing private preserving causal graph from data. It develops an improve version of the sparse vector technique (SVT) to achieve high accuracy and short running time. The general idea is that, when conducting independence tests in the PC algorithm, it first performs SVT on a small sampled subset with tweaked threshold for reducing type I errors, and then run independence tests again on the complete dataset for pairs that pass the filter. Experiments on four datasets show that the proposed method outperforms the state-of-the-art in terms both accuracy and running speed.

Strengths: It is an interesting idea to first filter out pairs that are below the threshold using a small sample. It is also a common and useful strategy to lower the threshold in order to reduce type I errors due to sampling. The paper provides theoretical analysis to the differential privacy, error bound, and sensitivity. The improvement in terms of the running speed is dramatic.

Weaknesses: The paper does not explain why SVT suffers from low accuracy. As a result, although the paper tries its best to explain the proposed method (which improves SVT) gives less running time and better utility, it is still not very clear. Some statements are not very accurate. For example, “It is unclear whether the differential privacy still holds given this compromise.” The greedy search compromise the optimal solution, which may compromise utility. The privacy guarantee still holds as proven. EM-PC achieves strict (epsilon,0)-DP. Priv-PC achieves (epsilon,delta)-DP as shown in Algorithm 2. Their results in the experiments are not comparable. It is unclear how much of the improvement of Priv-PC over EM-PC is due to the relaxation of privacy (a non-zero broken probability delta). Smaller privacy budget (<=1) is typically more meaningful in real world application, but the paper shows that EM-PC achieves better utility under small privacy budget. The paper uses vague language in technical sections (it is OK in the abstract or introduction), which is not preferred in a decent paper. For example, in Section 3.1, the PC algorithm uses independence tests that are “too many to obtain an acceptable privacy guarantee” EM-PC runs “a large number of independence tests” “only a few independence tests in causal discovery yield values above the threshold” The exact meanings of “too many, a large number of, a few” are vague.

Correctness: Some claims are not very accurate as pointed out above. The theoretical analysis is plausible.

Clarity: Yes.

Relation to Prior Work: Yes.

Reproducibility: Yes

Additional Feedback: ================================= EDIT AFTER AUTHOR RESPONSE ================================= I think the responses generally address my concerns, although I still think it would be better if they also show the difference in performances between epsilon-DP and epsilon,delta-DP for both methods, if possible. I would like to raise my score to 6.

Review 3

Summary and Contributions: The paper proposes a differentially private causal graph discovery algorithm, Priv-PC. Compared with another DP causal graph discovery algorithm based on exponential algorithm (called as EM-PC), the proposed Priv-PC algorithm adopts a paradigm called sieve-and-examine, which essentially applies the sub-sampled sparse vector technique and the Laplace mechanism in the two sub-processes.

Strengths: + How to construct DP causal graphs is important but under-explored. The only previous work, EM-PC, has limitations as pointed out by this paper: slow computation due to many independence tests and the intense calculation of the score function used in EM, and low utility due to the change of intrinsic workflow of PC. So this paper has its merits by presenting another practical (and better) algorithm. + I tend to agree with the authors that the proposed sieve-and-examine paradigm, which alternately executes sub-sampled sparse vector technique and output perturbation, can address low accuracy problem in the direct application of sparse vector technique. I think this is another main contribution of this work. + The work is relatively complete and its supplemental file includes the results of sensitivity analysis for several conditional independence tests, e.g., conditional Kenall's \tau and conditional Spearman's \pho, although the sensitivity analysis is not really challenging.

Weaknesses: - Somehow, I felt the methodological contribution is not high although the sieve-and-examine paradigm seems feasible in this context and the error bounds of type I error and type II error are not trivial. - Comparisons with the existing work (EM-PC) could be more fairly conducted. Recall the slow computation is because 1) involving too many independence tests in identifying the independence node set using EM; and 2) the intensive computation of the utility function used in EM. For 1), it would be better to calculate and report in the experiment the number of independence tests incurred in both EM-PC and Priv-PC. Note that EM-PC already reduces the number of conditional independence tests as it tries to identify conditional independence node set. In other words, "EM-PC makes the decision of edge elimination at each round of determining the conditional independence set for each node instead of making the decision by applying conditional independence test for each edge". So without reporting the numbers from experiments, the claim may not be right. Second, the current Priv-PC intentionally chooses Kendall's \tau for the independence test because the sensitivity values of other test metrics are high. On the contrary, EM-PC can support any test metrics because of the adopted score function (although whose computation is intense). Moreover, the proposed Priv-PC has much worse performance than EM-PC when privacy threshold \epsilon is low. The large \epsilon values basically cannot provide meaningful privacy protection.

Correctness: The theoretical results and the claims of the proposed method seem correct. The empirical methodology is also correct. The claims related to comparisons with the previous EM-PC need to be examined. See the comments in the weakness section.

Clarity: I think the paper is well written and can be easily followed. A minor comment. Somehow, I do not like the authors use the term of Priv-PC for their proposed method and call the previous work as EM-PC. Note that the original paper [30] named their method as PrivPC. I understand it is fine to rename the previous method as EM-PC for clearly comparison, but it would be better to name the proposed method as SE-PC (based on the sieve-and-examine paradigm).

Relation to Prior Work: See the second comment in the weakness section. [30] also presented PrivPC* to deal with numerical attributes when the data satisfy multi-Gaussian distribution.

Reproducibility: Yes

Additional Feedback: The number of edges in four chosen datasets is kind of small. In other words, the evaluation is only focused on very sparse causal graphs. The authors may check [30] to run simulation studies with different numbers of nodes, edges, and sparse ratio values. Regarding the rebuttal, I very appreciate the authors' efforts reporting the comparison between EM-PC and Priv-PC in terms of the number of independence tests, as shown in Table d in the rebuttal. The comparison results help understand the performance comparison in terms of execution time. However, the Priv-PC can only accommodate two independence test metrics, which I think is a big limitation. The comment on the exponential complexity of the brutal-force search of the accurate utility score in Priv-PC (as an advantage over EM-PC) is a bit misleading as Priv-PC does not have to use the optimal utility score. As discussed in my review, how to construct DP causal graphs is important and under-explored and the proposed sieve-and-examine paradigm is feasible to address low accuracy in the direct application of sparse vector technique. However, the methodological contribution seems below the bar of NeurIPS and overall I think the paper is a borderline one.

Review 4

Summary and Contributions: A differentially private causal graph discovery algorithm is proposed in this paper. A testing condition is derived to filter out variable pairs that are unlikely to be independent and hence fewer independence tests is needed. This would result in a reduction of computational cost. The whole algorithm incorporating this process in the PC method is presented. Experimental results are shown to compare the running time and accuracy of the method.

Strengths: There are very few works about the differential privacy issue of causal discovery. To the best of my knowledge, this paper could be one of the prior work in this area, and it is one of the areas that needs exploration. The paper presents an approach that is able to run faster than the only related existing approach, the EM-PC method. The work is not just empirical but also supported by theoretical proofs.

Weaknesses: The work focuses on the reduction of independent tests. Some of the traditional and non-differentially private algorithms related to PC algorithms are also addressing this problem of reducing the number of independent tests. While it is important to have a fast algorithm, but the speed of the algorithm should not be the main objective. But rather, the way that we can preserve the correctness of the causal algorithm without disclosing the information of the data should be focused on. The example used in the paper contains 100K samples and it is hard to imagine if any private information can be inferred. Datasets with 100K samples are pretty large for networks with few nodes and edges. Is the performance affected if smaller sample size is used? The authors in the feedback mentioned standard PC algorithm is used as the baseline for F1 score, but the paper says that F1 is measured based on the ground truth. It is known that PC algorithm may not have 100% accuracy especially with small sample size. So the authors need to clarify how F1 is measured in the experiments.

Correctness: Table 2 compares the running time of the algorithms. The running time should depend on a number of factors such as the parameters used. However, this table does not indicate the settings used. It is better to include the sensitives of the parameters too.

Clarity: Some discussions in the paper are relatively brief and hence readers may find it hard to understand. It is better to include more details about the algorithms and also the impact of the threshold, epsilon and subset size on the performance of the approach.

Relation to Prior Work: The authors are recommended to include a brief overview of the EM-PC in the paper so that it is easier for readers to have a comprehensive comparison between the works.

Reproducibility: Yes

Additional Feedback: