Paper ID: | 5380 |
---|---|

Title: | Elliptical Perturbations for Differential Privacy |

The paper is very well written and easy to follow. The question the authors are trying to solve is interesting: are there mechanisms that can achieve (eps, 0) differential privacy in general linear spaces. While the answer provided by the authors is still not conclusive they show that a very general class of mechanisms (which achieve (eps, 0) differential privacy in finite dimensions) is in fact only (eps, delta) DP in infinite dimensions. The ideas and proofs in the paper are very insightful. I have two comments that I believe need to be addressed before approving the paper for publication: 1. I believe Lemma 1 assumes the existence of regular conditional probability measure. Otherwise, the decomposition of P^1 as a an integral wouldn't hold. However the existence of a regular conditional measure is not guaranteed specially in such a high dimensional setting. Can the authors show that the conditional measure exists for the proposed process? 2. In the proof of Theorem 2. The authors say that because v (mu_1 - mu_2) is in the Cameron-Martin space for all v then it follows that the conditional distributions must be equivalent. I assume this is the case because v mu_1 and v_mu2 are gaussian measures and according to the authors two Gaussian measures are either equivalent or orthogonal. However, the authors fail to provide the conditions for them to be equivalent. Therefore, the proof doesn't quite follow.

The paper proposes to use elliptical distributions for adding perturbations to achieve differential privacy (DP) in infinite dimensional and multivariate settings. This is a significant and original contribution; an understanding of how to achieve DP in these settings has been lacking. The paper is has important theoretical contributions, but is entirely theoretical. There is no motivation of a real use case provided. In what practical statistical or machine learning problems does the need for this arise? While the paper is fairly readable, it is really directed to a reader well-versed in measure theory. Many parts of the paper could improve with explanations or intuitive descriptions approachable by a more general reader. Understanding why current methods do not achieve DP is important, and the authors do address this. However this is not presented in an easily understandable manner. More clarity is needed in the section on the impossibility result. The jump from Theoreom 4 to Theorem 5 is not clear -one is an impossibility result, the other gives a result on achieving DP. Further, the paper abruptly ends in the middle of a section.

The paper studies random perturbation mechanisms for private query release in abstract topological vector spaces. The notion of elliptical perturbations, defined by the mean, a positive definite kernel and a scaling function, is studied with several results in both finite and infinite-dimensional spaces. The basic rationale behind elliptical perturbations is that: for infinite-dimensional spaces, add isotropic noise will usually result in a point outside the space. So elliptical perturbations are introduced. This is not the first time such idea being explored: Gaussian process noises has been utilized in Hilbert spaces. However, it is still worthwhile to characterize the possibilities and impossibilities for infinite-dimensional spaces in a precise way, which is the main contribution of the paper. On the other hand, the author did not show how the abstract results can benefit any particular examples. In differential privacy research, it is important to compare the utility of the mechanism, instead of only analyzing the privacy guarantees. The author did not show why the mechanisms constructed from the abstract conditions could lead to possibly more accurate private answers. Minor comments: The remark following Theorem 4 at page 8 is based on incorrect understanding of differential privacy. The statement "One is achieving a random level of privacy which average out to the desired level. In fact, the actual level of DP can be determined once the output is released" makes no sense. DP is a property of a randomized algorithm which takes into account all the randomization. An algorithm either satisfies DP or not.