NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Paper ID: 1704 Locally Private Gaussian Estimation

### Reviewer 1

The paper studies the problem of Gaussian estimation in the local model of differential privacy. In particular, it achieves accuracy bounds that are tight under certain conditions on the mean of the underlying distribution. They also reduce the number of interaction from linear in the number of users to n. These are nice results at first glance; however, I have some reservations against accepting the paper in its present state. I list them in more details below and hope that the authors would find my comments constructive. Since this is mainly a theoretical paper, I have tried to put it at the same footing as I would expect from a theoretical paper. Theorem 1.2 sounds fishy. There is no dependence on \sigma_{min} or \sigma_{max}. My intuition is that there is should be a \log dependence on the ratio of two. I tried looking for it in the proof, but could not find where the authors have erred. Can the author provide an intuition why this is the case? For me, if there is no dependence on accuracy on the bound on variance, it is almost the same bound as in the case of Theorem 1.1 barring a factor of \sqrt{\log n}. What would be the situation when \sigma_min is exponentially small in n and \sigma_max is a polynomial. This can surely happen. Figure 1 mentions that R is an upper bound on both mean and variance which the authors mention is exponential in n. If that is the case, then their result on unknown non-adaptive would cancel the factor of n in the \sqrt. Maybe, I am missing something, but does that not make the result trivial? Perhaps, there is an accuracy interaction trade-off. The lower bound is only true when mean is of an order of the accuracy loss. So I think there is a gap between the lower and upper bound and it is misleading to say that they have tight bounds. An argument along this line would be helpful in the presentation of the paper. Related works. I do not get the result in comparison to that in the central model. The authors claim that there is a roughly $\sqrt{n}$ loss in accuracy in moving to the local model of privacy; however, for large enough $n$, the authors mention that Karwa and Vadhan give a bound of $O(\sigma \sqrt{\frac{\log(1/\beta)}{n}})$; their result also gives the same bound. I am a little perplexed with the bound of the earlier result — there is no dependence on \epsilon! I think that the authors have incorrectly cited the earlier papers. I would suggest the authors double-check the results in the earlier papers. Strong data processing inequality has been used previously in the context of the local model of privacy (DJW FOCS 2013 and DR COLT 2019 paper). In fact, the main claim in DJW13 uses strong data processing inequality to get a better bound than DRV10. The paper seems to claim that they are the first to use it. Further, the STOC 2016 paper on statistical estimation problems by Braverman et al. uses such inequalities and if I understand Duchi and Rogers correctly, they extend the techniques of Braverman et al. to the privacy setting. I wonder how much the current paper's technique is different from theirs. If not, I feel that the paper oversell their result. Now coming to the writing of the paper. I felt the proofs are written in a little convoluted manner and hard to follow. It is not clear what the authors want to say in lines 124-126. Unless I am mistaken, the maths says something else and the authors write something else: “U_1 is split into L subgroups indexed by \mathcal L”. \mathcal is just a set of contiguous indices of size L. I believe they want to say that U_1 is split in to disjoint subgroup each of size k and indices are in set \mathcal L_1, \cdots, \mathcal L_k. That seems more reasonable. For the rest of the paper, I have assumed this. Disclaimer: Since the case for unknown variance should follow more or less by making \log \frac{\sigma_max}{\sigma_min} bins, I skipped Section 4 in the extended abstract. I am more curious as to why they do not incur a dependence on \sigma_min and \sigma_max. On the Supplementary material: Algorithm 2: Line 2 should have a \in \{ 0,1,2,3 \}. Details in the proof of Claim 1 are missing. The authors just mention that the proof follows by induction without even stating the base case and how the inductive argument follows. Since this is primarily a theoretical paper, these details should be spelled out clearly. I find it unsettling and make me very unsure about accepting the paper. Claim 2 assumes \hat H_1^j(M_1) < 0.52 k +\psi. Then how does line 42 follows? Claim 1 has a completely different assumption. Lemma 1.4 is the main part and its proof is very similar to that in STOC 2016 paper by Braverman et al., including the notations. Still, I did not understand where is the price of privacy coming in to play? I might be missing something but I think the authors should mention it clearly. Same could be said about Lemma 2.4. Line 81 and 86: Define \erf and \erf^{-1}. Line 143: Perhaps strikeout "then." Equation (3). Why is expectation over y_i? A line of argument should be there. A personal suggestion in improving the clarity of the paper (the authors need not take my advice). I would suggest the authors to just focus on known variance case in the extended abstract section, explain the intuition and why the previous result fails in more detail. The case for the unknown variance can be simply relegated to the appendices. This would make the paper more readable and would definitely invoke more follow-up work. Also, in the proofs, it would be helpful to give an intuition of the proof instead of putting all the maths. If the authors prefer to just put the math (which I personally do not mind), they should include every detail. I did not get the time to verify the rest of the proof. I would try to read them before the notification time in hope that it would help the authors make their paper more presentable. Post-rebuttal: I read the rebuttal provided by the authors. They did give some idea as to how to complete their proof, so that is good. On the other hand, they did not answer my questions regarding the novelty of techniques. Reading the comments of the other reviewers and their feedback, I do believe that the current write-up needs significant improvement. The authors did not even provide any promise on that front. I am raising my score to 6 on the promise that the authors would consider the reviewers' suggestion in the next version.