NIPS 2018
Sun Dec 2nd through Sat the 8th, 2018 at Palais des Congrès de Montréal
Paper ID:1212
Title:Local Differential Privacy for Evolving Data

Reviewer 1

This is a decent work. The main idea of this paper is to propose a differentially private compute mechanism when the data is streaming. The crux of the proposal is in Thm 1.1. Given a population of n users partitioned into L subgroups, the system proceeds in rounds and in each round each user sample a bit independently from their subgroup specific distribution. The private data of each user consist of vectors sampled over time and the goal is to track the population mean over time. The estimate needs to be private and differentially secure. To circumvent some of the issues with local privacy, the authors relax the problem in which the rounds are batched into T epochs, each consisting of l rounds and aim to estimate pt in epoch t. The main result of this paper is an algorithm that takes data generated according to this model, guarantees a fixed level of privacy that grows with the number of distributional changes and not number of epochs. Strengths: 1. One of the key contributions of this paper is a novel algorithm. So it gets a high mark in originality 2. The work, if successful, may have a serious impact on how privacy is dealt with in the streaming world in the industry. Companies like Apple can take this kind of model and enable private data sharing for billions of users Weakness: 1. I found the paper hard to follow. Unfamiliar with local differential privacy, I found it hard to comprehend. The definition is in Section 2. I would urge the authors to present it in Section 1 2. The accuracy estimates provided in the paper are probabilistic. Without proper experiments it is impossible to judge the tradeoff between privacy and accuracy. This paper does not provide any expt results 3. Since this is an iterative system, how scalable is the method? This is very important to understand this, since the authors guarantee diff privacy after each epoch. There is a cost to pay for this in terms of the "delay" 4. From the simple problem of average of bits, how can we do go more complex data at each user? 5. No conclusion is provided Updated after Author response: I am still not happy that the authors did not do any expts. While theoretical results only provide a bound, the usefulness can only be found by thorough evaluation. I would also urge the authors to add a conclusion section since the takeaways become more informative after reading the whole paper.

Reviewer 2

This is a theoretical paper about differential privacy. The paper's contribution is a new technique for local differential privacy for periodically recomputing statistics. The privacy guarantees of previously known methods degraded quickly as a function of the number of times the statistics are recomputed. The main contribution of the paper is a model where the error required for a fixed level of privacy grows not with the number of recomputations, but only with the number of significant changes in the statistics. I am not familiar with the state of the art in differential privacy, but the problem being solved seems interesting as a differential privacy problem, and the solution, as far as I can tell, seems nontrivial. However, I am worried that this paper would be more suitable for a conference which has more of a focus on the theory of differential privacy than for NIPS. The submission seems out of scope for NIPS; beyond the general connection between differential privacy as a whole and learning as whole, this particular paper does not seem to have any strong connection to learning or the other subject areas in the NIPS CFP. - In light of the authors' feedback as well as the consensus among other reviewers that this paper would be of interest to the NIPS community, I updated my overall score to acceptance. I still have concerns as to how accessible the paper is for the NIPS audience; hopefully the authors will revisit the paper's exposition as they promised in their feedback.

Reviewer 3

This paper mainly considered streaming mean estimation problem under local differential privacy model. Though there has been lots of work about locally private mean estimation problem, utility guarantees of naïve implementation for previous methods in the streaming model will degrade with time period t at least in order O(\sqrt{t}), because of composition theorem. The main contribution of this paper was proposing an interactive private algorithm that achieved utility only in O(\ln T) order under reasonable settings, and its idea kind of resembled sparse vector technique in DP model. Further, authors also showed a direct application of their technique for streaming heavy hitter problem. Overall, this paper is well written, and I personally quite like its idea. Given fixed privacy budget, utility guarantees of common private algorithms usually degrade with time t in polynomial order, which is a well-known problem in privacy literature. This paper makes some sound assumptions about the underlying estimation problem, and obtains an O(\ln T) order performance guarantee, which advances existing techniques and results. However, I feel kind of confused about some parts in the paper, and I put them in the below “Cons” part. Pros: 1. Design a novel locally private algorithm that can achieve O(ln T) order performance in the streaming estimation problem under reasonable assumptions; 2. Apply above technique to common heavy hitter problem and obtain corresponding guarantees. Cons: 1. In the proof of lemma 7.6 (line 454), how do we obtain this inequality through the fact “user i sets VoteYes_i^t’ as true”? What if b^* equals -1, and 2^{log T – b^*} divides t? In this case, VoteYes_i^t’ is also set as yes, but we don’t have above inequality (i.e. line 454); 2. Assumption 4.2 seems to be a little restricted. What if this assumption doesn’t hold? For example, the size of smallest subgroup is in order O(n^{1/4}); 3. In line 261, authors assume d >> n for heavy hitter problem. I know it is a common parameter regime, but I wonder whether d >> n is a necessary condition like assumption 4.2. What if d is not so large as specified in line 297? Other minor comments: 1. In line 159, \mu^r should be \mu^r = 1/n * (|S_1|*\mu_1^r+ … + |S_m|* \mu_m^r); 2. In line 198, in the definition of \tao_b, constant 10 should be 12 according to the appendix; 3. In line 5 of the algorithm 5 (and other similar places), the notation of noise level b is misleading with the subscript of \tao_b; 4. In line 19-21 of the algorithm 5, it seems it doesn’t play any role in the algorithm, and could be deleted; 5. Line 386 is reduplicative with previous sentence; 6. In the bottom inequality of page 10, it should be a^t in the bracket, not (a^t, \tilde{p}^t); 7. In the second equation in line 392, authors omitted a^t in the conditional probability for \tilde{p}^t. Update: Thanks for authors' detailed responses. For my first concern, I just misunderstood the decide condition. For the 4th comment in "minor comments" part, there is a typo and algorithm 5 should be algorithm 1. Overall, it is a good paper.

Reviewer 4

This paper considers the problem of repeatedly estimating statistics on users’ data while providing users with the strong privacy notion of local differential privacy (LDP). The local model of differential privacy differs from the more common central model in that there is no trusted party that can aggregate the data before adding noise, so noise must be added by each individual before aggregation. This is the model used by Apple and Google in practice to ensure that private data never reaches their servers in the clear. Existing LDP techniques provide only "one shot" tools, with proven privacy guarantees only for a one time use of the algorithm. In contrast, industrial applications of LDP repeatedly apply such algorithms in order to obtain statistics on a daily basis. This paper presents a new framework for supporting such repeated estimations of users’ statistics under LDP. RESULTS: Consider a setting in which there are n users, and suppose that in every one of R days, each user i gets as input a bit x_i. Our goal is to privately estimate the average of the bits on every day. Existing techniques allow for estimating these averages under LDP with error that grows with sqrt(R). In contrast, under several distributional assumptions, the results in this paper reduce the error to sqrt(k), where k is the number of times in which the average changes "significantly". The authors show that their techniques extend beyond bit averages to the well-studied problem of estimating heavy-hitters in the data. CONSTRUCTION: Suppose that in every day, all of the users' bits are sampled iid from some distribution. Furthermore, assume that this distribution changes at most k<