NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Reviewer 1
This paper is methodological (and experimental) in nature, providing a suite of approaches to differentially-private Bayesian linear regression. The key significance is to revisit DP linear regression in the Bayesian setting, where it is natural to consider 1) how privacy-preserving noise affects posterior estimates; 2) leverage Bayesian inference through directly modelling the noise process, to improve utility (broadly construed including in terms of calibration). The paper does a quality job of exploring how such modelling and inference could be performed based on sufficient statistic perturbation. The paper has high clarity, further adding to the potential practical impact. The main technical ideas are largely inspired by prior work such as Bernstein and Sheldon (2018)'s work on exponential families. Although it is important to note that the extension does require careful technical work. There are some improvements listed below that could have led to a more comprehensive treatment. Some minor comments: line 20 and in related work otherwise, the discussion around posterior sampling being deficient in forming private posteriors seems at odds with the fundamental approach taken of MCMC - also sampling from the posterior. I view the key distinction here as instead modelling the DP noise process akin to the vision set out by Williams and McSherry [34] and applied here with care. Def 2 line 68 the max might not exist, instead use supremum (and Delta could be an upper bound). It is appropriate to cite McSherry & Mironov's KDD'09 paper as it is early work perturbing sufficient statistics for DP. The feature normalisation in line 304 - I don't think that's DP, is it? UPDATE POST REBUTTAL I thank the authors for their thoughtful rebuttal. Their response has cleared up a number of questions around VI, feature selection (not ideal but fair point regarding convention), and more. I like the work and believe it will have practical impact. I'd second another reviewer (also mentioned in my initial review I believe) that it would be nice to see some point estimate baseline comparisons made, even though the authors could lift their rebuttal comments directly into the paper instead, I view the point as more about being comprehensive without taking much space/time to achieve - it may help the paper have further reach, and does reflect on interesting questions that I'm sure other readers may have too.
Reviewer 2
The work is a solid contribution towards refining privacy-preserving bayesian linear regression, in which the bayesian interpretation is handled correctly even after injecting privatization noise. Several methods of implementing these models are explored, and the theory behind these changes are evidenced by empirical tests. Originality: the MCMC based method appears to mainly be a direct use of a mostly off-the-shelf idea, but the derivations of the Gibbs updates for the sufficient statistics based model are novel. Related work and appropriate comparisons are cited. Quality: The work is a solid complete contribution, with propositions backed up by empirical tests. Clarity: the work is clearly written and structured. Significance: the work greatly improves over the naive baseline, and the sufficient statistics method is shown to also be nicer than the MCMC method while achieving comparable results.
Reviewer 3
Originality: The main novelty vs Bernstein and Sheldon (2018) [5] is handling the fact that regression models condition on the data, which is private, when accounting for the noise of the mechanism. This is a valuable advance, though somewhat incremental. Quality: The approach of accounting for mechanism noise in posterior uncertainty is extremely elegant (though the basic idea of that follows from [5]). The proposed MCMC approaches are sensible, and the hierarchical normal prior formulation with conditionally conjugate updates is very clever. Overall, I like the proposed ideas. The experiments are the main weakness of the current manuscript. The real and synthetic data both have only 2 dimensions, and the real datasest only has 46 data points. While this data regime does have some real-world significance, it is not exactly modern. I would like to have seen some larger-scale results. The one posterior sample (OPS) method should also be used as a baseline (although I expect that the proposed methods would beat it, especially in this regime, due to its poor data efficiency). There are probably several strong point-estimate private linear regression models in the literature which should ideally be compared to as well. Clarity: The paper is well written and easy to read. The only issue is that there is no conclusion section to wrap up, instead the paper simply stops (presumably due to running out of space). Significance: The paper addresses an important problem (differentially private Bayesian linear regression, accounting for mechanism noise) and proposes elegant solutions. Its significance would be increased if the experiments could demonstrate the methods efficacy in more realistic, higher-dimensional problems.