{"title": "Average-Case Averages: Private Algorithms for Smooth Sensitivity and Mean Estimation", "book": "Advances in Neural Information Processing Systems", "page_first": 181, "page_last": 191, "abstract": "The simplest and most widely applied method for guaranteeing differential privacy is to add instance-independent noise to a statistic of interest that is scaled to its global sensitivity. However, global sensitivity is a worst-case notion that is often too conservative for realized dataset instances. We provide methods for scaling noise in an instance-dependent way and demonstrate that they provide greater accuracy under average-case distributional assumptions. Specifically, we consider the basic problem of privately estimating the mean of a real distribution from i.i.d. samples. The standard empirical mean estimator can have arbitrarily-high global sensitivity. We propose the trimmed mean estimator, which interpolates between the mean and the median, as a way of attaining much lower sensitivity on average while losing very little in terms of statistical accuracy. To privately estimate the trimmed mean, we revisit the smooth sensitivity framework of Nissim, Raskhodnikova, and Smith (STOC 2007), which provides a framework for using instance-dependent sensitivity. We propose three new additive noise distributions which provide concentrated differential privacy when scaled to smooth sensitivity. We provide theoretical and experimental evidence showing that our noise distributions compare favorably to others in the literature, in particular, when applied to the mean estimation problem.", "full_text": "Average-Case Averages: Private Algorithms for\n\nSmooth Sensitivity and Mean Estimation\n\nMark Bun\n\nBoston University\n\nmbun@bu.edu\n\nThomas Steinke\n\nIBM Research \u2013 Almaden\n\nsmooth@thomas-steinke.net\n\nAbstract\n\nThe simplest and most widely applied method for guaranteeing differential privacy\nis to add instance-independent noise to a statistic of interest that is scaled to its\nglobal sensitivity. However, global sensitivity is a worst-case notion that is often too\nconservative for realized dataset instances. We provide methods for scaling noise\nin an instance-dependent way and demonstrate that they provide greater accuracy\nunder average-case distributional assumptions. Speci\ufb01cally, we consider the basic\nproblem of privately estimating the mean of a real distribution from i.i.d. samples.\nThe standard empirical mean estimator can have arbitrarily-high global sensitivity.\nWe propose the trimmed mean estimator, which interpolates between the mean and\nthe median, as a way of attaining much lower sensitivity on average while losing\nvery little in terms of statistical accuracy. To privately estimate the trimmed mean,\nwe revisit the smooth sensitivity framework of Nissim, Raskhodnikova, and Smith\n(STOC 2007), which provides a framework for using instance-dependent sensitivity.\nWe propose three new additive noise distributions which provide concentrated\ndifferential privacy when scaled to smooth sensitivity. We provide theoretical and\nexperimental evidence showing that our noise distributions compare favorably to\nothers in the literature, in particular, when applied to the mean estimation problem.\n\nIntroduction\n\n1\nConsider a sensitive dataset x \u2208 X n consisting of the records of n individuals and a real-valued\nfunction f : X n \u2192 R. Our goal is to estimate f (x) while protecting the privacy of the individuals\nwhose data is being used. Differential privacy [14] gives a formal standard of individual privacy for\nthis problem (and many others), requiring that, for all pairs of datasets x, y \u2208 X n differing in one\nrecord (called neighbouring datasets and denoted d(x, y) \u2264 1), the distribution of outputs should be\nsimilar for both inputs x and y.\nThe most basic technique in differential privacy is to release an answer f (x) + \u03bd, where \u03bd is instance-\nindependent additive noise (e.g., Laplace or Gaussian) with standard deviation proportional to the\nglobal sensitivity GSf of the function f. Here,\n\nGSf =\n\nmax\n\ny,z\u2208X n : d(y,z)\u22641\n\n|f (y) \u2212 f (z)|\n\nmeasures the maximum amount that f can change across all pairs of datasets differing on one entry,\nincluding those which have nothing to do with x.\nCalibrating noise to global sensitivity is optimal in the worst case, but may be overly pessimistic in\nthe average case. This occurs in many statistical settings where x consists of i.i.d. samples from a\nreasonably structured distribution and the goal is to estimate a summary statistic of that distribution.\nThe main example we consider in this work is that of estimating the mean of a distribution given\ni.i.d. samples from it. The standard estimator for the distribution mean is the sample mean. However,\n\n33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.\n\n\ffor distributions with unbounded support, e.g., Gaussians, the global sensitivity of the sample mean\nis in\ufb01nite. Thus we consider a different estimator and a different measure of its sensitivity.\nA more \ufb01ne-grained notion of sensitivity is the local sensitivity of f at the dataset x, which measures\nthe variability of f in the neighbourhood of x. That is,\n\nLSf (x) =\n\nmax\n\ny\u2208X n : d(x,y)\u22641\n\n|f (y) \u2212 f (x)|.\n\nHowever, na\u00efvely calibrating noise to LSf (x) is not suf\ufb01cient to guarantee differential privacy. The\nreason is that the local sensitivity may itself be highly variable between neighbouring datasets, and\nhence the magnitude of the noise observed in a statistical release may itself leak information about x.\nThe work of Nissim, Raskhodnikova, and Smith [27] addressed this issue by identifying smooth\nsensitivity, an intermediate notion between local and global sensitivity, with respect to which one\ncan calibrate additive noise while guaranteeing differential privacy. Smooth sensitivity is a pointwise\nupper bound on local sensitivity which is itself \u201csmooth\u201d in that its multiplicative variation on\nneighboring datasets is small. More precisely, for a smoothing parameter t > 0, the t-smoothed\nsensitivity of a function f at a dataset x is de\ufb01ned as\n\nSt\nf (x) = max\ny\u2208X n\n\ne\u2212t\u00b7d(x,y) \u00b7 LSf (y),\n\nwhere d(x, y) denotes the number of entries in which x and y disagree. Noise distributions which\nsimultaneously do not change much under additive shifts and multiplicative dilations at scale t can be\nused with smooth sensitivity to give differential privacy.\nIn this work, we extend the smooth sensitivity framework by identifying three new distributions from\nwhich additive noise scaled to smooth sensitivity provides concentrated differential privacy. We\napply these techniques to the problem of mean estimation, for which we propose the trimmed mean\nas an estimator that has both high accuracy and low smooth sensitivity.\n\n1.1 Background.\n\nBefore describing our results in more detail, we recall the de\ufb01nition of differential privacy.\nDe\ufb01nition 1 (Differential Privacy (DP) [14, 12]). A randomized algorithm M : X n \u2192 Y is (\u03b5, \u03b4)-\ndifferentially private ((\u03b5, \u03b4)-DP) if, for all neighboring datasets x, y \u2208 X n and all (measurable) sets\nS \u2286 Y,\n\nP [M (x) \u2208 S] \u2264 e\u03b5P [M (y) \u2208 S] + \u03b4.\n\nWe refer to (\u03b5, 0)-differential privacy as pure differential privacy (or pointwise differential privacy)\nand (\u03b5, \u03b4)-differential privacy with \u03b4 > 0 as approximate differential privacy.\nGiven an estimator of interest f : X n \u2192 R and a private dataset x \u2208 X n, the randomized algorithm\ngiven by M (x) = f (x) + GSf \u00b7 Z is (\u03b5, 0)-differentially private for Z sampled from a Laplace\ndistribution scaled to have mean 0 and variance 2/\u03b52 (i.e., density e\u2212\u03b5|z| \u00b7 \u03b5/2). We use the smooth\nsensitivity in place of the global sensitivity. That is, we analyse randomized algorithms of the form\n\nM (x) = f (x) + St\n\nf (x) \u00b7 Z\n\nfor Z sampled from an admissible noise distribution.\nThe original work of Nissim, Raskhodnikova, and Smith [27] proposed three admissible noise\ndistributions. The \ufb01rst such distribution, the Cauchy distribution with density \u221d 1\n1+z2 (and its\ngeneralizations of the form 1\n1+|z|\u03b3 for a constant \u03b3 > 1) can be used with smooth sensitivity to\nguarantee pure (\u03b5, 0)-differential privacy. These distributions have polynomially-decaying tails\nand \ufb01nitely many moments, which means they may be appropriately concentrated around zero to\nguarantee accuracy for a single statistic, but can easily result in inaccurate answers when used to\nevaluate many statistics on the same dataset. Unfortunately, inverse polynomial decay is in fact\nessential to obtain pure differential privacy. The exponentially-decaying Laplace and Gaussian\ndistributions, which appear much more frequently in the differential privacy literature, were shown to\nyield approximate (\u03b5, \u03b4)-differential privacy with \u03b4 > 0.\nA recent line of work [15, 5, 26, 4] has developed variants of differential privacy which permit tighter\nanalyses of privacy loss over multiple releases of statistics as compared to both pure and approximate\n\n2\n\n\fdifferential privacy. In particular, the notion of concentrated differential privacy (CDP) [15, 5] has a\nsimple and tight composition theorem for analyzing how privacy degrades over many releases while\naccommodating most of the key algorithms in the differential privacy literature, including addition of\nGaussian noise calibrated to global sensitivity.\nDe\ufb01nition 2 (Concentrated Differential Privacy (CDP) [15, 5]). A randomized algorithm\nM : X n \u2192 Y is 1\n2 \u03b52-CDP) if, for all neighboring\ndatasets x, y \u2208 X n and all \u03b1 \u2208 (1,\u221e), D\u03b1 (M (x)(cid:107)M (y)) \u2264 1\n2 \u03b52\u03b1, where D\u03b1 (P(cid:107)Q) =\n\u03b1\u22121 log E\nX\u2190P\n\n(cid:104)\n(P (X)/Q(X))\u03b1\u22121(cid:105)\n\n2 \u03b52-concentrated differentially private ( 1\n\ndenotes the R\u00e9nyi divergence of order \u03b1.\n\n1\n\n2 \u03b52-\nAnother variant, R\u00e9nyi Differential Privacy (RDP) [26], is closely related to CDP. Speci\ufb01cally, 1\n2 \u03b52\u00b7\u03b1)-RDP\nCDP is equivalent to the conjunction of an in\ufb01nite family of RDP guarantees, namely (\u03b1, 1\nfor every \u03b1 \u2208 (1,\u221e). In particular, any CDP algorithm (such as those we present) is also an RDP\nalgorithm.\nIt is natural to ask whether concentrated differential privacy admits distributions that can be scaled to\nsmooth sensitivity while offering better privacy-accuracy tradeoffs than Cauchy, Laplace, or Gaussian.\n\n1.2 Our Contributions: Smooth Sensitivity and CDP\n\nAs concentrated differential privacy is a relaxation of pure differential privacy, Cauchy noise and its\ngeneralizations automatically guarantee CDP. However, admissible distributions for CDP could have\nmuch lighter tails. (The full version [6] contains a lower bound showing that quasi-polynomial tails are\nnecessary, whereas pure DP tails must be polynomial.) Nevertheless, it is not clear what distribution\nto conjecture would have these properties. In this work, we identify three such distributions with\nquasi-polynomial tails and show that they provide CDP when scaled to smooth sensitivity. Detailed\nstatements and proofs appear in the full version [6].\n\nf (x) \u00b7 Z guarantees 1\n\nLaplace Log-Normal: The \ufb01rst such distribution we identify, and term the \u201cLaplace log-Normal\u201d\nLLN(\u03c3), is the distribution of the random variable Z = X \u00b7 e\u03c3Y where X is a standard Laplace, Y\nis a standard Gaussian, and \u03c3 > 0 is a shape parameter. This distribution has mean zero, variance\n2e2\u03c32, and satis\ufb01es the quasi-polynomial tail bound P [|Z| > z] \u2264 e\u2212 log2(z)/3\u03c32 for large z. The\nfollowing result shows that scaling Laplace log-Normal noise to smooth sensitivity gives CDP.\nProposition 3. Let f : X n \u2192 R and let Z \u2190 LLN(\u03c3) for some \u03c3 > 0. Then, for all s, t > 0, the\nalgorithm M (x) = f (x) + 1\nIntuitively, additive scaling of Z = X \u00b7 e\u03c3Y is handled by X (i.e., D\u03b1 (Z(cid:107)Z + s) \u2264 \u03b1s2e3\u03c32\n/2),\nwhile multiplicative dilations are handled by Y after taking logarithms (i.e., D\u03b1 (Z(cid:107)etZ) \u2264\nD\u03b1 (\u03c3Y (cid:107)t + \u03c3Y ) = \u03b1t2/2\u03c32). Group privacy handles additive-multiplicative combinations.\nUniform Log-Normal: Second, ULN(\u03c3), is the distribution of Z = U \u00b7 e\u03c3Y where U is uniformly\ndistributed over [\u22121, 1], Y is a standard Gaussian, and \u03c3 > 0 is a shape parameter. It has mean zero\nand variance 1\nProposition 4. Let f : X n \u2192 R and let Z \u2190 ULN(\u03c3) with \u03c3 \u2265 \u221a\n\n3 e2\u03c32, and also has the tail bound P [|Z| > z] \u2264 e\u2212 log2(z)/2\u03c32 for all z \u2265 1.\n\n2 \u03b52-CDP for \u03b5 = t/\u03c3 + e3\u03c32/2s.\n\n2 \u03b52-CDP for \u03b5 = t/\u03c3 + e3\u03c32/2 \u00b7(cid:112)2/\u03c0\u03c32 \u00b7 s.\n\nf (x) \u00b7 Z guarantees 1\n\n2. Then, for all s, t > 0, the\n\nalgorithm M (x) = f (x) + 1\n\ns \u00b7 St\n\ns \u00b7 St\n\nArsinh-Normal: Our \ufb01nal new distribution is the \u201carsinh-normal\u201d which is the distribution of\n\u03c3 sinh(\u03c3Y ) where Y is a standard Gaussian and sinh(y) = (ey\u2212e\u2212y)/2 denotes the hyperbolic\nZ = 1\nsine function. This distribution has mean zero, variance e2\u03c32\u22121\n, and quasi-polynomial tails. We show\nthat it gives CDP, albeit with a worse dependence on the smoothing parameter t.\nProposition 5. Let f : X n \u2192 R and let Z = sinh(Y ) where Y is a standard Gaussian. Then, for all\ns, t \u2208 (0, 1), the algorithm M (x) = f (x) + 1\nt + 1.2s.\n\n\u221a\n2 \u03b52-CDP for \u03b5 = 2\n\nf (x) \u00b7 Z guarantees 1\n\ns \u00b7 St\n\n2\u03c32\n\nWe conjecture that the privacy analysis of the arsinh-Normal distribution can be improved to match\n(or better) the guarantees of our other two distributions.\n\n3\n\n\f1.3 Our Contributions: Private Mean Estimation\nWe study the following basic statistical estimation problem. Let D be a distribution over R with mean\n\u00b5 and let x = (x1, . . . , xn) consist of i.i.d. samples from D. Our goal is to design a differentially\nprivate algorithm M for estimating \u00b5 from the sample x.\nThe algorithmic framework we propose is as follows. We begin with a crude estimate on the range of\nthe distribution mean, assuming \u00b5 \u2208 [a, b].1 Then we compute a trimmed and truncated mean of x:\n\n(cid:20) x(m+1) + x(m+2) + \u00b7\u00b7\u00b7 + x(n\u2212m)\n\n(cid:21)\n\nf (x) =\n\nn \u2212 2m\n\n,\n\n[a,b]\n\nwhere x(1) \u2264 x(2) \u2264 \u00b7\u00b7\u00b7 \u2264 x(n) denotes the sample in sorted order (a.k.a. the order statistics) and\n[y][a,b] = min{max{y, a}, b} denotes truncation to the interval [a, b]. In other words, f (x) \ufb01rst\ndiscards the largest m samples and the smallest m samples from x, then computes the mean of the\nremaining n \u2212 2m samples, and \ufb01nally projects this to the interval [a, b]. Then we release M (x) =\nf (x) \u00b7 Z for Z sampled from an admissible distribution for t-smoothed sensitivity. This\nf (x) + St\nframework requires picking a noise distribution Z, a trimming parameter m \u2208 Z with n > 2m \u2265 0,\nand a smoothing parameter t > 0.\nOur mean estimation framework is versatile and may be applied to many distributions. We prove the\nfollowing three illustrative results for this framework. Theorem 6 gives a strong accuracy guarantee\nunder a correspondingly strong distributional assumption, whereas Theorem 8 gives a weaker accuracy\nguarantee under minimal distributional assumptions. Theorem 7 relaxes the light-tail assumption of\nTheorem 6 and shows the effect of this on the accuracy guarantee.\nTheorem 6 (Mean Estimation for Symmetric, Subgaussian Distributions). Let \u03b5, \u03c3 > 0, a < b, and\nn \u2208 Z with n \u2265 O(log(n(b \u2212 a)/\u03c3)/\u03b5). There exists a 1\n2 \u03b52-CDP algorithm M : Rn \u2192 R such\nthat the following holds. Let D be a \u03c3-subgaussian distribution that is symmetric about its mean\n\u00b5 \u2208 [a, b]. Then\n\n(cid:2)(M (X) \u2212 \u00b5)2(cid:3) \u2264 \u03c32\n\nE\n\nX\u2190Dn\n\n(cid:18) log((b \u2212 a)/\u03c3)\n\n+\n\n\u03c32\n\nn2 \u00b7 O\n\nn\n\n+\n\nlog n\n\n\u03b52\n\n\u03b5\n\n(cid:19)\n\n.\n\nn is exactly the non-private error required for this problem and the second term is the\n\nThe \ufb01rst term \u03c32\ncost of privacy, which is a lower order term for large n.\nTheorem 7 (Mean Estimation for Symmetric, Moment-Bounded Distributions). Let \u03b5 > 0 and\n\u03c3 > 0 and a < b, and n \u2208 Z with n \u2265 O(log(n(b \u2212 a)/\u03c3)/\u03b5). There exists a 1\n2 \u03b52-CDP algorithm\nM : Rn \u2192 R such that the following holds. Let D be a distribution that is symmetric about its mean\nk \u00b7 \u03c32k for some k \u2208 N\n\u00b5 \u2208 [a, b] and has variance \u03c32. Suppose \u03c3 > \u03c3 and E\nX\u2190D\nand ck \u2265 1. Then\nE\n\n(cid:18) log(n(b \u2212 a)/\u03c3)\n\n(cid:2)(X \u2212 \u00b5)2k(cid:3) \u2264 ck\n\n(cid:2)(M (X) \u2212 \u00b5)2(cid:3) \u2264 \u03c32\n\nck \u00b7 n1/k\n\n(cid:19)\n\n+\n\n.\n\n\u03c32\n\nn2 \u00b7 O\n\n+\n\nn\n\nX\u2190Dn\n\nNote that a \u03c3-subgaussian distribution satis\ufb01es the hypotheses of Theorem 7 with ck = O(k). Thus\nsetting k = \u0398(log n) in Theorem 7 recovers the bound of Theorem 6.\nTheorem 8 (Mean Estimation for General Distributions). Let \u03b5 > 0 and \u03c3 > 0 and a < b. Let\nn \u2265 O(log(n(b \u2212 a)/\u03c3)/\u03b5). Then there exists a 1\n2 \u03b52-CDP algorithm M : Rn \u2192 R such that the\nfollowing holds. Let D be a distribution with mean \u00b5 \u2208 [a, b] and variance \u03c32 \u2265 \u03c32. Then\n\n(M (X) \u2212 \u00b5)2(cid:105) \u2264 \u03c32\n(cid:104)\n\n\u00b7 O\n\nn\n\nE\n\nX\u2190Dn\n\n(cid:18) log(n(b \u2212 a)/\u03c3)\n\n+\n\n1\n\u03b52\n\nWe stress that Theorems 6, 7, and 8 use the same algorithm; the only difference is in the setting\nof parameters and analysis. This illustrates the versatility of our algorithmic framework and that\nthe distribution of the inputs directly translates to the accuracy of the estimate produced. We also\n\n1Assuming an a priori bound on the the mean is necessary to guarantee CDP. However, such a bound can,\n\nunder reasonable assumptions, be discovered by an (\u03b5, \u03b4)-DP algorithm [24].\n\n4\n\n\u03b5\n\n\u03b5\n\n\u03b52\n\n(cid:19)\n\n.\n\n\femphasize that, while the accuracy guarantees above depend on distributional assumptions on the\ndataset, the privacy guarantee requires no distributional assumptions, and holds without even assuming\nthe data consists of i.i.d. draws.\nIn Section 4, we present an experimental evaluation of our approach when applied to Gaussian data.\n\n1.4 Related Work\n\nPrior works [4, 29] showed that, when scaled to smooth sensitivity, Gaussian noise provides the\nrelaxed notion of truncated CDP [4] or R\u00e9nyi DP [26], but does not suf\ufb01ce to give CDP itself. Other\nthan this and the original distributions mentioned in Section 1.1, no other distributions have (to the\nbest of our knowledge) been shown to provide differential privacy when scaled to smooth sensitivity.\nWe remark that smooth sensitivity is not the only way to exploit instance-dependent sensitivity. The\nmost notable other example is the \u201cpropose-test-release\u201d framework of Dwork and Lei [13]. Roughly,\ntheir approach is as follows. First an upper bound on the local sensitivity is proposed. The validity of\nthis bound is then tested in a differentially private manner. If the test passes, then this bound can be\nused in place of the global sensitivity to release the desired quantity. If the test fails, the algorithm\nterminates without producing an estimate. This approach inherently requires relaxing to approximate\ndifferential privacy to account for the small probability that the test passes erroneously.\nDwork and Lei [13] apply their method to the trimmed mean to obtain asymptotic results (rather than\n\ufb01nite sample bounds like ours). They obtain a bound on the local sensitivity of the trimmed mean\nusing the interquantile range of the data, which is itself estimated by a propose-test-release algorithm.\nThen they add noise proportional to this bound. This requires relaxing to approximate differential\nprivacy as well as assuming that the data distribution has suf\ufb01cient density at the truncation points.\nThe mean and median (the extreme cases of the trimmed mean) have both been studied extensively in\nthe differential privacy literature. We limit our discussion and study to the central model of differential\nprivacy, though there has also been much work on mean estimation in the local model [21, 19, 11].\nNissim, Raskhodnikova, and Smith [27] analyze the smooth sensitivity of the median, but they do not\napply it to mean estimation or give any average-case bounds for the smooth sensitivity of the median.\nSmith [32] gives a general method for private point estimation, with asymptotic ef\ufb01ciency guarantees\nfor general \u201casymptotically normal\u201d estimators. This method is ultimately based on global sensitivity\nand, in large part due to its generality, does not provide good \ufb01nite sample complexity guarantees.\nKarwa and Vadhan [24] consider con\ufb01dence interval estimates for Gaussians. Although they work\nin a different setting, their guarantees are similar to Theorem 6 (up to logarithmic factors). They\npropose a two-step algorithm: First a crude bound on the data is computed. Then the data is truncated\nusing this bound and the mean is estimated via global sensitivity. We note that their algorithm does\nnot readily extend to heavy-tailed distributions (like ours does, cf. Theorem 8). This is because a\nheavier-tailed distribution requires more conservative truncation to avoid distortion and hence their\nalgorithm must add more noise due to the higher global sensitivity. Kamath, Li, Singhal, and Ullman\n[22] extend the work of Karwa and Vadhan to learning multivariate Gaussians.\nFeldman and Steinke [16] use a median-of-means algorithm to privately estimate the mean, yielding\na guarantee similar to Theorem 8.2 Speci\ufb01cally, their algorithm partitions the dataset into evenly-\nsized subdatasets and computes the mean of each subdataset. Then a private approximate median\nis computed treating each subdataset mean as a single data point. This algorithm is simple and is\napplicable to any low-variance distribution, but (unlike our algorithm) its accuracy does not improve\nwhen the input data distribution is well-behaved.\nThe results of Karwa and Vadhan [24] and Feldman and Steinke [16] are most similar to ours.\nOur results (speci\ufb01cally, Theorems 6 and 8) roughly match theirs. However, our results apply to\nintermediate settings (e.g., Theorem 7 for, say, Student\u2019s T data), where the prior works do not\nachieve good accuracy. Our results can be viewed as providing a uni\ufb01ed approach that simultaneously\nmatches the results of Karwa and Vadhan [24] for Gaussians and Feldman and Steinke [16] for\ngeneral distributions (and everything in between). The key advantage of our trimmed mean approach\n\n2The results of Feldman and Steinke [16] are stated for the (formally incomparable) problem of ensuring\ngeneralization for adaptive data analysis, but differentially private algorithms follow implicitly from their work.\n\n5\n\n\fFigure 1: Variance of trimmed mean for various\ndistributions as the trimming fraction is varied.\nThe plot depicts n = 1001 averaged of 106 runs.\n\nFigure 2: Excesss variance of the private\ntrimmed mean with smooth sensitivity. Data is\n[N (0, 1)201][\u221250,1050]. Average of 106 runs.\n\nis that the trimming automatically adjusts to the data distribution, whereas the prior approaches lack\nthis versatility and rely on relatively brittle distributional assumptions.\nShortly after this work, Avella-Medina and Brunel [2] used the smooth sensitivity framework (and the\npropose-test-release framework) for median estimation. Of course, the mean and median are closely\nrelated. However, there is a subtle \u2013 but important \u2013 difference: Whereas the standard deviation\nprovides the appropriate scale for the accuracy of an estimate of the mean, the reciprocal of the\nprobability density around the median provides the appropriate scale for an estimate of the median\n[31]. Indeed, the standard deviation of the empirical mean and the empirical median scale with these\nquantities respectively. Accordingly, while our results state accuracy bounds in terms of the variance\nof the unknown distribution, their results state accuracy bounds in terms of the probability density in\nthe neighbourhood of the median. Neither type of bound dominates the other, as it is easy to \ufb01nd\ndistributions more favourable to each analysis. However, while their analysis and bounds are very\ndifferent from ours, their algorithm is not; their algorithm is a special case of our algorithm. Thus we\nview their work as providing further independent validation of the utility of our approach.\n\nFurther Applications Mean estimation is an extremely fundamental task that arises as a subroutine\nof more complex tasks. For example, private optimization and machine learning often rely on\nestimating gradients [3, 1]. This is a (multivariate) mean estimation task and our methods may yield\nimprovements here. Mean estimation also naturally arises in hypothesis testing [33, 18, 7, 9, 10, 8].\nThe smooth sensitivity framework has also been applied to other problems. Examples include learning\ndecision forests [17], principal component analysis [20], analysis of outliers [28], and analysis of\ngraphical data [23, 25, 34, 30]. Our new distributions can immediately be applied to these problems.\nAfter estimating the mean (or location parameter) of a distribution, the next question is to estimate its\nscale (e.g. variance). For this, our methods can be applied to robust location estimators [31].\n2 Trimmed Mean\n\nFor the problem of mean estimation, we use the trimmed mean as our estimator.\nDe\ufb01nition 9 (Trimmed Mean). For n, m \u2208 Z with n > 2m \u2265 0, de\ufb01ne trimm : Rn \u2192 R by\n\ntrimm(x) =\n\nx(m+1) + x(m+2) + \u00b7\u00b7\u00b7 + x(n\u2212m)\n\n,\n\nn \u2212 2m\n\nwhere x(1) \u2264 x(2) \u2264 \u00b7\u00b7\u00b7 \u2264 x(n) denote the order statistics of x.\nIntuitively, the trimmed mean interpolates between the mean (m = 0) and the median (m = n\u22121\n2 ).\nError of the Trimmed Mean: Before we consider privatising the trimmed mean, we look at the\nerror introduced by the trimming itself. We focus on mean squared error relative to the mean. That is,\n\n(cid:104)\n\n(trimm(X) \u2212 \u00b5)2(cid:105)\n\nE\n\nX\u2190Dn\n\n, where \u00b5 = E\n\nX\u2190D [X] is the mean of the distribution D.\n\n6\n\n\fWe remark that mean squared error may not be the most relevant error metric for many applications.\nFor example, the length of a con\ufb01dence interval may be more relevant [24]. Similarly, the mean\nmay not be the most relevant parameter to estimate. We pick this error metric as it is simple,\nwidely-applicable, and does not require picking additional parameters (such as the con\ufb01dence level).\nThe error of the trimmed mean depends on both the trimming fraction and also the data distribution.\nFigure 1 illustrates this. For Gaussian data, the optimal estimate is the empirical mean, corresponding\nto trimming m = 0 elements. This has mean squared error 1\nn for n samples. As the trimming fraction\nis increased, the error does too. At the extreme, the median of Gaussian data has asymptotic variance\n2n \u2248 1.57\nn . However, if the data has slightly heavier tails than Gaussian data, such as Laplacian data,\n\u03c0\nn, while the median has\nthen trimming actually reduces error! The Laplacian mean has variance 2\nasymptotic variance 1\nn. In between these two cases is a mixture of two Gaussians with the same mean\nand differing variances. Here a small amount of trimming reduces the error, but a large amount of\ntrimming increases it again, and there is an optimal trimming fraction in between.\nFor our main theorems we use the following analytic bound.\nProposition 10. Let n, m \u2208 Z satisfy n > 2m \u2265 0. Let X1,\u00b7\u00b7\u00b7 , Xn be i.i.d. samples from a\ndistribution D on R with mean \u00b5 and variance \u03c32. Then\n\u221a\n\n8m)\n\n(n \u2212 2m)2 \u03c32 = O\n\n(cid:16)\n\nn\n\n(n \u2212 2m)2 \u00b7 \u03c32 =\n\n1 + O\n\n.\n\nFurthermore, if D is a symmetric distribution, then trimm(X) is also symmetric and\n\n(trimm(X) \u2212 \u00b5)2(cid:105) \u2264 n(1 +\nE(cid:104)\n(trimm(X))2(cid:105) \u2264\nE(cid:104)\n\n\u03c32.\n\n(cid:16) m\n(cid:17)\n(cid:17)(cid:17) \u03c32\n(cid:16) m\n\nn\n\nn\n\nn\n\nSensitivity of Trimmed Mean: The trimmed mean also has low smooth sensitivity.\nProposition 11. Let a, b, t \u2208 R with a < b and t \u2265 0 and n, m, k \u2208 Z with n > 2m \u2265 0 and\nx \u2208 Rn. Denote x1, x2,\u00b7\u00b7\u00b7 , xn in sorted order as x(1) \u2264 x(2) \u2264 \u00b7\u00b7\u00b7 \u2264 x(n). The t-smooth\nsensitivity of the trimmed mean restricted to inputs in [a, b] \u2013 that is, trimm : [a, b]n \u2192 [a, b] \u2013 is\n\nSt\ntrimm\n\n(x) =\n\n1\n\nn \u2212 2m\n\nn\n\nmax\nk=0\n\ne\u2212kt k+1\nmax\n(cid:96)=0\n\nx(n\u2212m+1+k\u2212(cid:96)) \u2212 x(m+1\u2212(cid:96)),\n\nwhere we de\ufb01ne x(i) = a for i \u2264 0 and x(i) = b for i > n.\nThis is a direct extension of the analysis of the smooth sensitivity of the median by Nissim, Raskhod-\nnikova, and Smith [27]. There is a O(n log n)-time algorithm for computing the smooth sensitivity.\n\n3 Average-Case Mean Estimation via Smooth Sensitivity of Trimmed Mean\n\n(cid:80)n\n\ni=1 Xi. This is unbiased \u2013 that is, E\n\n(cid:2)(X \u2212 \u00b5)2(cid:3) = \u03c32\n\nHaving compiled the relevant tools, we turn to the problem of mean estimation in the average-case\ndistributional setting. We have an unknown distribution D on R and our goal is to estimate the\nmean \u00b5, given n independent samples X1,\u00b7\u00b7\u00b7 , Xn from D. Our non-private comparison point is\nthe (un-trimmed) empirical mean X = 1\nn\nand has variance E\nassumption that some loose bound \u00b5 \u2208 [a, b] is known. Our results will only pay logarithmically in\nb \u2212 a, so this bound need not be tight.\nIn our situation the inputs may be unbounded. This means the trimmed mean has in\ufb01nite global\nsensitivity and in\ufb01nite smooth sensitivity. Thus we apply truncation to control the sensitivity.\nDe\ufb01nition 12 (Truncation). For a, b, x \u2208 R with a < b, de\ufb01ne [x][a,b] = max{min{x, b}, a}. For\nx \u2208 Rn and a < b, de\ufb01ne [x][a,b] = ([x1][a,b], [x2][a,b],\u00b7\u00b7\u00b7 , [xn][a,b]).\n\n(cid:2)X(cid:3) = \u00b5 \u2013\n(cid:2)(X \u2212 \u00b5)2(cid:3). We make the (necessary)\n\nn , where \u03c32 = E\nX\u2190D\n\nX\u2190Dn\n\nX\u2190Dn\n\nTruncation of Inputs: By truncating inputs before applying the trimmed mean, we obtain the\nfollowing error bound. This holds for symmetric and subgaussian distributions.\n\n7\n\n\fFigure 3: Excesss variance of the private\ntrimmed mean with smooth sensitivity. Data is\n[N (0, 1)1001][\u221250,1050]. Average of 106 runs.\n\nFigure 4: Excesss variance of the private\ntrimmed mean with smooth sensitivity. Data is\n[N (0, 1)5001][\u221250,1050]. Average of 106 runs.\n\nProposition 13. Let D be a symmetric O(\u03c3)-subgaussian distribution on R with mean \u00b5 and variance\n\u03c32. Let a + O(\u03c3 log n) < \u00b5 < b \u2212 O(\u03c3 log n). Let n, m \u2208 Z satisfy n > 3m \u2265 0. Then\n\n(cid:104)(cid:0)trimm\n\n(cid:0)[X][a,b]\n\n(cid:1) \u2212 \u00b5(cid:1)2(cid:105)\n\n(cid:16)\n\n=\n\n\u03c32\nn\n\n(cid:16) m\n\n(cid:17)(cid:17)\n\nn\n\n1 + O\n\n.\n\nE\n\nX\u2190Dn\n\nWe remark that if D is not subgaussian, but rather subexponential, then a similar bound can be proved.\nNext we turn to analyzing the smooth sensitivity of the trimmed mean with truncated inputs.\nLemma 14. Let D be a \u03c3-subgaussian distribution on R. Let a < 0 < b. Then\n\u2264 8\u03c32 log n + e\u22122mt(b \u2212 a)2\n\n(cid:17)2(cid:21)\n\n(cid:20)(cid:16)\n\nE\n\n.\n\nX\u2190Dn\n\nSt\ntrimm([\u00b7][a,b])(X)\n\n(n \u2212 2m)2\n\nProof. By Proposition 11,\n\nSt\ntrimm([\u00b7][a,b])(x) =\n\nx(n\u2212m+1+k\u2212(cid:96)) \u2212 x(m+1\u2212(cid:96))\n\nn\n\n1\n\ne\u2212kt k+1\nmax\n(cid:96)=0\n\nmax\nk=0\n\nn \u2212 2m\n\nn \u2212 2m\n\u2264 max{x(n) \u2212 x(1), e\u2212mt \u00b7 (b \u2212 a)}\n(cid:17)2(cid:21)\n\n,\n\n1\n\nE\n\n\u2264\n(n \u2212 2m)2\n\u2264 8\u03c32 log(2n) + e\u22122mt(b \u2212 a)2\n\nX\u2190Dn\n\n(n \u2212 2m)2\n\n,\n\nwhere the inequality follows from the fact that x(n\u2212m+1+k\u2212(cid:96)) \u2212 x(m+1\u2212(cid:96)) \u2264 x(n) \u2212 x(1) when\nk < m and x(n\u2212m+1+k\u2212(cid:96)) \u2212 x(m+1\u2212(cid:96)) \u2264 b \u2212 a when k \u2265 m. Thus\n\n(cid:2)(X(n) \u2212 X(1))2 + e\u22122mt(b \u2212 a)2(cid:3)\n\n(cid:20)(cid:16)\n\nE\n\nX\u2190Dn\n\nSt\ntrimm([\u00b7][a,b])(X)\n\nwhere the \ufb01nal inequality follows from properties of subgaussians [16, Lem. 4.5] and the fact that\n(x \u2212 y)2 \u2264 4 max{x2, y2} for all x, y \u2208 R.\n\nCombining Proposition 13 and Lemma 14 with the distributions from Section 1.2 yields Theorem 6.\n\nTruncation of Outputs: Rather than truncating the inputs to the trimmed mean, we can truncate\nthe output. This is useful for heavier-tailed distributions and is also simpler to analyze: If Y is a\n\nrandom variable and \u00b5 \u2208 [a, b], then E(cid:2)([Y ][a,b] \u2212 \u00b5)2(cid:3) \u2264 E(cid:2)(Y \u2212 \u00b5)2(cid:3). Truncation of outputs also\n\ncontrols smooth sensitivity. An analysis analogous to that above yields Theorem 8.\n\n8\n\n\f4 Experimental Results\n\nFigures 2, 3, & 4 show an experimental evaluation of our methods \u2013 speci\ufb01cally the combination of\nthe trimmed mean with various smooth sensitivity distributions applied to Gaussian data. We brie\ufb02y\nexplain the experimental setup below. Please see the full version [6] for detail.\nData & Error: Our data is sampled from a standard univariate Gaussian distribution. The truncation\ninterval is set conservatively to [a, b] = [\u221250, 1050] and the data is truncated before applying\nthe trimmed mean. We measure the variance or mean squared error of the various algorithms.\n\n(cid:19)2(cid:35)\n\n(cid:34)(cid:18)\n\n(cid:0)[X][a,b]\n\n(cid:1) + St\n\nI.e., \u03c32 =\n\nE\n\nX\u2190N (\u00b5,1)n\n\ntrimm\n\ntrimm([\u00b7][a,b])\n\n(X) \u00b7 Z \u2212 \u00b5\n\n, where Z is appropriate\n\n2 \u03b52-CDP, ( 1\n\nnoise. For scaling, we multiply \u03c32 by n and subtract 1.\nAlgorithms: We compare our three noise distributions against three other distributions from prior\nwork. Three further comparison points are provided. We explain each line brie\ufb02y; see the full\nversion [6] for more detail. LLN: The Laplace Log-Normal distribution. ULN: The Uniform Log-\nNormal distribution. arshinhN: The Arsinh-Normal distribution. Student\u2019s T: The Student\u2019s\nT distribution with 3 degrees of freedom. This is a simpli\ufb01cation of a prior algorithm [27]. (The\nCauchy distribution has in\ufb01nite variance and is thus is not included.) trim non-priv: We plot\nthe line where no noise is added for privacy. The only source of error is the trimmed mean itself.\nglobal sens: We plot the error that would be attained by truncating the data to [a, b] and then using\nthis to bound global sensitivity and add Gaussian noise. lower bound: We plot the lower bound on\nvariance. Lap: Laplace noise. This was suggested in the original work [27]. N: Gaussian noise. This\nwas analyzed in prior work [4] and provides the relaxed notion of truncated CDP.\nPrivacy & Parameters: The algorithms we compare satisfy different variants of differential privacy.\nAs such, it is not possible to give a completely fair comparison. Our new distributions satisfy\nconcentrated differential privacy, whereas the Student\u2019s T distribution satis\ufb01es pure differential\nprivacy. Laplace and Gaussian noise satisfy approximate differential privacy or truncated concentrated\ndifferential privacy. To provide the fairest possible comparison, we pick a \u03b5 value (namely, \u03b5 = 1 or\n\u03b5 = 0.2) and then compare (\u03b5, 0)-differential privacy with relaxations thereof. Namely, we compare\n2 \u03b52, 10)-tCDP, and (\u03b5, 10\u22126)-differential privacy. Each of\n(\u03b5, 0)-differential privacy with 1\nthese is implied by (\u03b5, 0)-differential privacy and the implication is fairly tight, so intuitively provides\na roughly similar level of privacy. Aside from the privacy parameters (\u03b5 etc.) and the dataset size\n(n), we show a range of trimming levels m on the horizontal axis. We numerically optimize the\nsmoothing parameter t. We set the distribution shape parameters to appropriate near-optimal values.\nOverall Performance: The experimental results demonstrate that for relatively moderate parameter\nsettings (n = 201 and \u03b5 = 1 depicted in Figure 2) it is possible to privately estimate the mean with\nvariance that is only a factor of two higher than non-privately. For n = 1001, it is possible to drive this\nexcess variance down to 10%. Indeed, in these settings, the additional error introduced by trimming\nis more signi\ufb01cant than that introduced by the privacy-preserving noise!\nWe remark that the data for these experiments is perfectly Gaussian. If the data deviates from this\nideal, the robustness of the trimmed mean may actually be bene\ufb01cial for accuracy (and not just\nprivacy). Figure 1 shows that for some natural distributions the trimming does reduce variance.\nComparison of Algorithms: The results show that different algorithms perform better in different\nparameter regimes. However, generally, the Laplace Log-Normal distribution has the lowest variance,\nclosely followed by the Student\u2019s T distribution. The Arsinh-Normal distribution performs adequately,\nbut the Uniform Log-Normal distribution performs poorly. The Laplace and Gaussian distributions\nfrom prior work perform substantially worse than our distributions in some parameter settings.\nNote that the different algorithms satisfy slightly different privacy guarantees and also have very\ndifferent tail behaviours. Since the variance of many of the algorithms is broadly similar, the choice\nof which algorithm is truly best will depend on these factors. If the stronger pure differential privacy\nguarantee is preferable, the Student\u2019s T distribution is likely best. However, this has no third moment\nand consequently heavy tails. This makes it bad if, for example, the goal is a con\ufb01dence interval,\nrather than a point estimate of the mean. The lightest tails are provided by the Gaussian, but this\nonly satis\ufb01es the weaker truncated CDP or approximate differential privacy de\ufb01nitions. Laplace\nLog-Normal is in between \u2013 it satis\ufb01es the strong concentrated differential privacy de\ufb01nition and has\nquasipolynomial tails and all its moments are \ufb01nite.\n\n9\n\n\fAcknowledgement\n\nPart of this work was completed while the authors were at the Simons Institute for the Theory of\nComputing at the University of California, Berkeley.\n\nReferences\n[1] Mart\u00edn Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar,\nand Li Zhang. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC\nConference on Computer and Communications Security, 2016.\n\n[2] Marco Avella-Medina and Victor-Emmanuel Brunel. Differentially private sub-gaussian location\n\nestimators. arXiv preprint arXiv:1906.11923, 2019.\n\n[3] Raef Bassily, Adam Smith, and Abhradeep Thakurta. Private empirical risk minimization: Ef\ufb01-\ncient algorithms and tight error bounds. In 2014 IEEE 55th Annual Symposium on Foundations\nof Computer Science, pages 464\u2013473. IEEE, 2014.\n\n[4] Mark Bun, Cynthia Dwork, Guy N. Rothblum, and Thomas Steinke. Composable and versatile\n\nprivacy via truncated cdp. In STOC, 2018.\n\n[5] Mark Bun and Thomas Steinke. Concentrated differential privacy: Simpli\ufb01cations, extensions,\n\nand lower bounds. In TCC, 2016. https://arxiv.org/abs/1605.02065.\n\n[6] Mark Bun and Thomas Steinke. Average-case averages: Private algorithms for smooth sensitivity\n\nand mean estimation. arXiv preprint arXiv:1906.02830, 2019.\n\n[7] Zachary Campbell, Andrew Bray, Anna Ritz, and Adam Groce. Differentially private anova\ntesting. In 2018 1st International Conference on Data Intelligence and Security (ICDIS), pages\n281\u2013285. IEEE, 2018.\n\n[8] Cl\u00e9ment L Canonne, Gautam Kamath, Audra McMillan, Adam Smith, and Jonathan Ullman.\nThe structure of optimal private tests for simple hypotheses. In Proceedings of the 51st Annual\nACM SIGACT Symposium on Theory of Computing, pages 310\u2013321. ACM, 2019.\n\n[9] Cl\u00e9ment L Canonne, Gautam Kamath, Audra McMillan, Jonathan Ullman, and Lydia Za-\narXiv preprint\n\nPrivate identity testing for high-dimensional distributions.\n\nkynthinou.\narXiv:1905.11947, 2019.\n\n[10] Simon Couch, Zeki Kazan, Kaiyan Shi, Andrew Bray, and Adam Groce. Differentially private\n\nnonparametric hypothesis testing. arXiv preprint arXiv:1903.09364, 2019.\n\n[11] John Duchi and Ryan Rogers. Lower bounds for locally private estimation via communication\n\ncomplexity. arXiv preprint arXiv:1902.00582, 2019.\n\n[12] Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. Our\n\ndata, ourselves: Privacy via distributed noise generation. In EUROCRYPT, 2006.\n\n[13] Cynthia Dwork and Jing Lei. Differential privacy and robust statistics. In STOC, 2009.\n\n[14] Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Calibrating noise to\nIn TCC, 2006. http://repository.cmu.edu/jpc/\n\nsensitivity in private data analysis.\nvol7/iss3/2.\n\n[15] Cynthia Dwork and Guy Rothblum. Concentrated differential privacy. CoRR, abs/1603.01887,\n\n2016. https://arxiv.org/abs/1603.01887.\n\n[16] Vitaly Feldman and Thomas Steinke. Calibrating noise to variance in adaptive data analysis. In\n\nCOLT, 2018.\n\n[17] Sam Fletcher and Md Zahidul Islam. Differentially private random decision forests using\n\nsmooth sensitivity. Expert Systems with Applications, 78:16\u201331, 2017.\n\n10\n\n\f[18] Marco Gaboardi, Hyun-Woo Lim, Ryan M Rogers, and Salil P Vadhan. Differentially private chi-\nsquared hypothesis testing: Goodness of \ufb01t and independence testing. In ICML\u201916 Proceedings\nof the 33rd International Conference on International Conference on Machine Learning-Volume\n48. JMLR, 2016.\n\n[19] Marco Gaboardi, Ryan Rogers, and Or Sheffet. Locally private mean estimation: Z-test and\n\ntight con\ufb01dence intervals. arXiv preprint arXiv:1810.08054, 2018.\n\n[20] Alon Gonen and Ran Gilad-Bachrach. Smooth sensitivity based approach for differentially\n\nprivate principal component analysis. arXiv preprint arXiv:1710.10556, 2017.\n\n[21] Matthew Joseph, Janardhan Kulkarni, Jieming Mao, and Zhiwei Steven Wu. Locally private\n\ngaussian estimation. arXiv preprint arXiv:1811.08382, 2018.\n\n[22] Gautam Kamath, Jerry Li, Vikrant Singhal, and Jonathan Ullman. Privately learning high-\n\ndimensional distributions. In COLT, 2019.\n\n[23] Vishesh Karwa, Sofya Raskhodnikova, Adam Smith, and Grigory Yaroslavtsev. Private analysis\n\nof graph structure. Proceedings of the VLDB Endowment, 4(11):1146\u20131157, 2011.\n\n[24] Vishesh Karwa and Salil Vadhan. Finite sample differentially private con\ufb01dence intervals. In\n9th Innovations in Theoretical Computer Science Conference (ITCS 2018). Schloss Dagstuhl-\nLeibniz-Zentrum fuer Informatik, 2018.\n\n[25] Shiva Prasad Kasiviswanathan, Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith. An-\nalyzing graphs with node differential privacy. In Theory of Cryptography Conference, pages\n457\u2013476. Springer, 2013.\n\n[26] Ilya Mironov. R\u00e9nyi differential privacy.\n\nIn 30th IEEE Computer Security Foundations\nSymposium, CSF 2017, Santa Barbara, CA, USA, August 21-25, 2017, pages 263\u2013275, 2017.\n\n[27] Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith. Smooth sensitivity and sampling in\nprivate data analysis. In Proceedings of the thirty-ninth annual ACM symposium on Theory of\ncomputing, pages 75\u201384. ACM, 2007.\n\n[28] Rina Okada, Kazuto Fukuchi, and Jun Sakuma. Differentially private analysis of outliers. In\nJoint European Conference on Machine Learning and Knowledge Discovery in Databases,\npages 458\u2013473. Springer, 2015.\n\n[29] Nicolas Papernot, Shuang Song, Ilya Mironov, Ananth Raghunathan, Kunal Talwar, and Ulfar\nErlingsson. Scalable private learning with PATE. In International Conference on Learning\nRepresentations, 2018.\n\n[30] Adam Sealfon and Jonathan Ullman. Ef\ufb01ciently estimating erdos-renyi graphs with node\n\ndifferential privacy. In NeurIPS, 2019.\n\n[31] Robert Ser\ufb02ing. Asymptotic relative ef\ufb01ciency in estimation. International encyclopedia of\n\nstatistical science, pages 68\u201372, 2011.\n\n[32] Adam Smith. Ef\ufb01cient, differentially private point estimators. arXiv preprint arXiv:0809.4794,\n\n2008.\n\n[33] Yue Wang, Jaewoo Lee, and Daniel Kifer. Differentially private hypothesis testing, revisited.\n\narXiv preprint arXiv:1511.03376, 1, 2015.\n\n[34] Yue Wang and Xintao Wu. Preserving differential privacy in degree-correlation based graph\n\ngeneration. Transactions on data privacy, 6(2):127, 2013.\n\n11\n\n\f", "award": [], "sourceid": 85, "authors": [{"given_name": "Mark", "family_name": "Bun", "institution": "Boston University"}, {"given_name": "Thomas", "family_name": "Steinke", "institution": "IBM -- Almaden"}]}