{"title": "Differentially Private Change-Point Detection", "book": "Advances in Neural Information Processing Systems", "page_first": 10825, "page_last": 10834, "abstract": "The change-point detection problem seeks to identify distributional changes at an unknown change-point k* in a stream of data. This problem appears in many important practical settings involving personal data, including biosurveillance, fault detection, finance, signal detection, and security systems. The field of differential privacy offers data analysis tools that provide powerful worst-case privacy guarantees. We study the statistical problem of change-point problem through the lens of differential privacy. We give private algorithms for both online and offline change-point detection, analyze these algorithms theoretically, and then provide empirical validation of these results.", "full_text": "Differentially Private Change-Point Detection\n\nRachel Cummings\n\nGeorgia Institute of Technology\n\nrachelc@gatech.edu\n\nSara Krehbiel\n\nUniversity of Richmond\n\nkrehbiel@richmond.edu\n\nYajun Mei\n\nGeorgia Institute of Technology\n\nymei@gatech.edu\n\nRui Tuo\n\nTexas A&M University\n\nruituo@tamu.edu\n\nWanrong Zhang\u21e4\n\nGeorgia Institute of Technology\n\nwanrongz@gatech.edu\n\nAbstract\n\nThe change-point detection problem seeks to identify distributional changes at\nan unknown change-point k\u21e4 in a stream of data. This problem appears in many\nimportant practical settings involving personal data, including biosurveillance, fault\ndetection, \ufb01nance, signal detection, and security systems. The \ufb01eld of differen-\ntial privacy offers data analysis tools that provide powerful worst-case privacy\nguarantees. We study the statistical problem of change-point detection through\nthe lens of differential privacy. We give private algorithms for both online and\nof\ufb02ine change-point detection, analyze these algorithms theoretically, and provide\nempirical validation of our results.\n\n1\n\nIntroduction\n\nThe change-point detection problem seeks to identify distributional changes at an unknown change-\npoint k\u21e4 in a stream of data. The estimated change-point should be consistent with the hypothesis\nthat the data are initially drawn from pre-change distribution P0 but from post-change distribution P1\nstarting at the change-point. This problem appears in many important practical settings, including\nbiosurveillance, fault detection, \ufb01nance, signal detection, and security systems. For example, the CDC\nmay wish to detect a disease outbreak based on real-time data about hospital visits, or smart home\nIoT devices may want to detect changes in activity within the home. In both of these applications, the\ndata contain sensitive personal information.\nThe \ufb01eld of differential privacy offers data analysis tools that provide powerful worst-case privacy\nguarantees. Informally, an algorithm that is \u270f-differentially private ensures that any particular output\nof the algorithm is at most e\u270f more likely when a single data entry is changed. In the past decade,\nthe theoretical computer science community has developed a wide variety of differentially private\nalgorithms for many statistical tasks. The private algorithms most relevant to this work are based on\nthe simple output perturbation principle that to produce an \u270f-differentially private estimate of some\nstatistic on the database, we should add to the exact statistic noise proportional to /\u270f, where \nindicates the sensitivity of the statistic, or how much it can be in\ufb02uenced by a single data entry.\nWe study the statistical problem of change-point problem through the lens of differential privacy. We\ngive private algorithms for both online and of\ufb02ine change-point detection, analyze these algorithms\ntheoretically, and then provide empirical validation of these results.\n\n\u21e4Primary author. Authors are listed in alphabetical order.\n\n32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montr\u00e9al, Canada.\n\n\f1.1 Related work\n\nThe change-point detection problem originally arose from industrial quality control, and has since\nbeen applied in a wide variety of other contexts including climatology [LR02], econometrics [BP03],\nand DNA analysis [ZS12]. The problem is studied both in the of\ufb02ine setting, in which the algorithm\nhas access to the full dataset X = {x1, . . . , xn} up front, and in the online setting, in which data\npoints arrive one at a time X = {x1, . . .}. Change-point detection is a canonical problem in statistics\nthat has been studied for nearly a century; selected results include [She31, Pag54, Shi63, Rob66,\nLor71, Pol85, Pol87, Mou86, Lai95, Lai01, Kul01, Mei06, Mei08, Mei10, Cha17].\nOur approach is inspired by the commonly used Cumulative Sum (CUSUM) procedure [Pag54]. It\nfollows the generalized log-likelihood ratio principle, calculating\n\n`(k) =\n\nlog\n\nP1(xi)\nP0(xi)\n\nnXi=k\n\nfor each k 2 [n] and declaring that a change occurs if and only if `(\u02c6k) T for MLE \u02c6k =\nargmaxk `(k) and appropriate threshold T > 0. The existing change-point literature works primarily\nin the asymptotic setting when k\u21e4n/n ! r for some r 2 (0, 1) as n ! 1 (see, e.g., [Hin70, Car88]).\nIn contrast, we consider \ufb01nite databases and provide the \ufb01rst accuracy guarantees for the MLE from\na \ufb01nite sample (n < 1).\nIn offering the \ufb01rst algorithms for private change-point detection, we primarily use two powerful\ntools from the differential privacy literature. REPORTMAX [DR14] calculates noisy approximations\nof a stream of queries on the database and reports which query produced the largest noisy value.\nWe instantiate this with partial log-likelihood queries to produce a private approximation of the the\nchange-point MLE in the of\ufb02ine setting. ABOVETHRESH [DNR+09] calculates noisy approximations\nof a stream of queries on the database iteratively and aborts as soon as a noisy approximation exceeds\na speci\ufb01ed threshold. We extend our of\ufb02ine results to the harder online setting, in which a bound on\nk\u21e4 is not known a priori, by using ABOVETHRESH to identify a window of \ufb01xed size n in which a\nchange is likely to have occurred so that we can call our of\ufb02ine algorithm at that point to estimate the\ntrue change-point.\n\n1.2 Our results\n\nWe use existing tools from differential privacy to solve the change-point detection problem in both\nof\ufb02ine and online settings, neither of which have been studied in the private setting before.\n\nPrivate of\ufb02ine change-point detection. We develop an of\ufb02ine private change-point detection\nalgorithm OFFLINEPCPD (Algorithm 1) that is accurate under one of two assumptions about\nthe distributions from which data are drawn. As is standard in the privacy literature, we give\naccuracy guarantees that bound the additive error of our estimate of the true change-point with\nhigh probability. Our accuracy theorem statements (Theorems 2 and 4) also provide guarantees for\nthe non-private estimator for comparison. Since traditional statistics typically focuses on the the\nasymptotic consistency and unbiasedness of the estimator, ours are the \ufb01rst \ufb01nite-sample accuracy\nguarantees for the standard (non-private) MLE. As expected, MLE accuracy decreases with the\nsensitivity of the measured quantity but increases as the pre- and post-change distribution grow\napart. Interestingly, it is constant with respect to the size of the database. In providing MLE bounds\nalongside accuracy guarantees for our private algorithms, we are able to quantify the cost of privacy\nas roughly DKL(P0||P1)/\u270f.\nWe are able to prove \u270f-differential privacy under the \ufb01rst distributional assumption, which is that the\nmeasured quantity has bounded sensitivity (`), by instantiating the general-purpose REPORTMAX\nalgorithm from the privacy literature with our log-likelihood queries (Theorem 1). Importantly and in\ncontrast to our accuracy results, the distributional assumption need only apply to the hypothesized\ndistributions from which data are drawn; privacy holds for arbitrary input databases. We offer a\nlimited privacy guarantee for our second distributional assumption, ensuring that if an individual\ndata point is drawn from one of the two hypothesized distributions, redrawing that data from either\nof the distributions will not be detected, regardless of the composition of the rest of the database\n(Theorem 3).\n\n2\n\n\fPrivate online change-point detection.\nIn ONLINEPCPD (Algorithm 2), we extend our of\ufb02ine\nresults to the online setting by using the ABOVETHRESH framework to \ufb01rst identify a window in\nwhich the change is likely to have happened and then call the of\ufb02ine algorithm to identify a more\nprecise approximation of when it occurred. Standard \u270f-differential privacy under our \ufb01rst distributional\nassumption follows from composition of the underlying privacy mechanisms (Theorem 5).2 Accuracy\nof our online mechanism relies on appropriate selection of the threshold that identi\ufb01es a window in\nwhich a change-point has likely occurred, at which point the error guarantees are inherited from the\nof\ufb02ine algorithm (Theorem 6).\n\nEmpirical validation. Finally, we run several Monte Carlo experiments to validate our theoretical\nresults for both the online and of\ufb02ine settings. We consider data drawn from Bernoulli and Gaussian\ndistributions, which satisfy our \ufb01rst and second distributional assumptions, respectively. Our of\ufb02ine\nexperiments are summarized in Figure 1, which shows that change-point detection is easier when\nP0 and P1 are further apart and harder when the privacy requirement is stronger (\u270f is smaller).\nAdditionally, these experiments enhance our theoretical results, \ufb01nding that OFFLINEPCPD performs\nwell even when we relax the assumptions required for our theoretical accuracy bounds by running\nour algorithm on imperfect hypotheses P0 and P1 that are closer together than the true distributions\nfrom which data are drawn. Figure 2 shows that ONLINEPCPD also performs well, consistent with\nour theoretical guarantees.\n\n2 Preliminaries\n\nOur work considers the statistical problem of change-point detection through the lens of differential\nprivacy. Section 2.1 de\ufb01nes the change-point detection problem, and Section 2.2 describes the\ndifferentially private tools that will be brought to bear.\n\n2.1 Change-point background\nLet X = {x1, . . . , xn} be n real-valued data points. The change-point detection problem is\nparametrized by two distributions, P0 and P1. The data points in X are hypothesized to initially\nbe sampled i.i.d. from P0, but at some unknown change time k\u21e4 2 [n], an event may occur (e.g.,\nepidemic disease outbreak) and change the underlying distribution to P1. The goal of a data analyst is\nto announce that a change has occurred as quickly as possible after k\u21e4. Since the xi may be sensitive\ninformation\u2014such as individuals\u2019 medical information or behaviors inside their home\u2014the analyst\nwill wish to announce the change-point time in a privacy-preserving manner.\nIn the standard non-private of\ufb02ine change-point literature, the analyst wants to test the null hypothesis\nH0 : k\u21e4 = 1, where x1, . . . , xn \u21e0iid P0, against the composite alternate hypothesis H1 : k\u21e4 2 [n],\nwhere x1, . . . , xk\u21e41 \u21e0iid P0 and xk\u21e4, . . . , xn \u21e0iid P1. The log-likelihood ratio of k\u21e4 = 1 against\nk\u21e4 = k is given by\n\n`(k, X) =\n\nlog\n\nP1(xi)\nP0(xi)\n\n.\n\n(1)\n\nnXi=k\n\nThe maximum likelihood estimator (MLE) of the change time k\u21e4 is given by\n\n(2)\n\n\u02c6k(X) = argmaxk2[n]`(k, X).\nWhen X is clear from context, we will simply write `(k) and \u02c6k.\nAn important quantity in our accuracy analysis will be the Kullback-Leibler distance between probabil-\nP0(x) dx = Ex\u21e0P1[log P1(x)\nP0(x) ].\n0 = 0.\n\nity distributions P0 and P1, de\ufb01ned as DKL(P1||P0) =R 1\n\nWe always use log to refer to the natural logarithm, and when necessary, we interpret log 0\nWe will measure the additive error of our estimations of the true change point as follows.\nDe\ufb01nition 1 ((\u21b5, )-accuracy). A change-point detection algorithm that produces a change-point\nestimator \u02dck(X) where a distribution change occurred at time k\u21e4 is (\u21b5, )-accurate if Pr[|\u02dck k\u21e4| <\n\u21b5] 1 , where the probability is taken over randomness of the algorithm and sampling of X.\n2We note that we can relax our distributional assumption and get a weaker privacy guarantee as in the of\ufb02ine\nsetting if desired.\n\nP1(x) log P1(x)\n\n1\n\n3\n\n\f2.2 Differential privacy background\n\nPr[M(X) 2S ] \uf8ff exp(\u270f) Pr[M(X0) 2S ] + .\n\nDifferential privacy bounds the maximum amount that a single data entry can affect analysis performed\non the database. Two databases X, X0 are neighboring if they differ in at most one entry.\nDe\ufb01nition 2 (Differential Privacy [DMNS06]). An algorithm M : Rn !R is (\u270f, )-differentially\nprivate if for every pair of neighboring databases X, X0 2 Rn, and for every subset of possible\noutputs S\u2713R ,\nIf = 0, we say that M is \u270f-differentially private.\nOne common technique for achieving differential privacy is by adding Laplace noise. The Laplace dis-\n2b exp\u21e3|x|b \u2318.\ntribution with scale b is the distribution with probability density function: Lap(x|b) = 1\nWe will write Lap(b) to denote the Laplace distribution with scale b, or (with a slight abuse of notation)\nto denote a random variable sampled from Lap(b).\nThe sensitivity of a function or query f is de\ufb01ned as (f ) = maxneighbors X,X0 |f (X) f (X0)|. The\nLaplace Mechanism of [DMNS06] takes in a function f, database X, and privacy parameter \u270f, and\noutputs f (X) + Lap((f )/\u270f). Our algorithms rely on two existing differentially private algorithms,\nREPORTMAX [DR14] and ABOVETHRESH [DNR+09], which are overviewed in Appendix A.\nAppendix B covers the concentration inequalities used in the proofs of our bounds.\n\n3 Of\ufb02ine private change-point detection\n\nIn this section, we investigate the differentially private change point detection problem in the setting\nthat n data points X = {x1, . . . , xn} are known to the algorithm in advance. Given two hypothesized\ndistributions P0 and P1, our algorithm OFFLINEPCPD privately approximates the MLE \u02c6k of the\nchange time k\u21e4. We provide accuracy bounds for both the MLE and the output of our algorithm\nunder two different assumptions about the distributions from which the data are drawn, summarized\nin Table 1.\n\nTable 1: Summary of non-private and private of\ufb02ine accuracy guarantees under H1. The expressions\n(`), A, C, and CM are de\ufb01ned in (4), (5), (8), (9), resp.\n\nAssumption\n\nA := (`) < 1\nA := A < 1\n\nMLE\n\n2A2\nC2 log 32\n3\n\n67\nC2\nM\n\nlog 64\n3\n\nOFFLINEPCPD\n3 , 4A\nC\u270f log 16\nC2 log 64\n3 , 2A log(16/)\n\nlog 128\n\nCM \u270f\n\nmaxn 8A2\nmaxn 262\n\nC2\nM\n\no\no\n\nThe \ufb01rst assumption essentially requires that P1(x)/P0(x) cannot be arbitrarily large or arbitrarily\nsmall for any x. We note that this assumption is not satis\ufb01ed by several important families of\ndistributions, including Gaussians. The second assumption, motivated by the > 0 relaxation of\ndifferential privacy, instead requires that the x for which this log ratio exceeds some bound A have\nprobability mass at most .\nAlthough the accuracy of OFFLINEPCPD only holds under the change-point model\u2019s alternate\nhypothesis H1, it is \u270f-differentially private for any hypothesized distributions P0, P1 with \ufb01nite (`)\nand privacy parameters \u270f> 0, = 0 regardless of the distributions from which X is drawn. We\noffer a similar but somewhat weaker privacy guarantee when (`) is in\ufb01nite but A is \ufb01nite, which\nroughly states that a data point sampled from either P0 or P1 can be replaced with a fresh sample\nfrom either P0 or P1 without detection.\n\n3.1 Of\ufb02ine algorithm\n\nOur proposed of\ufb02ine algorithm OFFLINEPCPD applies the report noisy max algorithm [DR14] to the\nchange-point problem by adding noise to partial log-likelihood ratios `(k) used to estimate the change\n\n4\n\n\fpoint MLE \u02c6k. The algorithm chooses Laplace noise parameter A/\u270f depending on input hypothesized\ndistributions P0, P1 and privacy parameters \u270f, and then outputs\n\n\u02dck = argmax\n\n1\uf8ffk\uf8ffn {`(k) + Zk}.\n\n(3)\n\nOur algorithm can be easily modi\ufb01ed to additionally output an approximation of `(\u02dck) and incur 2\u270f\nprivacy cost by composition.\n\nAlgorithm 1 Of\ufb02ine private change-point detector : OFFLINEPCPD(X, P0, P1,\u270f,, n )\n\nInput: database X, distributions P0, P1, privacy parameters \u270f, , database size n\nif = 0 then\n\n# set A = ` as in (4)\n\n# set A = A as in (5)\n\n# Report noisy argmax\n\nSet A = maxx log P1(x)\n\nP0(x) minx0 log P1(x0)\n\nP0(x0)\n\nelse\nSet A = min{t : maxi=0,1 Prx\u21e0Pi[2| log P1(x)\nend if\nfor k = 1, . . . , n do\nCompute `(k) =Pn\ni=k log P1(xi)\nP0(xi)\nSample Zk \u21e0 Lap( A\n\u270f )\n\nend for\nOutput \u02dck = argmax\n\n1\uf8ffk\uf8ffn {`(k) + Zk}\n\nP0(x)| > t] 0 and identify a value A for\nwhich most values of x \u21e0 P0, P1 have bounded log-likelihood ratio:\n\nA = min\u21e2t : max\n\ni=0,1\n\nx\u21e0Pi\uf8ff2|log\n\nPr\n\nP1(x)\n\nP0(x)| > t 0, we have A = 2\u00b5[1(1 /2) + \u00b5/2],\nwhere is the cumulative distribution function (CDF) of the standard normal distribution.\n\n3.2 Theoretical properties under the uniform bound assumption\nIn this subsection, we prove privacy and accuracy of OFFLINEPCPD when = 0 and P0, P1 are\nsuch that (`) is \ufb01nite. Note that if (`) is in\ufb01nite, then the algorithm will simply add noise with\nin\ufb01nite scale and will still be differentially private.\nTheorem 1. For arbitrary data X, OFFLINEPCPD(X, P0, P1,\u270f, 0) is (\u270f, 0)-differentially private.\nThe proof follows by instantiation of REPORTMAX [DR14] with queries `(k) for k 2 [n], which\nhave sensitivity A =( `). It is included in Appendix C for completeness.\n\n5\n\n\fNext we provide accuracy guarantees of the standard (non-private) MLE \u02c6k and the output \u02dck of\nour private algorithm OFFLINEPCPD when the data are drawn from P0, P1 with true change point\nk\u21e4 2 (1, n). By providing both bounds, Theorem 2 quanti\ufb01es the cost of requiring privacy in change\npoint detection.\nOur result for the standard (non-private) MLE is the \ufb01rst \ufb01nite-sample accuracy guarantee for this\nestimator. Such non-asymptotic properties have not been previously studied in traditional statistics,\nwhich typically focuses on consistency and unbiasedness of the estimator, with less attention to the\nconvergence rate. We show that the additive error of the MLE is constant with respect to the sample\nsize, which means that the convergence rate is OP (1). That is, it converges in probability to the true\nchange-point k\u21e4 in constant time.\nNote that accuracy depends on two measures A and C of the distances between distributions P0\nand P1. Accuracy both of MLE \u02c6k and OFFLINEPCPD output \u02dck is best for distributions for which\nA =( `) is small relative to KL-divergence, which is consistent with the intuition that larger changes\nare easier to detect but output sensitivity degrades the robustness of the estimator and requires more\nnoise for privacy, harming accuracy.\nA technical challenge that arises in proving accuracy of the private estimator is that the xi are not\nidentically distributed when the true change-point k\u21e4 2 (1, n], and so the partial log-likelihood ratios\n`(k) are dependent across k. Hence we need to investigate the impact of adding i.i.d. noise draws\nto a sequence of `(k) that may be neither independent nor identically distributed. Fortunately, the\ndifferences `(k) `(k + 1) = log P1(xk)\nP0(xk) are piecewise i.i.d. This property is key in our proof.\nMoreover, we show that we can divide the possible outputs of the algorithm into regions that of\ndoubling size with exponentially decreasing probability of being selected by the algorithm, resulting\nin accuracy bounds that are independent of the number of data points n.\nTheorem 2. For hypotheses P0, P1 such that (`) < 1 and n data points X drawn from P0, P1\nwith true change time k\u21e4 2 (1, n], the MLE \u02c6k is (\u21b5, )-accurate for any > 0 and\n\n\u21b5 =\n\n2A2\nC2 log\n\n32\n3\n\n.\n\nhypotheses\n\nFor\nOFFLINEPCPD(X, P0, P1,\u270f, 0, n) is (\u21b5, )-accurate for any > 0 and\n\nthis way with\n\nprivacy\n\ndrawn\n\ndata\n\nand\n\n\u21b5 = max\u21e2 8A2\n\nC2 log\n\n64\n3\n\n,\n\n4A\nC\u270f\n\nlog\n\nparameter\n\n\u270f>\n\n(6)\n\n0,\n\n(7)\n\n16\n\n .\n\nIn both expressions, A =( `) and C = min{DKL(P1||P0), DKL(P0||P1)}.\n3.3 Relaxing uniform bound assumptions\nIn this subsection, we prove accuracy and a limited notion of privacy for OFFLINEPCPD when\n> 0 and P0, P1 are such that A is \ufb01nite. Since we are no longer able to uniformly bound\nlog P1(x)/P0(x), these accuracy results include worse constants than those in Section 3.2, but the\nrelaxed assumption about P0, P1 makes the results applicable to a wider range of distributions,\nincluding Gaussian distributions (see Example 1). Note of course that for some pairs of very different\ndistributions, such as distributions with non-overlapping supports, the assumption that A < 1 may\nstill fail. A true change point k\u21e4 can always be detected with perfect accuracy given xk\u21e41 and xk\u21e4,\nso we should not expect to be able to offer any meaningful privacy guarantees for such distributions.\nBy similar rationale, relaxing the uniform bound assumption means that we may have a single data\npoint xj that dramatically increases `(k) for k j, so we cannot add noise proportional to (`) and\nprivacy no longer follows from that of REPORTMAX. Instead we offer a weaker notion of privacy in\nTheorem 3 below. As with the usual de\ufb01nition of differential privacy, we guarantee that the output\nof our algorithm is similarly distributed on neighboring databases, only our notion of neighboring\ndatabases depends on the hypothesized distributions. Speci\ufb01cally, the a single entry in X drawn from\neither P0 or P1 may be replaced without detection by another entry drawn from either P0 or P1, even\nif the rest of the database is arbitrary. The proof is given in Appendix C.\nTheorem 3. For any \u270f, > 0, any hypotheses P0, P1 such that A < 1, any index j 2 [n],\nany i, i0 2{ 0, 1}, and any x1, . . . , xj1, xj+2, . . . , xn, let Xi = {x1, . . . , xn} denote the random\n\n6\n\n\fvariable with xj \u21e0 Pi and let X0i0 = {x1, . . . , xj1, x0j, xj+1, . . . , xn} denote the random variable\nwith x0j \u21e0 Pi0. Then for any S \u2713 [n], we have\n\nPr[OFFLINEPCPD(Xi, P0, P1,\u270f,, n ) 2 S]\n\n\uf8ff exp(\u270f) \u00b7 Pr[OFFLINEPCPD(X0i0, P0, P1,\u270f,, n ) 2 S] + ,\n\nwhere the probabilities are over the randomness of the algorithm and of Xi, X0i0.\nAllowing (`) to be in\ufb01nite precludes our use of Hoeffding\u2019s inequality as in Theorem 2. The main\nidea in the proof, however, can be salvaged by decomposing the change into a change from P0 to\nthe average distribution (P0 + P1)/2 and then the average distribution to P1. Correspondingly, we\nwill use CM, an alternate distance measure between P0 and P1, de\ufb01ned below next to C from the\nprevious section for comparison:\n\n2\n\nP0 + P1\n\nP0 + P1\n\nC = min{DKL(P0||P1), DKL(P1||P0)}\n), DKL(P1||\n\nCM = min\u21e2DKL(P0||\nBecause (2Pi)/(P0 + P1) \uf8ff 2, we have 0 \uf8ff DKL(Pi||(P0 + P1)/2) \uf8ff log 2, and thus the constant\nCM in (9) is well-de\ufb01ned. The proof of the following theorem is given in Appendix C.\nTheorem 4. For > 0 and hypotheses P0, P1 such that A < 1 and n data points X drawn from\nP0, P1 with true change time k\u21e4 2 (1, n), the MLE \u02c6k is (\u21b5, )-accurate for any > 0 and\n\nP0(x) + P1(x)(9)\n\nEx\u21e0Pi\uf8fflog\n\n) = min\n\n2Pi(x)\n\ni=0,1\n\n2\n\n(8)\n\n\u21b5 =\n\nlog\n\n.\n\n(10)\n\n67\nC2\nM\n\n64\n3\n\nhypotheses\n\nFor\nOFFLINEPCPD(X, P0, P1,\u270f,, n ) is (\u21b5, )-accurate for any > 0 and\n\nthis way with\n\nprivacy\n\ndrawn\n\ndata\n\nand\n\nparameter\n\n\u270f>\n\n0,\n\n\u21b5 = max{\n\n262\nC2\nM\n\nlog\n\n128\n3\n\n,\n\n2A log(16/)\n\nCM \u270f\n\n}.\n\n(11)\n\nIn both expressions, A = A and CM = minDKL(P0|| P0+P1\n\n4 Online private change-point detection\n\n2\n\n), DKL(P1|| P0+P1\n\n2\n\n) .\n\nIn this section, we give a new differentially private algorithm for change point detection in the online\nsetting, ONLINEPCPD. In this setting, the algorithm initially receives n data points x1, . . . , xn and\nthen continues to receive data points one at a time. As before, the goal is to privately identify an\napproximation of the time k\u21e4 when the data change from distribution P0 to P1. Additionally, we\nwant to identify this change shortly after it occurs.\nOur of\ufb02ine algorithm is not directly applicable because we do not know a priori how many points must\narrive before a true change point occurs. To resolve this, ONLINEPCPD works like ABOVETHRESH,\ndetermining after each new data entry arrives whether it is likely that a change occurred in the most\nrecent n entries. When ONLINEPCPD detects a suf\ufb01ciently large (noisy) partial log likelihood ratio\nP0(xi), it calls OFFLINEPCPD to privately determine the most likely change point\n\n`(k) =Pj\n\u02dck in the window {xjn+1, . . . , xj}.\nPrivacy of ONLINEPCPD is immediate from composition of ABOVETHRESH and OFFLINEPCPD,\neach with privacy loss \u270f/2. As before, accuracy requires X to be drawn from P0, P1 with some true\nchange point k\u21e4. This algorithm also requires a suitable choice of T to guarantee that OFFLINEPCPD\nis called for a window of data that actually contains k\u21e4. Speci\ufb01cally, T should be large enough that\nthe algorithm is unlikely to call OFFLINEPCPD when j < k\u21e4 but small enough so that it is likely\nto call OFFLINEPCPD by time j = k\u21e4 + n/2. When both of these conditions hold, we inherit the\naccuracy of OFFLINEPCPD, with an extra log n factor arising from the fact that the data are no longer\ndistributed exactly as in the change-point model after conditioning on calling OFFLINEPCPD in a\ncorrect window.\n\ni=k log P1(xi)\n\n7\n\n\fWith our \ufb01nal bounds, we note that n A\nC log(k\u21e4/) suf\ufb01ces for existence of a suitable threshold,\nand an analyst must have a reasonable approximation of k\u21e4 in order to choose such a threshold.\nOtherwise, the accuracy bound itself has no dependence on the change-point k\u21e4.\n\nAlgorithm 2 Online private change-point detector : ONLINEPCPD(X, P0, P1,\u270f, n, T )\n\nP0(x) minx0 log P1(x0)\n\nInput: database X, distributions P0, P1, privacy parameter \u270f, starting size n, threshold T\nLet A = maxx log P1(x)\nLet \u02c6T = T + Lap(4A/\u270f)\nfor each new data point xj, j n do\nCompute `j = maxjn+1\uf8ffk\uf8ffj `(k)\nSample Zj \u21e0 Lap( 8A\n\u270f )\nif `j + Zj > \u02c6T then\n\nP0(x0)\n\nOutput OFFLINEPCPD({xjn+1, . . . , xj}, P0, P1,\u270f/ 2, 0, n) + (j n)\nHalt\nOutput ?\n\nelse\n\nend if\nend for\n\nTheorem 5. For arbitrary data X, ONLINEPCPD(X, P0, P1,\u270f, n, T ) is (\u270f, 0)-differentially private.\nThis privacy guarantee follows from simple composition of ABOVETHRESH and OFFLINEPCPD,\neach with privacy loss \u270f/2. The proof of the accuracy bound is given in Appendix D.\nTheorem 6. For hypotheses P0, P1 such that (`) < 1, a stream of data points X with starting\nsize n drawn from P0, P1 with true change time k\u21e4 n/2, privacy parameter \u270f> 0, and threshold\nT 2 [TL, TU ] with\n\nTL := 2As2 log\n\nTU\n\n:=\n\nnC\n2 \n\n64k\u21e4\n C +\n\n16A\n\nlog\n\nA\n\n2pn log(8/) \n\n,\n\n8k\u21e4\n\n8k\u21e4\n\n\nlog\n\n,\n\n\u270f\n16A\n\n\u270f\n\nwe have that ONLINEPCPD(X, P0, P1,\u270f, n, T ) is (\u21b5, ) accurate for any > 0 and\n\n\u21b5 = max\u21e2 16A2\n\nC2\n\nlog\n\n32n\n\n\n,\n\n4A\nC\u270f\n\nlog\n\n8n\n\n .\n\nIn the above expressions, A =( `) and C = min{DKL(P0||P1), DKL(P1||P0)}.\n5 Numerical studies\n\nWe now report the results of Monte Carlo experiments designed to validate the theoretical results of\nprevious sections. We only consider our accuracy guarantees because the nature of differential privacy\nprovides a strong worst-case guarantee for all hypothetical databases, and therefore is impractical\nand redundant to test empirically. Our simulations consider both of\ufb02ine and online settings for two\ncanonical problems: detecting a change in the mean of Bernoulli and Gaussian distributions.\nWe begin with the of\ufb02ine setting to verify performance of our OFFLINEPCPD algorithm. We use\nn = 200 observations where the true change occurs at time k\u21e4 = 100. This process is repeated 104\ntimes. For both the Bernoulli and Gaussian models, we consider the following three different change\nscenarios, corresponding to the size of the change and parameter selection for OFFLINEPCPD. For\neach of these cases, we consider privacy parameter \u270f = 0.1, 0.5, 1,1, where \u270f = 1 corresponds to\nthe non-private problem, which serves as our baseline. The results are summarized in Figure 1, which\nplots the empirical probabilities = Pr[|\u02dck k\u21e4| >\u21b5 ] as a function of \u21b5.\n\n(A) Large change. Bernoulli model: detecting a change from p0 = 0.2 to p1 = 0.8. Gaussian\n\nmodel: detecting a change from \u00b50 = 0 to \u00b51 = 1.\n\n8\n\n\f(B) Small change. Bernoulli model: detecting a change from p0 = 0.2 to p1 = 0.4. Gaussian\n\nmodel: detecting a change from \u00b50 = 0 to \u00b51 = 0.5.\n\n(C) Misspeci\ufb01ed change Bernoulli model: algorithm tests for change from p0 = 0.2 to p1 = 0.4\nwhen true distributions have p0 = 0.2 and p1 = 0.8. Gaussian model: algorithm tests for\nchange from \u00b50 = 0 to \u00b51 = 0.5 when true distributions have \u00b50 = 0 and \u00b51 = 1.\n\nFigure 1 highlights three positive results for our algorithm when data is drawn from Bernoulli or\nGaussian distributions: accuracy is best when the true change in data is large (plots a and d) compared\nto small (plots b and e), accuracy deteriorates as \u270f decreases for stronger privacy, and the algorithm\nperforms well even when the true change is larger than that hypothesized (plots c and f). This \ufb01gure\nemphasizes that our algorithm performs well even for quite strong privacy guarantees (\u270f< 1). The\nmisspeci\ufb01ed change experiments bolster our theoretical results substantially, indicating that our\nhypotheses can be quite far from the distributions of the true data and our algorithms will still identify\na change-point accurately. We also run Monte Carlo simulations of our online change-point detection\nalgorithm ONLINEPCPD. These are displayed in Figure 2 and discussed in Appendix E.\n\n\u03b2\n\n\u03b2\n\n0\n\n.\n\n1\n\n8\n\n.\n\n0\n\n6\n0\n\n.\n\n4\n\n.\n\n0\n\n2\n\n.\n\n0\n\n0\n0\n\n.\n\n0\n1\n\n.\n\n8\n0\n\n.\n\n6\n0\n\n.\n\n4\n\n.\n\n0\n\n2\n\n.\n\n0\n\n0\n0\n\n.\n\n\u03b5=0.1\n\u03b5=0.5\n\u03b5=1\nMLE(\u03b5=\u221e)\n\n\u03b2\n\n0\n\n.\n\n1\n\n8\n\n.\n\n0\n\n6\n0\n\n.\n\n4\n\n.\n\n0\n\n2\n\n.\n\n0\n\n0\n0\n\n.\n\n\u03b5=0.1\n\u03b5=0.5\n\u03b5=1\nMLE(\u03b5=\u221e)\n\n\u03b2\n\n0\n\n.\n\n1\n\n8\n\n.\n\n0\n\n6\n0\n\n.\n\n4\n\n.\n\n0\n\n2\n\n.\n\n0\n\n0\n0\n\n.\n\n\u03b5=0.1\n\u03b5=0.5\n\u03b5=1\nMLE(\u03b5=\u221e)\n\n0\n\n20\n\n40\n\n60\n\n80\n\n100\n\n0\n\n20\n\n40\n\n60\n\n80\n\n100\n\n0\n\n20\n\n40\n\n60\n\n80\n\n100\n\n\u03b1\n\n\u03b1\n\n\u03b1\n\n(a) Bernoulli, large change\n\n(b) Bernoulli, small change\n\n(c) Bernoulli, misspeci\ufb01ed change\n\n\u03b5=0.1\n\u03b5=0.5\n\u03b5=1\nMLE(\u03b5=\u221e)\n\n\u03b2\n\n0\n1\n\n.\n\n8\n0\n\n.\n\n6\n0\n\n.\n\n4\n\n.\n\n0\n\n2\n\n.\n\n0\n\n0\n0\n\n.\n\n\u03b5=0.1\n\u03b5=0.5\n\u03b5=1\nMLE(\u03b5=\u221e)\n\n\u03b2\n\n0\n1\n\n.\n\n8\n0\n\n.\n\n6\n0\n\n.\n\n4\n\n.\n\n0\n\n2\n\n.\n\n0\n\n0\n0\n\n.\n\n\u03b5=0.1\n\u03b5=0.5\n\u03b5=1\nMLE(\u03b5=\u221e)\n\n0\n\n20\n\n40\n\n60\n\n80\n\n100\n\n0\n\n20\n\n40\n\n60\n\n80\n\n100\n\n0\n\n20\n\n40\n\n60\n\n80\n\n100\n\n\u03b1\n\n\u03b1\n\n\u03b1\n\n(d) Gaussian, large change\n\n(e) Gaussian, small change\n\n(f) Gaussian, misspeci\ufb01ed change\n\nFigure 1: Accuracy for large change, small change, and misspeci\ufb01ed change Monte Carlo simulations\nwith Bernoulli and Gaussian data. Each simulation involves 104 runs of OFFLINEPCPD with varying\n\u270f on data generated by 200 i.i.d. samples from appropriate distributions with change point k\u21e4 = 100.\n\nAcknowledgments\nR.C. and S.K. were supported in part by a Mozilla Research Grant. Y.M. and W.Z. were supported in\npart by NSF grant CMMI-1362876. R.T. was supported in part by NSF grant DMS-156443. R.T.\u2019s\ncontribution was completed while the author was visiting the Georgia Institute of Technology.\n\nReferences\n\n[BP03] J. Bai and P. Perron. Computation and analysis of multiple structural change models.\n\nJournal of Applied Econometrics, 18(1):1\u201322, 2003.\n\n[Car88] E. Carlstein. Nonparametric change-point estimation. Ann. Statist., 16:188\u2013197, 1988.\n\n[Cha17] H. P. Chan. Optimal sequential detection in multi-stream data. Ann. Statist., 45(6):2736\u2013\n\n2763, 2017.\n\n9\n\n\f[DMNS06] Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Calibrating noise to\nsensitivity in private data analysis. In Proceedings of the 3rd Conference on Theory of\nCryptography, TCC \u201906, pages 265\u2013284, 2006.\n\n[DNR+09] Cynthia Dwork, Moni Naor, Omer Reingold, Guy N. Rothblum, and Salil P. Vadhan. On\nthe complexity of differentially private data release: ef\ufb01cient algorithms and hardness\nresults. In STOC \u201909, pages 381\u2013390, 2009.\n\n[DR14] Cynthia Dwork and Aaron Roth. The algorithmic foundations of differential privacy.\n\nFoundations and Trends R in Theoretical Computer Science, 9(3\u20134):211\u2013407, 2014.\n\n[Hin70] D. V. Hinkley. Inference about the change-point in a sequence of random variables.\n\nBiometrika, 57:1\u201317, 1970.\n\n[Kul01] M. Kulldorff. Prospective time periodic geographical disease surveillance using a scan\n\nstatistic. J. Roy. Statist. Soc. Ser. A, 164(1):61\u201372, 2001.\n\n[Lai95] T. L. Lai. Sequential changepoint detection in quality control and dynamical systems\n\n(with discussions). J. Roy. Statist. Soc. Ser. B, 57(4):613\u2013658, 1995.\n\n[Lai01] T. L. Lai. Sequential analysis: some classical problems and new challenges (with\n\ndiscussions). Statistica Sinica, 11(2):303\u2013408, 2001.\n\n[Lor71] G. Lorden. Procedures for reacting to a change in distribution. Ann. Math. Statist.,\n\n42(6):1897\u20131908, 1971.\n\n[LR02] R. Lund and J. Reeves. Detection of undocumented changepoints: A revision of the\n\ntwo-phase regression model. Journal of Climate, 15(17):2547\u20132554, 2002.\n\n[Mei06] Y. Mei. Sequential change-point detection when unknown parameters are present in the\n\npre-change distribution. Ann. Statist., 34(1):92\u2013122, 2006.\n\n[Mei08] Y. Mei. Is average run length to false alarm always an informative criterion? (with\n\ndiscussions). Sequential Analysis, 27(4):354\u2013419, 2008.\n\n[Mei10] Y. Mei. Ef\ufb01cient scalable schemes for monitoring a large number of data streams.\n\nBiometrika, 97(2):419\u2013433, 2010.\n\n[Mou86] G. V. Moustakides. Optimal stopping times for detecting changes in distributions. Ann.\n\nStatist., 14(4):1379\u20131387, 1986.\n\n[Pag54] E. S. Page. Continuous inspection schemes. Biometrika, 41(1/2):100\u2013115, 1954.\n\n[Pol85] M. Pollak. Optimal detection of a change in distribution. Ann. Statist., 13(1):206\u2013227,\n\n1985.\n\n[Pol87] M. Pollak. Average run lengths of an optimal method of detecting a change in distribu-\n\ntion. Ann. Statist., 15(2):749\u2013779, 1987.\n\n[Rob66] S. W. Roberts. A comparison of some control chart procedures. Technometrics, 8(3):411\u2013\n\n430, 1966.\n\n[She31] W. A. Shewhart. Economic Control of Quality of Manufactured Product. New York: D\n\nVan Norstrand. Preprinted by ASQC Quality Press, Wisconsin, 1980., 1931.\n\n[Shi63] A. N. Shiryaev. On optimum methods in quickest detection problems. Theory of\n\nProbability & Its Applications, 8(1):22\u201346, 1963.\n\n[VDVW96] Aad W Van Der Vaart and Jon A Wellner. Weak convergence. Springer, 1996.\n\n[ZS12] N. Zhang and D. O. Siegmund. Model selection for high-dimensional, multi-sequence\n\nchange-point problems. Statistica Sinica, 22:1507\u20131538, 2012.\n\n10\n\n\f", "award": [], "sourceid": 6905, "authors": [{"given_name": "Rachel", "family_name": "Cummings", "institution": "Georgia Tech"}, {"given_name": "Sara", "family_name": "Krehbiel", "institution": "University of Richmond"}, {"given_name": "Yajun", "family_name": "Mei", "institution": "Georgia Institute of Technology"}, {"given_name": "Rui", "family_name": "Tuo", "institution": "Texas A&M University"}, {"given_name": "Wanrong", "family_name": "Zhang", "institution": "Georgia Institute of Technology"}]}