{"title": "Tight Certificates of Adversarial Robustness for Randomly Smoothed Classifiers", "book": "Advances in Neural Information Processing Systems", "page_first": 4910, "page_last": 4921, "abstract": "Strong theoretical guarantees of robustness can be given for ensembles of classifiers generated by input randomization. Specifically, an $\\ell_2$ bounded adversary cannot alter the ensemble prediction generated by an additive isotropic Gaussian noise, where the radius for the adversary depends on both the variance of the distribution as well as the ensemble margin at the point of interest. We build on and considerably expand this work across broad classes of distributions. In particular, we offer adversarial robustness guarantees and associated algorithms for the discrete case where the adversary is $\\ell_0$ bounded. Moreover, we exemplify how the guarantees can be tightened with specific assumptions about the function class of the classifier such as a decision tree. We empirically illustrate these results with and without functional restrictions across image and molecule datasets.", "full_text": "Tight Certi\ufb01cates of Adversarial Robustness\n\nfor Randomly Smoothed Classi\ufb01ers\n\nGuang-He Lee1, Yang Yuan1,2, Shiyu Chang3, Tommi S. Jaakkola1\n\n1MIT Computer Science and Arti\ufb01cial Intelligence Lab\n\n2Institute for Interdisciplinary Information Sciences, Tsinghua University\n\n{guanghe, yangyuan, tommi}@csail.mit.edu, shiyu.chang@ibm.com\n\n3MIT-IBM Watson AI Lab\n\nAbstract\n\nStrong theoretical guarantees of robustness can be given for ensembles of classi\ufb01ers\ngenerated by input randomization. Speci\ufb01cally, an (cid:96)2 bounded adversary cannot\nalter the ensemble prediction generated by an additive isotropic Gaussian noise,\nwhere the radius for the adversary depends on both the variance of the distribution as\nwell as the ensemble margin at the point of interest. We build on and considerably\nexpand this work across broad classes of distributions. In particular, we offer\nadversarial robustness guarantees and associated algorithms for the discrete case\nwhere the adversary is (cid:96)0 bounded. Moreover, we exemplify how the guarantees\ncan be tightened with speci\ufb01c assumptions about the function class of the classi\ufb01er\nsuch as a decision tree. We empirically illustrate these results with and without\nfunctional restrictions across image and molecule datasets.1\n\n1\n\nIntroduction\n\nMany powerful classi\ufb01ers lack robustness in the sense that a slight, potentially unnoticeable manipu-\nlation of the input features, e.g., by an adversary, can cause the classi\ufb01er to change its prediction [15].\nThe effect is clearly undesirable in decision critical applications. Indeed, a lot of recent work has\ngone into analyzing such failures together with providing certi\ufb01cates of robustness.\nRobustness can be de\ufb01ned with respect to a variety of metrics that bound the magnitude or the\ntype of adversarial manipulation. The most common approach to searching for violations is by\n\ufb01nding an adversarial example within a small neighborhood of the example in question, e.g., using\ngradient-based algorithms [13, 15, 26]. The downside of such approaches is that failure to discover\nan adversarial example does not mean that another technique could not \ufb01nd one. For this reason, a\nrecent line of work has instead focused on certi\ufb01cates of robustness, i.e., guarantees that ensure, for\nspeci\ufb01c classes of methods, that no adversarial examples exist within a certi\ufb01ed region. Unfortunately,\nobtaining exact guarantees can be computationally intractable [20, 25, 36], and guarantees that scale\nto realistic architectures have remained somewhat conservative [7, 27, 38, 39, 42].\nEnsemble classi\ufb01ers have recently been shown to yield strong guarantees of robustness [6]. The\nensembles, in this case, are simply induced from randomly perturbing the input to a base classi\ufb01er.\nThe guarantees state that, given an additive isotropic Gaussian noise on the input example, an\nadversary cannot alter the prediction of the corresponding ensemble within an (cid:96)2 radius, where the\nradius depends on the noise variance as well as the ensemble margin at the given point [6].\nIn this work, we substantially extend robustness certi\ufb01cates for such noise-induced ensembles. We\nprovide guarantees for alternative metrics and noise distributions (e.g., uniform), develop a strati\ufb01ed\n\n1Project page: http://people.csail.mit.edu/guanghe/randomized_smoothing.\n\n33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.\n\n\flikelihood ratio analysis that allows us to provide certi\ufb01cates of robustness over discrete spaces\nwith respect to (cid:96)0 distance, which are tight and applicable to any measurable classi\ufb01ers. We also\nintroduce scalable algorithms for computing the certi\ufb01cates. The guarantees can be further tightened\nby introducing additional assumptions about the family of classi\ufb01ers. We illustrate this in the context\nof ensembles derived from decision trees. Empirically, our ensemble classi\ufb01ers yield the state-of-the-\nart certi\ufb01ed guarantees with respect to (cid:96)0 bounded adversaries across image and molecule datasets in\ncomparison to the previous methods adapted from continuous spaces.\n\n2 Related Work\n\nIn a classi\ufb01cation setting, the role of robustness certi\ufb01cates is to guarantee a constant classi\ufb01cation\nwithin a local region; a certi\ufb01cate is always suf\ufb01cient to claim robustness. When a certi\ufb01cate is\nboth suf\ufb01cient and necessary, it is called an exact certi\ufb01cate. For example, the exact (cid:96)2 certi\ufb01cate\nof a linear classi\ufb01er is the (cid:96)2 distance between the classi\ufb01er and a given point. Below we focus the\ndiscussions on the recent development of robustness guarantees for deep networks.\nMost of the exact methods are derived on piecewise linear networks, de\ufb01ned as any network archi-\ntectures with piecewise linear activation functions. Such class of networks has a mix integer-linear\nrepresentation [22], which allows the usage of mix integer-linear programming [4, 9, 14, 25, 36] or\nsatis\ufb01ability modulo theories [3, 12, 20, 33] to \ufb01nd the exact adversary under an (cid:96)q radius. However,\nthe exact method is in general NP-complete, and thus does not scale to large problems [36].\nA certi\ufb01cate that only holds a suf\ufb01cient condition is conservative but can be more scalable than exact\nmethods. Such guarantees may be derived as a linear program [39, 40], a semide\ufb01nite program [30,\n31], or a dual optimization problem [10, 11] through relaxation. Alternative approaches conduct layer-\nwise relaxations of feasible neuron values to derive the certi\ufb01cates [16, 27, 34, 38, 42]. Unfortunately,\nthere is no empirical evidence of an effective certi\ufb01cate from the above methods in large scale\nproblems. This does not entail that the certi\ufb01cates are not tight enough in practice; it might also be\nattributed to the fact that it is challenging to obtain a robust network in a large scale setting.\nRecent works propose a new modeling scheme that ensembles a classi\ufb01er by input randomization [2,\n24], mostly done via an additive isotropic Gaussian noise. Lecuyer et al. [21] \ufb01rst propose a certi\ufb01cate\nbased on differential privacy, which is improved by Li et al. [23] using R\u00b4enyi divergence. Cohen et\nal. [6] proceed with the analysis by proving the tight certi\ufb01cate with respect to all the measurable\nclassi\ufb01ers based on the Neyman-Pearson Lemma [28], which yields the state-of-the-art provably\nrobust classi\ufb01er. However, the tight certi\ufb01cate is tailored to an isotropic Gaussian distribution and (cid:96)2\nmetric, while we generalize the result across broad classes of distributions and metrics. In addition,\nwe show that such tight guarantee can be tightened with assumptions about the classi\ufb01er.\nOur method of certi\ufb01cation also yields the \ufb01rst tight and actionable (cid:96)0 robustness certi\ufb01cates in\ndiscrete domains (cf. continuous domains where an adversary is easy to \ufb01nd [15]). Robustness\nguarantees in discrete domains are combinatorial in nature and thus challenging to obtain. Indeed,\neven for simple binary vectors, verifying robustness requires checking an exponential number of\npredictions for any black-box model.2\n\n3 Certi\ufb01cation Methodology\nGiven an input x \u2208 X , a randomization scheme \u03c6 assigns a probability mass/density Pr(\u03c6(x) = z)\nfor each randomized outcome z \u2208 X . We can de\ufb01ne a probabilistic classi\ufb01er either by specifying\nthe associated conditional distribution P(y|x) for a class y \u2208 Y or by viewing it as a random\nfunction f (x) where the randomness in the output is independent for each x. We compose the\nrandomization scheme \u03c6 with a classi\ufb01er f to get a randomly smoothed classi\ufb01er E\u03c6[P(y|\u03c6(x))],\nwhere the probability for outputting a class y \u2208 Y is denoted as Pr(f (\u03c6(x)) = y) and abbreviated\nas p, whenever f, \u03c6, x and y are clear from the context. Under this setting, we \ufb01rst develop our\nframework for tight robustness certi\ufb01cates in \u00a73.1, exemplify the framework in \u00a73.2-3.4, and illustrate\nhow the guarantees can be re\ufb01ned with further assumption in \u00a73.5-3.6. We defer all the proofs to\nAppendix A.\n\n2We are aware of two concurrent works also yielding certi\ufb01cates in discrete domain [18, 19].\n\n2\n\n\f3.1 A Framework for Tight Certi\ufb01cates of Robustness\n\nIn this section, we develop our framework for deriving tight certi\ufb01cates of robustness for randomly\nsmoothed classi\ufb01ers, which will be instantiated in the following sections.\n\nPoint-wise Certi\ufb01cate. Given p, we \ufb01rst identify a tight lower bound on the probability score\nPr(f (\u03c6( \u00afx)) = y) for another (neighboring) point \u00afx \u2208 X . Here we denote the set of measurable\nclassi\ufb01ers with respect to \u03c6 as F. Without any additional assumptions on f, a lower bound can be\nfound by the minimization problem:\n\n\u03c1x, \u00afx(p) (cid:44)\n\nmin\n\n\u00aff\u2208F :Pr( \u00aff (\u03c6(x))=y)=p\n\nPr( \u00aff (\u03c6( \u00afx)) = y) \u2264 Pr(f (\u03c6( \u00afx)) = y).\n\n(1)\n\nNote that bound is tight since f satis\ufb01es the constraint.\n\nRegional Certi\ufb01cate. We can extend the point-wise certi\ufb01cate \u03c1x, \u00afx(p) to a regional certi\ufb01cate by\nexamining the worst case \u00afx over the neighboring region around x. Formally, given an (cid:96)q metric\n(cid:107) \u00b7 (cid:107)q, the neighborhood around x with radius r is de\ufb01ned as Br,q(x) (cid:44) { \u00afx \u2208 X : (cid:107)x \u2212 \u00afx(cid:107)q \u2264 r}.\nAssuming p = Pr(f (\u03c6(x)) = y) > 0.5 for a y \u2208 Y, a robustness certi\ufb01cate on the (cid:96)q radius can be\nfound by\n\nR(x, p, q) (cid:44) sup r, s.t. min\n\n\u00afx\u2208Br,q(x)\n\n\u03c1x, \u00afx(p) > 0.5.\n\n(2)\n\nEssentially, the certi\ufb01cate R(x, p, q) entails the following robustness guarantee:\n\n\u2200 \u00afx \u2208 X : (cid:107)x \u2212 \u00afx(cid:107)q < R(x, p, q), we have Pr(f (\u03c6( \u00afx)) = y) > 0.5.\n\n(3)\nWhen the maximum can be attained in Eq. (2) (which will be the case in (cid:96)0 norm), the above <\ncan be replaced with \u2264. Note that here we assume Pr(f (\u03c6(x)) = y) > 0.5 and ignore the case\nthat 0.5 \u2265 Pr(f (\u03c6( \u00afx)) = y) > maxy(cid:48)(cid:54)=y Pr(f (\u03c6( \u00afx)) = y(cid:48)). By de\ufb01nition, the certi\ufb01ed radius\nR(x, p, q) is tight for binary classi\ufb01cation, and provides a reasonable suf\ufb01cient condition to guarantee\nrobustness for |Y| > 2. The tight guarantee for |Y| > 2 will involve the maximum prediction\nprobability over all the remaining classes (see Theorem 1 of [6]). However, when the prediction\nprobability p = Pr(f (\u03c6(x)) = y) is intractable to compute and relies on statistical estimation for\neach class y (e.g., when f is a deep network), the tight guarantee is statistically challenging to obtain.\nThe actual algorithm used by Cohen et al. [6] is also a special case of Eq. (2).\n\n3.2 A Warm-up Example: the Uniform Distribution\nTo illustrate the framework, we show a simple (but new) scenario when X =\nRd and \u03c6 is an additive uniform noise with a parameter \u03b3 \u2208 R>0:\n\n\u03c6(x)i = xi + \u0001i, \u0001i\n\ni.i.d.\u223c Uniform([\u2212\u03b3, \u03b3]),\u2200i \u2208 {1, . . . , d}.\n\n(4)\nGiven two points x and \u00afx, as illustrated in Fig. 1, we can partition the space Rd\ninto 4 disjoint regions: L1 = B\u03b3,\u221e(x)\\B\u03b3,\u221e( \u00afx), L2 = B\u03b3,\u221e(x) \u2229 B\u03b3,\u221e( \u00afx),\nL3 = B\u03b3,\u221e( \u00afx)\\B\u03b3,\u221e(x) and L4 = Rd\\(B\u03b3,\u221e( \u00afx)\u222aB\u03b3,\u221e(x)). Accordingly,\n\u2200 \u00aff \u2208 F, we can rewrite Pr( \u00aff (\u03c6(x)) = y) and Pr( \u00aff (\u03c6( \u00afx)) = y) as follows:\n\nPr( \u00aff (\u03c6(x)) = y) =\n\nPr( \u00aff (\u03c6( \u00afx)) = y) =\n\nPr(\u03c6(x) = z) Pr( \u00aff (z) = y))dz =\n\nPr(\u03c6( \u00afx) = z) Pr( \u00aff (z) = y))dz =\n\nL4\n\nL1\n\n\u00afx\n\nL3\n\nL2\nx\n\nFigure 1: Uniform\ndistributions.\n\nPr( \u00aff (z) = y))dz,\n\nPr( \u00aff (z) = y))dz,\n\n4(cid:88)\n4(cid:88)\n\ni=1\n\ni=1\n\n\u03c0i\n\n\u00af\u03c0i\n\n(cid:90)\n(cid:90)\n\nLi\n\nLi\n\n(cid:90)\n(cid:90)\n\n4(cid:88)\n4(cid:88)\n\ni=1\n\ni=1\n\nLi\n\nLi\n\nwhere \u03c01:4 = ((2\u03b3)\u2212d, (2\u03b3)\u2212d, 0, 0), and \u00af\u03c01:4 = (0, (2\u03b3)\u2212d, (2\u03b3)\u2212d, 0). With this representation, it\nis clear that, in order to solve Eq. (1), we only have to consider the integral behavior of \u00aff within each\nregion L1, . . . ,L4. Concretely, we have:\n\n\u03c1x, \u00afx(p) =\n\n\u00aff\u2208F :(cid:80)4\n\ni=1 \u03c0i\n\n(cid:82)\n\nmin\nLi\n\nPr( \u00aff (z)=y))dz=p\n\nPr( \u00aff (z) = y))dz\n\n=\n\nmin\n\ng:{1,2,3,4}\u2192[0,1],\n\n\u03c01|L1|g(1)+\u03c02|L2|g(2)=p\n\n\u00af\u03c02|L2|g(2) + \u00af\u03c03|L3|g(3) =\n\nmin\n\ng:{1,2,3,4}\u2192[0,1],\n\n\u03c01|L1|g(1)+\u03c02|L2|g(2)=p\n\n\u00af\u03c02|L2|g(2),\n\n(cid:90)\n\n\u00af\u03c0i\n\nLi\n\n4(cid:88)\n\ni=1\n\n3\n\n\fwhere the second equality \ufb01lters the components with \u03c0i = 0 or \u00af\u03c0i = 0, and the last equality is due\nto the fact that g(3) is unconstrained and minimizes the objective when g(3) = 0. Since \u03c02 = \u00af\u03c02,\n\n(cid:26)\u03c1x, \u00afx(p) = 0,\n\n\u03c1x, \u00afx(p) = p \u2212 \u03c01|L1|,\n\nif 0 \u2264 p \u2264 \u03c01|L1| = Pr(\u03c6(x) \u2208 L1),\nif 1 \u2265 p > \u03c01|L1| = Pr(\u03c6(x) \u2208 L1).\n\nTo obtain the regional certi\ufb01cate, the minimizers of min \u00afx\u2208Br,q(x) \u03c1x, \u00afx(p) are simply the points that\nmaximize the volume of L1 = B1\\B2. Accordingly,\nProposition 1. If \u03c6(\u00b7) is de\ufb01ned as Eq. (4), we have R(x, p, q = 1) = 2p\u03b3\u2212\u03b3 and R(x, p, q =\n\u221e) = 2\u03b3\u22122\u03b3(1.5 \u2212 p)1/d.\n\nDiscussion. Our goal here was to illustrate how certi\ufb01cates can be computed with the uniform\ndistribution using our technique. However, the certi\ufb01cate radius itself is inadequate in this case. For\nexample, R(x, p, q = 1) \u2264 \u03b3, which arises from the bounded support in the uniform distribution. The\nderivation nevertheless provides some insights about how one can compute the point-wise certi\ufb01cate\n\u03c1x, \u00afx(p). The key step is to partition the space into regions L1, . . . ,L4, where the likelihoods\nPr(\u03c6(x) = z) and Pr(\u03c6( \u00afx) = z) are both constant within each region Li. The property allows\nus to substantially reduce the optimization problem in Eq. (1) to \ufb01nding a single probability value\ng(i) \u2208 [0, 1] for each region Li.\n\n3.3 A General Lemma for Point-wise Certi\ufb01cate\n\nIn this section, we generalize the idea in \u00a73.2 to \ufb01nd the point-wise certi\ufb01cate \u03c1x, \u00afx(p). For each point\nz \u2208 X , we de\ufb01ne the likelihood ratio \u03b7x, \u00afx(z) (cid:44) Pr(\u03c6(x) = z)/ Pr(\u03c6( \u00afx) = z).3 If we can partition\nX into n regions L1, . . . ,Ln : \u222an\ni=1Li = X for some n \u2208 Z>0, such that the likelihood ratio within\neach region Li is a constant \u03b7i \u2208 [0,\u221e]: \u03b7x, \u00afx(z) = \u03b7i,\u2200z \u2208 Li, then we can sort the regions such\nthat \u03b71 \u2265 \u03b72 \u2265 \u00b7\u00b7\u00b7 \u2265 \u03b7n. Note that X can still be uncountable (see the example in \u00a73.2).\nInformally, we can always \u201cnormalize\u201d \u00aff so that it predicts a constant probability value g(i) \u2208\n[0, 1] within each likelihood ratio region Li. This preserves the integral over Li and thus over X ,\ngeneralizing the scenario in \u00a73.2. Moreover, to minimize Pr( \u00aff (\u03c6( \u00afx)) = y) under a \ufb01xed budget\nPr( \u00aff (\u03c6(x)) = y), as in Eq. (1), it is advantageous to set \u00aff (z) to y in regions with high likelihood\nratio. These arguments suggest a greedy algorithm for solving Eq. (1) by iteratively assigning\nf (z) = y,\u2200z \u2208 Li for i \u2208 (1, 2, . . . ) until the budget constraint is met. Formally,\nLemma 2. \u2200x, \u00afx \u2208 X , p \u2208 [0, 1], let H\u2217 (cid:44) minH\u2208{1,...,n}:(cid:80)H\nany f\u2217 satisfying Eq. (5) is a minimizer of Eq. (1),\np\u2212(cid:80)H\u2217\u22121\n\ni=1 Pr(\u03c6(x)\u2208Li)\u2265p H, then \u03b7H\u2217 > 0,\n\n\u2200i \u2208 {1, 2, . . . , n},\u2200z \u2208 Li, Pr(f\u2217(z) = y) =\n\n\uf8f1\uf8f4\uf8f2\uf8f4\uf8f31,\ni=1 Pr(\u03c6( \u00afx) \u2208 Li) + (p \u2212(cid:80)H\u2217\u22121\n\n0,\n\nand \u03c1x, \u00afx(p) =(cid:80)H\u2217\u22121\n\nif i < H\u2217,\nif i = H\u2217,\nif i > H\u2217.\n\n(5)\n\ni=1 Pr(\u03c6(x)\u2208Li)\nPr(\u03c6(x)\u2208LH\u2217 )\n\n,\n\ni=1 Pr(\u03c6(x) \u2208 Li))/\u03b7H\u2217\n\nWe remark that Eq. (1) and Lemma 2 can be interpreted as a likelihood ratio testing [28], by casting\nPr(\u03c6(x) = z) and Pr(\u03c6( \u00afx) = z) as likelihoods for two hypothesis with the signi\ufb01cance level p. We\nrefer the readers to [37] to see a similar Lemma derived under the language of hypothesis testing.\nRemark 3. \u03c1x, \u00afx(p) is an increasing continuous function of p; if \u03b71 < \u221e, \u03c1x, \u00afx(p) is a strictly\nincreasing continuous function of p; if \u03b71 < \u221e and \u03b7n > 0, \u03c1x, \u00afx : [0, 1] \u2192 [0, 1] is a bijection.\nRemark 3 will be used in \u00a73.4 to derive an ef\ufb01cient algorithm to compute robustness certi\ufb01cates.\nDiscussion. Given Li, Pr(\u03c6(x) \u2208 Li), and Pr(\u03c6( \u00afx) \u2208 Li),\u2200i \u2208 [n], Lemma 2 provides an O(n)\nmethod to compute \u03c1x, \u00afx(p). For any actual randomization \u03c6, the key is to \ufb01nd a partition L1, . . . ,Ln\nsuch that Pr(\u03c6(x) \u2208 Li) and Pr(\u03c6( \u00afx) \u2208 Li) are easy to compute. Having constant likelihoods in\neach Li : Pr(\u03c6(x) = z) = Pr(\u03c6(x) = z(cid:48)),\u2200z, z(cid:48) \u2208 Li (cf. only having constant likelihood ratio\n\u03b7i) is a way to simplify Pr(\u03c6(x) \u2208 Li) = |Li| Pr(\u03c6(x) = z), and similarly for Pr(\u03c6( \u00afx) \u2208 Li).\n\n3If Pr(\u03c6( \u00afx) = z) = Pr(\u03c6(x) = z) = 0, \u03b7x, \u00afx(z) can be de\ufb01ned arbitrarily in [0,\u221e] without affecting the\n\nsolution in Lemma 2.\n\n4\n\n\f3.4 A Discrete Distribution for (cid:96)0 Robustness\n\nWe consider (cid:96)0 robustness guarantees in a discrete space X =(cid:8)0, 1\n(cid:26)Pr(\u03c6(x)i = xi) = \u03b1,\nif z \u2208(cid:8)0, 1\n\nPr(\u03c6(x)i = z) = (1 \u2212 \u03b1)/K (cid:44) \u03b2 \u2208 (0, 1/K),\n\nK , 2\n\nK , . . . , 1(cid:9)d for some K \u2208\nK , . . . , 1(cid:9) and z (cid:54)= xi.\n\n(6)\n\nZ>0;4 we de\ufb01ne the following discrete distribution with a parameter \u03b1 \u2208 (0, 1), independent and\nidentically distributed for each dimension i \u2208 {1, 2, . . . , d}:\n\nK , 2\n\nHere \u03c6(\u00b7) can be regarded as a composition of a Bernoulli random variable and a uniform random\nvariable. Due to the symmetry of the randomization with respect to all the con\ufb01gurations of x, \u00afx \u2208 X\nsuch that (cid:107)x \u2212 \u00afx(cid:107)0 = r (for some r \u2208 Z\u22650), we have the following Lemma for the equivalence of\n\u03c1x, \u00afx:\nLemma 4. If \u03c6(\u00b7) is de\ufb01ned as Eq. (6), given r \u2208 Z\u22650, de\ufb01ne the canonical vectors xC (cid:44)\n(0, 0,\u00b7\u00b7\u00b7 , 0) and \u00afxC (cid:44) (1, 1,\u00b7\u00b7\u00b7 , 1, 0, 0,\u00b7\u00b7\u00b7 , 0), where (cid:107) \u00afxC(cid:107)0 = r. Let \u03c1r (cid:44) \u03c1xC , \u00afxC . Then\nfor all x, \u00afx such that (cid:107)x \u2212 \u00afx(cid:107)0 = r, we have \u03c1x, \u00afx = \u03c1r.\nd \u2212 r = 3\nBased on Lemma 4, \ufb01nding R(x, p, q) for a given p, it suf\ufb01ces to\n\ufb01nd the maximum r such that \u03c1r(p) > 0.5. Since the likelihood\nratio \u03b7x, \u00afx(z) is always positive and \ufb01nite, the inverse \u03c1\u22121\nexists\n(due to Remark 3), which allows us to pre-compute \u03c1\u22121\nr (0.5) and\nr (0.5) for each r \u2208 Z\u22650, instead of computing\ncheck p > \u03c1\u22121\n\u03c1r(p) for each given p and r. Then R(x, p, q) is simply the\nmaximum r such that p > \u03c1\u22121\nr (0.5). Below we discuss how to\ncompute \u03c1\u22121\nr (0.5) in a scalable way. Our \ufb01rst step is to identify\na set of likelihood ratio regions L1, . . . ,Ln such that Pr(\u03c6(x) \u2208\nLi) and Pr(\u03c6( \u00afx) \u2208 Li) as used in Lemma 2 can be computed\nef\ufb01ciently. Note that, due to Lemma 4, it suf\ufb01ces to consider\nxC, \u00afxC such that (cid:107) \u00afxC(cid:107)0 = r throughout the derivation.\nFor an (cid:96)0 radius r \u2208 Z\u22650, \u2200(u, v) \u2208 {0, 1, . . . , d}2, we construct the region\n\nFigure 2: Illustration for Eq. (7)\n\nu = 4\n\nv = 5\n\nd = 7\n\nr = 4\n\n\u00afxC\n\nxC\n\n1\nK\n\n3\nK\n\n1\n\n1\n\n0\n\n0\n\n0\n\n0\n\nz\n\n0\n\n0\n\n0\n\n1\n\n1\n\n0\n\n1\n\n0\n\n0\n\n0\n\n0\n\n0\n\n1\n\nr\n\nL(u, v; r) (cid:44) {z \u2208 X : Pr(\u03c6(xC) = z) = \u03b1d\u2212u\u03b2u, Pr(\u03c6( \u00afxC) = z) = \u03b1d\u2212v\u03b2v},\n\n(7)\nwhich contains points that can be obtained by \u201c\ufb02ipping\u201d u coordinates from xC or v coordinates from\n\u00afxC. See Figure 2 for an illustration, where different colors represent different types of coordinates:\norange means both xC, \u00afxC are \ufb02ipped on this coordinate and they were initially the same; red means\nboth are \ufb02ipped and were initially different; green means only xC is \ufb02ipped and blue means only \u00afxC\nis \ufb02ipped. By denoting the numbers of these coordinates as i, j\u2217, u \u2212 i \u2212 j\u2217, v \u2212 i \u2212 j\u2217, respectively,\nwe have the following formula for computing the cardinality of each region |L(u, v; r)|.\nLemma 5. For any u, v \u2208 {0, 1, . . . , d}, u \u2264 v, r \u2208 Z\u22650 we have |L(u, v; r)| = |L(v, u; r)|, and\n\nmin(u,d\u2212r,(cid:98) u+v\u2212r\n\n(cid:99))(cid:88)\n\n2\n\ni=max{0,v\u2212r}\n\nwhere j\u2217 (cid:44) u + v \u2212 2i \u2212 r.\n\n|L(u, v; r)| =\n\n(K \u2212 1)j\u2217\n\nr!\n\n(u \u2212 i \u2212 j\u2217)!(v \u2212 i \u2212 j\u2217)!j\u2217!\n\nK i(d \u2212 r)!\n(d \u2212 r \u2212 i)!i!\n\n,\n\nTherefore, for a \ufb01xed r, the complexity of computing all the cardinalities |L(u, v; r)| is \u0398(d3).\nSince each region L(u, v; r) has a constant likelihood ratio \u03b1v\u2212u\u03b2u\u2212v and we have \u222ad\nL(u, v; r) = X , we can apply the regions to \ufb01nd the function \u03c1x, \u00afx = \u03c1r via Lemma 2. Under\nthis representation, the number of nonempty likelihood ratio regions n is bounded by (d + 1)2, the\nperturbation probability Pr(\u03c6(x) \u2208 L(u, v; r)) used in Lemma 2 is simply \u03b1d\u2212u\u03b2u|L(u, v; r)|, and\nsimilarly for the Pr(\u03c6( \u00afx) \u2208 L(u, v; r)). Based on Lemma 2 and Lemma 5, we may use a for-loop\nto compute the bijection \u03c1r(\u00b7) for the input p until \u03c1r(p) = 0.5, and return the corresponding p as\n\u03c1\u22121\nr (0.5). The procedure is illustrated in Algorithm 1.\n\nu=0 \u222ad\n\nv=0\n\n4More generally, the method applies to the (cid:96)0 / Hamming distance in a Hamming space (i.e., \ufb01xed length\n\nsequences of tokens from a discrete set, e.g., (\u266010,\u2660J,\u2660Q,\u2660K,\u2660A) \u2208 {\u2660A,\u2660K, ...,\u26632}5).\n\n5\n\n\fratio\n\nAlgorithm 1 Computing \u03c1\u22121\n1: sort {(ui, vi)}n\n\nr (0.5)\ni=1 by likelihood\n\n2: p, \u03c1r = 0, 0\n3: for i = 1, . . . , n do\np(cid:48) = \u03b1d\u2212ui\u03b2ui\n4:\n\u03c1(cid:48)\nr = \u03b1d\u2212vi \u03b2vi\n5:\nr \u00d7 |L(ui, vi; r)|\n6: \u2206\u03c1r = \u03c1(cid:48)\nif \u03c1r + \u2206\u03c1r < 0.5 then\n7:\n8:\n9:\n10:\n11:\n12:\nend if\n13:\n14: end for\n\nScalable implementation.\nIn practice, Algorithm 1 can\nbe challenging to implement; the probability values (e.g.,\n\u03b1d\u2212u\u03b2u) can be extremely small, which is infeasible to be\ncomputationally represented using \ufb02oating points. If we set\n\u03b1 to be a rational number, both \u03b1 and \u03b2 can be represented\nin fractions, and thus all the corresponding probability values\ncan be represented by two (large) integers; we also observe\nthat computing the (large) cardinality |L(u, v; r)| is feasible\nin modern large integer computation frameworks in practice\n(e.g., python), which motivates us to adapt the computation\nin Algorithm 1 to large integers.\nFor simplicity, we assume \u03b1 = \u03b1(cid:48)/100 with some \u03b1(cid:48) \u2208\nZ : 100 \u2265 \u03b1(cid:48) \u2265 0. If we de\ufb01ne \u02dc\u03b1 (cid:44) 100K\u03b1 \u2208 Z, \u02dc\u03b2 (cid:44)\n100K\u03b2 \u2208 Z, we may implement Algorithm 1 in terms of the\nnon-normalized, integer version \u02dc\u03b1, \u02dc\u03b2. Speci\ufb01cally, we replace\n\u03b1, \u03b2 and the constant 0.5 with \u02dc\u03b1, \u02dc\u03b2 and 50K \u00d7 (100K)d\u22121,\nrespectively. Then all the computations in Algorithm 1 can be\nr. Since the division is bounded by |L(ui, vi; r)|\ntrivially adapted except the division (0.5 \u2212 \u03c1r)/\u03c1(cid:48)\n(see the comparison between line 9 and line 11), we can implement the division by a binary search\nover {1, 2 . . . ,|L{mi, ni}|}, which will result in an upper bound with an error bounded by \u03c1(cid:48)\nr in\nthe original space, which is in turn bounded by \u03b1d assuming \u03b1 > \u03b2. Finally, to map the computed,\nunnormalized \u03c1\u22121\nr (0.5), back to the original space, we \ufb01nd an upper bound of\nr (0.5) up to the precision of 10\u2212c for some c \u2208 Z>0 (we set c = 20 in the experiments): we \ufb01nd\n\u03c1\u22121\nr (0.5) \u2264 \u02c6\u03c1 \u00d7 (10K)c(100K)d\u2212c over \u02c6\u03c1 \u2208 {1, 2, . . . , 10c} via binary\nthe smallest upper bound of \u02dc\u03c1\u22121\nr (0.5) as \u02c6\u03c1 \u00d7 10\u2212c with an error bounded by 10\u2212c + \u03b1d in\nsearch, and report an upper bound of \u03c1\u22121\ntotal. Note that an upper bound of \u03c1\u22121\nAs a side note, simply computing the probabilities in the log-domain will lead to uncontrollable\napproximate results due to \ufb02oating point arithmetic; using large integers to ensure a veri\ufb01able\napproximation error in Algorithm 1 is necessary to ensure a computationally accurate certi\ufb01cate.\n\n\u03c1r = \u03c1r + \u2206\u03c1r\np = p + p(cid:48) \u00d7|L(ui, vi; r)|\np = p + p(cid:48) \u00d7 (0.5 \u2212 \u03c1r)/\u03c1(cid:48)\nreturn p\n\nr (0.5) is still a valid certi\ufb01cate.\n\nr (0.5), denoted as \u02dc\u03c1\u22121\n\nelse\n\nr\n\n3.5 Connection Between the Discrete Distribution and an Isotropic Gaussian Distribution\nWhen the inputs are binary vectors X = {0, 1}d, one may still apply the prior work [6] using an\nadditive isotropic Gaussian noise \u03c6 to obtain an (cid:96)0 certi\ufb01cates since there is a bijection between\n(cid:96)0 and (cid:96)2 distance in {0, 1}d. If one uses a denoising function \u03b6(\u00b7) that projects each randomized\ncoordinate \u03c6(x)i \u2208 R back to the space {0, 1} using the (likelihood ratio testing) rule\n\n\u03b6(\u03c6(x))i = I{\u03c6(x)i > 0.5},\u2200i \u2208 [d],\n\nthen the composition \u03b6 \u25e6 \u03c6 is equivalent to our discrete randomization scheme with \u03b1 = \u03a6(0.5; \u00b5 =\n0, \u03c32), where \u03a6 is the CDF function of the Gaussian distribution with mean \u00b5 and variance \u03c32.\nIf one applies a classi\ufb01er upon the composition (or, equivalently, the discrete randomization scheme),\nthen the certi\ufb01cates obtained via the discrete distribution is always tighter than the one via Gaussian\ndistribution. Concretely, we denote F\u03b6 \u2282 F as the set of measurable functions with respect to the\nGaussian distribution that can be written as the composition \u00aff(cid:48) \u25e6 \u03b6 for some \u00aff(cid:48), and we have\nPr( \u00aff (\u03c6( \u00afx)) = y),\n\nPr( \u00aff (\u03c6( \u00afx)) = y) \u2265\n\nmin\n\nmin\n\n\u00aff\u2208F\u03b6 :Pr( \u00aff (\u03c6(x))=y)=p\n\n\u00aff\u2208F :Pr( \u00aff (\u03c6(x))=y)=p\n\nwhere the LHS corresponds to the certi\ufb01cate derived from the discrete distribution (i.e., applying \u03b6 to\nan isotropic Gaussian), and the RHS corresponds to the certi\ufb01cate from the Gaussian distribution.\n\n3.6 A Certi\ufb01cate with Additional Assumptions\n\nIn the previous analyses, we assume nothing but the measurability of the classi\ufb01er. If we further make\nassumptions about the functional class of the classi\ufb01er, we can obtain a tighter certi\ufb01cate than the\nones outlined in \u00a73.1. Assuming an extra denoising step in the classi\ufb01er over an additive Gaussian\nnoise as illustrated in \u00a73.5 is one example.\n\n6\n\n\fHere we illustrate the idea with another example. We assume that the inputs are binary vectors\nX = {0, 1}d, the outputs are binary Y = {0, 1}, and that the classi\ufb01er is a decision tree that each\ninput coordinate can be used at most once in the entire tree. Under the discrete randomization scheme,\nthe prediction probability can be computed via tree recursion, since a decision tree over the discrete\nrandomization scheme can be interpreted as assigning a probability of visiting the left child and the\nright child for each decision node. To elaborate, we denote idx[i], left[i], and right[i] as the split\nfeature index, the left child and the right child of the ith node. Without loss of generality, we assume\nthat each decision node i routes its input to the right branch if xidx[i] = 1. Then Pr(f (\u03c6(x)) = 1)\ncan be found by the recursion\n\nI{xidx[i]=1}\u03b2\n\nI{xidx[i]=0}pred[right[i]] + \u03b1\n\nI{xidx[i]=0}\u03b2\n\nI{xidx[i]=1}pred[left[i]],\n\npred[i] = \u03b1\n\n(8)\nwhere the boundary condition is the output of the leaf nodes. Effectively, we are recursively\naggregating the partial solutions found in the left subtree and the right subtree rooted at each node i,\nand pred[root] is the \ufb01nal prediction probability. Note that changing one input coordinate in xk is\nequivalent to changing the recursion in the corresponding unique node i(cid:48) (if exists) that uses feature k\nas the splitting index, which gives\nI{xidx[i(cid:48) ]=0}pred[left[i(cid:48)]].\npred[i(cid:48)] = \u03b1\nIn addition, changes in the left subtree do not affect the partial solution found in the right subtree,\nand vice versa. Hence, we may use dynamic programming to \ufb01nd the exact adversary under each (cid:96)0\nradius r by aggregating the worst case changes found in the left subtree and the right subtree rooted\nat each node i. See Appendix B.1 for details.\n\nI{xidx[i(cid:48) ]=1}pred[right[i(cid:48)]] + \u03b1\n\nI{xidx[i(cid:48) ]=1}\u03b2\n\nI{xidx[i(cid:48) ]=0}\u03b2\n\n4 Learning and Prediction in Practice\n\nSince we focus on the development of certi\ufb01cates, here we only brie\ufb02y discuss how we train the\nclassi\ufb01ers and compute the prediction probability Pr(f (\u03c6(x)) = y) in practice.\n\nDeep networks: We follow the approach proposed by the prior work [21]: training is conducted on\nsamples drawn from the randomization scheme via a cross entropy loss. The prediction probability\nPr(f (\u03c6(x)) = y) is estimated by the lower bound of the Clopper-Pearson Bernoulli con\ufb01dence\ninterval [5] with 100K samples drawn from the distribution and the 99.9% con\ufb01dence level. Since\n\u03c1x, \u00afx(p) is an increasing function of p (Remark 3), a lower bound of p entails a valid certi\ufb01cate.\nDecision trees: we train the decision tree greedily in a breadth-\ufb01rst ordering with a depth limit; for\neach split, we only search coordinates that are not used before to enforce the functional constraint in\n\u00a73.6, and optimize a weighted gini index, which weights each training example x by the probability\nthat it is routed to the node by the discrete randomization. The details of the training algorithm is in\nAppendix B.2. The prediction probability is computed by Eq. (8).\n\n5 Experiment\nIn this section, we validate the robustness certi\ufb01cates of the proposed discrete distribution (D) in\n(cid:96)0 norm. We compare to the state-of-the-art additive isotropic Gaussian noise (N ) [6], since an (cid:96)0\ncerti\ufb01cate with radius r in X = {0, 1\nK , . . . , 1}d can be obtained from an (cid:96)2 certi\ufb01cate with radius\n\u221a\nr. Note that the derived (cid:96)0 certi\ufb01cate from Gaussian distribution is still tight with respect to all the\nmeasurable classi\ufb01ers (see Theorem 1 in [6]). We consider the following evaluation measures:\n\u2022 \u00b5(R): the average certi\ufb01ed (cid:96)0 radius R(x, p, q) (with respect to the labels) across the testing set.\n\u2022 ACC@r: the certi\ufb01ed accuracy within a radius r (the average I{R(x, p, q) \u2265 r} in the testing set).\n\n5.1 Binarized MNIST\n\nWe use a 55, 000/5, 000/10, 000 split of the MNIST dataset for training/validation/testing. For each\ndata point x in the dataset, we binarize each coordinate by setting the threshold as 0.5. Experiments are\nconducted on randomly smoothed CNN models and the implementation details are in Appendix C.1.\nThe results are shown in Table 1. For the same randomly smoothed CNN model (the 1st and 2nd\nrows in Table 1), our certi\ufb01cates are consistently better than the ones derived from the Gaussian\n\n7\n\n\fTable 1: Randomly smoothed CNN models on the MNIST dataset. The \ufb01rst two rows refer to the\nsame model with certi\ufb01cates computed via different methods (see details in \u00a73.5).\n\n\u03c6\n\nD\nD\nN\n\nCerti\ufb01cate\n\n\u00b5(R)\n\nD\nN [6]\nN [6]\n\n3.456\n1.799\n2.378\n\nACC@r\n\nr = 1\n0.921\n0.830\n0.884\n\nr = 2\n0.774\n0.557\n0.701\n\nr = 3\n0.539\n0.272\n0.464\n\nr = 4\n0.524\n0.119\n0.252\n\nr = 5\n0.357\n0.021\n0.078\n\nr = 6\n0.202\n0.000\n0.000\n\nr = 7\n0.097\n0.000\n0.000\n\nTable 2: The guaranteed accuracy of randomly smoothed ResNet50 models on ImageNet.\n\n\u03c6 and certi\ufb01cate\n\nACC@r\n\nD\nN [6]\n\nr = 1\n0.538\n0.372\n\nr = 2\n0.394\n0.292\n\nr = 3\n0.338\n0.226\n\nr = 4\n0.274\n0.194\n\nr = 5\n0.234\n0.170\n\nr = 6\n0.190\n0.154\n\nr = 7\n0.176\n0.138\n\ndistribution (see \u00a73.5). The gap between the average certi\ufb01ed radius is about 1.7 in (cid:96)0 distance, and\nthe gap between the certi\ufb01ed accuracy can be as large as 0.4. Compared to the models trained with\nGaussian noise (the 3rd row in Table 1), our model is also consistently better in terms of the measures.\nSince the above comparison between our certi\ufb01cates and the Gaussian-based certi\ufb01cates is relative,\nwe conduct an exhaustive search over all the possible adversary within (cid:96)0 radii 1 and 2 to study the\ntightness against the exact certi\ufb01cate. The resulting certi\ufb01ed accuracies at radii 1 and 2 are 0.954\nand 0.926, respectively, which suggest that our certi\ufb01cate is reasonably tight when r = 1 (0.954 vs.\n0.921), but still too pessimistic when r = 2 (0.926 vs. 0.774). The phenomenon is expected since the\ncerti\ufb01cate is based on all the measurable functions for the discrete distribution. A tighter certi\ufb01cate\nrequires additional assumptions on the classi\ufb01er such as the example in \u00a73.6.\n\n5.2\n\nImageNet\n\nWe conduct experiments on ImageNet [8], a large scale image dataset with 1, 000 labels. Following\ncommon practice, we consider the input space X = {0, 1/255, . . . , 1}224\u00d7224\u00d73 by scaling the\nimages. We consider the same ResNet50 classi\ufb01er [17] and learning procedure as Cohen et al. [6]\nwith the only modi\ufb01cation on the noise distribution. The details and visualizations can be found in\nAppendix C.2. For comparison, we report the best guaranteed accuracy of each method for each (cid:96)0\nradius r in Table 2. Our model outperforms the competitor by a large margin at r = 1 (0.538 vs.\n0.372), and consistently outperforms the baseline across different radii.\n\nAnalysis. We analyze our method in ImageNet in terms of 1) the number n of nonempty likelihood\nratio region L(u, v; r) in Algorithm 1, 2) the pre-computed \u03c1\u22121\nr (0.5), and 3) the certi\ufb01ed accuracy\nat each \u03b1. The results are in Figure 3. For reproducability, the detailed accuracy numbers of 3)\nis available in Table 3 in Appendix C.2, and the pre-computed \u03c1\u22121\nr (0.5) is available at our code\nrepository. 1) The number n of nonempty likelihood ratio regions is much smaller than the bound\n(d + 1)2 = (3 \u00d7 224 \u00d7 224)2 for small radii. 2) The value \u03c1\u22121\nr (0.5) approaches 1 more rapidly for a\nhigher \u03b1 value than a lower one. Note that \u03c1\u22121\nr (0.5) only reaches 1 when r = d due to Remark 3.\nComputing \u03c1\u22121\nr (0.5) in large integer is time-consuming, which takes about 4 days for each \u03b1 and r,\nbut this can be trivially parallelized across different \u03b1 and r.5 For each radius r and randomization\nparameter \u03b1, note that the 4-day computation only has to be done once, and the pre-computed\n\u03c1\u22121\nr (0.5) can be applied to any ImageNet scale images and models. 3) The certi\ufb01ed accuracy behaves\nnonlinearly across different radii; relatively, a high \u03b1 value exhibits a high certi\ufb01ed accuracy at small\nradii and low certi\ufb01ed accuracy at large radii, and vice versa.\n\n5As a side note, computing \u03c1\u22121\n\nr (0.5) in MNIST takes less than 1 second for each \u03b1 and r.\n\n8\n\n\f(a) # of nonempty L(u, v; r)\n\n(b) \u03c1\u22121\n\nr (0.5) for an \u03b1\n\n(c) The certi\ufb01ed accuracy for an \u03b1\n\nFigure 3: Analysis of the proposed method in the ImageNet dataset.\n\n(a) r = 1\n\n(b) r = 2\n\n(c) r = 3\n\nFigure 4: The guaranteed AUC in the Bace dataset across different (cid:96)0 radius r and the ratio of testing\ndata that the adversary can manipulate.\n\n5.3 Chemical Property Prediction\n\nThe experiment is conducted on the Bace dataset [35], a binary classi\ufb01cation dataset for biophysical\nproperty prediction on molecules. We use the Morgan \ufb01ngerprints [32] to represent molecules, which\nare commonly used binary features [41] indicating the presence of various chemical substructures.\nThe dimension of the features (\ufb01ngerprints) is 1, 024. Here we focus on an ablation study comparing\nthe proposed randomly smoothed decision tree with a vanilla decision tree, where the adversary is\nfound by dynamic programming in \u00a73.6 (thus the exact worse case) and a greedy search, respectively.\nMore details can be found in Appendix C.3.\nSince the chemical property prediction is typically evaluated via AUC [41], we de\ufb01ne a robust version\nof AUC that takes account of the radius of the adversary as well as the ratio of testing data that can be\nmanipulated. Note that to maximally decrease the score of AUC via a positive (negative) example,\nthe adversary only has to maximally decrease (increase) its prediction probability, regardless of the\nscores of the other examples. Hence, given an (cid:96)0 radius r and a ratio of testing data, we \ufb01rst compute\nthe adversary for each testing data, and then \ufb01nd the combination of adversaries and the clean data\nunder the ratio constraint that leads to the worst AUC score. See details in Appendix C.4.\nThe results are in Figure 4. Empirically, the adversary of the decision tree at r = 1 always changes\nthe prediction probability of a positive (negative) example to 0 (1). Hence, the plots of the decision\ntree model are constant across different (cid:96)0 radii. The randomly smoothed decision tree is consistently\nmore robust than the vanilla decision tree model. We also compare the exact certi\ufb01cate of the\nprediction probability with the one derived from Lemma 2; the average difference across the training\ndata is 0.358 and 0.402 when r equals to 1 and 2, respectively. The phenomenon encourages the\ndevelopment of a classi\ufb01er-aware guarantee that is tighter than the classi\ufb01er-agnostic guarantee.\n\n6 Conclusion\n\nWe present a strati\ufb01ed approach to certifying the robustness of randomly smoothed classi\ufb01ers, where\nthe robustness guarantees can be obtained in various resolutions and perspectives, ranging from a\npoint-wise certi\ufb01cate to a regional certi\ufb01cate and from general results to speci\ufb01c examples. The\nhierarchical investigation opens up many avenues for future extensions at different levels.\n\nAcknowledgments\n\nGH and TJ were in part supported by a grant from Siemens Corporation.\n\n9\n\n1234567radius r500000100000015000002000000nNumber of regions (ImageNet)ImageNet012345670 radius r0.50.60.70.80.91.01r(0.5)=0.1=0.2=0.3=0.4=0.5012345670 radius r0.00.10.20.30.40.50.60.7Certified accuracy=0.1=0.2=0.3=0.4=0.50.00.20.40.60.81.0Ratio of manipulated data0.00.20.40.60.81.0Guaranteed AUCdecision treerandomly smoothed decision tree0.00.20.40.60.81.0Ratio of manipulated data0.00.20.40.60.81.0Guaranteed AUCdecision treerandomly smoothed decision tree0.00.20.40.60.81.0Ratio of manipulated data0.00.20.40.60.81.0Guaranteed AUCdecision treerandomly smoothed decision tree\fReferences\n[1] G. W. Bemis and M. A. Murcko. The properties of known drugs. 1. molecular frameworks.\n\nJournal of medicinal chemistry, 39(15):2887\u20132893, 1996.\n\n[2] X. Cao and N. Z. Gong. Mitigating evasion attacks to deep neural networks via region-based\nclassi\ufb01cation. In Proceedings of the 33rd Annual Computer Security Applications Conference,\npages 278\u2013287. ACM, 2017.\n\n[3] N. Carlini, G. Katz, C. Barrett, and D. L. Dill. Provably minimally-distorted adversarial\n\nexamples. arXiv preprint arXiv:1709.10207, 2017.\n\n[4] C.-H. Cheng, G. N\u00fchrenberg, and H. Ruess. Maximum resilience of arti\ufb01cial neural networks.\nIn International Symposium on Automated Technology for Veri\ufb01cation and Analysis, pages\n251\u2013268. Springer, 2017.\n\n[5] C. J. Clopper and E. S. Pearson. The use of con\ufb01dence or \ufb01ducial limits illustrated in the case\n\nof the binomial. Biometrika, 26(4):404\u2013413, 1934.\n\n[6] J. M. Cohen, E. Rosenfeld, and J. Z. Kolter. Certi\ufb01ed adversarial robustness via randomized\n\nsmoothing. In the 36th International Conference on Machine Learning, 2019.\n\n[7] F. Croce, M. Andriushchenko, and M. Hein. Provable robustness of relu networks via maxi-\nmization of linear regions. In the 22nd International Conference on Arti\ufb01cial Intelligence and\nStatistics, 2018.\n\n[8] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierarchical\nimage database. In Proceedings of the IEEE international conference on computer vision, pages\n248\u2013255. Ieee, 2009.\n\n[9] S. Dutta, S. Jha, S. Sankaranarayanan, and A. Tiwari. Output range analysis for deep feedforward\n\nneural networks. In NASA Formal Methods Symposium, pages 121\u2013138. Springer, 2018.\n\n[10] K. Dvijotham, S. Gowal, R. Stanforth, R. Arandjelovic, B. O\u2019Donoghue, J. Uesato, and P. Kohli.\n\nTraining veri\ufb01ed learners with learned veri\ufb01ers. arXiv preprint arXiv:1805.10265, 2018.\n\n[11] K. Dvijotham, R. Stanforth, S. Gowal, T. Mann, and P. Kohli. A dual approach to scalable\nveri\ufb01cation of deep networks. In the 34th Annual Conference on Uncertainty in Arti\ufb01cial\nIntelligence, 2018.\n\n[12] R. Ehlers. Formal veri\ufb01cation of piece-wise linear feed-forward neural networks. In Inter-\nnational Symposium on Automated Technology for Veri\ufb01cation and Analysis, pages 269\u2013286.\nSpringer, 2017.\n\n[13] C. Finlay, A.-A. Pooladian, and A. M. Oberman. The logbarrier adversarial attack: making\n\neffective use of decision boundary information. arXiv preprint arXiv:1903.10396, 2019.\n\n[14] M. Fischetti and J. Jo. Deep neural networks and mixed integer linear optimization. Constraints,\n\n23:296\u2013309, 2018.\n\n[15] I. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. In\n\nInternational Conference on Learning Representations, 2015.\n\n[16] S. Gowal, K. Dvijotham, R. Stanforth, R. Bunel, C. Qin, J. Uesato, T. Mann, and P. Kohli. On\nthe effectiveness of interval bound propagation for training veri\ufb01ably robust models. arXiv\npreprint arXiv:1810.12715, 2018.\n\n[17] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition.\n\nIn\nProceedings of the IEEE conference on computer vision and pattern recognition, pages 770\u2013\n778, 2016.\n\n[18] P.-S. Huang, R. Stanforth, J. Welbl, C. Dyer, D. Yogatama, S. Gowal, K. Dvijotham, and\nP. Kohli. Achieving veri\ufb01ed robustness to symbol substitutions via interval bound propagation.\narXiv preprint arXiv:1909.01492, 2019.\n\n10\n\n\f[19] R. Jia, A. Raghunathan, K. G\u00f6ksel, and P. Liang. Certi\ufb01ed robustness to adversarial word\n\nsubstitutions. arXiv preprint arXiv:1909.00986, 2019.\n\n[20] G. Katz, C. Barrett, D. L. Dill, K. Julian, and M. J. Kochenderfer. Reluplex: An ef\ufb01cient smt\nsolver for verifying deep neural networks. In International Conference on Computer Aided\nVeri\ufb01cation, pages 97\u2013117. Springer, 2017.\n\n[21] M. Lecuyer, V. Atlidakis, R. Geambasu, D. Hsu, and S. Jana. Certi\ufb01ed robustness to adversarial\n\nexamples with differential privacy. IEEE Symposium on Security and Privacy (SP), 2019.\n\n[22] G.-H. Lee, D. Alvarez-Melis, and T. S. Jaakkola. Towards robust, locally linear deep networks.\n\nIn International Conference on Learning Representations, 2019.\n\n[23] B. Li, C. Chen, W. Wang, and L. Carin. Second-order adversarial attack and certi\ufb01able\n\nrobustness. arXiv preprint arXiv:1809.03113, 2018.\n\n[24] X. Liu, M. Cheng, H. Zhang, and C.-J. Hsieh. Towards robust neural networks via random\nself-ensemble. In Proceedings of the European Conference on Computer Vision (ECCV), pages\n369\u2013385, 2018.\n\n[25] A. Lomuscio and L. Maganti. An approach to reachability analysis for feed-forward relu neural\n\nnetworks. arXiv preprint arXiv:1706.07351, 2017.\n\n[26] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu. Towards deep learning models\nresistant to adversarial attacks. In International Conference on Learning Representations, 2018.\n\n[27] M. Mirman, T. Gehr, and M. Vechev. Differentiable abstract interpretation for provably robust\n\nneural networks. In the 35th International Conference on Machine Learning, 2018.\n\n[28] J. Neyman and E. S. Pearson.\n\nIx. on the problem of the most ef\ufb01cient tests of statistical\nhypotheses. Philosophical Transactions of the Royal Society of London. Series A, Containing\nPapers of a Mathematical or Physical Character, 231(694-706):289\u2013337, 1933.\n\n[29] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison,\n\nL. Antiga, and A. Lerer. Automatic differentiation in pytorch. 2017.\n\n[30] A. Raghunathan, J. Steinhardt, and P. Liang. Certi\ufb01ed defenses against adversarial examples. In\n\nInternational Conference on Learning Representations, 2018.\n\n[31] A. Raghunathan, J. Steinhardt, and P. S. Liang. Semide\ufb01nite relaxations for certifying robustness\nto adversarial examples. In Advances in Neural Information Processing Systems, pages 10877\u2013\n10887, 2018.\n\n[32] D. Rogers and M. Hahn. Extended-connectivity \ufb01ngerprints. Journal of chemical information\n\nand modeling, 50(5):742\u2013754, 2010.\n\n[33] K. Scheibler, L. Winterer, R. Wimmer, and B. Becker. Towards veri\ufb01cation of arti\ufb01cial neural\n\nnetworks. In MBMV, pages 30\u201340, 2015.\n\n[34] G. Singh, T. Gehr, M. Mirman, M. P\u00fcschel, and M. Vechev. Fast and effective robustness\ncerti\ufb01cation. In Advances in Neural Information Processing Systems, pages 10802\u201310813,\n2018.\n\n[35] G. Subramanian, B. Ramsundar, V. Pande, and R. A. Denny. Computational modeling of \u03b2-\nsecretase 1 (bace-1) inhibitors using ligand based approaches. Journal of chemical information\nand modeling, 56(10):1936\u20131949, 2016.\n\n[36] V. Tjeng, K. Xiao, and R. Tedrake. Evaluating robustness of neural networks with mixed integer\n\nprogramming. In International Conference on Learning Representations, 2017.\n\n[37] K. Tocher. Extension of the neyman-pearson theory of tests to discontinuous variates.\n\nBiometrika, 37(1/2):130\u2013144, 1950.\n\n11\n\n\f[38] T.-W. Weng, H. Zhang, H. Chen, Z. Song, C.-J. Hsieh, D. Boning, I. S. Dhillon, and L. Daniel.\nTowards fast computation of certi\ufb01ed robustness for relu networks. In the 35th International\nConference on Machine Learning, 2018.\n\n[39] E. Wong and J. Z. Kolter. Provable defenses against adversarial examples via the convex outer\n\nadversarial polytope. In the 35th International Conference on Machine Learning, 2018.\n\n[40] E. Wong, F. Schmidt, J. H. Metzen, and J. Z. Kolter. Scaling provable adversarial defenses. In\n\nAdvances in Neural Information Processing Systems, pages 8400\u20138409, 2018.\n\n[41] Z. Wu, B. Ramsundar, E. N. Feinberg, J. Gomes, C. Geniesse, A. S. Pappu, K. Leswing, and\nV. Pande. Moleculenet: a benchmark for molecular machine learning. Chemical science,\n9(2):513\u2013530, 2018.\n\n[42] H. Zhang, T.-W. Weng, P.-Y. Chen, C.-J. Hsieh, and L. Daniel. Ef\ufb01cient neural network\nrobustness certi\ufb01cation with general activation functions. In Advances in Neural Information\nProcessing Systems, pages 4939\u20134948, 2018.\n\n12\n\n\f", "award": [], "sourceid": 2720, "authors": [{"given_name": "Guang-He", "family_name": "Lee", "institution": "MIT"}, {"given_name": "Yang", "family_name": "Yuan", "institution": "MIT"}, {"given_name": "Shiyu", "family_name": "Chang", "institution": "IBM T.J. Watson Research Center"}, {"given_name": "Tommi", "family_name": "Jaakkola", "institution": "MIT"}]}