{"title": "Efficient Neural Network Robustness Certification with General Activation Functions", "book": "Advances in Neural Information Processing Systems", "page_first": 4939, "page_last": 4948, "abstract": "Finding minimum distortion of adversarial examples and thus certifying robustness in neural networks classifiers is known to be a challenging problem. Nevertheless, recently it has been shown to be possible to give a non-trivial certified lower bound of minimum distortion, and some recent progress has been made towards this direction by exploiting the piece-wise linear nature of ReLU activations. However, a generic robustness certification for \\textit{general} activation functions still remains largely unexplored. To address this issue, in this paper we introduce CROWN, a general framework to certify robustness of neural networks with general activation functions. The novelty in our algorithm consists of bounding a given activation function with linear and quadratic functions, hence allowing it to tackle general activation functions including but not limited to the four popular choices: ReLU, tanh, sigmoid and arctan. In addition, we facilitate the search for a tighter certified lower bound by \\textit{adaptively} selecting appropriate surrogates for each neuron activation. Experimental results show that CROWN on ReLU networks can notably improve the certified lower bounds compared to the current state-of-the-art algorithm Fast-Lin, while having comparable computational efficiency. Furthermore, CROWN also demonstrates its effectiveness and flexibility on networks with general activation functions, including tanh, sigmoid and arctan.", "full_text": "Ef\ufb01cientNeuralNetworkRobustnessCerti\ufb01cationwithGeneralActivationFunctionsHuanZhang1,\u2020,\u2217Tsui-WeiWeng2,\u2020Pin-YuChen3Cho-JuiHsieh1LucaDaniel21UniversityofCalifornia,LosAngeles,LosAngelesCA900952MassachusettsInstituteofTechnology,Cambridge,MA021393MIT-IBMWatsonAILab,IBMResearch,YorktownHeights,NY10598huan@huan-zhang.com,twweng@mit.edupin-yu.chen@ibm.com,chohsieh@cs.ucla.edu,dluca@mit.eduAbstractFindingminimumdistortionofadversarialexamplesandthuscertifyingrobustnessinneuralnetworkclassi\ufb01ersforgivendatapointsisknowntobeachallengingproblem.Nevertheless,recentlyithasbeenshowntobepossibletogiveanon-trivialcerti\ufb01edlowerboundofminimumadversarialdistortion,andsomerecentprogresshasbeenmadetowardsthisdirectionbyexploitingthepiece-wiselinearnatureofReLUactivations.However,agenericrobustnesscerti\ufb01cationforgeneralactivationfunctionsstillremainslargelyunexplored.Toaddressthisissue,inthispaperweintroduceCROWN,ageneralframeworktocertifyrobustnessofneuralnetworkswithgeneralactivationfunctionsforgiveninputdatapoints.Thenoveltyinouralgorithmconsistsofboundingagivenactivationfunctionwithlinearandquadraticfunctions,henceallowingittotacklegeneralactivationfunctionsincludingbutnotlimitedtofourpopularchoices:ReLU,tanh,sigmoidandarctan.Inaddition,wefacilitatethesearchforatightercerti\ufb01edlowerboundbyadaptivelyselectingappropriatesurrogatesforeachneuronactivation.ExperimentalresultsshowthatCROWNonReLUnetworkscannotablyimprovethecerti\ufb01edlowerboundscomparedtothecurrentstate-of-the-artalgorithmFast-Lin,whilehavingcomparablecomputationalef\ufb01ciency.Furthermore,CROWNalsodemonstratesitseffectivenessand\ufb02exibilityonnetworkswithgeneralactivationfunctions,includingtanh,sigmoidandarctan.1IntroductionWhileneuralnetworks(NNs)haveachievedremarkableperformanceandaccomplishedunprece-dentedbreakthroughsinmanymachinelearningtasks,recentstudieshavehighlightedtheirlackofrobustnessagainstadversarialperturbations[1,2].Forexample,inimagelearningtaskssuchasobjectclassi\ufb01cation[3,4,5,6]orcontentcaptioning[7],visuallyindistinguishableadversarialexamplescanbeeasilycraftedfromnaturalimagestoalteraNN\u2019spredictionresult.Beyondthewhite-boxattacksettingwherethetargetmodelisentirelytransparent,visuallyimperceptibleadversarialperturbationscanalsobegeneratedintheblack-boxsettingbyonlyusingthepredictionresultsofthetargetmodel[8,9,10,11].Inaddition,real-lifeadversarialexampleshavebeenmadepossiblethroughthelensofrealizingphysicalperturbations[12,13,14].AsNNsarebecomingacoretechniquedeployedinawiderangeofapplications,includingsafety-criticaltasks,certifyingrobustnessofaNNagainstadversarialperturbationshasbecomeanimportantresearchtopicinmachinelearning.\u2217WorkdoneduringinternshipatIBMResearch.\u2020Equalcontribution.CorrespondencetoHuanZhang<huan@huan-zhang.com>andTsui-WeiWeng<twweng@mit.edu>32ndConferenceonNeuralInformationProcessingSystems(NeurIPS2018),Montr\u00e9al,Canada.\fGivenaNN(possiblywithadeepandcomplicatednetworkarchitecture),weareinterestedincertifyingthe(local)robustnessofanarbitrarynaturalexamplex0byensuringallitsneighborhoodhasthesameinferenceoutcome(e.g.,consistenttop-1prediction).Inthispaper,theneighborhoodofx0ischaracterizedbyan(cid:96)pballcenteredatx0,foranyp\u22651.Geometricallyspeaking,theminimumdistanceofamisclassi\ufb01ednearbyexampletox0istheleastadversarystrength(a.k.a.minimumadversarialdistortion)requiredtoalterthetargetmodel\u2019sprediction,whichisalsothelargestpossiblerobustnesscerti\ufb01cateforx0.Unfortunately,\ufb01ndingtheminimumdistortionofadversarialexamplesinNNswithRecti\ufb01edLinearUnit(ReLU)activations,whichisoneofthemostwidelyusedactivationfunctions,isknowntobeanNP-completeproblem[15,16].Thismakesformalveri\ufb01cationtechniquessuchasReluplex[15]computationallydemandingevenforsmall-sizedNNsandsufferfromscalabilityissues.AlthoughcertifyingthelargestpossiblerobustnessischallengingforReLUnetworks,thepiece-wiselinearnatureofReLUscanbeexploitedtoef\ufb01cientlycomputeanon-trivialcerti\ufb01edlowerboundoftheminimumdistortion[17,18,19,20].BeyondReLU,onefundamentalproblemthatremainslargelyunexploredishowtogeneralizetherobustnesscerti\ufb01cationtechniquetootherpopularactivationfunctionsthatarenotpiece-wiselinear,suchastanhandsigmoid,andhowtomotivateandcertifythedesignofotheractivationfunctionstowardsimprovedrobustness.Inthispaper,wetackletheprecedingproblembyproposinganef\ufb01cientrobustnesscerti\ufb01cationframeworkforNNswithgeneralactivationfunctions.Ourmaincontributionsinthispaperaresummarizedasfollows:\u2022WeproposeagenericanalysisframeworkCROWNforcertifyingNNsusinglinearorquadraticupperandlowerboundsforgeneralactivationfunctionsthatarenotnecessarilypiece-wiselinear.\u2022Unlikepreviouswork[20],CROWNallows\ufb02exibleselectionsofupper/lowerboundsforactivationfunctions,enablinganadaptiveschemethathelpstoreduceapproximationerrors.OurexperimentsshowthatCROWNachievesupto26%improvementsincerti\ufb01edlowerboundscomparedto[20].\u2022Ouralgorithmisef\ufb01cientandcanscaletolargeNNswithvariousactivationfunctions.ForaNNwithover10,000neurons,wecangiveacerti\ufb01edlowerboundinabout1minuteon1CPUcore.2BackgroundandRelatedWorkForReLUnetworks,\ufb01ndingtheminimumadversarialdistortionforagiveninputdatapointx0canbecastasamixedintegerlinearprogramming(MILP)problem[21,22,23].Reluplex[15,24]usesasatis\ufb01ablemodulotheory(SMT)toencodeReLUactivationsintolinearconstraints.Similarly,Planet[25]usessatis\ufb01ability(SAT)solvers.However,duetotheNP-completenessforsolvingsuchaproblem[15],thesemethodscanonly\ufb01ndminimumdistortionforverysmallnetworks.ItcantakeReluplexseveralhoursto\ufb01ndtheminimumdistortionofanexampleforaReLUnetworkwith5inputs,5outputsand300neurons[15].Ontheotherhand,acomputationallyfeasiblealternativeofrobustnesscerti\ufb01cateistoprovideanon-trivialandcerti\ufb01edlowerboundofminimumdistortion.Someanalyticallowerboundsbasedonoperatornormsontheweightmatrices[3]ortheJacobianmatrixinNNs[17]donotexploitspecialpropertyofReLUandthuscanbeveryloose[20].Theboundsin[26,27]arebasedonthelocalLipschitzconstant.[26]assumesacontinuousdifferentiableNNandhenceexcludesReLUnetworks;aclosedformlower-boundisalsohardtoderivefornetworksbeyond2layers.[27]appliestoReLUnetworksandusesExtremeValueTheorytoprovideanestimatedlowerbound(CLEVERscore).AlthoughtheCLEVERscoreiscapableofre\ufb02ectingthelevelofrobustnessindifferentNNsandisscalabletolargenetworks,itisnotacerti\ufb01edlowerbound.Ontheotherhand,[18]usetheideaofaconvexouteradversarialpolytopeinReLUnetworkstocomputeacerti\ufb01edlowerboundbyrelaxingtheMILPcerti\ufb01cationproblemtolinearprograming(LP).[20]exploittheReLUpropertytoboundtheactivationfunction(orthelocalLipschitzconstant)andprovideef\ufb01cientalgorithms(Fast-LinandFast-Lip)forcomputingacerti\ufb01edlowerbound,achievingstate-of-the-artperformance.Arecentwork[28]usesabstracttransformationstozonotopesforprovingrobustnesspropertyforReLUnetworks.Nonetheless,therearestillsomeapplicationsdemandnon-ReLUactivations,e.g.RNNandLSTM,thusageneralframeworkthatcanef\ufb01cientlycomputenon-trivialandcerti\ufb01edlowerboundsforNNswithgeneralactivationfunctionsisofgreatimportance.Weaimat\ufb01llingthisgapandproposeCROWNthatcanperformef\ufb01cientrobustnesscerti\ufb01cationtoNNswithgeneralactivationfunctions.Table1summarizesthedifferencesofotherapproachesandCROWN.Notethatthesemide\ufb01niteprogrammingapproachproposedin[19]andarecentwork[29]basedonsolvingLagrangiandualcanbothhandlegeneralactivationfunctions,but[19]islimitedtoNNswithonehiddenlayerand[29]tradesoffthequalityofrobustnessboundwithscalability.2\fTable1:Comparisonofmethodsforprovidingadversarialrobustnesscerti\ufb01cationinNNs.MethodNon-trivialboundMulti-layerScalabilityBeyondReLUSzegedyet.al.[3]\u00d7(cid:88)(cid:88)(cid:88)Reluplex[15],Planet[25](cid:88)(cid:88)\u00d7\u00d7Hein&Andriushchenko[26](cid:88)\u00d7(cid:88)differentiable*Raghunathanetal.[19](cid:88)\u00d7\u00d7(cid:88)KolterandWong[18](cid:88)(cid:88)(cid:88)\u00d7Fast-lin/Fast-lip[20](cid:88)(cid:88)(cid:88)\u00d7CROWN(ours)(cid:88)(cid:88)(cid:88)(cid:88)(general)*Continuouslydifferentiableactivationfunctionrequired(soft-plusisdemonstratedin[26])Somerecentworks(suchasrobustoptimizationbasedadversarialtraining[30]orregion-basedclassi\ufb01cation[31])empiricallyexhibitstrongrobustnessagainstseveraladversarialattacks,whichisbeyondthescopeofprovablerobustnesscerti\ufb01cation.Inaddition,Sinhaetal.[16]providedistributionalrobustnesscerti\ufb01cationbasedonWassersteindistancebetweendatadistributions,whichisdifferentfromthelocal(cid:96)pballrobustnessmodelconsideredinthispaper.3CROWN:AgeneralframeworkforcertifyingneuralnetworksOverviewofourresults.Inthissection,wepresentageneralframeworkCROWNforef\ufb01cientlycomputingacerti\ufb01edlowerboundofminimumadversarialdistortiongivenanyinputdatapointx0withgeneralactivationfunctionsinlargerNNs.We\ufb01rstprovideprinciplesinSection3.1toderiveoutputboundsofNNswhentheinputsareperturbedwithinan(cid:96)pballandeachneuronhasdifferent(adaptive)linearapproximationboundsonitsactivationfunction.InSection3.2,wedemonstratehowtoproviderobustnesscerti\ufb01cationforfourwidely-usedactivationfunctions(ReLU,tanh,sigmoidandarctan)usingCROWN.Inparticular,weshowthatthestate-of-the-artFast-LinalgorithmisaspecialcaseundertheCROWNframeworkandthattheadaptiveselectionsofapproximationboundsallowCROWNtoachieveatighter(larger)certi\ufb01edlowerbound(seeSection4).InSection3.3,wefurtherhighlightthe\ufb02exibilityofCROWNtoincorporatequadraticapproximationsontheactivationfunctionsinadditiontothelinearapproximationsdescribedinSection3.1.3.1GeneralframeworkNotations.Foranm-layerneuralnetworkwithaninputvectorx\u2208Rn0,letthenumberofneuronsineachlayerbenk,\u2200k\u2208[m],where[i]denotesset{1,2,\u00b7\u00b7\u00b7,i}.Letthek-thlayerweightmatrixbeW(k)\u2208Rnk\u00d7nk\u22121andbiasvectorbeb(k)\u2208Rnk,andlet\u03a6k:Rn0\u2192Rnkbetheoperatormappingfrominputtolayerk.Wehave\u03a6k(x)=\u03c3(W(k)\u03a6k\u22121(x)+b(k)),\u2200k\u2208[m\u22121],where\u03c3(\u00b7)isthecoordinate-wiseactivationfunction.Whileourmethodologyisapplicabletoanyactivationfunctionofinterest,weemphasizeonfourmostwidely-usedactivationfunctions,namelyReLU:\u03c3(y)=max(y,0),hyperbolictangent:\u03c3(y)=tanh(y),sigmoid:\u03c3(y)=1/(1+e\u2212y)andarctan:\u03c3(y)=arctan(y).Notethattheinput\u03a60(x)=x,andthevectoroutputoftheNNisf(x)=\u03a6m(x)=W(m)\u03a6m\u22121(x)+b(m).Thej-thoutputelementisdenotedasfj(x)=[\u03a6m(x)]j.Inputperturbationandpre-activationbounds.Letx0\u2208Rn0beagivendatapoint,andlettheperturbedinputvectorxbewithinan\u0001-bounded(cid:96)p-ballcenteredatx0,i.e.,x\u2208Bp(x0,\u0001),whereBp(x0,\u0001):={x|(cid:107)x\u2212x0(cid:107)p\u2264\u0001}.Forther-thneuronink-thlayer,letitspre-activationinputbey(k)r,wherey(k)r=W(k)r,:\u03a6k\u22121(x)+b(k)randW(k)r,:denotesther-throwofmatrixW(k).Whenx0isperturbedwithinan\u0001-bounded(cid:96)p-ball,letl(k)r,u(k)r\u2208Rbethepre-activationlowerboundandupperboundofy(k)r,i.e.l(k)r\u2264y(k)r\u2264u(k)r.Below,we\ufb01rstde\ufb01nethelinearupperboundsandlowerboundsofactivationfunctionsinDe\ufb01ni-tion3.1,whicharethekeytoderiveexplicitoutputboundsforanm-layerneuralnetworkwithgeneralactivationfunctions.TheformalstatementoftheexplicitoutputboundsisshowninTheorem3.2.De\ufb01nition3.1(Linearboundsonactivationfunction).Forther-thneuronink-thlayerwithpre-activationboundsl(k)r,u(k)randtheactivationfunction\u03c3(y),de\ufb01netwolinearfunctionsh(k)U,r,h(k)L,r:R\u2192R,h(k)U,r(y)=\u03b1(k)U,r(y+\u03b2(k)U,r),h(k)L,r(y)=\u03b1(k)L,r(y+\u03b2(k)L,r),suchthath(k)L,r(y)\u2264\u03c3(y)\u2264h(k)U,r(y),y\u2208[l(k)r,u(k)r],\u2200k\u2208[m\u22121],r\u2208[nk]and\u03b1(k)U,r,\u03b1(k)L,r\u2208R+,\u03b2(k)U,r,\u03b2(k)L,r\u2208R.3\fNotethattheparameters\u03b1(k)U,r,\u03b1(k)L,r,\u03b2(k)U,r,\u03b2(k)L,rdependonl(k)randu(k)r,i.e.fordifferentl(k)randu(k)rwemaychoosedifferentparameters.Also,foreaseofexposition,inthispaperwerestrict\u03b1(k)U,r,\u03b1(k)L,r\u22650.However,Theorem3.2canbeeasilygeneralizedtothecaseofnegative\u03b1(k)U,r,\u03b1(k)L,r.Theorem3.2(Explicitoutputboundsofneuralnetworkf).Givenanm-layerneuralnetworkfunctionf:Rn0\u2192Rnm,thereexiststwoexplicitfunctionsfLj:Rn0\u2192RandfUj:Rn0\u2192Rsuchthat\u2200j\u2208[nm],\u2200x\u2208Bp(x0,\u0001),theinequalityfLj(x)\u2264fj(x)\u2264fUj(x)holdstrue,wherefUj(x)=\u039b(0)j,:x+m(cid:88)k=1\u039b(k)j,:(b(k)+\u2206(k):,j),fLj(x)=\u2126(0)j,:x+m(cid:88)k=1\u2126(k)j,:(b(k)+\u0398(k):,j),(1)\u039b(k\u22121)j,:=(cid:26)e(cid:62)jifk=m+1;(\u039b(k)j,:W(k))(cid:12)\u03bb(k\u22121)j,:ifk\u2208[m].\u2126(k\u22121)j,:=(cid:26)e(cid:62)jifk=m+1;(\u2126(k)j,:W(k))(cid:12)\u03c9(k\u22121)j,:ifk\u2208[m].and\u2200i\u2208[nk],wede\ufb01nefourmatrices\u03bb(k),\u03c9(k),\u2206(k),\u0398(k)\u2208Rnm\u00d7nk:\u03bb(k)j,i=\uf8f1\uf8f4\uf8f2\uf8f4\uf8f3\u03b1(k)U,iifk(cid:54)=0,\u039b(k+1)j,:W(k+1):,i\u22650;\u03b1(k)L,iifk(cid:54)=0,\u039b(k+1)j,:W(k+1):,i<0;1ifk=0.\u03c9(k)j,i=\uf8f1\uf8f4\uf8f2\uf8f4\uf8f3\u03b1(k)L,iifk(cid:54)=0,\u2126(k+1)j,:W(k+1):,i\u22650;\u03b1(k)U,iifk(cid:54)=0,\u2126(k+1)j,:W(k+1):,i<0;1ifk=0.\u2206(k)i,j=\uf8f1\uf8f4\uf8f2\uf8f4\uf8f3\u03b2(k)U,iifk(cid:54)=m,\u039b(k+1)j,:W(k+1):,i\u22650;\u03b2(k)L,iifk(cid:54)=m,\u039b(k+1)j,:W(k+1):,i<0;0ifk=m.\u0398(k)i,j=\uf8f1\uf8f4\uf8f2\uf8f4\uf8f3\u03b2(k)L,iifk(cid:54)=m,\u2126(k+1)j,:W(k+1):,i\u22650;\u03b2(k)U,iifk(cid:54)=m,\u2126(k+1)j,:W(k+1):,i<0;0ifk=m.and(cid:12)istheHadamardproductandej\u2208Rnmisastandardunitvectoratjthcoordinate.Theorem3.2illustrateshowaNNfunctionfj(x)canbeboundedbytwolinearfunctionsfUj(x)andfLj(x)whentheactivationfunctionofeachneuronisboundedbytwolinearfunctionsh(k)U,randh(k)L,rinDe\ufb01nition3.1.Thecentralideaistounwraptheactivationfunctionslayerbylayerbyconsideringthesignsoftheassociated(equivalent)weightsofeachneuronandapplythetwolinearboundsh(k)U,randh(k)L,r.Aswedemonstrateintheproof,whenwereplacetheactivationfunctionswiththecorrespondinglinearupperboundsandlowerboundsatthelayerm\u22121,wecanthende\ufb01neequivalentweightsandbiasesbasedontheparametersofh(m\u22121)U,randh(m\u22121)L,r(e.g.\u039b(k),\u2206(k),\u2126(k),\u0398(k)arerelatedtotheterms\u03b1(k)U,r,\u03b2(k)U,r,\u03b1(k)L,r,\u03b2(k)L,r,respectively)andthenrepeattheprocedureto\u201cback-propagate\u201dtotheinputlayer.ThisallowsustoobtainfUj(x)andfLj(x)in(1).TheformalproofofTheorem3.2isinAppendixA.Notethatforaneuronrinlayerk,theslopesofitslinearupperandlowerbounds\u03b1(k)U,r,\u03b1(k)L,rcanbedifferent.Thisimplies:1.Fast-Lin[20]isaspecialcaseofourframeworkastheyrequiretheslopes\u03b1(k)U,r,\u03b1(k)L,rtobethesame;anditonlyappliestoReLUnetworks(cf.Sec.3.2).InFast-Lin,\u039b(0)and\u2126(0)areidentical.2.OurCROWNframeworkallowsadaptiveselectionsonthelinearapproximationwhencomputingcerti\ufb01edlowerboundsofminimumadversarialdistortion,whichisthemaincontributortoimprovethecerti\ufb01edlowerboundasdemonstratedintheexperimentsinSection4.Globalbounds.Moreimportantly,sincetheinputx\u2208Bp(x0,\u0001),wecantakethemaximum,i.e.maxx\u2208Bp(x0,\u0001)fUj(x),andminimum,i.e.minx\u2208Bp(x0,\u0001)fLj(x),asapairofglobalupperandlowerboundoffj(x)\u2013whichinfacthasclosed-formsolutionsbecausefUj(x)andfLj(x)aretwolinearfunctionsandx\u2208Bp(x0,\u0001)isaconvexnormconstraint.ThisresultisformallypresentedbelowanditsproofisgiveninAppendixB.Corollary3.3(Closed-formglobalbounds).Givenadatapointx0\u2208Rn0,(cid:96)pballparametersp\u22651and\u0001>0.Foranm-layerneuralnetworkfunctionf:Rn0\u2192Rnm,thereexiststwo\ufb01xedvalues\u03b3Ljand\u03b3Ujsuchthat\u2200x\u2208Bp(x0,\u0001)and\u2200j\u2208[nm],1/q=1\u22121/p,theinequality\u03b3Lj\u2264fj(x)\u2264\u03b3Ujholdstrue,where\u03b3Uj=\u0001(cid:107)\u039b(0)j,:(cid:107)q+\u039b(0)j,:x0+m(cid:88)k=1\u039b(k)j,:(b(k)+\u2206(k):,j),\u03b3Lj=\u2212\u0001(cid:107)\u2126(0)j,:(cid:107)q+\u2126(0)j,:x0+m(cid:88)k=1\u2126(k)j,:(b(k)+\u0398(k):,j).(2)4\fTable2:Linearupperboundparametersofvariousactivationfunctions:h(k)U,r(y)=\u03b1(k)U,r(y+\u03b2(k)U,r)Upperboundh(k)U,rr\u2208S+kr\u2208S\u2212kr\u2208S\u00b1kforactivationfunction\u03b1(k)U,r\u03b2(k)U,r\u03b1(k)U,r\u03b2(k)U,r\u03b1(k)U,r\u03b2(k)U,rReLU1000a\u2212l(k)r(a\u2265u(k)ru(k)r\u2212l(k)r,e.g.a=u(k)ru(k)r\u2212l(k)r)Sigmoid,tanh\u03c3(cid:48)(d)\u03c3(d)\u03b1(k)U,r\u2212d*\u03c3(u(k)r)\u2212\u03c3(l(k)r)u(k)r\u2212l(k)r\u03c3(l(k)r)\u03b1(k)U,r\u2212l(k)r\u03c3(cid:48)(d)\u03c3(l(k)r)\u03b1(k)U,r\u2212l(k)r(denotedas\u03c3(y))(l(k)r\u2264d\u2264u(k)r)(\u03c3(d)\u2212\u03c3(l(k)r)d\u2212l(k)r\u2212\u03c3(cid:48)(d)=0,d\u22650)(cid:5)*If\u03b1(k)U,riscloseto0,wesuggesttocalculatetheinterceptdirectly,\u03b1(k)U,r\u00b7\u03b2(k)U,r=\u03c3(d)\u2212\u03b1(k)U,rd,toavoidnumericalissuesinimplementation.Sameforothersimilarcases.(cid:5)Alternatively,ifthesolutiond\u2265u(k)r,thenwecanset\u03b1(k)U,r=\u03c3(u(k)r)\u2212\u03c3(l(k)r)u(k)r\u2212l(k)r.Table3:Linearlowerboundparametersofvariousactivationfunctions:h(k)L,r(y)=\u03b1(k)L,r(y+\u03b2(k)L,r)Lowerboundh(k)L,rr\u2208S+kr\u2208S\u2212kr\u2208S\u00b1kforactivationfunction\u03b1(k)L,r\u03b2(k)L,r\u03b1(k)L,r\u03b2(k)L,r\u03b1(k)L,r\u03b2(k)L,rReLU1000a0(0\u2264a\u22641,e.g.a=u(k)ru(k)r\u2212l(k)r,0,1)Sigmoid,tanh\u03c3(u(k)r)\u2212\u03c3(l(k)r)u(k)r\u2212l(k)r\u03c3(l(k)r)\u03b1(k)L,r\u2212l(k)r\u03c3(cid:48)(d)\u03c3(d)\u03b1(k)L,r\u2212d\u03c3(cid:48)(d)\u03c3(u(k)r)\u03b1(k)L,r\u2212u(k)r(denotedas\u03c3(y))(l(k)r\u2264d\u2264u(k)r)(\u03c3(d)\u2212\u03c3(u(k)r)d\u2212u(k)r\u2212\u03c3(cid:48)(d)=0,d\u22640)\u2020\u2020Alternatively,ifthesolutiond\u2264l(k)r,thenwecanset\u03b1(k)L,r=\u03c3(u(k)r)\u2212\u03c3(l(k)r)u(k)r\u2212l(k)r.Certi\ufb01edlowerboundofminimumdistortion.Givenaninputexamplex0andanm-layerNN,letcbethepredictedclassofx0andt(cid:54)=cbethetargetedattackclass.WeaimtousetheuniformboundsestablishedinCorollary3.3toobtainthelargestpossiblelowerbound\u02dc\u0001tand\u02dc\u0001oftargetedanduntargetedattacksrespectively,whichcanbeformulatedasfollows:\u02dc\u0001t=max\u0001\u0001s.t.\u03b3Lc(\u0001)\u2212\u03b3Ut(\u0001)>0and\u02dc\u0001=mint(cid:54)=c\u02dc\u0001t.Wenotethatalthoughthereisalinear\u0001termin(2),othertermssuchas\u039b(k),\u2206(k)and\u2126(k),\u0398(k)alsoimplicitlydependon\u0001.Thisisbecausetheparameters\u03b1(k)U,i,\u03b2(k)U,i,\u03b1(k)L,i,\u03b2(k)L,idependonl(k)i,u(k)i,whichmayvarywith\u0001;thusthevaluesin\u039b(k),\u2206(k),\u2126(k),\u0398(k)dependon\u0001.Itisthereforedif\ufb01culttoobtainanexplicitexpressionof\u03b3Lc(\u0001)\u2212\u03b3Ut(\u0001)intermsof\u0001.Fortunately,wecanstillperformabinarysearchtoobtain\u02dc\u0001twithCorollary3.3.Moreprecisely,we\ufb01rstinitialize\u0001atsome\ufb01xedpositivevalueandapplyCorollary3.3repeatedlytoobtainl(k)randu(k)rfromk=1tomandr\u2208[nk].Wethencheckifthecondition\u03b3Lc\u2212\u03b3Ut>0issatis\ufb01ed.Ifso,weincrease\u0001;otherwise,wedecrease\u0001;andwerepeattheprocedureuntilagiventolerancelevelismet.2TimeComplexity.WithCorollary3.3,wecancomputeanalyticoutputboundsef\ufb01cientlywithoutresortingtoanyoptimizationsolversforgeneral(cid:96)pdistortion,andthetimecomplexityforanm-layerReLUnetworkispolynomialtimeincontrasttoReluplexorMixed-IntegerOptimization-basedapproach[22,23]whereSMTandMIOsolversareexponential-time.Foranmlayernetworkwithnneuronsperlayerandnoutputs,timecomplexityofCROWNisO(m2n3).Forming\u039b(0)and\u2126(0)forthem-thlayerinvolvesmultiplicationsoflayerweightsinasimilarcostofforwardpropagationinO(mn3)time.Also,theboundsforallpreviousk\u2208[m\u22121]layersneedtobecomputedbeforehandinO(kn3)time;thusthetotaltimecomplexityisO(m2n3).3.2Casestudies:CROWNforReLU,tanh,sigmoidandarctanactivationsInSection3.1weshowedthataslongasonecanidentifytwolinearfunctionshU(y),hL(y)toboundageneralactivationfunction\u03c3(y)foreachneuron,wecanuseCorollary3.3withabinarysearch2Theboundcanbefurtherimprovedbyconsideringg(x):=fc(x)\u2212ft(x)andreplacingthelastlayer\u2019sweightsbyW(m)c,:\u2212W(m)t,:.Thisisalsousedby[20].5\f(a)r\u2208S+k(b)r\u2208S\u2212k(c)r\u2208S\u00b1kFigure1:\u03c3(y)=tanh.Greenlinesaretheupperboundsh(k)U,r;redlinesarethelowerboundsh(k)L,r.toobtaincerti\ufb01edlowerboundsofminimumdistortion.Inthissection,weillustratehowto\ufb01ndparameters\u03b1(k)U,r,\u03b1(k)L,rand\u03b2(k)U,r,\u03b2(k)L,rofhU(y),hL(y)forfourmostwidelyusedactivationfunctions:ReLU,tanh,sigmoidandarctan.Otheractivations,includingbutnotlimitedtoleakyReLU,ELUandsoftplus,canbeeasilyincorporatedintoourCROWNframeworkfollowingasimilarprocedure.Segmentingactivationfunctions.Basedonthesignsofl(k)randu(k)r,wede\ufb01neapartition{S+k,S\u00b1k,S\u2212k}ofset[nk]suchthateveryneuronink-thlayerbelongstoexactlyoneofthethreesets.Theformalde\ufb01nitionofS+k,S\u00b1kandS\u2212kisS+k={r\u2208[nk]|0\u2264l(k)r\u2264u(k)r},S\u00b1k={r\u2208[nk]|l(k)r<0<u(k)r},andS\u2212k={r\u2208[nk]|l(k)r\u2264u(k)r\u22640}.Forneuronsineachpartitionedset,wede\ufb01necorrespondingupperboundh(k)U,randlowerboundh(k)L,rintermsofl(k)randu(k)r.Aswewillseeshortly,segmentingtheactivationfunctionsbasedonl(k)randu(k)risusefultoboundagivenactivationfunction.Wenotetherearemultiplewaysofsegmentingtheactivationfunctionsandde\ufb01ningthepartitionedsets(e.g.basedonthevaluesofl(k)r,u(k)rratherthantheirsigns),andwecaneasilyincorporatethisintoourframeworktoprovidethecorrespondingexplicitoutputboundsforthenewpartitionsets.Inthecasestudy,weconsiderS+k,S\u00b1kandS\u2212kforthefouractivations,asthispartitionre\ufb02ectsthecurvatureoftanh,sigmoidandarctanfunctionsandactivationstatesofReLU.Boundingtanh/sigmoid/arctan.Fortanhactivation,\u03c3(y)=1\u2212e\u22122y1+e\u22122y;forsigmoidactivation,\u03c3(y)=11+e\u2212y;forarctanactivation,\u03c3(y)=arctan(y).Allfunctionsareconvexononeside(y<0)andconcaveontheotherside(y>0),thusthesamerulescanbeusedto\ufb01ndh(k)U,randh(k)L,r.Belowwecall(l(k)r,\u03c3(l(k)r))asleftend-pointand(u(k)r,\u03c3(u(k)r))asrightend-point.Forr\u2208S+k,since\u03c3(y)isconcave,wecanleth(k)U,rbeanytangentlineof\u03c3(y)atpointd\u2208[l(k)r,u(k)r],andleth(k)L,rpassthetwoend-points.Similarly,\u03c3(y)isconcaveforr\u2208S+k,thuswecanleth(k)L,rbeanytangentlineof\u03c3(y)atpointd\u2208[l(k)r,u(k)r]andleth(k)U,rpassthetwoend-points.Lastly,forr\u2208S\u00b1k,wecanleth(k)U,rbethetangentlinethatpassestheleftend-pointand(d,\u03c3(d))whered\u22650andh(k)U,rbethetangentlinethatpassestherightend-pointand(d,\u03c3(d))whered\u22640.Thevalueofdfortranscendentalfunctionscanbefoundusingabinarysearch.TheplotsofupperandlowerboundsfortanhandsigmoidareinFigure1and3(inAppendix).Plotsforarctanaresimilarandsoomitted.BoundingReLU.ForReLUactivation,\u03c3(y)=max(0,y).Ifr\u2208S+k,wehave\u03c3(y)=yandsowecanseth(k)U,r=h(k)L,r=y;ifr\u2208S\u2212k,wehave\u03c3(y)=0,andthuswecanseth(k)U,r=h(k)L,r=0;ifr\u2208S\u00b1k,wecanseth(k)U,r=u(k)ru(k)r\u2212l(k)r(y\u2212l(k)r)andh(k)L,r=ay,0\u2264a\u22641.Settinga=u(k)ru(k)r\u2212l(k)rleadstothelinearlowerboundusedinFast-Lin[20].Thus,Fast-Linisaspecialcaseunderourframework.Weproposetoadaptivelychoosea,whereweseta=1whenu(k)r\u2265|l(k)r|anda=0whenu(k)r<|l(k)r|.Inthisway,theareabetweenthelowerboundh(k)L,r=ayand\u03c3(y)(whichre\ufb02ectsthegapbetweenthelowerboundandtheReLUfunction)isalwaysminimized.Asshowninourexperiments,theadaptiveselectionofh(k)L,rbasedonthevalueofu(k)randl(k)rhelpstoachieveatightercerti\ufb01edlowerbound.Figure4(inAppendix)illustratestheideadiscussedhere.6\fSummary.Wesummarizedtheaboveanalysisonchoosingvalidlinearfunctionsh(k)U,randh(k)L,rinTable2and3.Ingeneral,aslongash(k)U,randh(k)L,rareidenti\ufb01edfortheactivationfunctions,wecanuseCorollary3.3tocomputecerti\ufb01edlowerboundsforgeneralactivationfunctions.Notethatthereremainmanyotherchoicesofh(k)U,randh(k)L,rasvalidupper/lowerboundsof\u03c3(y),butideally,wewouldlikethemtobecloseto\u03c3(y)inordertoachieveatighterlowerboundofminimumdistortion.3.3ExtensiontoquadraticboundsInadditiontothelinearboundsonactivationfunctions,theproposedCROWNframeworkcanalsoincorporatequadraticboundsbyaddingaquadratictermtoh(k)U,randh(k)L,r:h(k)U,r(y)=\u03b7(k)U,ry2+\u03b1(k)U,r(y+\u03b2(k)U,r),h(k)L,r(y)=\u03b7(k)L,ry2+\u03b1(k)L,r(y+\u03b2(k)L,r),where\u03b7(k)U,r,\u03b7(k)L,r\u2208R.Followingtheprocedureofunwrappingtheactivationfunctionsatthelayerm\u22121,weshowinAppendixDthattheoutputupperboundandlowerboundwithquadraticapproximationsare:fUj(x)=\u03a6m\u22122(x)(cid:62)Q(m\u22121)U\u03a6m\u22122(x)+2p(m\u22121)U\u03a6m\u22122(x)+s(m\u22121)U,(3)fLj(x)=\u03a6m\u22122(x)(cid:62)Q(m\u22121)L\u03a6m\u22122(x)+2p(m\u22121)L\u03a6m\u22122(x)+s(m\u22121)L,(4)whereQ(m\u22121)U=W(m\u22121)(cid:62)D(m\u22121)UW(m\u22121),Q(m\u22121)L=W(m\u22121)(cid:62)D(m\u22121)LW(m\u22121),p(m\u22121)U,p(m\u22121)L,s(m\u22121)U,ands(m\u22121)Larede\ufb01nedinAppendixDduetopagelimit.Whenm=2,\u03a6m\u22122(x)=xandwecandirectlyoptimizeoverx\u2208Bp(x0,\u0001);otherwise,wecanusethepostactivationboundsoflayerm\u22122astheconstraints.D(m\u22121)Uin(3)isadiagonalmatrixwithi-thentrybeingW(m)j,i\u03b7(m\u22121)U,i,ifW(m)j,i\u22650orW(m)j,i\u03b7(m\u22121)L,i,ifW(m)j,i<0.Thus,ingeneralQ(m\u22121)Uisinde\ufb01nite,resultinginanon-convexoptimizationwhen\ufb01ndingtheglobalboundsasinCorollary3.3.Fortunately,byproperlychoosingthequadraticbounds,wecanmaketheproblemmaxx\u2208Bp(x0,\u0001)fUj(x)intoaconvexQuadraticProgrammingproblem;forexample,wecanlet\u03b7(m\u22121)U,i=0forallW(m)j,i>0andlet\u03b7(m\u22121)L,i>0tomakeD(m\u22121)Uhaveonlynegativeandzerodiagonalsforthemaximizationproblem\u2013thisisequivalenttoapplyingalinearupperboundandaquadraticlowerboundtoboundtheactivationfunction.Similarly,forD(m\u22121)L,welet\u03b7(m\u22121)U,i=0forallW(m)j,i<0andlet\u03b7(m\u22121)L,i>0tomakeD(m\u22121)Lhavenon-negativediagonalsandhencetheproblemminx\u2208Bp(x0,\u0001)fLj(x)isconvex.Wecansolvethisconvexprogramwithprojectedgradientdescent(PGD)forx\u2208Bp(x0,\u0001)andArmijolinesearch.Empirically,we\ufb01ndthatPGDusuallyconvergeswithinafewiterations.4ExperimentsMethods.ForReLUnetworks,CROWN-AdaisCROWNwithadaptivelinearbounds(Sec.3.2),CROWN-QuadisCROWNwithquadraticbounds(Sec.3.3).Fast-LinandFast-Liparestate-of-the-artfastcerti\ufb01edlowerboundproposedin[20].Reluplexcansolvetheexactminimumadversarialdistortionbutisonlycomputationallyfeasibleforverysmallnetworks.LP-FullisbasedontheLPformulationin[18]andwesolveLPsforeachneuronexactlytoachievethebestpossiblebound.Fornetworkswithotheractivationfunctions,CROWN-generalisourproposedmethod.ModelandDataset.WeevaluateCROWNandotherbaselinesonmulti-layerperceptron(MLP)modelstrainedonMNISTandCIFAR-10datasets.Wedenoteafeed-forwardnetworkwithmlayersandnneuronsperlayerasm\u00d7[n].FormodelswithReLUactivation,weusepretrainedmodelsprovidedby[20]andalsoevaluatethesamesetof100randomtestimagesandrandomattacktargetsasin[20](accordingtotheirreleasedcode)tomakeourresultscomparable.FortrainingNNmodelswithotheractivationfunctions,wesearchforbestlearningrateandweightdecayparameterstoachieveasimilarlevelofaccuracyasReLUmodels.ImplementationandSetup.WeimplementouralgorithmusingPython(numpywithnumba).MostcomputationsinourmethodarematrixoperationsthatcanbeautomaticallyparallelizedbytheBLASlibrary;however,wesetthenumberofBLASthreadsto1forafaircomparisontoothermethods.ExperimentswereconductedonanIntelSkylakeserverCPUrunningat2.0GHzonGoogleCloud.Ourcodeisavailableathttps://github.com/CROWN-Robustness/Crown7\f(a)MNIST2\u00d7[20],(cid:96)2(b)MNIST2\u00d7[20],(cid:96)\u221e(c)MNIST3\u00d7[20],(cid:96)2(d)MNIST3\u00d7[20],(cid:96)\u221eFigure2:Certi\ufb01edlowerboundsandminimumdistortioncomparisonsfor(cid:96)2and(cid:96)\u221edistortions.Lefty-axisisdistortionandrighty-axis(blackline)iscomputationtime(seconds,logarithmicscale).Onthetopof\ufb01guresaretheavg.CLEVERscoreandtheupperboundfoundbyC&Wattack[6].Fromlefttorightin(a)-(d):CROWN-Ada,(CROWN-Quad),Fast-Lin,Fast-Lip,LP-Fulland(Reluplex).Table4:Comparisonofcerti\ufb01edlowerboundsonlargeReLUnetworks.Boundsaretheaverageover100images(skippedmisclassi\ufb01edimages)withrandomattacktargets.PercentageimprovementsarecalculatedagainstFast-LinasFast-LipisworsethanFast-Lin.NetworkCerti\ufb01edBoundsImprovement(%)AverageComputationTime(sec)(cid:96)pnormFast-LinFast-LipCROWN-AdaCROWN-AdavsFast-LinFast-LinFast-LipCROWN-AdaMNIST4\u00d7[1024](cid:96)11.576490.728001.88217+19%1.802.043.54(cid:96)20.188910.064870.22811+21%1.781.963.79(cid:96)\u221e0.008230.002640.00997+21%1.532.173.57CIFAR-107\u00d7[1024](cid:96)10.864680.092391.09067+26%13.2119.7622.43(cid:96)20.059370.004070.07496+26%12.5718.7121.82(cid:96)\u221e0.001340.000080.00169+26%8.9820.3416.66Table5:Comparisonofcerti\ufb01edlowerboundsbyCROWN-AdaonReLUnetworksandCROWN-generalonnetworkswithtanh,sigmoidandarctanactivations.CIFARmodelswithsigmoidactiva-tionsachievemuchworseaccuracythanothernetworksandarethusexcluded.NetworkCerti\ufb01edBoundsbyCROWN-AdaandCROWN-generalAverageComputationTime(sec)(cid:96)pnormReLUtanhsigmoidarctanReLUtanhsigmoidarctanMNIST3\u00d7[1024](cid:96)13.002312.484072.942392.332461.251.611.681.70(cid:96)20.508410.272870.444710.303451.261.761.611.75(cid:96)\u221e0.025760.011820.021220.013631.371.781.761.77CIFAR-106\u00d7[2048](cid:96)10.912010.44059-0.4619871.6289.77-83.80(cid:96)20.052450.02538-0.0251571.5184.22-83.12(cid:96)\u221e0.001140.00055-0.0005549.2859.72-58.04ResultsonSmallNetworks.Figure2showsthecerti\ufb01edlowerboundfor(cid:96)2and(cid:96)\u221edistortionsfoundbydifferentalgorithmsonsmallnetworks,whereReluplexisfeasibleandwecanobservethegapbetweendifferentcerti\ufb01edlowerboundsandthetrueminimumadversarialdistortion.ReluplexandLP-Fullareordersofmagnitudesslowerthanothermethods(notethelogarithmicscaleonrighty-axis),andCROWN-Quad(for2-layer)andCROWN-Adaachievethelargestlowerbounds.ImprovementsofCROWN-AdaoverFast-Linaremoresigni\ufb01cantinlargerNNs,asweshowbelow.ResultsonLargeReLUNetworks.Table4demonstratesthelowerboundsfoundbydifferentalgorithmsforallcommon(cid:96)pnorms.CROWN-Adasigni\ufb01cantlyoutperformsFast-LinandFast-Lip,whilethecomputationtimeincreasedbylessthan2XoverFast-Lin,andiscomparablewithFast-Lip.SeeAppendixforresultsonmorenetworks.ResultsonDifferentActivations.Table7comparesthecerti\ufb01edlowerboundcomputedbyCROWN-generalforfouractivationfunctionsanddifferent(cid:96)pnormonlargenetworks.CROWN-generalisabletocertifynon-triviallowerboundsforallfouractivationfunctionsef\ufb01ciently.ComparingtoCROWN-AdaonReLUnetworks,certifyinggeneralactivationsthatarenotpiece-wiselinearonlyincursabout20%additionalcomputationaloverhead.5ConclusionWehavepresentedageneralframeworkCROWNtoef\ufb01cientlycomputeacerti\ufb01edlowerboundofminimumdistortioninneuralnetworksforanygivendatapointx0.CROWNfeaturesadaptiveboundsforimprovedrobustnesscerti\ufb01cationandappliestogeneralactivationfunctions.Moreover,experimentsshowthat(1)CROWNoutperformsstate-of-the-artbaselinesonReLUnetworksand(2)CROWNcanef\ufb01cientlycertifynon-triviallowerboundsforlargenetworkswithover10Kneuronsandwithdifferentactivationfunctions.8\fAcknowledgementThisworkwassupportedinpartbyNSFIIS-1719097,Intelfacultyaward,GoogleCloudCreditsforResearchProgramandGPUsdonatedbyNVIDIA.Tsui-WeiWengandLucaDanielarepartiallysupportedbyMIT-IBMWatsonAILabandMIT-Skoltechprogram.References[1]A.Fawzi,S.-M.Moosavi-Dezfooli,andP.Frossard,\u201cTherobustnessofdeepnetworks:Ageometricalperspective,\u201dIEEESignalProcessingMagazine,vol.34,no.6,pp.50\u201362,2017.[2]B.BiggioandF.Roli,\u201cWildpatterns:Tenyearsaftertheriseofadversarialmachinelearning,\u201darXivpreprintarXiv:1712.03141,2017.[3]C.Szegedy,W.Zaremba,I.Sutskever,J.Bruna,D.Erhan,I.Goodfellow,andR.Fergus,\u201cIntriguingpropertiesofneuralnetworks,\u201darXivpreprintarXiv:1312.6199,2013.[4]I.J.Goodfellow,J.Shlens,andC.Szegedy,\u201cExplainingandharnessingadversarialexamples,\u201dICLR,2015.[5]S.-M.Moosavi-Dezfooli,A.Fawzi,andP.Frossard,\u201cDeepfool:asimpleandaccuratemethodtofooldeepneuralnetworks,\u201dinIEEEConferenceonComputerVisionandPatternRecognition,2016,pp.2574\u20132582.[6]N.CarliniandD.Wagner,\u201cTowardsevaluatingtherobustnessofneuralnetworks,\u201dinIEEESymposiumonSecurityandPrivacy(SP),2017,pp.39\u201357.[7]H.Chen,H.Zhang,P.-Y.Chen,J.Yi,andC.-J.Hsieh,\u201cShow-and-fool:Craftingadversarialexamplesforneuralimagecaptioning,\u201darXivpreprintarXiv:1712.02051,2017.[8]N.Papernot,P.McDaniel,I.Goodfellow,S.Jha,Z.B.Celik,andA.Swami,\u201cPracticalblack-boxattacksagainstmachinelearning,\u201dinACMAsiaConferenceonComputerandCommunicationsSecurity,2017,pp.506\u2013519.[9]Y.Liu,X.Chen,C.Liu,andD.Song,\u201cDelvingintotransferableadversarialexamplesandblack-boxattacks,\u201dICLR,2017.[10]P.-Y.Chen,H.Zhang,Y.Sharma,J.Yi,andC.-J.Hsieh,\u201cZOO:Zerothorderoptimizationbasedblack-boxattackstodeepneuralnetworkswithouttrainingsubstitutemodels,\u201dinACMWorkshoponArti\ufb01cialIntelligenceandSecurity,2017,pp.15\u201326.[11]W.Brendel,J.Rauber,andM.Bethge,\u201cDecision-basedadversarialattacks:Reliableattacksagainstblack-boxmachinelearningmodels,\u201dICLR,2018.[12]A.Kurakin,I.Goodfellow,andS.Bengio,\u201cAdversarialexamplesinthephysicalworld,\u201darXivpreprintarXiv:1607.02533,2016.[13]I.Evtimov,K.Eykholt,E.Fernandes,T.Kohno,B.Li,A.Prakash,A.Rahmati,andD.Song,\u201cRobustphysical-worldattacksonmachinelearningmodels,\u201darXivpreprintarXiv:1707.08945,2017.[14]A.AthalyeandI.Sutskever,\u201cSynthesizingrobustadversarialexamples,\u201darXivpreprintarXiv:1707.07397,2017.[15]G.Katz,C.Barrett,D.L.Dill,K.Julian,andM.J.Kochenderfer,\u201cReluplex:Anef\ufb01cientsmtsolverforverifyingdeepneuralnetworks,\u201dinInternationalConferenceonComputerAidedVeri\ufb01cation.Springer,2017,pp.97\u2013117.[16]A.Sinha,H.Namkoong,andJ.Duchi,\u201cCerti\ufb01abledistributionalrobustnesswithprincipledadversarialtraining,\u201dICLR,2018.[17]J.Peck,J.Roels,B.Goossens,andY.Saeys,\u201cLowerboundsontherobustnesstoadversarialperturbations,\u201dinNIPS,2017.[18]J.Z.KolterandE.Wong,\u201cProvabledefensesagainstadversarialexamplesviatheconvexouteradversarialpolytope,\u201dICML,2018.[19]A.Raghunathan,J.Steinhardt,andP.Liang,\u201cCerti\ufb01eddefensesagainstadversarialexamples,\u201dICLR,2018.9\f[20]T.-W.Weng,H.Zhang,H.Chen,Z.Song,C.-J.Hsieh,D.Boning,I.S.Dhillon,andL.Daniel,\u201cTowardsfastcomputationofcerti\ufb01edrobustnessforrelunetworks,\u201dICML,2018.[21]A.LomuscioandL.Maganti,\u201cAnapproachtoreachabilityanalysisforfeed-forwardreluneuralnetworks,\u201darXivpreprintarXiv:1706.07351,2017.[22]C.-H.Cheng,G.N\u00fchrenberg,andH.Ruess,\u201cMaximumresilienceofarti\ufb01cialneuralnetworks,\u201dinInternationalSymposiumonAutomatedTechnologyforVeri\ufb01cationandAnalysis.Springer,2017,pp.251\u2013268.[23]M.FischettiandJ.Jo,\u201cDeepneuralnetworksas0-1mixedintegerlinearprograms:Afeasibilitystudy,\u201darXivpreprintarXiv:1712.06174,2017.[24]N.Carlini,G.Katz,C.Barrett,andD.L.Dill,\u201cProvablyminimally-distortedadversarialexamples,\u201darXivpreprintarXiv:1709.10207,2017.[25]R.Ehlers,\u201cFormalveri\ufb01cationofpiece-wiselinearfeed-forwardneuralnetworks,\u201dinInterna-tionalSymposiumonAutomatedTechnologyforVeri\ufb01cationandAnalysis.Springer,2017,pp.269\u2013286.[26]M.HeinandM.Andriushchenko,\u201cFormalguaranteesontherobustnessofaclassi\ufb01eragainstadversarialmanipulation,\u201dinNIPS,2017.[27]T.-W.Weng,H.Zhang,P.-Y.Chen,J.Yi,D.Su,Y.Gao,C.-J.Hsieh,andL.Daniel,\u201cEvaluatingtherobustnessofneuralnetworks:Anextremevaluetheoryapproach,\u201dICLR,2018.[28]T.Gehr,M.Mirman,D.Drachsler-Cohen,P.Tsankov,S.Chaudhuri,andM.Vechev,\u201cAi2:Safetyandrobustnesscerti\ufb01cationofneuralnetworkswithabstractinterpretation,\u201dinIEEESymposiumonSecurityandPrivacy(SP),vol.00,2018,pp.948\u2013963.[29]K.Dvijotham,R.Stanforth,S.Gowal,T.Mann,andP.Kohli,\u201cAdualapproachtoscalableveri\ufb01cationofdeepnetworks,\u201dUAI,2018.[30]A.Madry,A.Makelov,L.Schmidt,D.Tsipras,andA.Vladu,\u201cTowardsdeeplearningmodelsresistanttoadversarialattacks,\u201dICLR,2018.[31]X.CaoandN.Z.Gong,\u201cMitigatingevasionattackstodeepneuralnetworksviaregion-basedclassi\ufb01cation,\u201dinACMAnnualComputerSecurityApplicationsConference,2017,pp.278\u2013287.[32]P.-Y.Chen,Y.Sharma,H.Zhang,J.Yi,andC.-J.Hsieh,\u201cEAD:elastic-netattackstodeepneuralnetworksviaadversarialexamples,\u201dAAAI,2018.10\f", "award": [], "sourceid": 2388, "authors": [{"given_name": "Huan", "family_name": "Zhang", "institution": "UCLA"}, {"given_name": "Tsui-Wei", "family_name": "Weng", "institution": "MIT"}, {"given_name": "Pin-Yu", "family_name": "Chen", "institution": "IBM Research AI"}, {"given_name": "Cho-Jui", "family_name": "Hsieh", "institution": "UCLA, Google Research"}, {"given_name": "Luca", "family_name": "Daniel", "institution": "MIT"}]}