{"title": "Robustness of conditional GANs to noisy labels", "book": "Advances in Neural Information Processing Systems", "page_first": 10271, "page_last": 10282, "abstract": "We study the problem of learning conditional generators from noisy labeled samples, where the labels are corrupted by random noise. A standard training of conditional GANs will not only produce samples with wrong labels, but also generate poor quality samples. We consider two scenarios, depending on whether the noise model is known or not. When the distribution of the noise is known, we introduce a novel architecture which we call Robust Conditional GAN (RCGAN). The main idea is to corrupt the label of the generated sample before feeding to the adversarial discriminator, forcing the generator to produce samples with clean labels. This approach of passing through a matching noisy channel is justified by accompanying multiplicative approximation bounds between the loss of the RCGAN and the distance between the clean real distribution and the generator distribution. This shows that the proposed approach is robust, when used with a carefully chosen discriminator architecture, known as projection discriminator. When the distribution of the noise is not known, we provide an extension of our architecture, which we call RCGAN-U, that learns the noise model simultaneously while training the generator. We show experimentally on MNIST and CIFAR-10 datasets that both the approaches consistently improve upon baseline approaches, and RCGAN-U closely matches the performance of RCGAN.", "full_text": "RobustnessofconditionalGANstonoisylabelsKiranKoshyThekumparampil\u2020,AshishKhetan\u2020,ZinanLin\u2021,SewoongOh\u2020\u2020UniversityofIllinoisatUrbana-Champaign,\u2021CarnegieMellonUniversityAbstractWestudytheproblemoflearningconditionalgeneratorsfromnoisylabeledsam-ples,wherethelabelsarecorruptedbyrandomnoise.AstandardtrainingofconditionalGANswillnotonlyproducesampleswithwronglabels,butalsogener-atepoorqualitysamples.Weconsidertwoscenarios,dependingonwhetherthenoisemodelisknownornot.Whenthedistributionofthenoiseisknown,weintroduceanovelarchitecturewhichwecallRobustConditionalGAN(RCGAN).Themainideaistocorruptthelabelofthegeneratedsamplebeforefeedingtotheadversarialdiscriminator,forcingthegeneratortoproducesampleswithcleanlabels.Thisapproachofpassingthroughamatchingnoisychannelisjusti\ufb01edbyaccompanyingmultiplicativeapproximationboundsbetweenthelossoftheRCGANandthedistancebetweenthecleanrealdistributionandthegeneratordistribution.Thisshowsthattheproposedapproachisrobust,whenusedwithacarefullychosendiscriminatorarchitecture,knownasprojectiondiscriminator.Whenthedistributionofthenoiseisnotknown,weprovideanextensionofourarchitecture,whichwecallRCGAN-U,thatlearnsthenoisemodelsimultaneouslywhiletrainingthegenerator.WeshowexperimentallyonMNISTandCIFAR-10datasetsthatboththeapproachesconsistentlyimproveuponbaselineapproaches,andRCGAN-UcloselymatchestheperformanceofRCGAN.1IntroductionConditionalgenerativeadversarialnetworks(GAN)havebeenwidelysuccessfulinseveralapplica-tionsincludingimprovingimagequality,semi-supervisedlearning,reinforcementlearning,categorytransformation,styletransfer,imagede-noising,compression,in-painting,andsuper-resolution[30,13,49,36,26,58].ThegoaloftrainingaconditionalGANistogeneratesamplesfromdistribu-tionssatisfyingcertainconditioningonsomecorrelatedfeatures.Concretely,givensamplesfromjointdistributionofadatapointxandalabely,wewanttolearntogeneratesamplesfromthetrueconditionaldistributionoftherealdataPX|Y.AcanonicalconditionalGANstudiedinliteratureisthecaseofdiscretelabely[30,36,35,32].Signi\ufb01cantprogresseshavebeenmadeinthissetting,whicharetypicallyevaluatedonthequalityoftheconditionalsamples.TheseincludemeasuringinceptionscoresandintraFr\u00e9chetinceptiondistances,visualinspectionondownstreamtaskssuchascategorymorphingandsuperresolution[32],andfaithfulnessofthesamplesasmeasuredbyhowaccuratelywecaninfertheclassthatgeneratedthesample[36].WestudytheproblemoftrainingconditionalGANswithnoisydiscretelabels.Bynoisylabels,werefertoasettingwherethelabelyforeachexampleinthetrainingsetisrandomlycorrupted.Suchnoisecanresultfromanadversarydeliberatelycorruptingthedata[7]orfromhumanerrorsincrowdsourcedlabelcollection[12,18].Thiscanbemodeledasarandomprocess,whereacleandataAuthoremailsarethekump2@illinois.edu,ashish.khetan09@gmail.com,zinanl@andrew.cmu.edu,andswoh@illinois.edu.ThisworkusedtheExtremeScienceandEngineeringDiscoveryEnvironment(XSEDE),whichissupportedbyNationalScienceFoundationgrantnumberOCI-1053575.Speci\ufb01cally,itusedtheBridgessystem,whichissupportedbyNSFawardnumberACI-1445606,atthePittsburghSupercomputingCenter(PSC).32ndConferenceonNeuralInformationProcessingSystems(NeurIPS2018),Montr\u00e9al,Canada.\fpointx\u2208Xanditslabely\u2208[m]aredrawnfromajointdistributionPX,Ywithmclasses.Foreachdatapoint,thelabeliscorruptedbypassingthroughanoisychannelrepresentedbyarow-stochasticconfusionmatrixC\u2208Rm\u00d7mde\ufb01nedasCij,P(eY=j|Y=i).Thisde\ufb01nesajointdistributionforthedatapointxandanoisylabeley:ePX,eY.IfwetrainastandardconditionalGANonnoisysamples,thenitsolvesthefollowingoptimization:minG\u2208GmaxD\u2208FV(G,D)=E(x,ey)\u223cePX,eY[\u03c6(D(x,ey))]+Ez\u223cN,y\u223cePeY[\u03c6(1\u2212D(G(z;y),y))],(1)where\u03c6isafunctionofchoice,DandGarethediscriminatorandthegeneratorrespectivelyoptimizedoverfunctionclassesGandFofourchoice,andNisthedistributionofthelatentrandomvector.Fortypicalchoicesof\u03c6,forexamplelog(\u00b7),andlargeenoughfunctionclassesGandF,theoptimalconditionalgeneratorlearnstogeneratesamplesfromePX|eY,thecorruptedconditionaldistribution.Inotherwords,itgeneratessamplesXfromclassesotherthanwhatitisconditionedon.Asthelearneddistributionexhibitssuchabias,wecallthisnaiveapproachtheBiasedGAN.Underthissetting,thereisafundamentalquestionofinterest:canwedesignanovelconditionalGANthatcangeneratesamplesfromthetrueconditionaldistributionPX|Y,evenwhentrainedonnoisysamples?Severalaspectsofthisproblemmakeitchallengingandinteresting.First,theperformanceofsuchrobustGANshoulddependonhownoisythechannelCis.IfCisrank-de\ufb01cient,forinstance,thentherearemultipledistributionsthatresultinthesamedistributionafterthecorruption,andhencenoreliablelearningofthetruedistributionispossible.Wewouldideallywantatheoreticalguaranteethatshowssuchtrade-offbetweenCandtherobustnessofGANs.Next,whenthenoiseisfromerrorsincrowdsourcedlabels,wemighthavesomeaccesstotheconfusionmatrixCfromhistoricaldata.Onothercasesofadversarialcorruption,wemightnothaveanyinformationofC.Wewanttoproviderobustsolutionstoboth.Finally,animportantpracticalchallengeinthissettingistocorrectthenoisylabelsinthetrainingdata.Weaddressallsuchvariationsinourapproachesandmakethefollowingcontributions.Ourcontributions.WeintroducetwoarchitecturestotrainconditionalGANswithnoisysamples.First,whenwehavetheknowledgeoftheconfusionmatrixC,weproposeRCGAN(RobustConditionalGAN)inSection2.We\ufb01rstprovethatminimizingtheRCGANlossprovablyrecoversthecleandistributionPX|Y(Theorem2),undercertainconditionsontheclassFofdiscriminatorsweoptimizeover(Assumption1).WeshowthatsuchaconditiononFisalsonecessary,aswithoutit,thetraininglosscanbearbitrarilysmallwhilethegenerateddistributioncanbefarfromthereal(Theorem4).TheassumptionleadstoourparticularchoiceofthediscriminatorinRCGAN,calledprojectiondiscriminator[32]thatsatis\ufb01esalltheconditions(Remark1).Finally,weprovidea\ufb01nitesamplegeneralizationboundshowingthatthelossminimizedintrainingRCGANdoesgeneralize,andresultsinthelearneddistributionbeingclosetothecleanconditionaldistributionPX|Y(Theorem3).Experimentalresultsinbenchmarkdatasetscon\ufb01rmthatRCGANisrobustagainstnoisysamples,andimprovessigni\ufb01cantlyoverthenaiveBiasedGAN.Secondly,whenwedonothaveaccesstoC,weproposeRCGAN-U(RCGANwithUnknownnoisedistribution)inSection4.WeprovideexperimentalresultsshowingthatperformancegainssimilartothatofRCGANcanbeachieved.Finally,weshowcasethepracticaluseofthuslearnedconditionalGANs,byusingitto\ufb01xthenoisylabelsinthetrainingdata.Numericalexperimentscon\ufb01rmthattheRCGANframeworkprovidesamorerobustapproachtocorrectingthenoisylabels,comparedtothestate-of-the-artmethodsthatrelyonlyondiscriminators.Relatedwork.Twopopulartrainingmethodsforgenerativemodelsarevariationalauto-encoders[22]andadversarialtraining[14].Theadversarialtrainingapproachhasmadesigni\ufb01cantadvancesinseveralapplicationsofpracticalinterest.[37,2,5]proposenewarchitecturesthatsigni\ufb01cantlyimprovethetraininginpracticalimagedatasets.[58,16]proposenewarchitecturestotransferthestyleofoneimagetotheotherdomain.[26,43]showhowtoenhanceagivenimagewithlearnedgenerator,byenhancingtheresolutionormakingitmorerealistic.[27,50]showhowtogeneratevideosand[51,1]demonstratethat3-dimensionalmodelscanbegeneratedfromadversarialtraining.[23]proposesanewarchitectureencodingcausalstructuresinconditionalGANs.[42]introducesthestate-of-the-artconditionalindependencetester.Onadifferentdirection,severalrecentapproachesshowcasehowthemanifoldlearnedbytheadversarialtrainingcanbeusedtosolveinverseproblems[9,57,53].2\fConditionalGANshavebeenproposedasasuccessfultoolforvariousapplications,includingclassconditionalimagegeneration[36],imagetoimagetranslation[21],andimagegenerationfromtext[38,55].MostoftheconditionalGANsincorporatetheclassinformationbynaivelyconcatenatingittotheinputorfeaturevectoratsomemiddlelayer[30,13,38,55].AC-GANs[36]createsanauxiliaryclassi\ufb01ertoincorporateclassinformation.ProjectiondiscriminatorGAN[32]takesaninnerproductbetweentheembeddedclassvectorandthefeaturevector.Arecentwork[31]whichproposesspectralnormalizationshowsthathighqualityimagegenerationon1000-classILSVRC2012dataset[39]canbeachievedusingprojectionconditionaldiscriminator.Robustnessof(unconditional)GANsagainstadversarialorrandomnoisehasrecentlybeenstudiedin[10,52].[52]studiesanadversarialattackthatperturbsthediscriminatoroutput.TheproposedarchitectureofRCGANisinspiredbyacloselyrelatedworkofAmbientGANin[10].AmbientGANisageneralframeworkaddressinganycorruptionontheimageitself(notnecessarilyjustthelabels).Givencorruptedsampleswithaknowncorruption,AmbientGANappliesthatcorruptiontotheoutputofthegeneratorbeforefeedingittothediscriminator.MotivatedbythesuccessofAmbientGANinde-noising,weproposeRCGAN.Animportantdistinctionisthatwemakespeci\ufb01carchitecturalchoicesguidedbyourtheoreticalanalysisthatgivesasigni\ufb01cantgaininpractice(AppendixJ).Underthescenarioofinterestwithnoisylabels,weprovidesharpanalysesforboththepopulationlossandthe\ufb01nitesampleloss.SuchsharpcharacterizationsdonotexistforthemoregeneralAmbientGANscenarios.Further,ourRCGAN-Udoesnotrequiretheknowledgeoftheconfusionmatrix,departingfromtheAmbientGANapproach.Learningclassi\ufb01ersfromnoisylabelsisacloselyrelatedproblem.Recently[34,20]proposedatheoreticallymotivatedclassi\ufb01erwhichminimizesthemodi\ufb01edlossinpresenceofnoisylabelsandshowedimprovementovertherobustclassi\ufb01ers[29,45,46].[47]proposedaddingnoisetotheclassi\ufb01eroutputtomatchthenoisedistribution.Notation.Foravector,kxkp=(Pi|xi|p)1/pisthe\u2018p-norm.Foramatrix,let|||A|||p=maxkxkp=1kAxkpdenotetheoperatornorm.Then|||A|||\u221e=maxiPj|Aij|,|||A|||1=maxjPi|Aij|and|||A|||2=\u03c3max(A),themaximumsingularvalue.1isallonesvectorandIisidentitymatrix.[n]={1,...,n}.Foravectorx\u2208Rn,xi(i\u2208[n])isitsi-thcoordinate.2Our\ufb01rstarchitecture:RCGANTrainingaconditionalGANwithnoisysamplesresultsinabiasedgenerator.WeproposeRobustConditionalGAN(RCGAN)architecturewhichhasthefollowingpre-processing,discriminatorupdate,andgeneratorupdatesteps.WeassumeinthissectionthattheconfusionsmatrixCisknown(andthemarginalPYcaneasilybeinferred),andaddressthecaseofunknownCinSection4.GDCzyx\u02dcy\u02dcyrealxrealpermutationregularizeradversariallosshFigure1:TheoutputxoftheconditionalgeneratorGispairedwithanoisylabeleycorruptedbythechannelC.ThediscriminatorDestimateswhetheragivenlabeledsampleiscomingfromtherealdata(xreal,\u02dcyreal)orgenerateddata(x,\u02dcy).Thepermutationregularizerhispre-trainedonrealdata.Pre-processing:Wetrainaclassi\ufb01erh\u2217topredictthenoisylabeleygivenxunderalossl,trainedonh\u2217\u2208argminh\u2208HE(x,ey)\u223cePX,eY[\u2018(h(x),ey)],whereHisaparametricfamilyofclassi\ufb01ers(typicallyneuralnetworks)andePX,eYisthejointdistributionofrealxandcorrespondingrealnoisyey.D-step:Wetrainonthefollowingadversarialloss.Inthesecondtermbelow,yisgeneratedaccordingtoPYandcorrespondingnoisylabelsaregeneratedbycorruptingtheyaccordingtotheconditionaldistributionCywhichisthey-throwoftheconfusionmatrix(assumedtobeknown):maxD\u2208FE(x,ey)\u223cePX,eY[\u03c6(D(x,ey))]+Ez\u223cN,y\u223cPYey|y\u223cCy[\u03c6(1\u2212D(G(z;y),ey))],3\fwherePYisthetruemarginaldistributionofthelabels,Nisthedistributionofthelatentrandomvector,andFisafamilyofdiscriminators.G-step:Wetrainonthefollowinglosswithsome\u03bb>0:minG\u2208GEz\u223cN,y\u223cPYey|y\u223cCy(cid:2)\u03c6(1\u2212D(G(z;y),ey))+\u03bb\u2018(h\u2217(G(z;y)),y)(cid:3),(2)whereGisafamilyofgenerators.Theideaofusingauxiliaryclassi\ufb01ershavebeenusedtoimprovethequalityoftheimageandstabilityofthetraining,forexampleinauxiliaryclassi\ufb01erGAN(AC-GAN)[36],andimprovethequalityofclusteringinthelatentspace[33].Weproposeanauxiliaryclassi\ufb01ersh,mitigatingapermutationerror,whichweempiricallyidenti\ufb01edonnaiveimplementationofourideawithnoregularizers.Permutationregularizer(controlledby\u03bb).Permutationerroroccursif,whenaskedtoproducesamplesfromatargetclass,thetrainedgeneratorproducessamplesdominantlyfromasingleclassbutdifferentfromthetargetclass.Weproposearegularizerh\u2217,whichpredictsthenoisylabeley.Aslongastheconfusionmatrixisdiagonallydominant,whichisanecessaryconditionforidenti\ufb01ability,thisregularizerencouragesthecorrectpermutationofthelabels.Moreregularizerscouldpotentiallyprovideadditonalrobustnessandwediscussonesuchregularizer(similartotheInfoGANloss[11])inAppendixK.TheoreticalmotivationforRCGAN.When\u03bb=0,wegetthestandardconditionalGANupdatesteps,albeitonewhichtriestominimizediscriminatorlossbetweenthenoisyrealdistributionePandthedistributioneQofthegeneratorwhenthelabelispassedthroughthesamenoisychannelparameterizedbyC.ThemainideaofRCGANistominimizeacertaindivergencebetweennoisyrealdataandnoisygenerateddata.Forexample,thechoiceofboundedfunctionsF={D:X\u00d7[m]\u2192[0,1]}andidentitymap\u03c6(a)=aleadstoatotalvariationminimization;ThelossminimizedintheG-stepisthetotalvariationdTV(eP,eQ),supS\u2208X\u00d7[m]{eP(S)\u2212eQ(S)}betweenthetwodistributionswithcorruptedlabels,uptosomescalingandsomeshift.IfwechooseF={D:X\u00d7[m]\u2192[0,1]}and\u03c6(a)=log(a),thenweareminimizingtheJensen-ShannondivergencedJS(eP,eQ),(1/2)dKL(ePk(eP+eQ)/2)+(1/2)dKL(eQk(eP+eQ)/2),wheredKL(\u00b7k\u00b7)denotestheKullback-Leiblerdivergence.Thefollowingtheoremprovidesapproximationguaranteesforsomecommondivergencemeasuresovernoisychannel,justifyingourproposedpracticalapproach.WerefertoAppendixBforaproof.Theorem1.LetPX,YandQX,YbetwodistributionsonX\u00d7[m].LetePX,eY,eQX,eYbethecorre-spondingdistributionswhensamplesfromP,QarepassedthroughthenoisychannelgivenbytheconfusionmatrixC\u2208Rm\u00d7m(asde\ufb01nedinSection1).IfCisfull-rank,weget,dTV(cid:16)eP,eQ(cid:17)\u2264dTV(P,Q)\u2264|||C\u22121|||\u221edTV(cid:16)eP,eQ(cid:17),and(3)dJS(cid:16)eP(cid:13)(cid:13)(cid:13)eQ(cid:17)\u2264dJS(PkQ)\u2264|||C\u22121|||\u221er8dJS(cid:16)eP(cid:13)(cid:13)(cid:13)eQ(cid:17).(4)Tointerpretthistheorem,letQdenotethedistributionofthegenerator.ThetheoremimpliesthatwhenthenoisygeneratordistributioneQbecomesclosetothenoisyrealdistributionePintotalvariationorinJensen-Shannondivergence,thenthegeneratordistributionQmustbeclosetothedistributionofrealdataPinthesamemetric.Thisjusti\ufb01estheuseoftheproposedarchitectureRCGAN.Inpractice,weminimizethesampledivergenceofthetwodistributions,insteadofthepopulationdivergenceasanalyzedintheabovetheorem.However,thesestandarddivergencesareknowntonotgeneralizeintrainingGANs[3].Tothisend,weprovideinSection3analysesonneuralnetworkdistances,whichareknowntogeneralize,andprovide\ufb01nitesamplebounds.3TheoreticalAnalysisofRCGANItwasshownin[3]thatstandardGANlossesofJensen-ShannondivergenceandWassersteindistancebothfailtogeneralizewitha\ufb01nitenumberofsamples.Ontheotherhand,morerecentadvancesinanalyzingGANsin[56,6,4]showpromisinggeneralizationboundsbyeitherassumingLipschitzconditionsonthegeneratormodelorbyrestrictingtheanalysistocertainclassesofdistributions.Underthoseassumptions,whereJSdivergencegeneralizes,Theorem1justi\ufb01estheuseofthe4\fproposedRCGAN.However,thoserequirethedistributiontobeGaussian,mixtureofGaussians,oroutputofaneuralnetworkgenerator,forexamplein[4].Inthissection,weprovideanalysesofRCGANonadistancethatgeneralizeswithoutanyassumptionsonthedistributionoftherealdataasprovenin[3]:neuralnetworkdistance.Formally,consideraclassofreal-valuedfunctionsFandafunction\u03c6:[0,1]\u2192Rwhichiseitherconvexorconcave.Theneuralnetworkdistanceisde\ufb01nedasdF,\u03c6(P,Q),supD\u2208FE(x,y)\u223cP[\u03c6(D(x,y))]+E(x,y)\u223cQ[\u03c6(1\u2212D(x,y))]\u2212\u00b5\u03c6.(5)wherePisthedistributionoftherealdata,Qisthatofthegenerateddata,and\u00b5\u03c6istheconstantcorrectiontermtoensurethatdF,\u03c6(P,P)=0.WefurtherassumethatFincludesthreeconstantfunctionsD(x,y)=0,D(x,y)=1/2,andD(x,y)=1,inordertoensurethatdF,\u03c6(P,Q)\u22650anddF,\u03c6(P,P)=0,asshowninLemma1intheAppendix.TheproposedRCGANwith\u03bb=0approximatelyminimizestheneuralnetworkdistancedF,\u03c6(eP,eQ)betweenthetwocorrupteddistributions.Inpractice,Fisaparametricfamilyoffunctionsfromaspeci\ufb01cneuralnetworkarchitecturethatthedesignerhaschosen.Intheory,weaimtoidentifyhowthechoiceofclassFprovidesthedesiredapproximationboundssimilartothoseinTheorem1,butforneuralnetworkdistances.Thisanalysisleadstothechoiceofprojectiondiscriminator[32]tobeusedinRCGAN(Remark1).Ontheotherhand,weshowinTheorem4thataninappropriatechoiceofthediscriminatorarchitecturecancausenon-approximation.Further,weprovidethesamplecomplexityoftheapproximationboundsinTheorem3.Werefertotheun-regularizedversionwith\u03bb=0assimplyRCGAN.Inthissection,wefocusonaclassoflossfunctionscalledIntegralProbabilityMetrics(IPM)where\u03c6(x)=x[44].ThisisapopularchoiceoflossinGANsinpractice[48,2,8]andinanalyses[4].WewritetheinducedneuralnetworkdistanceasdF(P,Q),droppingthe\u03c6inthenotation.3.1ApproximationboundsforneuralnetworkdistancesWede\ufb01neanoperation\u25e6overamatrixT\u2208Rm\u00d7mandaclassFoffunctionsonX\u00d7[m]\u2192RasT\u25e6F,ng(x,y)=Xey\u2208[m]Tyeyf(x,ey)|f\u2208Fo.(6)ThismakesitconvenienttorepresenttheneuralnetworkdistancecorruptedbynoisewithaconfusionmatrixC\u2208Rm\u00d7m,whereCyeyistheprobabilityalabelyiscorruptedasey.Formally,itfollowsfrom(5)and(6)thatdF(eP,eQ)=dC\u25e6F(P,Q).WerefertoAppendixFforaproof.FordF(eP,eQ)tobeagoodapproximationofdF(P,Q),weshowthatthefollowingconditionissuf\ufb01cient.Assumption1.WeassumethattheclassofdiscriminatorfunctionsFcanbedecomposedintothreepartsF={f1+f2+c|f1\u2208F1,f2\u2208F2}suchthatc\u2208Risanyconstantand\u2022F1satis\ufb01estheinclusioncondition:T\u25e6F1\u2286F1,(7)forall|||T|||\u221e,maxiPj|Tij|=1;and\u2022F2satis\ufb01esthelabelinvariancecondition:thereexistsaclassF02offunctionsoveronlyx,suchthatF2=(cid:8)\u03b1g(x,y)|g(x,y)=f(x),foranyf(x)\u2208F02,and\u03b1\u2208[0,1](cid:9).(8)WediscussthenecessityandpracticalimplicationsofthisassumptioninSection3.2,andgiveexamplessatisfyingtheseassumptionsinRemark1andAppendixC.Noticethatatrivialclasswithasingleconstantzerofunctionsatis\ufb01esbothinclusionandlabelinvarianceconditions.Forexample,wecanchoosec=0andalsochoosetoseteitherF1={f(x,y)=0}orF2={f(x,y)=0},inwhichcaseFonlyneedstosatisfyeitheroneoftheconditionsinAssumption1.The\ufb02exibilitythatwegainbyallowingthesetadditionF1+F2iscriticalinapplyingtheseconditionstopracticaldiscriminators,especiallyinprovingRemark1.NotethatintheinclusionconditioninEq.7,we5\frequiretheconditiontoholdforallmax-normboundedset:{T:maxiPj|Tij|=1}.Thereasonaweakerconditionofallrow-stochasticmatrices,{T:PjTij=1},doesnotsuf\ufb01ceisthatinordertoprovetheupperboundinEq.9,weneedtoapplytheinvarianceconditionto|||C\u22121|||\u22121\u221eC\u22121\u25e6F.Thismatrix|||C\u22121|||\u22121\u221eC\u22121isnotrow-stochastic,butstillmax-normbounded.We\ufb01rstshowthatAssumption1issuf\ufb01cientforapproximabilityoftheneuralnetworkdistancefromcorruptedsamples.FortwodistributionsPX,YandQX,YonX\u00d7[m],letePX,eYandeQX,eYbethecorrespondingcorrupteddistributionsrespectively,wherethelabelYispassedthroughthenoisychannelde\ufb01nedbytheconfusionmatrixC\u2208Rm\u00d7m,i.e.eP(x,ey)=PyP(x,y)Cy,ey.Theorem2.IfaclassoffunctionsFsatis\ufb01esAssumption1,thendF(eP,eQ)\u2264dF(P,Q)\u2264|||C\u22121|||\u221edF(eP,eQ),(9)wherewefollowtheconventionthat|||C\u22121|||\u221e=\u221eifCisnotfullrank.WerefertoAppendixFforaproof.Thisgivesasharpcharacterizationonhowtwodistancesarerelated:theonewecanminimizeintrainingRCGAN(i.e.dF(eP,eQ))andthetruemeasureofcloseness(i.e.dF(P,Q)).Althoughthelattercannotbedirectlyevaluatedorminimized,RCGANisapproximatelyminimizingthetrueneuralnetworkdistancedF(P,Q)asdesired.Thelowerboundprovesaspecialcaseofthedata-processinginequality.TworandomvariablesfromPandQgetcloserinneuralnetworkdistance,whenpassedthroughastochastictransformation.TheupperboundputsalimitonhowmuchcloserePandeQcanget,dependingonthenoiselevel.Thisfundamentaltrade-offiscapturedby|||C\u22121|||\u221e.UnderthenoiselesscasewhereCistheidentitymatrix,wehave|||C\u22121|||\u221e=1andwerecoveratrivialfactthatthetwodistancesareequal.Ontheotherextreme,ifCisrankde\ufb01cient,weusetheconventionthat|||C\u22121|||\u221e=\u221eandthetwodistancescanbearbitrarilydifferent.Theapproximationfactorof|||C\u22121|||\u221ecaptureshowmuchthespaceFcanshrinkbythenoiseC.ThiscoincideswithTheorem1,whereasimilartrade-offwasidenti\ufb01edfortheTVdistance.InRemark3inAppendixD,weshowthattheseboundscannotbetightenedforgeneralP,Q,andF.Theorem2showsthat(i)RCGANcanlearnthetrueconditionaldistribution,justifyingitsuse;and(ii)performanceofRCGANisdeterminedbyhownoisythesamplesarevia|||C\u22121|||\u221e.Therearestilltwolooseends.First,doespracticalimplementationofRCGANarchitecturesatisfytheinclusionand/orlabelinvarianceassumptions?Secondly,inpracticewecannotminimizedF(eP,eQ)asweonlyhavea\ufb01nitenumberofsamples.Howmuchdoweloseinthis\ufb01nitesampleregime?Wegivepreciseanswerstoeachquestioninthefollowingtwosections.3.2InclusionandlabelinvarianceassumptionsForRCGAN,weproposeapopularstate-of-the-artdiscriminatorforconditionalGANsknownastheprojectiondiscriminator[32],parametrizedbyV\u2208Rm\u00d7dV,v\u2208Rdv,and\u03b8\u2208Rd\u03b8:DV,v,\u03b8(x,y)=vec(y)TV\u03c8(x;\u03b8)+vT\u03c80(x;\u03b8),(10)where\u03c8(x;\u03b8)\u2208RdVand\u03c80(x;\u03b8)\u2208RdvarevectorvaluedparametricfunctionsforsomeintegersdV,dv,andvec(y)T=[Iy=1,...,Iy=m].The\ufb01rsttermsatis\ufb01estheinclusioncondition,asanyoperationwithTcanbeabsorbedintoV.Thesecondtermislabelinvariantasitdoesnotdependony.Thisismadepreciseinthefollowingremark,whoseproofisprovidedinAppendixG.Togetherwiththisremark,theapproximabilityresultinTheorem2justi\ufb01estheuseofprojectiondiscriminatorsinRCGAN,whichweuseinallourexperiments.Remark1.Theclassofprojectiondiscriminators{DV,v,\u03b8(x,y)}V\u2208V1,v\u2208V2,\u03b8\u2208\u0398de\ufb01nedinEq.10satis\ufb01esAssumption1forany\u03c8,\u03c80,and\u0398,ifV1=(cid:8)V\u2208Rm\u00d7dV(cid:12)(cid:12)maxi|Vij|\u22641forallj\u2208[dV](cid:9),andV2=(cid:8)v\u2208Rdv(cid:12)(cid:12)kvk\u22641(cid:9).OtherchoicesofV1andV2arealsopossible.Forexample,V01={V\u2208Rm\u00d7dV|Pjmaxi|Vij|\u22641}orV001={V\u2208Rm\u00d7dV||||V|||\u221e=maxiPj|Vij|\u22641}arealsosuf\ufb01cient.We\ufb01ndtheproposedchoiceofV1easytoimplement,asacolumn-wiseL\u221e-normnormalizationviaprojectedgradientdescent.WedescribeimplementationdetailsinAppendixL.InAppendixE,weshowthatAssumption1isalsonecessary.6\f3.3FinitesampleanalysisInpractice,wedonothaveaccesstotheprobabilitydistributionsePandeQ.Instead,weobserveasetofsamplesofa\ufb01nitesizen,fromeachofthem.IntrainingGAN,weminimizetheempiricalneuralnetworkdistance,dF(ePn,eQn),whereePnandeQndenotetheempiricaldistributionofnsamples.Inspiredfromtherecentgeneralizationresultsin[3],weshowthatthisempiricaldistanceminimizationleadstosmalldF(P,Q)uptoanadditiveerrorthatvanisheswithanincreasingsamplesize.Asshownin[3],Lipschitzandboundedfunctionclassesarecriticalinachievingsampleef\ufb01ciencyforGANs.Wefollowthesameapproachoverasimilarfunctionclass.LetFp,L={Du(x,y)\u2208[0,1]|Du(x,y)isL-Lipschitzinuandu\u2208U\u2286Rp},(11)beaclassofboundedfunctionswithparameteru\u2208Rp.WesaythatFisL-Lipschitzinuif|Du1(x,y)\u2212Du2(x,y)|\u2264Lku1\u2212u2k,\u2200u1,u2\u2208U,x\u2208X,y\u2208[m].(12)Theorem3.ForanyclassFp,LofboundedLipschitzfunctionsDu(x,y)satisfyingAssumption1,thereexistsauniversalconstantc>0suchthatdFp,L(ePn,eQn)\u2212\u0001\u2264dFp,L(P,Q)\u2264|||C\u22121|||\u221e(cid:0)dFp,L(ePn,eQn)+\u0001(cid:1),(13)withprobabilityatleast1\u2212e\u2212pforany\u03b5>0andnlargeenough,n\u2265(cp/\u00012)log(pL/\u0001).WerefertoAppendixIforaproof.Thisjusti\ufb01estheproposedRCGANwhichminimizesdF(ePn,eQn),asitleadstothegeneratorQbeingclosetotherealdistributionPinneuralnetworkdistance,dF(P,Q).TheseboundsinherittheapproximabilityofthepopulationversionfromTheorem2.4Oursecondarchitecture:RCGAN-UInmanyrealworldscenariostheconfusionmatrixCisunknown.WeproposeRCGAN-Unknown(RCGAN-U)algorithmwhichjointlyestimatestherealdistributionPandthenoisemodelC.Thepre-processingandDstepsoftheRCGAN-UarethesameasthoseofRCGAN,assumingthecurrentguessMoftheconfusionmatrix.AstheG-stepin(2)isnotdifferentiableinC,weusethefollowingreparameterizedestimatoroftheloss,motivatedbysimilartechniqueintrainingclassi\ufb01ersfromnoisylabels:minG\u2208G,M\u2208CEz\u223cNy\u223cPY(cid:2)\u03c6M(G(z;y),y,D)+\u03bbl(h\u2217(G(z;y)),y)(cid:3)whereCisthesetofalltransitionmatricesand\u03c6M(x,y,D)=Pey\u2208[m]Myey\u03c6(1\u2212D(x,ey)).5ExperimentsImplementationdetailsareexplainedinAppendixL.Weconsiderone-coinbasedmodels,whichareparameterizedbytheirlabelaccuracyprobability\u03b1.Inthismodelasamplewithtruelabelyis\ufb02ippeduniformlyatrandomtolabeleyin[m]\\{y}withprobability1\u2212\u03b1.TheentriesofitsconfusionmatrixC,willthenbeCii=\u03b1andCi6=j=(1\u2212\u03b1)/(m\u22121),wheremisthenumberofclasses.Wecallthismodeluniform\ufb02ippingmodel.Codetoreproduceourexperimentsisavailableathttps://github.com/POLane16/Robust-Conditional-GAN.Baselines.FirstisthebiasedGAN,whichisaconditionalGANapplieddirectlyonthenoisydata.Thelossishencebiased,andthetrueconditionaldistributionisnottheoptimalsolutionofthisbiasedloss.Nextnaturalbaselineisusingde-biasedclassi\ufb01erasthediscriminator,motivatedbytheapproachof[34]onlearningclassi\ufb01ersfromnoisylabels.ThemaininsightistomodifythelossfunctionaccordingtoC,suchthatinexpectationthelossmatchesthatofthecleandata.WerefertothisapproachasunbiasedGAN.Concretely,whentrainingthediscriminator,weproposethefollowing(modi\ufb01ed)de-biasedloss:maxD\u2208FE(x,ey)\u223cePX,eY(cid:2)Xy\u2208[m](C\u22121)eyy\u03c6(D(x,y))(cid:3)+Ez\u223cNy\u223cPY(cid:2)\u03c6(1\u2212D(G(z;y),y))(cid:3).(14)Thisisunbiased,asthe\ufb01rsttermisequivalenttoE(x,y)\u223cPX,Y[\u03c6(D(x,y))],whichisthestandardGANlosswithcleansamples.However,suchde-biasingissensitivetotheconditionnumberofC,andcanbecomenumericallyunstablefornoisychannelsasC\u22121haslargeentries[20].Forboththedataset,weuselinearclassi\ufb01ersforpermutationregularizeroftheRCGAN-Uarchitecture.7\f0.00.10.20.30.40.50.60.70.80.90.00.20.40.60.81.0RCGAN+yRCGANRCGAN-UUn-biasedBiasednoiseintherealdata(1\u2212\u03b1)generatorlabelaccuracy0.00.10.20.30.40.50.60.70.80.90.00.20.40.60.81.0RCGAN+yRCGANRCGAN-UUn-biasedBiasedUn-biasedclassi\ufb01ernoiseintherealdata(1\u2212\u03b1)labelrecoveryaccuracyFigure2:NoisyMNISTdataset:OurRCGANmodelsconsistentlyimprovesuponallcompetingbaselineapproachesingeneratorlabelaccuracy(left).Thetrendcontinuesinlabelrecoveryaccuracy(right),whereourproposedRCGAN-classi\ufb01ersimprovesuponunbiasedclassi\ufb01er[34],whichisoneofthestate-of-the-artapproachestailoredforlabelrecovery.5.1MNISTWetrain\ufb01vearchitecturesonMNISTdatasetcorruptedbytheuniform\ufb02ippingnoise:RCGAN+y,RCGAN,RCGAN-U,unbiasedGAN,andbiasedGAN.RCGAN+yarchitecturehasthesamearchitectureasRCGANbuttheinputtothe\ufb01rstlayerofitsdiscriminatorisconcatenatedwithaone-hotrepresentationofthelabel.WediscussourtechniquestoovercomethechallengesinvolvedintrainingRCGAN+yinAppendixL.Conditionalgeneratorscanbeusedtogeneratesamplesxfromaparticularclassy,intheclassesitlearned.Wethencanuseapre-trainedclassi\ufb01erftocompareytothetrueclassofthesample,f(x)(asperceivedbytheclassi\ufb01erf).Wecomparethegeneratorlabelaccuracyde\ufb01nedasEy\u223cPY,Z\u223cN[I{y=f(G(z,y))}],inFigure2,leftpanel.Wegenerated10klabelschosenuniformlyatrandomandcorrespondingconditionalsamplesfromthegenerators,andcalculatedthegeneratorlabelaccuracyusingaCNNclassi\ufb01erpre-trainedonthecleanMNISTdatatoanaccuracyof99.2%.TheproposedRCGANsigni\ufb01cantlyimprovesuponthecompetingbaselines,andachievesalmostperfectlabelaccuracyuntilahighnoiseof\u03b1=0.3.RCGAN+yfurtherimprovesuponRCGANandtogainveryhighaccuracyevenat\u03b1=0.125.ThehighaccuracyofRCGAN-UsuggeststhatrobusttrainingispossiblewithoutpriorknowledgeoftheconfusionmatrixC.Asexpected,biasedGANhasanaccuracyofapproximately1\u2212\u03b1.AnimmediateapplicationofrobustGANsisrecoveringthetruelabelsofthenoisytrainingdata,whichisanimportantandchallengingproblemincrowdsourcing.Weproposeanewmeta-algorithm,whichwecallcGAN-label-recovery,whichuseanyconditionalgeneratorG(z,y)trainedonthenoisysamples,toestimatethetruelabel,as\u02c6y,ofasamplexusingthefollowingoptimization.\u02c6y\u2208argminy\u2208[m](cid:8)minzy|||G(zy,y)\u2212x|||22(cid:9).(15)IntherightpanelofFigure2wecomparethelabelrecoveryaccuracyofthemeta-algorithmusingthe\ufb01veconditionalGANs,on500randomlychosennoisytrainingsamples.Thisisalsocomparedtoastate-of-the-artmethod[34]forlabelrecovery,whichproposedminimizingunbiasedlossfunctiongiventhenoisylabelsandtheconfusionmatrix.Thisunbiasedclassi\ufb01er,wasshowntooutperformstherobustclassi\ufb01ers[29,45,46]andcanbeusedtopredictthetruelabelofthetrainingexamples.InFigures5ofAppendixM,weshowexampleimagesfromallthegenerators.5.2CIFAR-10InFigure3,weshowtheinceptionscore[40]andthelabelaccuracyoftheconditionalgeneratorforthefourapproaches:ourproposedRCGANandRCGAN-U,againstthebaselinesUnbiased(Section5)andBiased(Section1)GANstrainedusingCIFAR-10images[24],whilevaryingthelabelaccuracyoftherealdataunderuniform\ufb02ippingmodel.InRCGAN-U,evenwiththeregularizer,thelearnedconfusionmatrixwasapermutedversionofthetrueC,possiblybecausealinearclassi\ufb01ermightbetoosimpletoclassifyCIFARimages.Tocombatthis,weinitializedtheconfusionmatrixMtobediagonallydominant(AppendixL).8\f0.00.20.40.60.87.47.67.88.08.28.4RCGAN-URCGANUn-biasedBiasednoiseintherealdata(1\u2212\u03b1)Inceptionscore0.00.20.40.60.80.10.20.30.40.50.60.70.8RCGAN-URCGANUn-biasedBiasednoiseintherealdata(1\u2212\u03b1)GeneratorlabelaccuracyFigure3:NoisyCIFAR-10dataset:OurRCGAN(red)andRCGAN-U(blue)consistentlyimprovesuponUnbiased(magenta)andBiased(black)GANstrainedonnoisyCIFAR-10ininceptionscores(left)andingeneratorlabelaccuracy(right).IntheleftpanelofFigure3,ourRCGANandRCGAN-Uconsistentlyachievehigherinceptionscoresthantheothertwoapproaches.TheUnbiasedGANishighlyunstableandhenceproducesgarbageimagesforlargenoise(Fig.6),possiblyduetonumericalinstabilityof|||C\u22121|||\u221e,asnotedin[20].Thiscon\ufb01rmsthatrobustGANsnotonlyproduceimagesfromthecorrectclass,butalsoproducebetterqualityimages.IntherightpanelofFigure3,wereportthegeneratorlabelaccuracy(Section5.1)on1ksamplesgeneratedbyeachGAN.WeclassifythegeneratorimagesusingaResNet-110model1trainedtoanaccuracyof92.3%onthenoiselessCIFAR-10dataset.BiasedGANhassigni\ufb01cantlylowerlabelaccuracywhereastheUnbiasedGANhaslowinceptionscore.InFigure6inAppendixM,weshowexampleimagesfromthethreegeneratorsforthedifferent\ufb02ippingprobabilities.WebelievethatthegaininusingtheproposedrobustGANswillbelarger,whenwetraintohigheraccuracywithlargernetworksandextensivehyperparametertuning,withlatestinnovationsinGANarchitectures,forexample[54,28,17,19,41].6ConclusionStandardconditionalGANscanbesensitivetonoiseinthelabelsofthetrainingdata.Weproposetwonewarchitecturestomakethemrobust,onerequiringtheknowledgeofthedistributionofthenoiseandanotherwhichdoesnot,anddemonstratetherobustnessonbenchmarkdatasetsofCIFAR-10andMNIST.Wefurthershowcasehowthelearnedgeneratorcanbeusedtorecoverthecorruptedlabelsinthetrainingdata,whichcanpotentiallybeusedinpracticalapplications.TheproposedarchitecturecombinesthenoiseaddingideaofAmbientGAN[10],projectiondiscriminatorof[32],andregularizerssimilartothoseinInfoGAN[11].InspiredbyAmbientGAN[10],themainideaistopairthegeneratoroutputimagewithalabelthatispassedthroughanoisychannel,beforefeedingtothediscriminator.Wejustifythisideaofnoiseaddingbyidentifyingacertainclassofdiscriminatorsthathavegoodgeneralizationproperties.Inparticular,weprovethatprojectiondiscriminator,introducedin[32],hasagoodgeneralizationproperty.Weshowcasethattheproposedarchitecture,whentrainedwitharegularizer,hassuperiorrobustnessonbenchmarkdatasets.AcknowledgementThisworkissupportedbyNSFawardsCNS-1527754,CCF-1553452,CCF-1705007,RI-1815535andGoogleFacultyResearchAward.ThisworkusedtheExtremeScienceandEngineeringDiscoveryEnvironment(XSEDE),whichissupportedbyNationalScienceFoundationgrantnumberOCI-1053575.Speci\ufb01cally,itusedtheBridgessystem,whichissupportedbyNSFawardnumberACI-1445606,atthePittsburghSupercomputingCenter(PSC).ThisworkispartiallysupportedbythegenerousresearchcreditsonAWScloudcomputingresourcesfromAmazon.1https://github.com/wenxinxu/resnet-in-tensorflow9\fReferences[1]PanosAchlioptas,OlgaDiamanti,IoannisMitliagkas,andLeonidasGuibas.Representationlearningandadversarialgenerationof3Dpointclouds.arXivpreprintarXiv:1707.02392,2017.[2]MartinArjovsky,SoumithChintala,andL\u00e9onBottou.WassersteinGAN.arXivpreprintarXiv:1701.07875,2017.[3]SanjeevArora,RongGe,YingyuLiang,TengyuMa,andYiZhang.Generalizationandequilibriumingenerativeadversarialnets(GANs).arXivpreprintarXiv:1703.00573,2017.[4]YuBai,TengyuMa,andAndrejRisteski.ApproximabilityofdiscriminatorsimpliesdiversityinGANs.arXivpreprintarXiv:1806.10586,2018.[5]DavidBerthelot,TomSchumm,andLukeMetz.BEGAN:Boundaryequilibriumgenerativeadversarialnetworks.arXivpreprintarXiv:1703.10717,2017.[6]GBiau,BCadre,MSangnier,andUTanielian.SometheoreticalpropertiesofGANs.arXivpreprintarXiv:1803.07819,2018.[7]BattistaBiggio,BlaineNelson,andPavelLaskov.Supportvectormachinesunderadversariallabelnoise.InAsianConferenceonMachineLearning,pages97\u2013112,2011.[8]Miko\u0142ajBi\u00b4nkowski,DougalJSutherland,MichaelArbel,andArthurGretton.DemystifyingMMDGANs.arXivpreprintarXiv:1801.01401,2018.[9]AshishBora,AjilJalal,EricPrice,andAlexandrosGDimakis.Compressedsensingusinggenerativemodels.arXivpreprintarXiv:1703.03208,2017.[10]AshishBora,EricPrice,andAlexandrosGDimakis.AmbientGAN:Generativemodelsfromlossymeasurements.InInternationalConferenceonLearningRepresentations(ICLR),2018.[11]XiChen,YanDuan,ReinHouthooft,JohnSchulman,IlyaSutskever,andPieterAbbeel.Info-GAN:Interpretablerepresentationlearningbyinformationmaximizinggenerativeadversarialnets.InAdvancesinNeuralInformationProcessingSystems,pages2172\u20132180,2016.[12]AlexanderPhilipDawidandAllanMSkene.Maximumlikelihoodestimationofobservererror-ratesusingtheemalgorithm.Appliedstatistics,pages20\u201328,1979.[13]EmilyLDenton,SoumithChintala,RobFergus,etal.DeepgenerativeimagemodelsusingaLaplacianpyramidofadversarialnetworks.InAdvancesinneuralinformationprocessingsystems,pages1486\u20131494,2015.[14]IanGoodfellow,JeanPouget-Abadie,MehdiMirza,BingXu,DavidWarde-Farley,SherjilOzair,AaronCourville,andYoshuaBengio.Generativeadversarialnets.InAdvancesinneuralinformationprocessingsystems,pages2672\u20132680,2014.[15]IshaanGulrajani,FarukAhmed,MartinArjovsky,VincentDumoulin,andAaronCCourville.ImprovedtrainingofWassersteinGANs.InAdvancesinNeuralInformationProcessingSystems,pages5769\u20135779,2017.[16]PhillipIsola,Jun-YanZhu,TinghuiZhou,andAlexeiAEfros.Image-to-imagetranslationwithconditionaladversarialnetworks.arXivpreprint,2017.[17]AlexiaJolicoeur-Martineau.Therelativisticdiscriminator:akeyelementmissingfromstandardGAN.arXivpreprintarXiv:1807.00734,2018.[18]DavidRKarger,SewoongOh,andDevavratShah.Iterativelearningforreliablecrowdsourcingsystems.InAdvancesinneuralinformationprocessingsystems,pages1953\u20131961,2011.[19]TeroKarras,TimoAila,SamuliLaine,andJaakkoLehtinen.ProgressivegrowingofGANsforimprovedquality,stability,andvariation.arXivpreprintarXiv:1710.10196,2017.[20]AshishKhetan,ZacharyCLipton,andAnimaAnandkumar.Learningfromnoisysingly-labeleddata.arXivpreprintarXiv:1712.04577,2017.10\f[21]TaeksooKim,MoonsuCha,HyunsooKim,JungkwonLee,andJiwonKim.Learningtodiscovercross-domainrelationswithgenerativeadversarialnetworks.arXivpreprintarXiv:1703.05192,2017.[22]DiederikPKingmaandMaxWelling.Auto-encodingvariationalbayes.arXivpreprintarXiv:1312.6114,2013.[23]MuratKocaoglu,ChristopherSnyder,AlexandrosGDimakis,andSriramVishwanath.Causal-GAN:Learningcausalimplicitgenerativemodelswithadversarialtraining.arXivpreprintarXiv:1709.02023,2017.[24]AlexKrizhevskyandGeoffreyHinton.Learningmultiplelayersoffeaturesfromtinyimages.Technicalreport,Citeseer,2009.[25]YannLeCun.Themnistdatabaseofhandwrittendigits.http://yann.lecun.com/exdb/mnist/,1998.[26]ChristianLedig,LucasTheis,FerencHusz\u00e1r,JoseCaballero,AndrewCunningham,AlejandroAcosta,AndrewAitken,AlykhanTejani,JohannesTotz,ZehanWang,etal.Photo-realisticsingleimagesuper-resolutionusingagenerativeadversarialnetwork.arXivpreprint,2016.[27]XiaodanLiang,LisaLee,WeiDai,andEricPXing.DualmotionGANforfuture-\ufb02owembeddedvideoprediction.arXivpreprint,2017.[28]ZinanLin,AshishKhetan,GiuliaFanti,andSewoongOh.PacGAN:Thepoweroftwosamplesingenerativeadversarialnetworks.arXivpreprintarXiv:1712.04086,2017.[29]BingLiu,YangDai,XiaoliLi,WeeSunLee,andPhilipSYu.Buildingtextclassi\ufb01ersusingpositiveandunlabeledexamples.InDataMining,2003.ICDM2003.ThirdIEEEInternationalConferenceon,pages179\u2013186.IEEE,2003.[30]MehdiMirzaandSimonOsindero.Conditionalgenerativeadversarialnets.arXivpreprintarXiv:1411.1784,2014.[31]TakeruMiyato,ToshikiKataoka,MasanoriKoyama,andYuichiYoshida.Spectralnormalizationforgenerativeadversarialnetworks.arXivpreprintarXiv:1802.05957,2018.[32]TakeruMiyatoandMasanoriKoyama.cGANswithprojectiondiscriminator.arXivpreprintarXiv:1802.05637,2018.[33]SudiptoMukherjee,HimanshuAsnani,EugeneLin,andSreeramKannan.ClusterGAN:Latentspaceclusteringingenerativeadversarialnetworks.arXivpreprintarXiv:1809.03627,2018.[34]NagarajanNatarajan,InderjitSDhillon,PradeepKRavikumar,andAmbujTewari.Learningwithnoisylabels.InAdvancesinneuralinformationprocessingsystems,pages1196\u20131204,2013.[35]AnhNguyen,JasonYosinski,YoshuaBengio,AlexeyDosovitskiy,andJeffClune.Plug&playgenerativenetworks:Conditionaliterativegenerationofimagesinlatentspace.arXivpreprintarXiv:1612.00005,2016.[36]AugustusOdena,ChristopherOlah,andJonathonShlens.Conditionalimagesynthesiswithauxiliaryclassi\ufb01ergans.arXivpreprintarXiv:1610.09585,2016.[37]AlecRadford,LukeMetz,andSoumithChintala.Unsupervisedrepresentationlearningwithdeepconvolutionalgenerativeadversarialnetworks.arXivpreprintarXiv:1511.06434,2015.[38]ScottReed,ZeynepAkata,XinchenYan,LajanugenLogeswaran,BerntSchiele,andHonglakLee.Generativeadversarialtexttoimagesynthesis.arXivpreprintarXiv:1605.05396,2016.[39]OlgaRussakovsky,JiaDeng,HaoSu,JonathanKrause,SanjeevSatheesh,SeanMa,ZhihengHuang,AndrejKarpathy,AdityaKhosla,MichaelBernstein,etal.Imagenetlargescalevisualrecognitionchallenge.InternationalJournalofComputerVision,115(3):211\u2013252,2015.11\f[40]TimSalimans,IanGoodfellow,WojciechZaremba,VickiCheung,AlecRadford,andXiChen.ImprovedtechniquesfortrainingGANs.InAdvancesinNeuralInformationProcessingSystems,pages2234\u20132242,2016.[41]MaziarSanjabi,JimmyBa,MeisamRazaviyayn,andJasonDLee.SolvingapproximateWassersteinGANstostationarity.arXivpreprintarXiv:1802.08249,2018.[42]RajatSen,KarthikeyanShanmugam,HimanshuAsnani,ArmanRahimzamani,andSreeramKannan.Mimicandclassify:Ameta-algorithmforconditionalindependencetesting.arXivpreprintarXiv:1806.09708,2018.[43]AshishShrivastava,TomasP\ufb01ster,OncelTuzel,JoshSusskind,WendaWang,andRussWebb.Learningfromsimulatedandunsupervisedimagesthroughadversarialtraining.InTheIEEEConferenceonComputerVisionandPatternRecognition(CVPR),volume3,page6,2017.[44]BharathKSriperumbudur,KenjiFukumizu,ArthurGretton,BernhardSch\u00f6lkopf,andGertRGLanckriet.Onintegralprobabilitymetrics,\u03c6-divergencesandbinaryclassi\ufb01cation.arXivpreprintarXiv:0901.2698,2009.[45]GuillaumeStempfelandLivaRalaivola.Learningkernelperceptronsonnoisydatausingrandomprojections.InInternationalConferenceonAlgorithmicLearningTheory,pages328\u2013342.Springer,2007.[46]GuillaumeStempfelandLivaRalaivola.LearningSVMsfromsloppilylabeleddata.InInternationalConferenceonArti\ufb01cialNeuralNetworks,pages884\u2013893.Springer,2009.[47]SainbayarSukhbaatar,JoanBruna,ManoharPaluri,LubomirBourdev,andRobFergus.Trainingconvolutionalnetworkswithnoisylabels.arXivpreprintarXiv:1406.2080,2014.[48]DougalJSutherland,Hsiao-YuTung,HeikoStrathmann,SoumyajitDe,AadityaRamdas,AlexSmola,andArthurGretton.Generativemodelsandmodelcriticismviaoptimizedmaximummeandiscrepancy.arXivpreprintarXiv:1611.04488,2016.[49]AaronvandenOord,NalKalchbrenner,andKorayKavukcuoglu.Pixelrecurrentneuralnetworks.arXivpreprintarXiv:1601.06759,2016.[50]CarlVondrick,HamedPirsiavash,andAntonioTorralba.Generatingvideoswithscenedynamics.InAdvancesInNeuralInformationProcessingSystems,pages613\u2013621,2016.[51]JiajunWu,ChengkaiZhang,TianfanXue,BillFreeman,andJoshTenenbaum.Learningaprobabilisticlatentspaceofobjectshapesvia3Dgenerative-adversarialmodeling.InAdvancesinNeuralInformationProcessingSystems,pages82\u201390,2016.[52]ZhiXu,ChengtaoLi,andStefanieJegelka.RobustGANsagainstdishonestadversaries.arXivpreprintarXiv:1802.09700,2018.[53]RaymondYeh,ChenChen,TeckYianLim,MarkHasegawa-Johnson,andMinhNDo.Semanticimageinpaintingwithperceptualandcontextuallosses.arXivpreprintarXiv:1607.07539,2016.[54]HanZhang,IanGoodfellow,DimitrisMetaxas,andAugustusOdena.Self-attentiongenerativeadversarialnetworks.arXivpreprintarXiv:1805.08318,2018.[55]HanZhang,TaoXu,HongshengLi,ShaotingZhang,XiaoleiHuang,XiaogangWang,andDimitrisMetaxas.StackGAN:Texttophoto-realisticimagesynthesiswithstackedgenerativeadversarialnetworks.InIEEEInt.Conf.Comput.Vision(ICCV),pages5907\u20135915,2017.[56]PengchuanZhang,QiangLiu,DengyongZhou,TaoXu,andXiaodongHe.Onthediscrimination-generalizationtradeoffinGANs.arXivpreprintarXiv:1711.02771,2017.[57]Jun-YanZhu,PhilippKr\u00e4henb\u00fchl,EliShechtman,andAlexeiAEfros.Generativevisualmanipulationonthenaturalimagemanifold.InEuropeanConferenceonComputerVision,pages597\u2013613.Springer,2016.[58]Jun-YanZhu,TaesungPark,PhillipIsola,andAlexeiAEfros.Unpairedimage-to-imagetranslationusingcycle-consistentadversarialnetworks.arXivpreprintarXiv:1703.10593,2017.12\f", "award": [], "sourceid": 6588, "authors": [{"given_name": "Kiran", "family_name": "Thekumparampil", "institution": "Univ. of Illinois at Urbana-Champaign"}, {"given_name": "Ashish", "family_name": "Khetan", "institution": "Amazon AI Labs"}, {"given_name": "Zinan", "family_name": "Lin", "institution": "Carnegie Mellon University"}, {"given_name": "Sewoong", "family_name": "Oh", "institution": "University of Washington"}]}