{"title": "A Robust Non-Clairvoyant Dynamic Mechanism for Contextual Auctions", "book": "Advances in Neural Information Processing Systems", "page_first": 8657, "page_last": 8667, "abstract": "Dynamic mechanisms offer powerful techniques to improve on both revenue and efficiency by linking sequential auctions using state information, but these techniques rely on exact distributional information of the buyers\u2019 valuations (present and future), which limits their use in learning settings. In this paper, we consider the problem of contextual auctions where the seller gradually learns a model of the buyer's valuation as a function of the context (e.g., item features) and seeks a pricing policy that optimizes revenue. Building on the concept of a bank account mechanism---a special class of dynamic mechanisms that is known to be revenue-optimal---we develop a non-clairvoyant dynamic mechanism that is robust to both estimation errors in the buyer's value distribution and strategic behavior on the part of the buyer. We then tailor its structure to achieve a policy with provably low regret against a constant approximation of the optimal dynamic mechanism in contextual auctions. Our result substantially improves on previous results that only provide revenue guarantees against static benchmarks.", "full_text": "A Robust Non-Clairvoyant Dynamic Mechanism for\n\nContextual Auctions\n\nYuan Deng\n\nericdy@cs.duke.edu\n\nDuke University\n\nDurham, NC\n\nS\u00e9bastien Lahaie\nGoogle Research\nNew York City, NY\n\nslahaie@google.com\n\nVahab Mirrokni\nGoogle Research\nNew York City, NY\n\nmirrokni@google.com\n\nAbstract\n\nDynamic mechanisms offer powerful techniques to improve on both revenue and\nef\ufb01ciency by linking sequential auctions using state information, but these tech-\nniques rely on exact distributional information of the buyers\u2019 valuations (present\nand future), which limits their use in learning settings. In this paper, we consider\nthe problem of contextual auctions where the seller gradually learns a model of\nthe buyer\u2019s valuation as a function of the context (e.g., item features) and seeks a\npricing policy that optimizes revenue. Building on the concept of a bank account\nmechanism\u2014a special class of dynamic mechanisms that is known to be revenue-\noptimal\u2014we develop a non-clairvoyant dynamic mechanism that is robust to both\nestimation errors in the buyer\u2019s value distribution and strategic behavior on the\npart of the buyer. We then tailor its structure to achieve a policy with provably\nlow regret against a constant approximation of the optimal dynamic mechanism in\ncontextual auctions. Our result substantially improves on previous results that only\nprovide revenue guarantees against static benchmarks.\n\n1\n\nIntroduction\n\nAs a fundamental problem in mechanism design, pricing in repeated auctions has been extensively\nstudied in recent years. This is partly motivated by the popularity of selling online ads via auctions,\nan industry totalling annual revenue of hundreds of billions of dollars. Repeated auctions open up\nthe possibility of linking auctions across time using state information in order to enhance revenue\nor welfare, but this introduces several challenges. To guarantee optimal outcomes, the process must\ntake into account the bidders\u2019 incentives to possibly manipulate each individual auction as well as the\nauction state across time. In practice, the seller must also rely on approximate models of the buyers\u2019\npreferences to effectively set auction parameters like reserve prices. These aspects of the problem\nhave so far been explored in two separate strands of the literature on repeated auctions, where items\narrive online and the allocation and payment decisions must be made as soon as an item arrives.\nOne strand, known as dynamic mechanism design, considers an environment in which the seller has\nexact distributional information over the buyers\u2019 values for the items, for the current stage and all\nfuture stages, and designs revenue-maximizing dynamic mechanisms that adapt the auction state\nbased on the buyer\u2019s historical bids [Thomas and Worrall, 1990, Bergemann and V\u00e4lim\u00e4ki, 2010,\nAshlagi et al., 2016, Mirrokni et al., 2016a,b]. However, this clairvoyant framework relies on the\nseller having an accurate forecast of the buyer\u2019s valuation distributions in future auctions. To address\nthis concern, Mirrokni et al. [2018] propose non-clairvoyant dynamic mechanisms, which do not rely\non any information about the future (but do rely on an accurate forecast of the present). They show\nthat a non-clairvoyant dynamic mechanism can achieve a constant approximation to the revenue of\nthe optimal clairvoyant mechanism. The other strand of literature, known as robust price learning,\nfocuses on a setting where the buyer\u2019s value distributions across stages are parameterized by some\ncommon private factors that are unknown to the seller, and designs robust policies to learn from the\n\n33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.\n\n\fbuyer\u2019s bids and set prices with good revenue performance [Amin et al., 2013, 2014, Medina and\nMohri, 2014, Golrezaei et al., 2018]. Although these results also take into account strategic buyer\nbehavior, they only provide guarantees against the revenue-optimal static benchmark, which does not\ntake advantage of auction state across time and whose revenue can be arbitrarily smaller than the\noptimal dynamic benchmark [Papadimitriou et al., 2016].\nIn this work, we consider a scenario in which the designer can only make use of an estimate of the\nbuyer\u2019s value distribution in the present auction stage, which connects dynamic mechanism design\nwith the problem of learning. Designing dynamic auctions in this setting is challenging for several\nreasons. When the seller\u2019s estimate of the distribution is not perfectly aligned with the buyer\u2019s true\ndistribution, it is impossible for the seller to offer a dynamic mechanism that is exactly incentive-\ncompatible and also makes use of the prior on values. Furthermore, unlike static mechanisms in\nwhich the auction for each item is independent of the buyer\u2019s past reports, in a dynamic mechanism a\nbuyer\u2019s misreport can potentially affect auctions for all future items. We overcome these obstacles and\nprovide a robust non-clairvoyant dynamic mechanism such that the extent of the buyers\u2019 misreports\nand the revenue loss can be related to and bounded by the estimation error. We then apply our\nrobust dynamic mechanism to the concrete problem of contextual auctions, where a buyer\u2019s valuation\ndepends on the context that describes the item, but the relationship between the buyer\u2019s valuation\nand the context is unknown to the seller and must be estimated across auctions. The seller\u2019s task is\nto design a policy which adapts the auction mechanism based on the buyer\u2019s historical bids, with\nthe objective of maximizing revenue. Previous results give no-regret policies against the optimal\nstatic mechanism [Amin et al., 2014, Golrezaei et al., 2018], but as mentioned it is known that the\nrevenue gap between optimal static and dynamic mechanisms can be arbitrarily large [Papadimitriou\net al., 2016]. We tailor the structure of our robust non-clairvoyant dynamic mechanism to a learning\nenvironment, leading to a no-regret policy against the strong benchmark of a constant approximation\nof the optimal clairvoyant dynamic mechanism.\n\nRelated Work We brie\ufb02y discuss research in dynamic mechanism design that is closely related\nto the present work. For a comprehensive review of the literature readers are encouraged to refer\nto [Bergemann and Said, 2011]. Papadimitriou et al. [2016] provide an example demonstrating\nthat the revenue gap between optimal static and the dynamic mechanisms can be arbitrarily large,\nwhich is a key motivation for the use of dynamic mechanisms in our setting. Moreover, they show\nthat it is NP-Hard to design the optimal deterministic auctions even in an environment with a\nsingle buyer and two items only. Ashlagi et al. [2016] and Mirrokni et al. [2016b] simultaneously\nand independently provide a fully polynomial-time approximation scheme to compute the optimal\nrandomized mechanism. Our work builds upon the framework of bank account mechanisms developed\nby Mirrokni et al. [2016a,b, 2018]. Based on the bank account mechanism, Mirrokni et al. [2018]\ndesign a non-clairvoyant mechanism achieving 1/3 of the revenue of a clairvoyant mechanism.\nHowever, their mechanism relies on exact distributional information, which makes it unsuitable in\na learning environment where value distributions are estimated. Our robust dynamic mechanism\naddresses this limitation.\nOur work is closely related to dynamic pricing with learning; see [den Boer, 2015] for a recent\nsurvey. There has been a growing body of literature on learning in dynamic pricing in contextual\nauctions with non-strategic buyers [Cohen et al., 2016, Lobel et al., 2018, Leme and Schneider, 2018,\nMao et al., 2018]. In their models, the buyers have homogeneous valuations and are non-strategic,\nand thus, the problem can be reduced to a single-item setting where the buyer acts myopically\nwithout considering the impact on the future auction from their current bids. However, Edelman and\nOstrovsky [2007] provide empirical evidence that the buyers participating in the online advertising\nmarkets do act strategically. The study of robust price learning with strategic buyers was initiated by\nAmin et al. [2013] and Medina and Mohri [2014]. They design no-regret policies in a non-contextual\nenvironment where the buyer\u2019s valuation is \ufb01xed and the seller repeatedly interacts with a single buyer\nthrough posted price auctions, where the buyer is less patient than the seller. The regret guarantee\nis later improved to \u0398(log log T ) by Drutsa [2017, 2018]. Amin et al. [2013] show that no learning\nalgorithm can achieve sublinear revenue loss if the buyer is as patient as the seller.\nFor learning in contextual auctions, Amin et al. [2014] develop a no-regret policy in a setting without\nmarket noise. Golrezaei et al. [2018] enrich the model by incorporating market noise and design a\nno-regret policy for cases where the market noise is known exactly or adversarially selected from\na set of distributions. Liu et al. [2018] apply techniques from differential privacy to learn optimal\n\n2\n\n\freserve prices against non-myopic bidders. All these results are no-regret against the optimal static\nmechanism as a benchmark, whereas our policy is no-regret against a constant-factor approximation\nof the optimal dynamic mechanism which has all distributional information available in advance.\n\n2 Preliminaries\n\nIn a dynamic auction a seller (he) sells a stream of T items that arrive online, based on bids placed\nby strategic buyers. An item must be sold when it arrives. For the sake of simplicity we will focus\non the case of a single buyer (she) throughout this paper.1 At the beginning of stage t a new item\narrives and the buyer\u2019s valuation vt \u2208 [0, at] for the item is drawn independently from a distribution\nFt with density ft. The distributions are not necessarily identical across stages. We assume that ft is\ncontinuous and upper bounded by cf /at where cf is a constant. The domain bounds at are known to\nthe seller and may vary across stages to re\ufb02ect the fact that item valuations may have different scales.2\nAs a special case of this framework, in a contextual auction the item at stage t is represented by an\nobservable feature vector \u03b6t \u2208 Rd with (cid:107)\u03b6t(cid:107)2 \u2264 1. In line with the literature, we assume that the\nfeature vectors are drawn independently from a \ufb01xed distribution D with positive-de\ufb01nite covariance\nmatrix [Golrezaei et al., 2018]. The buyer\u2019s preferences are encoded by a \ufb01xed vector \u03c3 \u2208 Rd and\nthe buyer\u2019s valuation at stage t takes the form vt = at((cid:104)\u03c3, \u03b6t(cid:105) + \u03b5t), where \u03b5t is a noise term with\ncumulative distribution Mt. The distribution Mt and the feature vector \u03b6t are observed by the seller\nbut the buyer\u2019s preference vector \u03c3 remains private. We make the following technical assumption on\nthe sequence of at:\n\nAssumption 1. For all t,(cid:80)\n\nt(cid:48)\u2264t at(cid:48) \u2264 ca \u00b7 t where ca is a constant.\n\nAssumption 1 limits the portion of welfare and revenue that can arise in the \ufb01rst t stages, for any t.\nIts purpose is to rule out situations where a large fraction of revenue comes from the initial stages,\nunder which a large revenue loss may be inevitable since it is impossible for the seller to obtain a\ngood estimate of \u03c3 from just the \ufb01rst few stages.\nOnce the buyer learns her valuation vt at stage t, she then submits a bid bt \u2208 [0, at] to the seller who\nthen decides whether to allocate the item (perhaps stochastically) and what payment to charge. We\nwrite V t to denote the set of all possible sequences (b1, . . . , bt) of buyer bids for the \ufb01rst t stages,\nand similarly we write (\u2206V )t to denote the set of all possible independent distributions over the\nsequence of \ufb01rst t bids. The seller\u2019s distributional beliefs over the buyer\u2019s values across stages are\ndenoted as \u02c6F(1,T ) = ( \u02c6F1, \u02c6F2, . . . , \u02c6FT ). Throughout the paper we will use the notation \u02c6F(t(cid:48),t(cid:48)(cid:48)) to\nrepresent ( \u02c6Ft(cid:48), . . . , \u02c6Ft(cid:48)(cid:48)), and similarly for F(t(cid:48),t(cid:48)(cid:48)), v(t(cid:48),t(cid:48)(cid:48)), and b(t(cid:48),t(cid:48)(cid:48)). A dynamic mechanism is\nrepresented by sequences (x1, . . . , xT ) and (p1, . . . , pT ) where xt and pt denote the allocation rule\nand the payment rule at stage t, respectively. We refer to (cid:104)xt, pt(cid:105) as the stage mechanism at stage t.\nNon-Clairvoyant Dynamic Mechanism. In a non-clairvoyant environment, the seller obtains an\nestimated distribution \u02c6Ft only at stage t and not before, so the mechanism at stage t can only depend\non \u02c6F(1,t). The allocation function xt maps the history of bids b(1,t) and distribution \u02c6F(1,t) to an\nallocation probability, xt : V t \u00d7 (\u2206V )t \u2192 [0, 1]. The payment function pt maps the history of bids\nb(1,t) and the distribution \u02c6F(1,t) to a real-valued payment, pt : V t \u00d7 (\u2206V )t \u2192 R. In line with the\nliterature, we assume the buyer has a quasi-linear utility such that the buyer\u2019s utility from bidding bt\nat stage t is ut\nsetting the seller maintains a model \u02c6\u03c3t for the buyer\u2019s preference vector estimated from prior bidding\nbehavior, and combines with at, \u03b6t, and noise model Mt, which can only be observed at the beginning\nof stage t and not before, to compute \u02c6Ft.\nUtility-Maximizing Buyer. We assume that the buyer knows the true distributions F(1,T ) in advance\nso that she can reason about how the mechanism will evolve over time and compute a bidding\nstrategy that maximizes her utility. Speci\ufb01cally, we consider a buyer who aims to maximize her time\nt(cid:48)=t \u03b3t(cid:48)\u2212t \u00b7 E[ut] at stage t where \u03b3 \u2208 [0, 1) is the discounting factor and the\n\n(cid:1). In the contextual auction\n\ndiscounted utility(cid:80)T\n\n(cid:0)vt; b(1,t); \u02c6F(1,t)\n\n(cid:1) = vt \u00b7 xt\n\n(cid:0)b(1,t); \u02c6F(1,t)\n\n(cid:1)\u2212 pt\n\n(cid:0)b(1,t); \u02c6F(1,t)\n\n1Our results can be extended to multi-buyer settings by using the techniques from Cai et al. [2012] and Mir-\n\nrokni et al. [2018].\n\n2For instance, in a dynamic auction for display advertising, the value of a video ad may be orders of\n\nmagnitude larger than the value of a text ad.\n\n3\n\n\fexpectation is taken with respect to F(1,T ). We note that it is impossible to obtain a no-regret policy\nwhen the buyer is as patient as the seller (the case of \u03b3 = 1) [Amin et al., 2013].\nIncentive Constraints. In a dynamic environment, the buyer\u2019s best response at stage t depends on\nher strategy in the future stages. When the seller has perfect distributional information, the classic\nnotion of dynamic incentive-compatibility (DIC) requires that the buyer is incentivized to report\ntruthfully assuming that she plays optimally in the future [Mirrokni et al., 2018].3 When the seller\nonly has approximate distributional information this is no longer possible to achieve, so we introduce\nthe notion of \u03b7(1,T )-approximate DIC, which requires that the buyer\u2019s bid deviate from the truth by\nat most \u03b7t at stage t, assuming the buyer plays optimally in the future (note that optimally now no\nlonger means truthfully). Formally, at each stage t, there exists \u02c6bt \u2208 [vt \u2212 \u03b7t, vt + \u03b7t] such that\n\nfor all vt, b(1,t\u22121), F(t+1,T ), and \u02c6F(t+1,T ), where Ut(b(1,t); F(1,T ); \u02c6F(1,T )) is the continuation\nutility that the buyer obtains in the future: UT\nUt\n\nut\n\n(cid:0)vt; b(1,t); \u02c6F(1,t)\n(cid:1) is de\ufb01ned as\n\n(cid:1) + \u03b3 \u00b7 Ut\n\n(cid:0)b(1,t); F(1,T ); \u02c6F(1,T )\n(cid:0)b(1,T ); F(1,T ); \u02c6F(1,T )\n(cid:1) + \u03b3 \u00b7 Ut+1\n\n(cid:0)vt+1; b(1,t+1); \u02c6F(1,t+1)\n\n(cid:1)\n(cid:1) = 0, and for t < T\n\n(\u03b7(1,T )-DIC)\n\n(cid:0)b(1,t+1); F(1,T ); \u02c6F(1,T )\n\n(cid:1)(cid:105)\n\n.\n\n\u02c6bt \u2208 arg max\n\nbt\n\n(cid:0)b(1,t); F(1,T ); \u02c6F(1,T )\n(cid:104)\n\nEvt+1\u223cFt+1\n\nmax\nbt+1\n\nut+1\n\nParticipation Constraints. We assume that the buyer weighs realized past utilities equally. There-\nfore, ex-post individual rationality requires that for all \u02c6F(1,T ) and for all v(1,T ),\n\nT(cid:88)\n\n(cid:0)vt; v(1,t); \u02c6F(1,t)\n\n(cid:1) \u2265 0.\n\nut\n\n(ex-post IR)\n\nt=1\n\nFor convenience, we will use the phrase \u201cfor F(1,T )\u201d to indicate the environment where the buyer\u2019s\ntrue distribution is F(1,T ). For example, when we say that a mechanism is \u03b7(1,T )-DIC for F(1,T ) we\nmean that it is \u03b7(1,T )-DIC when the buyer\u2019s true distribution is F(1,T ).\nNo-Regret Policy. Our task is to design a policy \u03c0 that includes both a learning policy for \u03c3\nand an associated dynamic mechanism policy to extract revenue. At the beginning of stage t, the\nlearning policy estimates \u02c6Ft using information a(1,t), \u03b6(1,t), M(1,t), and b(1,t\u22121), while the dynamic\nmechanism policy computes the stage mechanism (cid:104)xt, pt(cid:105) at stage t using \u02c6F(1,t) and b(1,t\u22121). Let\nRev(\u03c0; F(1,T )) and Rev(B; F(1,T )) be the revenue of implementing policy \u03c0 and mechanism B for\nF(1,T ), respectively. Moreover, let B\u2217(F(1,T )) denote the revenue-optimal clairvoyant dynamic\nmechanism that knows F(1,T ) in advance. The regret of policy \u03c0 against a c-approximation of the\n\ndynamic benchmark is de\ufb01ned as Regret\u03c0(F(1,T )) = c \u00b7 Rev(cid:0)B\u2217(F(1,T )); F(1,T )\n\n(cid:1) \u2212 Rev(\u03c0; F(1,T )).\n\nOur objective is to design a policy with sublinear regret.4\n\n3 Robust Non-clairvoyant Mechanism\n\nThe literature on dynamic mechanism design relies on the strong assumption that the seller has\nperfect distributional information at each stage, \u02c6F(1,T ) = F(1,T ) [Ashlagi et al., 2016, Mirrokni\net al., 2016b,a, 2018]. However, in a learning setting like that of contextual auctions, the seller\ncan only obtain a sequence of estimated distributions by estimating \u03c3. In this section, we design a\nnon-clairvoyant mechanism that is robust to misspeci\ufb01cations in the value distribution in the sense\nthat the buyer is incentivized to place a bid within known bounds from its value, which ultimately\nallows us to relate the mechanism revenue under the estimated and true value distributions. The\nmisspeci\ufb01cations handled by the mechanism are captured by the following assumption.\nAssumption 2. There exists a coupling between a random draw vt from Ft and a random draw \u02c6vt\nfrom \u02c6Ft such that vt = \u02c6vt + at \u00b7 \u0001t with \u0001t \u2208 [\u2212\u2206, \u2206].\n\nwhich is the case when(cid:80)T\n\u2126((cid:80)\n\n3Interested readers can refer to [Mirrokni et al., 2018] for discussions on the choice of DIC notions.\n4Note that sublinear revenue loss is only meaningful if the available revenue to extract is itself at least linear,\nt=1 at = \u2126(T ) since the revenue obtained by the optimal dynamic mechanism is\nt at) revenue by offering a posted\nprice pat with p = 1/(2cf ) at stage t which induces revenue at least p\u00b7 at(1\u2212 p\u00b7 cf ) = at/(4cf ) from stage t.\n\nt at) in our setting. In fact, a static mechanism can already achieve \u2126((cid:80)\n\n4\n\n\f3.1 The Mechanism\n\nt = 1, pG\n\nt+1 = balt + bt\n\nt = 0, and balG\n\n(cid:104)(cid:0)vt \u2212 rt(balt)(cid:1)+(cid:105)\n\nBuilding on the 1\n3-approximation non-clairvoyant mechanism from Mirrokni et al. [2018], we design\nour robust non-clairvoyant mechanism by mixing their mechanism with a random posted-price\nauction. The mechanism is an instance of a bank account mechanism where the state information is\ncaptured by a single scalar balt.\nMechanism 1. The robust non-clairvoyant mechanism B( \u02c6F(1,T ), \u03bb) consists of a mixture of four\nmechanisms: the give-for-free mechanism, the posted-price auction with extra fee, the Myerson\u2019s\nauction, and the random posted-price auction. The stage mechanism at stage t is parameterized by a\nnon-negative balance balt. When the buyer submits a bid bt:\nGive-for-free Mechanism. Allocate the item no matter what the buyer\u2019s bid is and increase the\nbalance by the buyer\u2019s bid: xG\nPosted-price Auction with Extra Fee. Let feet(balt; \u02c6Ft) = min(3balt, E\n[vt]) and rt(balt) be\nthe posted-price such that E\n= feet(balt; \u02c6Ft). The mechanism charges the\nbuyer feet(balt; \u02c6Ft) before the buyer learns her valuation and then runs a posted-price auction with\nt = 1{bt \u2265 rt(balt)} and pP\nt = feet(balt; \u02c6Ft) + rt(balt) \u00b7 1{bt \u2265 rt(balt)}, and\nprice rt(balt): xP\nt+1 = balt \u2212 feet(balt; \u02c6Ft).\ndecrease the balance by feet(balt; \u02c6Ft): balP\nMyerson\u2019s Auction. Let r\u2217\n\n(cid:0)1 \u2212 \u02c6Ft(r)(cid:1) and run a posted-price auction with price r\u2217\n\nt ( \u02c6Ft) be Myerson\u2019s optimal reserve price, i.e., r\u2217\nt ( \u02c6Ft) \u00b7 1{bt \u2265 r\u2217\n\n1{bt \u2265 r\u2217\nRandom Posted-price Auction. Let \u02c6rt be random reserve price drawn from [0, at] uniformly and\nrun a posted-price auction with price \u02c6rt without changing the balance: xR\nt =\n\u02c6rt \u00b7 1{bt \u2265 \u02c6rt}, and balR\nThe robust non-clairvoyant mechanism at stage t is: xt = \u03bb \u00b7 xR\n\u03bb \u00b7 pR\nbalG\n\n(cid:3), and balt = \u03bb \u00b7 balR\n\nt ( \u02c6Ft) without changing the balance: xM\nt+1 = balt.\n\nt ( \u02c6Ft) = arg maxr r \u00b7\nt =\n\nt + xM\nt + xP\nt\nt + balM\n.\nt\n\nt = 1{bt \u2265 \u02c6rt}, pR\n\nt ( \u02c6Ft)}, and balM\n\nt ( \u02c6Ft)}, pM\n\nt = r\u2217\n\nt + 1\u2212\u03bb\n3\nt + balP\n\n(cid:3), pt =\n\nt + pP\n\nt + pM\nt\n\n(cid:2)xG\n\nt+1 = balt.\n\n(cid:2)pG\n\nt + 1\u2212\u03bb\n\n3\n\nt + 1\u2212\u03bb\n\n3\n\nvt\u223c \u02c6Ft\n\nvt\u223c \u02c6Ft\n\n(cid:105)\n\n(cid:104)\n\nThe following central result gives a guarantee on the revenue performance of our robust non-\nclairvoyant mechanism against a utility-maximizing buyer subject to an estimation error \u2206.\n\nTheorem 3.1. Rev(cid:0)B( \u02c6F(1,T ), \u03bb), F(1,T )\n\n(cid:16)\n\n(cid:1) \u2212 O\n\n(cid:113) \u2206\n\n(cid:17)\n\n.\n\n\u03bbT +\n\n\u03bb T\n\n(cid:1) \u2265 1\n3 Rev(cid:0)B\u2217(F(1,T )), F(1,T )\n(cid:17)\n\n(cid:16)\n\n3 the revenue loss is O\n\n\u2206 1\n\n3 T\n\n. The remainder of this section is\n\nAt the optimal choice of \u03bb = \u2206 1\ndevoted to proving Theorem 3.1.\n\n3.2 Analysis\n\nWe start by describing the incentive properties that B( \u02c6F(1,T ), \u03bb) satis\ufb01es for \u02c6F(1,T ). First notice that\nall four base mechanisms are variants of posted-price auctions, and therefore, all of them are stage-IC:\n\n\u2200bt, vt \u00b7 xt(bal, vt) \u2212 pt(bal, vt) \u2265 vt \u00b7 xt(bal, bt) \u2212 pt(bal, bt).\n\n(stage-IC)\n\nIn particular, all mechanisms except the posted-price auction with extra fee are stage-IR:\n\n\u2200vt, vt \u00b7 xt(vt) \u2212 pt(vt) \u2265 0\n\n(stage-IR)\nWe emphasize that the posted-price auction with extra fee is different from a classic posted-price\nauction: the posted-price auction with extra fee will charge the buyer an extra payment feet(balt; \u02c6Ft)\nno matter what the buyer\u2019s bid is, and therefore, it is not stage-IR. Moreover, each stage mechanism\nis balance-independent (BI) with respect to the estimated distribution \u02c6Ft: there exists a constant ct,\n(BI)\nIn particular, the give-for-free mechanism, the Myerson\u2019s auction, and the random posted-price\nauction are static and independent of the balance; as for the posted-price auction with extra fee, it\nensures that the buyer\u2019s expected utility is always 0 for all balt \u2265 0 under \u02c6Ft.\n\n[vt \u00b7 xt(bal, vt) \u2212 pt(bal, vt)] = ct.\n\nE\nvt\u223c \u02c6Ft\n\n5\n\n\fThe combination of stage-IC and BI implies that the mechanism is DIC: since the mechanism\npromises the buyer that all future stage mechanisms are BI, the buyer can infer that her action at the\ncurrent stage does not impact her expected utility in the future. Moreover, notice that the non-negative\nbalance bal always lower-bounds the buyer\u2019s cumulative utility, and therefore, B( \u02c6F(1,T ), \u03bb) is ex-post\nIR under the estimated distributions \u02c6F(1,T ).\nProposition 3.1. B( \u02c6F(1,T ), \u03bb) is stage-IC, BI, DIC, and ex-post IR for \u02c6F(1,T ).\n\nWe next turn to the mechanism\u2019s properties under the true distributions F(1,T ).\n\n3.2.1 Mismatch between \u02c6F(1,T ) and F(1,T )\n\nWe \ufb01rst bound the revenue loss due to the mismatch between \u02c6F(1,T ) and F(1,T ). Observe that one can\ninterpret the estimation error under Assumption 2 as the buyer\u2019s misreport: when the buyer reports\ntruthfully under F(1,T ) this is equivalent to the case in which the buyer misreports by a magnitude at\nmost at \u00b7 \u2206 under \u02c6F(1,T ). We develop a program for computing the revenue of our mechanism even\nwhen the buyer misreports. For a non-clairvoyant mechanism B( \u02c6F(1,T ), \u03bb), we consider a program\n\u03c8t(bal, \u02c6F(1,T ); F(1,T )) to keep track on the revenue of implementing B( \u02c6F(1,T ), \u03bb) when the buyer\u2019s\ntrue distribution is F(1,T ). We de\ufb01ne \u03c8T (bal) = 0 and for t < T ,\n\n\u03c8t\u22121(bal, \u02c6F(1,T ); F(1,T )) = Evt\u223cFt\n\nfeet(bal; \u02c6Ft) +\n\nt ( \u02c6Ft) \u00b7 1{v(cid:48)\nr\u2217\n\nt \u2265 r\u2217\n\nt ( \u02c6Ft)}\n\n(cid:20) 1\n\n3\n\n(cid:18)\n\n1\n3\nt \u2212 1\nv(cid:48)\n3\n\n1\n3\n\n(cid:19)(cid:21)\n\n(1)\n\n+ \u03c8t\n\nbal +\n\nfeet(bal; \u02c6Ft), \u02c6F(1,T ); F(1,T )\n\nwhere v(cid:48)\nt is the buyer\u2019s reported bid that maximizes her continuation utility when her true value is vt.\nRecall that conditioned on that the stage mechanism is not the random posted-price auction, with\n3 probability, we run the posted-price auction with extra fee and extract feet(bal; \u02c6Ft) as revenue.\n1\nHere, we omit the revenue rt(balt) obtained from the posted-price auction with extra fee. In addition,\nt ( \u02c6Ft).\nwith another 1\nMoreover, the balance is increased by v(cid:48)\n3 from the give-for-free mechanism and\ndecreased by 1\n\n3 probability, we run the Myerson\u2019s auction and extract r\u2217\n\nt ( \u02c6Ft) revenue if v(cid:48)\n\nt \u2265 r\u2217\n\nt with probability 1\n3 from the posted-price auction.\n\n(cid:17) \u2265 (1 \u2212 \u03bb) \u00b7 \u03c80(0, \u02c6F(1,T ); F(1,T )).\n\nProposition 3.2. Rev\n\n3 feet(bal) with probability 1\nB( \u02c6F(1,T ), \u03bb); F(1,T )\n\n(cid:16)\n\n3-approximation of\n\n3 \u00b7 Rev(cid:0)B\u2217(F(1,T )), F(1,T )\n\nAccording to the revenue analysis in [Mirrokni et al., 2018], we can still obtain 1\nthe optimal revenue even when the revenue rt(balt) is omitted.\nLemma 3.1. [Mirrokni et al., 2018] \u03c80(0, F(1,T ); F(1,T )) \u2265 1\nThe following lemma establishes a connection between the change of the balance and the change of\nthe revenue, when the seller\u2019s distributional information is perfect so that the buyer does not misreport.\nIn particular, it shows that as balance increases by \u03b4, the change of the future revenue is between 0\nand \u03b4. Therefore, it demonstrates the smoothness of revenue curve such that if the buyer misreports at\nstage t to change the balance by \u03b4, then the revenue loss is at most \u03b4 for the future stages, assuming\nthe buyer reports truthfully in the future.\nLemma 3.2. For all 0 \u2264 t \u2264 T and \u03b4 \u2265 0,\n\n(cid:1).\n\n\u03c8t(bal + \u03b4, F(1,T ); F(1,T )) \u2212 \u03b4 \u2264 \u03c8t(bal, F(1,T ); F(1,T )) \u2264 \u03c8t(bal + \u03b4, F(1,T ); F(1,T )).\n\nApplying Lemma 3.2 with Assumption 2, we can bound the revenue loss due to the mismatch between\nF(1,T ) and \u02c6F(1,T ). More precisely, we will bound the difference between \u03c80(0, F(1,T ); F(1,T )) and\n\u03c80(0, \u02c6F(1,T ); \u02c6F(1,T )). Notice that B(F(1,T )) is dynamic incentive-compatible for F(1,T ), and thus,\nthe buyer will not misreport, i.e., v(cid:48)\nt = vt in (1); similarly for \u02c6F(1,T ).\nLemma 3.3. \u03c80(0, \u02c6F(1,T ); \u02c6F(1,T )) \u2265 \u03c80(0, F(1,T ); F(1,T )) \u2212 O(\u2206T ).\n\n6\n\n\f3.2.2 The Buyer\u2019s Misreport\n\nNote that in a single-buyer environment, the properties stage-IC and ex-post IR do not depend on\nthe underlying distributions, and therefore, B( \u02c6F(1,T ), \u03bb) is also stage-IC and ex-post IR for F(1,T ).\nHowever, B( \u02c6F(1,T ), \u03bb) is no longer BI for F(1,T ), which is the key property to ensure DIC. To\ncircumvent this dif\ufb01culty, we generalize the de\ufb01nition of BI to approximate balance-independence.\nDe\ufb01nition 3.1. A dynamic mechanism is \u03b2(1,T )-BI for F(1,T ) if \u2200t, there exists a constant ct:\n\n\u2200bal \u2265 0, Evt\u223cFt[vt \u00b7 xt(bal, vt) \u2212 pt(bal, vt)] \u2208 [ct \u2212 \u03b2t\n2\n\n, ct +\n\n\u03b2t\n2\n\n]\n\n(\u03b2(1,T )-BI)\n\nSince with the same stage mechanism, the difference between the expected utility under \u02c6Ft and Ft is\nat most \u2206at, B( \u02c6F(1,T ), \u03bb) is \u03b2(1,T )-BI with \u03b2t = 2\u2206at.\nProposition 3.3. B( \u02c6F(1,T ), \u03bb) is stage-IC, \u03b2(1,T )-BI with \u03b2t = 2\u2206at, and ex-post IR for F(1,T ).\nFor a dynamic mechanism satisfying \u03b2(1,T )-BI for F(1,T ), the range of the buyer\u2019s expected utility\nunder truthful reporting is \u03b2t in the t-th stage. Therefore, no matter how the buyer misreports in the\n\ufb01rst (t \u2212 1) stages, her expected utility in the t-th stage can only \ufb02uctuate at most \u03b2t if she reports\ntruthfully at stage t. Combining this with the fact that the stage mechanisms are stage-IC, we have\nLemma 3.4. For a dynamic mechanism that is stage-IC and \u03b2(1,T )-BI for F(1,T ), for any b(1,t\u22121) and\nvt, the difference between the continuation utility of reporting any bt \u2208 [0, at] and the continuation\n\nutility of reporting vt truthfully is bounded by(cid:80)T\n\nt(cid:48)=t+1 \u03b3t(cid:48)\u2212t \u00b7 \u03b2t(cid:48).\n\nLemma 3.4 states that the gain of the continuation utility by misreporting is bounded and the bound\nis independent of the magnitude of the misreport. The key observation behind Lemma 3.4 is that at\nstage t, the buyer obtains the maximum utility when she reports truthfully since the stage mechanism\nis stage-IC. Therefore, by the property of \u03b2(1,T )-BI, the difference of utility between misreporting in\nan optimal way and reporting truthfully is at most \u03b2t at stage t.\nAs a result, once the mechanism posts a risk for misreporting, we are able to bound the magnitude\nof the buyer\u2019s misreport. This is the purpose of mixing in the random posted-price mechanism at\neach stage t: it can be shown that a misreport with magnitude mt will cause the buyer a utility loss\n\u03bb \u00b7 m2\n. Since the buyer is a utility-maximizer with discounting factor \u03b3, we can bound the magnitude\nof misreport for each stage:\nLemma 3.5. B( \u02c6F(1,T ), \u03bb) is \u03b7(1,T )-DIC with \u03b7t =\n\n(cid:113) 2at\n\u03bb \u00b7(cid:80)T\n\nt(cid:48)=t+1 \u03b3t(cid:48)\u2212t\u03b2t(cid:48).\n\nt\n2at\n\nApplying Lemma 3.2, we can show that B( \u02c6F(1,T ), \u03bb) is robust against the buyer\u2019s misreport. We\nabuse the notion to use \u03c80(0, \u02c6F(1,T ); F(1,T )) to track the revenue conditioned on that the magnitude\nof the buyer\u2019s misreport at stage t is bounded by \u03b7t.\nLemma 3.6. \u03c80(0, \u02c6F(1,T ); F(1,T )) \u2265 \u03c80(0, \u02c6F(1,T ); \u02c6F(1,T )) \u2212 O\n\n(cid:16)(cid:113) \u2206\n\n(cid:17)\n\n\u03bb T\n\n.\n\nFinally, combining Proposition 3.2, Lemma 3.3 and Lemma 3.6, completes the proof of Theorem 3.1.\n\n4 No-Regret Policy in Contextual Auctions\n\n4.1 Learning Policy\n\nOur learning policy is adapted from the contextual robust pricing policy proposed in [Golrezaei et al.,\n2018]. Our learning policy partitions the entire time horizon into K = (cid:100)log T(cid:101) phases where T is the\ntime horizon, such that the partition is speci\ufb01ed by ((cid:96)1 = 1, (cid:96)2,\u00b7\u00b7\u00b7 , (cid:96)K, (cid:96)K+1 = T + 1), in which\n(cid:96)k = 2k\u22121. The k-th phase spans between the (cid:96)k-th stage and the ((cid:96)k+1 \u2212 1)-th stage, and therefore,\nthe length of phase k is exactly (cid:96)k. Note that the partition can be implemented even when T is not\nknown in advance. We use Ek = {(cid:96)k,\u00b7\u00b7\u00b7 , (cid:96)k+1 \u2212 1} to refer to the stages in the k-th phase.\nAt the beginning of the k-th phase, we update the estimation of the buyer\u2019s preference vector \u03c3 using\nthe buyer\u2019s bids from the (k \u2212 1)-th phase, denoted by \u02c6\u03c3k. To estimate \u02c6\u03c3k, we sample wt uniformly\n\n7\n\n\ffrom [0, 1] for t \u2208 \u02c6Ek\u22121, where \u02c6Ek\u22121 = {t \u2208 Ek\u22121 | (cid:96)k \u2212 t > c log (cid:96)k} for some constant c. In\nother words, we will only use the information from the stages that are at least c log (cid:96)k ahead of the\nend of phase (k \u2212 1). \u02c6\u03c3k is set to be arg min(cid:107)\u03c3(cid:107)\u22641 Lk\u22121(\u03c3), where\n\nLk\u22121(\u03c3) = \u2212 (cid:88)\n\nt\u2208 \u02c6Ek\u22121\n\n(cid:104)\n1{bt \u2265 at \u00b7 wt} log(cid:0)1 \u2212 Mt(wt \u2212 (cid:104)\u03c3, \u03b6t(cid:105))(cid:1)\n+ 1{bt < at \u00b7 wt} log(cid:0)Mt(wt \u2212 (cid:104)\u03c3, \u03b6t(cid:105))(cid:1)(cid:105)\n\n.\n\nNote that when the buyer reports truthfully, Lk\u22121(\u03c3) is exactly the negative of log-likelihood\ncorresponding to \u03c3. We do not change our estimation throughout the k-th phase and the next update\nhappens at the beginning of the (k + 1)-phase. As a result, based on the estimate \u02c6\u03c3k, we compute the\nestimated distribution in phase k as \u02c6Ft(vt) = Mt\nWe say a lie is a misreport from the buyer that results in 1{bt \u2265 at \u00b7 wt} (cid:54)= 1{vt \u2265 at \u00b7 wt}. Let\n\n\u2212 (cid:104) \u02c6\u03c3k, \u03b6t(cid:105)(cid:17)\n\nLk\u22121 =(cid:8)t \u2208 \u02c6Ek\u22121 | 1{bt \u2265 at \u00b7 wt} (cid:54)= 1{vt \u2265 at \u00b7 wt}(cid:9)\n\nfor all t \u2208 Ek.\n\n(cid:16) vt\n\nat\n\nbe the set of stages in which the buyer lies. For a dynamic mechanism that is \u03b7(1,T )-DIC, we have\nvt \u2212 \u03b7t \u2264 bt \u2264 vt + \u03b7t. Hence, if |at \u00b7 wt \u2212 vt| > \u03b7t, any misreport from the buyer does not result in\na lie. Moreover, the buyer has an additional motivation to misreport to change the seller\u2019s estimation\nfor the future phases. However, for t \u2208 \u02c6Ek\u22121, such a gain is relatively small since the buyer discounts\nthe future.\nLet B( \u02c6F(1,T ), \u03bb(1,K)) be a mechanism generalized from B( \u02c6F(1,T ), \u03bb) such that for t \u2208 Ek,\nB( \u02c6F(1,T ), \u03bb(1,K)) offers the random posted-price auction with probability \u03bbk instead of \u03bb.\nLemma 4.1. In B( \u02c6F(1,T ), \u03bb(1,K)), the additional misreport at stage t \u2208 \u02c6Ek is O(\n|Lk| = O\nGiven this upper bound on |Lk\u22121|, the following lemma bounds the estimation error of \u02c6\u03c3k.\nLemma 4.2 (Proposition 7.1 [Golrezaei et al., 2018]). With probability 1 \u2212 1\nfor phase k is \u2206k \u2261 (cid:107) \u02c6\u03c3k \u2212 \u03c3(cid:107) = O\n\n(cid:16)\nlog (cid:96)k +(cid:80)\n\n(cid:113) log((cid:96)k\u22121\u00b7d)\n\nwith probability 1 \u2212 1\n\n, the estimation error\n\nd \u00b7 |Lk\u22121|\n\n). Moreover,\n\nt\u2208 \u02c6Ek\n\n(cid:16)\n\n1\u221a\n\n\u03bbk\u00b7(cid:96)2\n\nk\n\n(cid:17)\n\n(cid:17)\n\n.\n\n.\n\n(cid:96)k\n\n\u03b7t\nat\n\n(cid:96)k\n\n+\n\n(cid:96)k\u22121\n\n(cid:96)k\u22121\n\n4.2 Dynamic Mechanism Policy\n\nWe develop a hybrid non-clairvoyant mechanism to reduce the number of lies by reducing the\nmagnitude of misreports. To do so, observe that the buyer has no incentive to misreport in order\nto affect future stage mechanisms when the latter are static. However, as previously mentioned,\noffering a purely static mechanism may forego a large amount of revenue [Papadimitriou et al., 2016].\nMotivated by this insight, our hybrid mechanism contains both dynamic stages dependent on the\nhistory and static stages independent of the history. We adapt B( \u02c6F(1,T ), \u03bb(1,K)) to obtain a hybrid\nnon-clairvoyant mechanism Bhybrid( \u02c6F(1,T ), \u03bb(1,K), \u03c9, \u03c4 ), which is parameterized by \u03c9 \u2208 (0, 1) and\na function \u03c4 : Z+ \u2192 R+ that maps the phase number to a real number. The stage mechanism at stage\nt is parameterized by at, two balances balt and sbalt, and an additional parameter swt. We provide a\nhigh level description of our mechanism while a detailed description is deferred to the full version.\nk = {t \u2208 Ek | at < (cid:96)\u03c9\nk}. Intuitively, the hybrid non-clairvoyant mechanism runs different\nLet E\u03c9\nstage mechanisms conditioned on whether t \u2208 E\u03c9\nk or not: the stage mechanism is dynamic for\nk and the stage mechanism is static for t \u2208 E\u03c9\nt (cid:54)\u2208 E\u03c9\nk with high probability.\nMore precisely, for t (cid:54)\u2208 E\u03c9\nk , the stage mechanisms are exactly the same as B( \u02c6F(1,T ), \u03bb(1,K)) and in\n(cid:80)\nparticular, the posted-price auction with extra fee only uses the balance from balt. For t \u2208 E\u03c9\nk , the\ngive-for-free mechanism and the Myerson\u2019s auction remain the same. We use swt to keep track of the\nE\nvt(cid:48)\u223c \u02c6Ft(cid:48) [vt(cid:48)]. If swt < \u03c4 (k), we turn\nsummation of expected valuations, i.e., swt = 1\n3\nthe posted-price auction with extra fee into a give-for-free mechanism, but we increase the balance\nsbal instead of bal; otherwise, we run the posted-price auction with extra fee, except that it only uses\nthe balance from sbal and it will in addition deposit the buyer\u2019s utility to sbal.\n\nk ,t(cid:48)<t\n\nt(cid:48)\u2208E\u03c9\n\n8\n\n\f3\n\n3\n\n(cid:17)\n\n[vt]\n\n(cid:16)\n\n= E\n\nvt\u223c \u02c6Ft\n\nvt\u223c \u02c6Ft\n\nvt\u223c \u02c6Ft\n\n3sbalt, E\n\nand a Myerson\u2019s auction with probability 1\u2212\u03bbk\n\nFor t \u2208 E\u03c9\nk and swt < \u03c4 (k), the stage mechanism is static since it in fact runs a give-for-free\nmechanism with probability 2(1\u2212\u03bbk)\n, both of which\nare independent of the history. For t \u2208 E\u03c9\nk and swt \u2265 \u03c4 (k), by choosing \u03c4 properly, we show that\nwith high probability, even if the buyer plays strategically, 3sbalt \u2265 E\n[vt], which implies that\n[vt] so that the posted-price would be 0. Therefore, with high\nmin\nprobability, the hybrid posted-price auction with extra fee is a give-for-free mechanism with fee\nE\n[vt], which is static and independent of balt and sbalt. To formally prove these statements,\nvt\u223c \u02c6Ft\nwe exploit the fact that the dynamics of sbalt forms a martingale for stage t with swt \u2265 \u03c4 (k).\nLemma 4.3. With \u03c4 (k) = \u2126\n\n(cid:1) \u2212(cid:88)\nRev(cid:0)Bhybrid( \u02c6F(1,T ), \u03bb(1,K), \u03c9, \u03c4 ), F(1,T )\n,(cid:80)\nLemma 4.3 states that there exits a function \u03b3 such that the revenue loss is at most(cid:80)\n\n(cid:113) \u2206k\n(cid:1) \u2265 1\nRev(cid:0)B\u2217(F(1,T )), F(1,T )\n\u2264 \u02dcO(cid:0)(cid:96)1\u2212\u03c9\n\nk (\u03c4 (k) + \u03bbk \u00b7 (cid:96)k)\nand the number of lies is \u02dcO((cid:96)1\u2212\u03c9\n). In particular, as \u03c9 increases, the revenue loss increases while the\nnumber of lies decreases, and therefore, our hybrid non-clairvoyant mechanism achieves a trade-off\nbetween the revenue loss and the number of lies.\n\nand with probability at least 1 \u2212 1\n\n(\u03c4 (k) + \u03bbk \u00b7 (cid:96)k)\n\nfor all k, we have\n\n1\n2 (1+\u03c9)\n(cid:96)\nk\n\nlog (cid:96)k +\n\n(cid:1).\n\nt\u2208 \u02c6Ek\n\n\u03b7t\nat\n\n(cid:96)k\n\n\u03bbk\n\n(cid:96)k\n\nk\n\n(cid:16)\n\nk\n\n3\n\nk\n\n\u221a\n\n(cid:17)\n\n4.3 The Final Policy\n\nLearning Policy: At the start of phase k, estimate \u02c6\u03c3k = arg min(cid:107)\u03c3(cid:107)\u22641 Lk\u22121(\u03c3).\n\nDynamic Mechanism Policy: Bhybrid( \u02c6F(1,T ), \u03bb(1,K), 1\n\n2 , \u03c4 ): at phase k\n\n\u2212 1\nk\n\n6\n\n\u2022 \u03bbk = (cid:96)\n\u2022 Compute the distributional information \u02c6Ft for t \u2208 Ek according to the estimation \u02c6\u03c3k;\n\nand \u03c4 (k) = c\u2217(cid:96)\n\nk ;\n\n5\n6\n\nFigure 1: Robust Non-clairvoyant Dynamic Contextual Auction Policy\n\nWe are now ready to combine our learning policy and dynamic mechanism policy to obtain our\nno-regret policy for contextual auctions in a non-clairvoyant environment (Figure 1). For our hybrid\nnon-clairvoyant mechanism, we will set \u03c9 = 1\nk with a large enough\nconstant c\u2217. In particular, the estimation error for \u02c6\u03c3k is \u2206k = O((cid:96)\nTheorem 4.1. The T -stage regret of the robust non-clairvoyant dynamic contextual auction policy is\n\u02dcO(T 5\n\n3 -approximation of the optimal clairvoyant dynamic mechanism.\n\n\u2212 1\nk ) under our policy.\n\n\u2212 1\nk , and \u03c4 (k) = c\u2217(cid:96)\n\n6 ) against 1\n\n2, \u03bbk = (cid:96)\n\n5\n6\n\n2\n\n6\n\n5 Conclusion\n\nIn this paper, we present a framework of designing non-clairvoyant dynamic mechanisms that\nare robust to both the estimation errors on the buyer\u2019s distributional information and the buyer\u2019s\nstrategic behavior. We then tailor our framework to the setting of contextual auctions to develop a\nnon-clairvoyant mechanism that achieves no-regret against 1\n3-approximation of the revenue-optimal\nclairvoyant dynamic mechanism. A natural direction for future work is to improve the regret guarantee\nor to provide a matching lower bound. Moreover, it is interesting to understand how to apply our\nframework to dynamic auction environments other than contextual auctions. Finally, it would also be\ninteresting to investigate what can be achieved when the seller has limited prediction power of the\nfuture, a region between non-clairvoyant and clairvoyant environments.\n\n9\n\n\fReferences\nKareem Amin, Afshin Rostamizadeh, and Umar Syed. Learning prices for repeated auctions with\nstrategic buyers. In Advances in Neural Information Processing Systems, pages 1169\u20131177, 2013.\n\nKareem Amin, Afshin Rostamizadeh, and Umar Syed. Repeated contextual auctions with strategic\n\nbuyers. In Advances in Neural Information Processing Systems, pages 622\u2013630, 2014.\n\nItai Ashlagi, Constantinos Daskalakis, and Nima Haghpanah. Sequential mechanisms with ex-\npost participation guarantees. In Proceedings of the 2016 ACM Conference on Economics and\nComputation, pages 213\u2013214. ACM, 2016.\n\nDirk Bergemann and Maher Said. Dynamic auctions. In Wiley Encyclopedia of Operations Research\n\nand Management Science, 2011.\n\nDirk Bergemann and Juuso V\u00e4lim\u00e4ki. The dynamic pivot mechanism. Econometrica, 78(2):771\u2013789,\n\n2010.\n\nYang Cai, Constantinos Daskalakis, and S. Matthew Weinberg. An algorithmic characterization of\nmulti-dimensional mechanisms. In Proceedings of the 44th Annual ACM Symposium on Theory of\nComputing, pages 459\u2013478, 2012.\n\nMaxime C. Cohen, Ilan Lobel, and Renato Paes Leme. Feature-based dynamic pricing. In Proceedings\nof the 2016 ACM Conference on Economics and Computation, EC \u201916, pages 817\u2013817, New York,\nNY, USA, 2016. ACM. ISBN 978-1-4503-3936-0.\n\nArnoud V. den Boer. Dynamic pricing and learning: Historical origins, current research, and new\n\ndirections. Surveys in Operations Research and Management Science, 20(1):1\u201318, 2015.\n\nAlexey Drutsa. Horizon-independent optimal pricing in repeated auctions with truthful and strategic\nbuyers. In Proceedings of the 26th International Conference on World Wide Web, pages 33\u201342.\nInternational World Wide Web Conferences Steering Committee, 2017.\n\nAlexey Drutsa. Weakly consistent optimal pricing algorithms in repeated posted-price auctions with\n\nstrategic buyer. In International Conference on Machine Learning, pages 1318\u20131327, 2018.\n\nBenjamin Edelman and Michael Ostrovsky. Strategic bidder behavior in sponsored search auctions.\n\nDecision support systems, 43(1):192\u2013198, 2007.\n\nNegin Golrezaei, Adel Javanmard, and Vahab Mirrokni. Dynamic incentive-aware learning: Ro-\nbust pricing in contextual auctions. Available at SSRN: https://ssrn.com/abstract=3144034 or\nhttp://dx.doi.org/10.2139/ssrn.3144034, 2018.\n\nChristos Koufogiannakis and Neal E. Young. A nearly linear-time PTAS for explicit fractional\n\npacking and covering linear programs. Algorithmica, 70(4):648\u2013674, 2014.\n\nRenato Paes Leme and Jon Schneider. Contextual search via intrinsic volumes. 2018.\n\nJinyan Liu, Zhiyi Huang, and Xiangning Wang. Learning optimal reserve price against non-myopic\n\nbidders. In Advances in Neural Information Processing Systems, pages 2038\u20132048, 2018.\n\nIlan Lobel, Renato Paes Leme, and Adrian Vladu. Multidimensional binary search for contextual\n\ndecision-making. Operations Research, 66(5):1346\u20131361, 2018.\n\nJieming Mao, Renato Leme, and Jon Schneider. Contextual pricing for lipschitz buyers. In Advances\n\nin Neural Information Processing Systems, pages 5643\u20135651, 2018.\n\nAndres M. Medina and Mehryar Mohri. Learning theory and algorithms for revenue optimization\nin second price auctions with reserve. In Proceedings of the 31st International Conference on\nMachine Learning, pages 262\u2013270, 2014.\n\nVahab Mirrokni, Renato Paes Leme, Pingzhong Tang, and Song Zuo. Dynamic auctions with bank\naccounts. In Proceedings of the 25th International Joint Conference on Arti\ufb01cial Intelligence,\npages 387\u2013393, 2016a.\n\n10\n\n\fVahab Mirrokni, Renato Paes Leme, Pingzhong Tang, and Song Zuo. Optimal dynamic mechanisms\n\nwith ex-post IR via bank accounts. arXiv preprint arXiv:1605.08840, 2016b.\n\nVahab Mirrokni, Renato Paes Leme, Pingzhong Tang, and Song Zuo. Non-clairvoyant dynamic\nmechanism design. In Proceedings of the 19th ACM Conference on Economics and Computation,\npages 169\u2013169, 2018.\n\nRoger B. Myerson. Optimal auction design. Mathematics of Operations Research, 6(1):58\u201373, 1981.\n\nChristos Papadimitriou, George Pierrakos, Christos-Alexandros Psomas, and Aviad Rubinstein. On\nthe complexity of dynamic mechanism design. In Proceedings of the 27th annual ACM-SIAM\nSymposium on Discrete Algorithms, pages 1458\u20131475, 2016.\n\nJonathan Thomas and Tim Worrall. Income \ufb02uctuation and asymmetric information: An example of\n\na repeated principal-agent problem. Journal of Economic Theory, 51(2):367\u2013390, 1990.\n\n11\n\n\f", "award": [], "sourceid": 4659, "authors": [{"given_name": "Yuan", "family_name": "Deng", "institution": "Duke University"}, {"given_name": "S\u00e9bastien", "family_name": "Lahaie", "institution": "Google Research"}, {"given_name": "Vahab", "family_name": "Mirrokni", "institution": "Google Research NYC"}]}