{"title": "Multiplicative Weights Update with Constant Step-Size in Congestion Games:  Convergence, Limit Cycles and Chaos", "book": "Advances in Neural Information Processing Systems", "page_first": 5872, "page_last": 5882, "abstract": "The Multiplicative Weights Update (MWU) method is a ubiquitous meta-algorithm that works as follows: A distribution is maintained on a certain set, and at each step the probability assigned to action $\\gamma$ is multiplied by $(1 -\\epsilon C(\\gamma))>0$ where $C(\\gamma)$ is the ``cost\" of action $\\gamma$ and then rescaled to ensure that the new values form a distribution.  We analyze MWU in congestion games where agents use \\textit{arbitrary admissible constants} as learning rates $\\epsilon$ and prove convergence to \\textit{exact Nash equilibria}. Interestingly, this convergence result does not carry over to the nearly homologous MWU variant where at each step the probability assigned to action $\\gamma$ is multiplied by $(1 -\\epsilon)^{C(\\gamma)}$ even for the simplest case of two-agent, two-strategy load balancing games, where such dynamics can provably lead to limit cycles or even chaotic behavior.", "full_text": "Multiplicative Weights Update with Constant\n\nStep-Size in Congestion Games: Convergence, Limit\n\nCycles and Chaos\n\nGerasimos Palaiopanos\u2217\n\nSUTD\n\nSingapore\n\ngerasimosath@yahoo.com\n\nIoannis Panageas\u2020\n\nMIT\n\nCambridge, MA 02139\n\nioannis@csail.mit.edu\n\nGeorgios Piliouras\u2021\n\nSUTD\n\nSingapore\n\ngeorgios@sutd.edu.sg\n\nAbstract\n\nThe Multiplicative Weights Update (MWU) method is a ubiquitous meta-algorithm\nthat works as follows: A distribution is maintained on a certain set, and at each\nstep the probability assigned to action \u03b3 is multiplied by (1 \u2212 \u0001C(\u03b3)) > 0 where\nC(\u03b3) is the \u201ccost\" of action \u03b3 and then rescaled to ensure that the new values form\na distribution. We analyze MWU in congestion games where agents use arbitrary\nadmissible constants as learning rates \u0001 and prove convergence to exact Nash\nequilibria. Interestingly, this convergence result does not carry over to the nearly\nhomologous MWU variant where at each step the probability assigned to action \u03b3\nis multiplied by (1 \u2212 \u0001)C(\u03b3) even for the simplest case of two-agent, two-strategy\nload balancing games, where such dynamics can provably lead to limit cycles or\neven chaotic behavior.\n\n1\n\nIntroduction\n\nThe Multiplicative Weights Update (MWU) is a ubiquitous meta-algorithm with numerous appli-\ncations in different \ufb01elds [2]. It is particularly useful in game theory due to its regret-minimizing\nproperties [24, 11]. It is typically introduced in two nearly identical variants, the one in which at\neach step the probability assigned to action \u03b3 is multiplied by (1 \u2212 \u0001C(\u03b3)) and the one in which\nit is multiplied by (1 \u2212 \u0001)C(\u03b3) where C(\u03b3) is the cost of action \u03b3. We will refer to the \ufb01rst as the\nlinear variant, MWU(cid:96), and the second as the exponential, MWUe (also known as Hedge). In the\nliterature there is little distinction between these two variants as both carry the same advantageous\nregret-minimizing property. It is also well known that in order to achieve sublinear regret, the learning\nrate \u0001 must be decreasing as time progresses. This constraint raises a natural question: Are there\ninteresting classes of games where MWU behaves well without the need to \ufb01ne-tune its learning rate?\nA natural setting to test the learning behavior of MWU with constant learning rates \u0001 is the well-\nstudied class of congestion games. Unfortunately, even for the simplest instances of congestion\ngames MWUe fails to converge to equilibria. For example, even in the simplest case of two balls two\n\n\u2217Gerasimos Palaiopanos would like to acknowledge a SUTD Presidential fellowship.\n\u2020Ioannis Panageas would like to acknowledge a MIT-SUTD postdoctoral fellowship. Part of this work was\ncompleted while Ioannis Panageas was a PhD student at Georgia Institute of Technology and a visiting scientist\nat the Simons Institute for the Theory of Computing.\n\u2021Georgios Piliouras would like to acknowledge SUTD grant SRG ESD 2015 097, MOE AcRF Tier 2 Grant\n2016-T2-1-170 and a NRF Fellowship. Part of this work was completed while Georgios Piliouras was a visiting\nscientist at the Simons Institute for the Theory of Computing.\n\n31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA.\n\n\fbins games,4 MWUe with \u0001 = 1\u2212 e\u221210 is shown to converge to a limit cycle of period 2 for in\ufb01nitely\nmany initial conditions (Theorem 4.1). If the cost functions of the two edges are not identical then we\ncreate instances of two player load balancing games such that MWUe has periodic orbits of length k\nfor all k > 0, as well as uncountable many initial conditions which never settle on any periodic orbit\nbut instead exhibit an irregular behavior known as Li-Yorke chaos (Theorem 4.2, see Corollary 4.3).\nThe source of these problems is exactly the large, \ufb01xed learning rate \u0001, e.g., \u0001 \u2248 1 for costs in [0, 1].\nIntuitively, the key aspect of the problem can be captured by (simultaneous) best response dynamics.\nIf both agents start from the same edge and best-respond simultaneously they will land on the second\nedge which now has a load of two. In the next step they will both jump back to the \ufb01rst edge and this\nmotion will be continued perpetually. Naturally, MWUe dynamics are considerably more intricate as\nthey evolve over mixed strategies and allow for more complicated non-equilibrium behavior but the\nkey insight is correct. Each agent has the right goal, decrease his own cost and hence the potential of\nthe game, however, as they pursue this goal too aggressively they cancel each other\u2019s gains and lead\nto unpredictable non-converging behavior.\nIn a sense, the cautionary tales above agree with our intuition. Large, constant learning rates \u0001 nullify\nthe known performance guarantees of MWU. We should expect erratic behavior in such cases. The\ntypical way to circumvent these problems is through careful monitoring and possibly successive\nhalving of the \u0001 parameter, a standard technique in the MWU literature. In this paper, we explore an\nalternative, cleaner, and surprisingly elegant solution to this problem. We show that applying MWU(cid:96),\nthe linear variant of MWU, suf\ufb01ces to guarantee convergence in all congestion games.\nOur key contributions. Our key result is the proof of convergence of MWU(cid:96) in congestion games.\nThe main technical contribution is a proof that the potential of the mixed state is always strictly\ndecreasing along any nontrivial trajectory (Theorem 3.1). This result holds for all congestion games,\nirrespective of the number of agents or the size, topology of the strategy sets. Moreover, each agent i\nmay be applying different learning rates \u0001i which will be constant along the dynamics (\u0001i does not\ndepend on the number of iterations T of the dynamics and therefore is bounded away from zero as\nT \u2192 \u221e; this is not the case for most of the results in the literature). The only restriction on the set\nof allowable learning rates \u0001i is that for each agent the multiplicative factor (1 \u2212 \u0001iCi(s)) should\nbe positive for all strategy outcomes s.5 Arguing convergence to equilibria for all initial conditions\n(Theorem 3.4) and further, convergence to Nash equilibria for all interior initial conditions (Theorem\n3.8) follows. Proving that the potential always decreases (Theorem 3.1) hinges upon discovering a\nnovel interpretation of MWU dynamics. Speci\ufb01cally, we show that the class of dynamical systems\nderived by applying MWU(cid:96) in congestion games is a special case of a convergent class of dynamical\nsystems introduced by Baum and Eagon [5] (see Theorem 2.4). The most well known member of this\nclass is the classic Baum-Welch algorithm, the standard instantiation of the Expectation-Maximization\n(EM) algorithm for hidden Markov models (HMM). Effectively, the proof of convergence of both\nthese systems boils down to a proof of membership to the same class of Baum-Eagon systems (see\nsection 2.3 for more details on these connections).\nIn the second part we provide simple congestion games where MWUe provably fails to converge. The\n\ufb01rst main technical contribution of this section is proving convergence to a limit cycle, speci\ufb01cally a\nperiodic orbit of length two, for the simplest case of two balls two bins games for in\ufb01nitely many initial\nconditions (Theorem 4.1). Moreover, after normalizing costs to lie in [0, 1], i.e. c(x) = x/2, we prove\nthat almost all symmetric non-equilibrium initial conditions converge to a unique limit cycle when\nboth agents use learning rate \u0001 = 1\u2212e\u221210. In contrast, since 1\u2212\u0001\u00b7C(s) \u2265 1\u2212(1\u2212e\u221210)1 = e\u221210 > 0,\nMWU(cid:96) successfully converges to equilibrium. In other words, for the same learning rates, MWUe\nexhibits chaotic behavior whereas MWU(cid:96) converges to Nash equilibrium. Establishing chaotic\nbehavior for the case of edges with different cost functions is rather straightforward in comparison\n(Theorem 4.2). The key step is to exploit symmetries in the system to reduce it to a single dimensional\none and then establish the existence of a periodic orbit of length three. The existence of periodic\norbits of any length as well as chaotic orbits then follows from the Li-Yorke theorem 2.3 [30] (see\nsection 2.2 for background on chaos and dynamical systems). Finally, for any learning rate 1 > \u0001 > 0,\nwe construct n-player games so that MWUe has chaotic behavior for uncountably many starting\npoints.\n\n4n balls n bin games are symmetric load balancing games with n agent and n edges/elements each with a\n\ncost function of c(x)=x. We normalize costs equal to c(x) = x/n so that they lie in [0, 1].\n\n5This is an absolutely minimal restriction so that the denominator of MWU(cid:96) cannot become equal to zero.\n\n2\n\n\fRelated work and Extensions/Implications of our results.\n\n\u221a\n\nConnections to learning in games and price of anarchy: Several recent papers, e.g., [40, 22] focus\non proving welfare guarantees of no-regret dynamics in games exploiting connections to (robust)\nprice of anarchy literature [37] by establishing fast convergence of the time average behavior to\n(approximate) coarse correlate equilibria. Although these approaches are rather powerful they are\nnot always applicable. For example, it is well known that when we consider the makespan (i.e. the\nload of the most congested machine) instead of the social/total cost there can be an exponential gap\nbetween the performance of coarse correlated equilibria and Nash equilibria. For example the price\nof anarchy for the makespan objective for n balls n bins games is O(log(n)/ log log(n)) whereas for\nthe worst no regret algorithm it can be \u2126(\nn) [9]. Moreover, even if we focus on the social cost, the\nprice of anarchy guarantees do not carry over if we perform af\ufb01ne transformation to the cost functions\n(e.g. if there exist users of different tiers/types that the system designer wants to account for in a\ndifferential manner). In contrast, our convergence results are robust to any af\ufb01ne cost transformation.\nIn fact, our results apply for all weighted potential games [32] (Remark 3.5).\nConnections to distributed computation and adversarial agent scheduling: A rather realistic\nconcern about results on learning in games has to do with their sensitivity to the ordering of the moves\nof the agent dynamics. For example, better-response dynamics in congestion games are guaranteed to\nconverge only if in every round, exactly one agent deviates to a better strategy. A series of recent\npapers has established strong non-termination (cycling) results for large classes of bounded recall\ndynamics with a wide variety of interesting and timely applications: game theory, circuit design,\nsocial networks, routing and congestion control [26, 19, 34, 25]. In the case of games, these results\ntranslate to corollaries such as: \u201cIf there are two or more pure Nash equilibria in a game with unique\nbest responses, then all bounded-recall self-independent dynamics6 for which those equilibria are\n\ufb01xed points can fail to converge in asynchronous environments.\" Even the simplest 2 balls 2 bins\ngame satis\ufb01es these properties (two pure Nash and unique best responses) which shows the strength\nof this impossibility result. In contrast, our convergence result holds for any adversarial scheduling\nwith the minimal fairness assumption that given any mixed state at least one agent who is not best\nresponding eventually will be given the possibility to update their behavior, answering open questions\nin [26, 25]. In fact, our convergence result is in a sense the strongest possible, no matter how many\nagents get to update their behavior (as long as one of them does) then the potential of the game will\nstrictly decrease (Corollary 3.6).\nConnections to complexity theory: Whereas the complexity of computing both mixed Nash equilib-\nria in general games (PPAD-complete [17]) as well as the complexity of \ufb01nding pure Nash equilibria\nin congestion games (PLS-complete [20]) have both been completely characterized and are thus\nunlikely to admit an ef\ufb01cient time algorithm, the complexity of computing mixed Nash equilibria\nin congestion games has withstood so far an exhaustive characterization. Naturally, it lies on the\nintersection of both PPAD and PLS, known as CLS [18]. Such an equilibrium can be found both via\nan end-of-line type of argument as well as a local search type of argument, but it is still not known\nif it is CLS-complete. Given the active interest for producing CLS-complete problems [16, 21] our\nconstructive/convergence proof may help shed light on this open question.\nChaos for arbitrary small learning rates \u0001: Although our example of chaotic behavior uses a very\nhigh learning rate \u0001 = 1 \u2212 e\u221210, it should be noted that for any learning rate \u0001 (e.g. \u0001 = e\u221210), as\nwell as for any number of agents n, we can create congestion games with n agents where MWUe\nexhibits chaotic behavior (Corollary 4.3).\nCongestion/potential games: Congestion games are amongst the most well known and thoroughly\nstudied class of games. Proposed in [36] and isomorphic to potential games [32], they have been\nsuccessfully employed in myriad modeling problems. Despite the numerous positive convergence\nresults for concurrent dynamics in congestion games, e.g., [33, 23, 7, 1, 6, 28, 10, 13, 12, 31], we\nknow of no prior work establishing such a deterministic convergence result of the day-to-day agent\nbehavior to exact Nash equilibria for general atomic congestion games. MWU has also been studied\nin congestion games. In [29] randomized variants of the exponential version of the MWU are shown\nto converge w.h.p. to pure Nash equilibria as long as the learning rate \u0001 is small enough. In contrast\nour positive results for linear M W U(cid:96) hold deterministically and for all learning rates. Recently, [14]\nshowed that if the Hedge algorithm is run with a suitably decreasing learning factor \u0001, the sequence\n\n6A dynamic is called self-independent if the agent\u2019s response does not depend on his actions.\n\n3\n\n\fof play converges to a Nash equilibrium with probability 1 (in the bandit case). The result and the\ntechniques are orthogonal to ours, since we assume \ufb01xed learning rates.\nNon-convergent dynamics: Outside the class of congestion games, there exist several negative\nresults in the literature concerning the non-convergence of MWU and variants thereof. In particular,\nin [15] it was shown that the multiplicative updates algorithm fails to \ufb01nd the unique Nash equilibrium\nof the 3 \u00d7 3 Shapley game. Similar non-convergent results have been proven for perturbed zero-sum\ngames [4], as well as for the continuous time version of MWU, the replicator dynamics [27, 35]. The\npossibility of applying Li-Yorke type arguments for MWU in congestion games with two agents\nwas inspired by a remark in [3] for the case of continuum of agents. Our paper is the \ufb01rst to our\nknowledge where non-convergent MWU behavior in congestion games is formally proven capturing\nboth limit cycles and chaos and we do so in the minimal case of two balls two bin games.\n\n2 Preliminaries\n\nNotation. We use boldface letters, e.g., x, to denote column vectors (points). For a function\nf : Rm \u2192 Rm, by f n we denote the composition of f with itself n times, namely f \u25e6 f \u25e6 \u00b7\u00b7\u00b7 \u25e6 f\n\n.\n\n(cid:124)\n\n(cid:123)(cid:122)\n\nn times\n\n(cid:125)\n\n2.1 Congestion Games\nA congestion game [36] is de\ufb01ned by the tuple (N ; E; (Si)i\u2208N ; (ce)e\u2208E) where N is the set of\nagents, N = |N|, E is a set of resources (also known as edges or bins or facilities) and each player i\nhas a set Si of subsets of E (Si \u2286 2E) and |Si| \u2265 1. Each strategy si \u2208 Si is a set of edges and ce is\na positive cost (latency) function associated with facility e. We use small greek characters like \u03b3, \u03b4\nto denote different strategies/paths. For a strategy pro\ufb01le s = (s1, s2, . . . , sN ), the cost of player i\nce((cid:96)e(s)), where (cid:96)e(s) is the number of players using e in s (the load of\n\nis given by ci(s) =(cid:80)\nedge e). The potential function is de\ufb01ned to be \u03a6(s) =(cid:80)\nby \u2206(Si) = {p \u2265 0 : (cid:80)\nthe expected cost of player i given that he chooses strategy \u03b3 and \u02c6ci =(cid:80)\n\nFor each i \u2208 N and \u03b3 \u2208 Si, pi\u03b3 denotes the probability player i chooses strategy \u03b3. We denote\n\u03b3 pi\u03b3 = 1} the set of mixed (randomized) strategies of player i and\n\u2206 = \u00d7i\u2206(Si) the set of mixed strategies of all players. We use ci\u03b3 = Es\u2212i\u223cp\u2212ici(\u03b3, s\u2212i) to denote\npi\u03b4ci\u03b4 to denote his\nexpected cost.\n\n(cid:80)(cid:96)e(s)\n\ne\u2208E\n\nj=1 ce(j).\n\n\u03b4\u2208Si\n\ne\u2208si\n\n2.2 Dynamical Systems and Chaos\nLet x(t+1) = f (x(t)) be a discrete time dynamical system with update rule f : Rm \u2192 Rm. The\npoint z is called a \ufb01xed point of f if f (z) = z. A sequence (f t(x(0)))t\u2208N is called a trajectory or\norbit of the dynamics with x(0) as starting point. A common technique to show that a dynamical\nsystem converges to a \ufb01xed point is to construct a function P : Rm \u2192 R such that P (f (x)) > P (x)\nunless x is a \ufb01xed point. We call P a Lyapunov or potential function.\nDe\ufb01nition 2.1. C = {z1, . . . , zk} is called a periodic orbit of length k if zi+1 = f (zi) for 1 \u2264 i \u2264\nk \u2212 1 and f (zk) = z1. Each point z1, . . . , zk is called periodic point of period k. If the dynamics\nconverges to some periodic orbit, we also use the term limit cycle.\n\nSome dynamical systems converge and their behavior can be fully understood and some others\nhave strange, chaotic behavior. There are many different de\ufb01nitions for what chaotic behavior and\nchaos means. In this paper we follow the de\ufb01nition of chaos by Li and Yorke. Let us \ufb01rst give\nthe de\ufb01nition of a scrambled set. Given a dynamical system with update rule f, a pair x and y is\ncalled \u201cscrambled\" if limn\u2192\u221e inf |f n(x) \u2212 f n(y)| = 0 (the trajectories get arbitrarily close) and\nalso limn\u2192\u221e sup|f n(x) \u2212 f n(y)| > 0 (the trajectories move apart). A set S is called \u201cscrambled\"\nif \u2200x, y \u2208 S, the pair is \u201cscrambled\".\nDe\ufb01nition 2.2 (Li and Yorke). A discrete time dynamical system with update rule f, f : X \u2192 X\ncontinuous on a compact set X \u2282 R is called chaotic if (a) for each k \u2208 Z+, there exists a periodic\npoint p \u2208 X of period k and (b) there is an uncountably in\ufb01nite set S \u2286 X that is \u201cscrambled\".\nLi and Yorke proved the following theorem [30] (there is another theorem of similar \ufb02avor due to\nSharkovskii [38]):\n\n4\n\n\fTheorem 2.3 (Period three implies chaos). Let J be an interval and let F : J \u2192 J be continuous.\nAssume there is a point a \u2208 J for which the points b = F (a), c = F 2(a) and d = F 3(a), satisfy\n\nd \u2264 a < b < c (or d \u2265 a > b > c).\n\nThen\n\n1. For every k = 1, 2, . . . there is a periodic point in J having period k.\n2. There is an uncountable set S \u2282 J (containing no periodic points), which satis\ufb01es the\n\nfollowing conditions:\n\n\u2022 For every p, q \u2208 S with p (cid:54)= q,\n\nlim\n\nn\u2192\u221e sup|F n(p) \u2212 F n(q)| > 0 and lim\n\u2022 For every point p \u2208 S and periodic point q \u2208 J,\n\nn\u2192\u221e inf |F n(p) \u2212 F n(q)| = 0.\n\nn\u2192\u221e sup|F n(p) \u2212 F n(q)| > 0.\n\nlim\n\nNotice that if there is a periodic point with period 3, then the hypothesis of the theorem will be\nsatis\ufb01ed.\n\n2.3 Baum-Eagon Inequality, Baum-Welch and EM\n\nWe start this subsection by stating the Baum-Eagon inequality. This inequality will be used to show\nthat MWU(cid:96) converges to \ufb01xed points and more speci\ufb01cally Nash equilibria for congestion games.\nTheorem 2.4 (Baum-Eagon inequality [5]). Let P (x) = P ({xij}) be a polynomial with nonnegative\ncoef\ufb01cients homogeneous of degree d in its variables {xij}. Let x = {xij} be any point of the\nj=1 xij = 1, i = 1, 2, ..., p, j = 1, 2, ..., qi. For x = {xij} \u2208 D let\n(cid:61)(x) = (cid:61){xij} denote the point of D whose i, j coordinate is\n\ndomain D : xij \u2265 0,(cid:80)qi\n\nThen P ((cid:61)(x)) > P (x) unless (cid:61)(x) = x.\n\nThe Baum-Welch algorithm is a classic technique used to \ufb01nd the unknown parameters of a hidden\nMarkov model (HMM). A HMM describes the joint probability of a collection of \u201chidden\" and\nobserved discrete random variables. It relies on the assumption that the i-th hidden variable given the\n(i \u2212 1)-th hidden variable is independent of previous hidden variables, and the current observation\nvariables depend only on the current hidden state. The Baum-Welch algorithm uses the well known\nEM algorithm to \ufb01nd the maximum likelihood estimate of the parameters of a hidden Markov model\ngiven a set of observed feature vectors. More detailed exposition of these ideas can be found here\n[8]. The probability of making a speci\ufb01c time series of observations of length T can be shown to\nbe a homogeneous polynomial P of degree T with nonnegative (integer) coef\ufb01cients of the model\nparameters. Baum-Welch algorithm is homologous to the iterative process derived by applying the\nBaum-Eagon theorem to polynomial P [5, 41].\nIn a nutshell, both Baum-Welch and MWU(cid:96) in congestion games are special cases of the Baum-Eagon\niterative process (for different polynomials P ).\n\n2.4 Multiplicative Weights Update\n\nIn this section, we describe the MWU dynamics (both the linear MWU(cid:96), and the exponential\nMWUe variants) applied in congestion games. The update rule (function) \u03be : \u2206 \u2192 \u2206 (where\np(t + 1) = \u03be(p(t))) for the linear variant MWU(cid:96) is as follows:\n1 \u2212 \u0001ici\u03b3(t)\n1 \u2212 \u0001i\u02c6ci(t)\n\npi\u03b3(t + 1) = (\u03be(p(t)))i\u03b3 = pi\u03b3(t)\n\n, \u2200i \u2208 N ,\u2200\u03b3 \u2208 Si,\n\n(1)\n\n5\n\n(cid:32)\n\n(cid:33)(cid:44) qi(cid:88)\n\n(cid:12)(cid:12)(cid:12)(cid:12)(x)\n\n(cid:61)(x)ij =\n\nxij\n\n\u2202P\n\u2202xij\n\nxij(cid:48)\n\n\u2202P\n\u2202xij(cid:48)\n\nj(cid:48)=1\n\n(cid:12)(cid:12)(cid:12)(cid:12)(x)\n\n\fdef\n\nwhere \u0001i is a constant (can depend on player i but not on p) so that both enumerator and denominator\nof the fraction in (1) are positive (and thus the fraction is well de\ufb01ned). Under the assumption that\n1/\u0001i > 1\n\u03b2\nThe update rule (function) \u03b7 : \u2206 \u2192 \u2206 (where p(t + 1) = \u03b7(p(t))) for the exponential variant\nMWUe is as follows:\n\n= supi,p\u2208\u2206,\u03b3\u2208Si {ci\u03b3}, it follows that 1/\u0001i > ci\u03b3 for all i, \u03b3 and hence 1/\u0001i > \u02c6ci.\n\n(cid:80)\n\n(1 \u2212 \u0001i)ci\u03b3 (t)\npi\u03b3(cid:48)(t)(1 \u2212 \u0001i)ci\u03b3(cid:48) (t)\n\n\u03b3(cid:48)\u2208Si\n\n, \u2200i \u2208 N ,\u2200\u03b3 \u2208 Si,\n\n(2)\n\npi\u03b3(t + 1) = (\u03b7(p(t)))i\u03b3 = pi\u03b3(t)\n\ndenominator are positive) and also is true that(cid:80)\n\nwhere \u0001i < 1 is a constant (can depend on player i but not on p). Note that \u0001i can be small when the\nnumber of agents N is large enough.\nRemark 2.5. Observe that \u2206 is invariant under the discrete dynamics (1), (2) de\ufb01ned above. If\npi\u03b3 = 0 then pi\u03b3 remains zero, and if it is positive, it remains positive (both numerator and\npi\u03b3 = 1 for all agents i. A point p\u2217 is called\na \ufb01xed point if it stays invariant under the update rule of the dynamics, namely \u03be(p\u2217) = p\u2217 or\n\u03b7(p\u2217) = p\u2217. A point p\u2217 is a \ufb01xed point of (1), (2) if for all i, \u03b3 with p\u2217\ni\u03b3 > 0 we have that ci\u03b3 = \u02c6ci.\ni\u03b3, p\u2217\nTo see why, observe that if p\u2217\ni\u03b3(cid:48) > 0, then ci\u03b3 = ci\u03b3(cid:48) and thus ci\u03b3 = \u02c6ci. We conclude that the set\nof \ufb01xed points of both dynamics (1), (2) coincide and are supersets of the set of Nash equilibria of the\ncorresponding congestion game.\n\n\u03b3\u2208Si\n\n3 Convergence of MWU(cid:96) to Nash Equilibria\n\nWe \ufb01rst prove that MWU(cid:96) (1) converges to \ufb01xed points7. Technically, we establish that function\n= Es\u223cp [\u03a6(s)] is strictly decreasing along any nontrivial (i.e. nonequilibrium) trajectory, where\ndef\n\u03a8\n\u03a6 is the potential function of the congestion game as de\ufb01ned in Section 2. Formally we show the\nfollowing theorem:\nTheorem 3.1 (\u03a8 is decreasing). Function \u03a8 is decreasing w.r.t. time, i.e., \u03a8(p(t + 1)) \u2264 \u03a8(p(t))\nwhere equality \u03a8(p(t + 1)) = \u03a8(p(t)) holds only at \ufb01xed points.\n\n\uf8eb\uf8ed(cid:88)\n\n\u03b3\u2208Si\n\n(cid:88)\n(cid:124)\n\ni\u2208N\n\nWe de\ufb01ne the function\n\n\uf8eb\uf8ed(1/\u0001i \u2212 1/\u03b2) \u00b7 (cid:88)\n\n\uf8f6\uf8f8 + 1/\u03b2 \u00b7 (cid:89)\n(cid:123)(cid:122)\n(cid:80)\nfollows since Q = const \u2212 \u03a8 where const =(cid:80)\n\nconstant term\n\n\u03b3\u2208Si\n\nand show that Q(p) is strictly increasing w.r.t time, unless p is a \ufb01xed point. Observe that\npi\u03b3 = 1 since p lies in \u2206, but we include this terms in Q for technical reasons that will be\nmade clear later in the section. By showing that Q is increasing with time, Theorem 3.1 trivially\ni\u2208N 1/\u0001i \u2212 1/\u03b2(N \u2212 1). To show that Q(p) is\nstrictly increasing w.r.t time, unless p is a \ufb01xed point, we use a generalization of an inequality by\nBaum and Eagon [5] on function Q.\nCorollary 3.2 (Generalization of Baum-Eagon). Theorem 2.4 holds even if P is non-homogeneous.\n\n\u2212\u03a8(p),\n\n(3)\n\n\uf8f6\uf8f8\n(cid:125)\n\npi\u03b3\n\npi\u03b3\n\n\u03b3\u2208Si\n\ni\u2208N\n\nQ(p)\n\ndef\n=\n\nWe want to apply Corollary 3.2 on Q. To do so, it suf\ufb01ces to show that Q(p) is a polynomial with\nnonnegative coef\ufb01cients.\nLemma 3.3. Q(p) is a polynomial with respect to pi\u03b3 and has nonnegative coef\ufb01cients.\n\nUsing Lemma 3.3 and Corollary 3.2 we show the following:\nTheorem 3.4. Let Q be the function de\ufb01ned in (3). Let also p(t) \u2208 \u2206 be the point MWU(cid:96) (1)\ndef\n= Q(\u03be(p(t))) > Q(p(t)) unless\noutputs at time t with update rule \u03be. It holds that Q(p(t + 1))\n\u03be(p(t)) = p(t) (\ufb01xed point). Namely Q is strictly increasing with respect to the number of iterations\nt unless MWU(cid:96) is at a \ufb01xed point.\n\n7All missing proofs can be found in the full version of this paper http://arxiv.org/abs/1703.01138.\n\n6\n\n\fe\u2208E\n\n(same for all players) function \u03a6 =(cid:80)\n\n(cid:80)(cid:96)e(s)\ni, s\u2212i) = wi(\u03a6(si, s\u2212i) \u2212 \u03a6(s(cid:48)\n\nRemark 3.5 (Weighted potential games). A congestion game is a potential game because if a player\ndeviates, the difference he experiences in his cost is exactly captured by the deviation of the global\nj=1 ce(j). In a weighted potential game, it holds that\nci(si, s\u2212i) \u2212 ci(s(cid:48)\ni, s\u2212i)), where wi is some constant not necessarily\n1 (as in the potential games case) and vector s\u2212i captures the strategies of all players but i. It\nis not hard to see that Lemma 3.3 and thus Theorems 3.4 and 3.1 hold in this particular class of\ngames (which is a generalization of congestion games), and so do the rest of the theorems of the\nsection. Effectively, in terms of the weighted potential games analysis, it is possible to reduce it to\nthe standard potential games analysis as follows: Consider the system with learning rates \u0001i and\ncost functions wici so that the game with cost functions ci is a potential game. The only necessary\ncondition that we ask of this system is that \u0001iwici(s) < 1 for all i (as in the standard case) so that\nthe enumerators/denominators are positive.\n\nBy reduction, we can show that for every round T , even if a subset (that depends on the round T )\nof the players update their strategy according to MWU(cid:96) and the rest remain \ufb01xed, the potential still\ndecreases.\nCorollary 3.6 (Any subset). Assume that at time t we partition the players in two sets St, S(cid:48)\nwe allow only players in St to apply MWU(cid:96) dynamics, whereas the players in S(cid:48)\nholds that the expected potential function of the game at time t decreases.\n\nt so that\nt remain \ufb01xed. It\n\nAs stated earlier in the section, if Q(p(t)) is strictly increasing with respect to time t unless p(t) is\na \ufb01xed point, it follows that the expected potential function \u03a8(p(t)) = const \u2212 Q(p(t)) is strictly\ndecreasing unless p(t) is a \ufb01xed point and Theorem 3.1 is proved. Moreover, we can derive the fact\nthat our dynamics converges to \ufb01xed points as a corollary of Theorem 3.1.\nTheorem 3.7 (Convergence to \ufb01xed points). MWU(cid:96) dynamics (1) converges to \ufb01xed points.\n\nWe conclude the section by strengthening the convergence result (i.e., Theorem 3.7). We show that if\nthe initial distribution p is in the interior of \u2206 then we have convergence to Nash equilibria.\nTheorem 3.8 (Convergence to Nash equilibria). Assume that the \ufb01xed points of (1) are isolated. Let\np(0) be a point in the interior of \u2206. It follows that limt\u2192\u221e p(t) = p\u2217 is a Nash equilibrium.\n\np\u2217. Also it is clear from the dynamics that \u2206 is invariant, i.e.,(cid:80)\n\nProof. We showed in Theorem 3.7 that MWU(cid:96) dynamics (1) converges, hence limt\u2192\u221e p(t) exists\n(under the assumption that the \ufb01xed points are isolated) and is equal to a \ufb01xed point of the dynamics\npj\u03b4(t) = 1, pj\u03b4(t) > 0 for all\nj and t \u2265 0 since p(0) is in the interior of \u2206.\nAssume that p\u2217 is not a Nash equilibrium, then there exists a player i and a strategy \u03b3 \u2208 Si so that\ni\u03b3 = 0. Fix a \u03b6 > 0 and let U\u03b6 = {p : ci\u03b3(p) <\nci\u03b3(p\u2217) < \u02c6ci(p\u2217) (on mixed strategies p\u2217) and p\u2217\n\u02c6ci(p) \u2212 \u03b6}. By continuity we have that U\u03b6 is open. It is also true that p\u2217 \u2208 U\u03b6 for \u03b6 small enough.\nSince p(t) converges to p\u2217 as t \u2192 \u221e, there exists a time t0 so that for all t(cid:48) \u2265 t0 we have that\np(t(cid:48)) \u2208 U\u03b6. However, from MWU(cid:96) dynamics (1) we get that if p(t(cid:48)) \u2208 U\u03b6 then 1 \u2212 \u0001ici\u03b3(t(cid:48)) >\n1 \u2212 \u0001i\u02c6ci(t(cid:48)) and hence pi\u03b3(t(cid:48) + 1) = pi\u03b3(t(cid:48)) 1\u2212\u0001ici\u03b3 (t(cid:48))\n1\u2212\u0001i \u02c6ci(t(cid:48)) \u2265 pi\u03b3(t(cid:48)) > 0, i.e., pi\u03b3(t(cid:48)) is positive and\nincreasing with t(cid:48) \u2265 t0. We reached a contradiction since pi\u03b3(t) \u2192 p\u2217\ni\u03b3 = 0, thus p\u2217 is a Nash\nequilibrium.\n\n\u03b4\u2208Sj\n\n4 Non-Convergence of MWUe: Limit Cycle and Chaos\n\nWe consider a symmetric two agent congestion game with two edges e1, e2. Both agents have the\nsame two available strategies \u03b31 = {e1} and \u03b32 = {e2}. We denote x, y the probability that the \ufb01rst\nand the second agent respectively choose strategy \u03b31.\n2 \u00b7 l. Computing the expected\nFor the \ufb01rst example, we assume that ce1(l) = 1\ncosts we get that c1\u03b31 = 1+y\n2 . MWUe then becomes xt+1 =\n(sec-\nxt\nond player). We assume that \u00011 = \u00012 and also that x0 = y0 (players start with the same mixed\n\n2 \u00b7 l and ce2 (l) = 1\n2 , c2\u03b32 = 2\u2212x\n\n(1\u2212\u00011)\n2 +(1\u2212xt)(1\u2212\u00011)\n\n(\ufb01rst player) and yt+1 = yt\n\n2 , c1\u03b32 = 2\u2212y\n\nxt+1\n\n(1\u2212\u00012)\n2 +(1\u2212yt)(1\u2212\u00012)\n\n2\n\n2 , c2\u03b31 = 1+x\n\nxt(1\u2212\u00011)\n\n2\u2212xt\n\n2\n\nyt(1\u2212\u00012)\n\nxt+1\n\n(yt+1)\n\n2\n\nyt+1\n\n2\u2212yt\n\n2\n\n7\n\n\f(a) Exponential MWUe: Plot of function G (blue)\nand its iterated versions G2 (red), G3 (yellow).\nFunction y(x) = x is also included.\n\n(b) Linear MWU(cid:96): Plot of function G(cid:96) (blue) and\nits iterated versions G2\n(cid:96) (yellow). Func-\ntion y(x) = x is also included.\n\n(cid:96) (red) and G3\n\n(c) Exponential MWUe: Plot of function G10.\nFunction y(x) = x is also included.\n\n(d) Linear MWU(cid:96): Plot of function G10\ny(x) = x is also included.\n\n(cid:96) . Function\n\nFigure 1: We compare and contrast MWUe (left) and MWU(cid:96) (right) in the same two agent two\n4 \u00b7 l and same learning rate\nstrategy/edges congestion game with ce1(l) = 1\n\u0001 = 1 \u2212 e\u221240. MWUe exhibits sensitivity to initial conditions whereas MWU(cid:96) equilibrates. Function\ny(x) = x is also included in the graphs to help identify \ufb01xed points and periodic points.\n\n4 \u00b7 l and ce2(l) = 1.4\n\nstrategy. Due to symmetry, it follows that xt = yt for all t \u2208 N, thus it suf\ufb01ces to keep track only of\none variable (we have reduced the number of variables of the update rule of the dynamics to one) and\n. Finally, we choose \u0001 = 1 \u2212 e\u221210\nthe dynamics becomes xt+1 = xt\nand we get\n\n(1\u2212\u0001)\n2 +(1\u2212xt)(1\u2212\u0001)\n\nxt(1\u2212\u0001)\n\n2\u2212xt\n\nxt+1\n\nxt+1\n\n2\n\n2\n\nxt+1 = H(xt) = xt\n\ne\u22125(xt+1)\n\nxte\u22125(xt+1) + (1 \u2212 xt)e\u22125(2\u2212xt)\n\n,\n\ni.e., we denote H(x) =\n\nxe\u22125(x+1)\n\nxe\u22125(x+1)+(1\u2212x)e\u22125(2\u2212x) .\n\nFor the second example, we assume that ce1(l) = 1\nthe expected costs we get that c1\u03b31 = 1+y\nMWUe then becomes xt+1 = xt\n\n4 , c1\u03b32 = 1.4(2\u2212y)\n\n4\n\n4 \u00b7 l and ce2(l) = 1.4\n4\n, c2\u03b31 = 1+x\n\n\u00b7 l. Computing\n4 , c2\u03b32 = 1.4(2\u2212x)\n.\n(\ufb01rst player) and yt+1 =\n\n1.4(2\u2212yt)\n\n4\n\n(yt+1)\n\n(1\u2212\u00011)\n4 +(1\u2212xt)(1\u2212\u00011)\n\n4\n\n(1\u2212\u00012)\n\nxt(1\u2212\u00011)\n(second player). We assume that \u00011 = \u00012 and also that x0 = y0\nyt\n(players start with the same mixed strategy. Similarly, due to symmetry, it follows that xt = yt\nfor all t \u2208 N, thus it suf\ufb01ces to keep track only of one variable and the dynamics becomes\n\n4 +(1\u2212yt)(1\u2212\u00012)\n\nyt(1\u2212\u00012)\n\n1.4(2\u2212xt)\n\nxt+1\n\nxt+1\n\nyt+1\n\n4\n\n4\n\n4\n\n8\n\n\fxt+1 = xt\n\n(1\u2212\u0001)\n\nxt+1\n\n4\n\nxt(1\u2212\u0001)\n\nxt+1\n\n4 +(1\u2212xt)(1\u2212\u0001)\n\n1.4(2\u2212xt)\n\n4\n\n. Finally, we choose \u0001 = 1 \u2212 e\u221240 and we get\n\nxt+1 = G(xt) = xt\n\ne\u221210(xt+1)\n\nxte\u221210(xt+1) + (1 \u2212 xt)e\u221214(2\u2212xt)\n\n,\n\ni.e., we denote G(x) =\n\nxe\u221210(x+1)\n\nxe\u221210(x+1)+(1\u2212x)e\u221214(2\u2212x) .\n\nWe show the following three statements, the proofs of which can be found in the full version.\nTheorem 4.1. For all but a measure zero set S of x \u2208 (0, 1) we get that limt\u2192\u221e H 2t(x) = \u03c11 or \u03c12.\nMoreover, H(\u03c11) = \u03c12 and H(\u03c12) = \u03c11, i.e., {\u03c11, \u03c12} is a periodic orbit. Thus, all but a measure\nzero set S of initial conditions converge to the limit cycle {\u03c11, \u03c12}. Finally, the initial points in S\nconverge to the equilibrium 1\n2 .\nTheorem 4.2. There exist two player two strategy symmetric congestion games such that MWUe has\nperiodic orbits of length n for any natural number n > 0 and as well as an uncountably in\ufb01nite set\nof \u201cscrambled\" initial conditions (Li-Yorke chaos).\n\nUsing Theorem 4.2, we conclude with the following corollary.\nCorollary 4.3. For any 1 > \u0001 > 0 and n, there exists a n-player congestion game G(\u0001) (depending\non \u0001) so that MWUe dynamics exhibits Li-Yorke chaos for uncountably many starting points.\n\n5 Conclusion and Future Work\n\nWe have analyzed MWU(cid:96) in congestion games where agents use arbitrary admissible constants as\nlearning rates \u0001 and showed convergence to exact Nash equilibria. We have also shown that this\nresult is not true for the nearly homologous exponential variant MWUe even for the simplest case of\ntwo-agent, two-strategy load balancing games. There we prove that such dynamics can provably lead\nto limit cycles or even chaotic behavior.\nFor a small enough learning rate \u0001 the behavior of MWUe approaches that of its smooth variant,\nreplicator dynamics, and hence convergence is once again guaranteed [29]. This means that as we\nincrease the learning rate \u0001 from near zero values we start off with a convergent system and we\nend up with a chaotic one. Numerical experiments establish that between the convergent region\nand the chaotic region there exists a range of values for \u0001 for which the system exhibits periodic\nbehavior. Period doubling is known as standard route for 1-dimensional chaos (e.g. logistic map) and\nis characterized by unexpected regularities such as the Feigenbaum constant [39]. Elucidating these\nconnections is an interesting open problem. More generally, what other type of regularities can be\nestablished in these non-equilibrium systems?\nAnother interesting question has to do with developing a better understanding of the set of conditions\nthat result to non-converging trajectories. So far, it has been critical for our non-convergent examples\nthat the system starts from a symmetric initial condition. Whether such irregular MWUe trajectories\ncan be constructed for generic initial conditions, possibly in larger congestion games, is not known.\nNevertheless, the non-convergent results, despite their non-generic nature are rather useful since\nthey imply that we cannot hope to leverage the power of Baum-Eagon techniques for MWUe. In\nconclusion, establishing generic (non)convergence results (e.g. for most initial conditions, most\ncongestion games) for MWUe with constant step size is an interesting future direction.\n\nReferences\n[1] H. Ackermann, P. Berenbrink, S. Fischer, and M. Hoefer. Concurrent imitation dynamics in\n\ncongestion games. In PODC, pages 63\u201372, New York, USA, 2009. ACM.\n\n[2] S. Arora, E. Hazan, and S. Kale. The multiplicative weights update method: a meta-algorithm\n\nand applications. Theory of Computing, 8(1):121\u2013164, 2012.\n\n[3] I. Avramopoulos. Evolutionary stability implies asymptotic stability under multiplicative\n\nweights. CoRR, abs/1601.07267, 2016.\n\n9\n\n\f[4] M.-F. Balcan, F. Constantin, and R. Mehta. The weighted majority algorithm does not converge\nin nearly zero-sum games. In ICML Workshop on Markets, Mechanisms and Multi-Agent\nModels, 2012.\n\n[5] L. E. Baum and J. A. Eagon. An inequality with applications to statistical estimation for\nprobabilistic functions of markov processes and to a model of ecology. Bulletin of the American\nMathematical Society, 73(3):360\u2013363, 1967.\n\n[6] P. Berenbrink, M. Hoefer, and T. Sauerwald. Distributed sel\ufb01sh load balancing on networks. In\n\nACM Transactions on Algorithms (TALG), 2014.\n\n[7] P. Berenbrink, T. Friedetzky, L. A. Goldberg, P. W. Goldberg, Z. Hu, and R. Martin. Distributed\n\nsel\ufb01sh load balancing. SIAM J. Comput., 37(4):1163\u20131181, November 2007.\n\n[8] J. A Bilmes et al. A gentle tutorial of the em algorithm and its application to parameter\nestimation for gaussian mixture and hidden markov models. International Computer Science\nInstitute, 4(510):126, 1998.\n\n[9] A. Blum, M. Hajiaghayi, K. Ligett, and A. Roth. Regret minimization and the price of total\nanarchy. In Proceedings of the 40th annual ACM symposium on Theory of computing, STOC,\npages 373\u2013382, 2008.\n\n[10] I. Caragiannis, A. Fanelli, N. Gravin, and A. Skopalik. Ef\ufb01cient computation of approximate\n\npure nash equilibria in congestion games. In FOCS, 2011.\n\n[11] N. Cesa-Bianchi and G. Lugoisi. Prediction, Learning, and Games. Cambridge University\n\nPress, 2006.\n\n[12] P. Chen and C. Lu. Generalized mirror descents in congestion games. Arti\ufb01cial Intelligence,\n\n241:217\u2013243, 2016.\n\n[13] S. Chien and A. Sinclair. Convergence to approximate nash equilibria in congestion games. In\n\nGames and Economic Behavior, pages 315\u2013327, 2011.\n\n[14] J. Cohen, A. Heliou, and P. Mertikopoulos. Learning with bandit feedback in potential games.\nIn Proceedings of the 31th International Conference on Neural Information Processing Systems,\n2017.\n\n[15] C. Daskalakis, R. Frongillo, C. Papadimitriou, G. Pierrakos, and G. Valiant. On learning\nalgorithms for Nash equilibria. Symposium on Algorithmic Game Theory (SAGT), pages\n114\u2013125, 2010.\n\n[16] C. Daskalakis, C. Tzamos, and M. Zampetakis. A Converse to Banach\u2019s Fixed Point Theorem\n\nand its CLS Completeness. ArXiv e-prints, February 2017.\n\n[17] C. Daskalakis, P. W. Goldberg, and C. H. Papadimitriou. The complexity of computing a nash\n\nequilibrium. pages 71\u201378. ACM Press, 2006.\n\n[18] C. Daskalakis and C. Papadimitriou. Continuous local search. In Proceedings of the Twenty-\nsecond Annual ACM-SIAM Symposium on Discrete Algorithms, SODA \u201911, pages 790\u2013804,\nPhiladelphia, PA, USA, 2011. Society for Industrial and Applied Mathematics.\n\n[19] R. Engelberg, A. Fabrikant, M. Schapira, and D. Wajc. Best-response dynamics out of sync:\nIn Proceedings of the Fourteenth ACM Conference on\n\nComplexity and characterization.\nElectronic Commerce, EC \u201913, pages 379\u2013396, New York, NY, USA, 2013. ACM.\n\n[20] A. Fabrikant, C. Papadimitriou, and K. Talwar. The complexity of pure Nash equilibria. In\n\nACM Symposium on Theory of Computing (STOC), pages 604\u2013612. ACM, 2004.\n\n[21] J. Fearnley, S. Gordon, R. Mehta, and R. Savani. CLS: New Problems and Completeness. ArXiv\n\ne-prints, February 2017.\n\n[22] D. J Foster, T. Lykouris, K. Sridharan, and E. Tardos. Learning in games: Robustness of fast\nconvergence. In Advances in Neural Information Processing Systems, pages 4727\u20134735, 2016.\n\n10\n\n\f[23] D. Fotakis, A. C. Kaporis, and P. G. Spirakis. Atomic congestion games: Fast, myopic and\nconcurrent. In Burkhard Monien and Ulf-Peter Schroeder, editors, Algorithmic Game Theory,\nvolume 4997 of Lecture Notes in Computer Science, pages 121\u2013132. Springer Berlin Heidelberg,\n2008.\n\n[24] D. Fudenberg and D. K. Levine. The Theory of Learning in Games. MIT Press Books. The\n\nMIT Press, 1998.\n\n[25] A. D Jaggard, N. Lutz, M. Schapira, and R. N Wright. Dynamics at the boundary of game\ntheory and distributed computing. ACM Transactions on Economics and Computation (TEAC),\n2017.\n\n[26] A. D Jaggard, M. Schapira, and R. N Wright. Distributed computing with adaptive heuristics.\n\nIn ICS, 2011.\n\n[27] R. Kleinberg, K. Ligett, G. Piliouras, and \u00c9. Tardos. Beyond the Nash equilibrium barrier. In\n\nSymposium on Innovations in Computer Science (ICS), 2011.\n\n[28] R. Kleinberg, G. Piliouras, and \u00c9. Tardos. Load balancing without regret in the bulletin board\n\nmodel. Distributed Computing, 24(1):21\u201329, 2011.\n\n[29] R. Kleinberg, G. Piliouras, and \u00c9. Tardos. Multiplicative updates outperform generic no-regret\n\nlearning in congestion games. In ACM Symposium on Theory of Computing (STOC), 2009.\n\n[30] T. Li and J. A. Yorke. Period three implies chaos. The American Mathematical Monthly,\n\n82(10):985\u2013992, 1975.\n\n[31] P. Mertikopoulos and A. L. Moustakas. The emergence of rational behavior in the presence of\n\nstochastic perturbations. The Annals of Applied Probability, 20(4):1359\u20131388, 2010.\n\n[32] D. Monderer and L. S. Shapley. Potential games. Games and Economic Behavior, pages\n\n124\u2013143, 1996.\n\n[33] D. Monderer and L. S Shapley. Fictitious play property for games with identical interests.\n\nJournal of economic theory, 68(1):258\u2013265, 1996.\n\n[34] N. Nisan, M. Schapira, and A. Zohar. Asynchronous best-reply dynamics. In International\n\nWorkshop on Internet and Network Economics, pages 531\u2013538. Springer, 2008.\n\n[35] G. Piliouras and J. S. Shamma. Optimization despite chaos: Convex relaxations to complex\n\nlimit sets via Poincar\u00e9 recurrence. In SODA, 2014.\n\n[36] R.W. Rosenthal. A class of games possessing pure-strategy Nash equilibria. International\n\nJournal of Game Theory, 2(1):65\u201367, 1973.\n\n[37] T. Roughgarden. Intrinsic robustness of the price of anarchy. In Proc. of STOC, pages 513\u2013522,\n\n2009.\n\n[38] A.N. Sharkovskii. Co-existence of cycles of a continuous mapping of the line into itself.\n\nUkrainian Math. J., 16:61 \u2013 71, 1964.\n\n[39] S. Strogatz. Nonlinear Dynamics and Chaos. Perseus Publishing, 2000.\n\n[40] V. Syrgkanis, A. Agarwal, H. Luo, and R. E. Schapire. Fast convergence of regularized\nlearning in games. In Proceedings of the 28th International Conference on Neural Information\nProcessing Systems, NIPS\u201915, pages 2989\u20132997, Cambridge, MA, USA, 2015. MIT Press.\n\n[41] L. R Welch. Hidden markov models and the baum-welch algorithm. IEEE Information Theory\n\nSociety Newsletter, 53(4):10\u201313, 2003.\n\n11\n\n\f", "award": [], "sourceid": 3000, "authors": [{"given_name": "Gerasimos", "family_name": "Palaiopanos", "institution": "SUTD"}, {"given_name": "Ioannis", "family_name": "Panageas", "institution": "MIT"}, {"given_name": "Georgios", "family_name": "Piliouras", "institution": "Singapore University of Technology and Design"}]}