{"title": "Nash Propagation for Loopy Graphical Games", "book": "Advances in Neural Information Processing Systems", "page_first": 817, "page_last": 824, "abstract": null, "full_text": "Nash Propagation for Loopy Graphical Games\n\nLuis E. Ortiz\nMichael Kearns\n\nDepartment of Computer and Information Science\n\nUniversity of Pennsylvania\n\n leortiz,mkearns\n\n@cis.upenn.edu\n\nAbstract\n\nWe introduce NashProp, an iterative and local message-passing algo-\nrithm for computing Nash equilibria in multi-player games represented\nby arbitrary undirected graphs. We provide a formal analysis and exper-\nimental evidence demonstrating that NashProp performs well on large\ngraphical games with many loops, often converging in just a dozen itera-\ntions on graphs with hundreds of nodes.\nNashProp generalizes the tree algorithm of (Kearns et al. 2001), and\ncan be viewed as similar in spirit to belief propagation in probabilis-\ntic inference, and thus complements the recent work of (Vickrey and\nKoller 2002), who explored a junction tree approach. Thus, as for prob-\nabilistic inference, we have at least two promising general-purpose ap-\nproaches to equilibria computation in graphs.\n\n1 Introduction\nThere has been considerable recent interest in representational and algorithmic issues\narising in multi-player game theory. One example is the recent work on graphical\ngames (Kearns et al. 2001) (abbreviated KLS in the sequel). Here a multi-player game\nis represented by an undirected graph. The interpretation is that while the global equilibria\nof the game depend on the actions of all players, individual payoffs for a player are deter-\nmined solely by his own action and the actions of his immediate neighbors in the graph.\nLike graphical models in probabilistic inference, graphical games may provide an exponen-\ntially more succinct representation than the standard \u201ctabular\u201d or normal form of the game.\nAlso as for probabilistic inference, the problem of computing equilibria on arbitrary graphs\nis intractable in general, and so it is of interest to identify both natural special topologies\npermitting fast Nash computations, and good heuristics for general graphs.\n\nKLS gave a dynamic programming algorithm for computing Nash equilibria in graphical\ngames in which the underlying graph is a tree, and drew analogies to the polytree algorithm\nfor probabilistic inference (Pearl 1988). A natural question following from this work is\nwhether there are generalizations of the basic tree algorithm analogous to those for proba-\nbilistic inference. In probabilistic inference, there are two main approaches to generalizing\nthe polytree algorithm. Roughly speaking, the \ufb01rst approach is to take an arbitrary graph\nand \u201cturn it into a tree\u201d via triangulation, and subsequently run the tree-based algorithm on\nthe resulting junction tree (Lauritzen and Spiegelhalter 1988). This approach has the merit\nof being guaranteed to perform inference correctly, but the drawback of requiring the com-\nputation to be done on the junction tree. On highly loopy graphs, junction tree computations\nmay require exponential time. The other broad approach is to simply run (an appropriate\n\n\u0001\n\fgeneralization of) the polytree algorithm on the original loopy graph. This method gar-\nnered considerable interest when it was discovered that it sometimes performed quite well\nempirically, and was closely connected to the problem of decoding in Turbo Codes. Belief\npropagation has the merit of each iteration being quite ef\ufb01cient, but the drawback of hav-\ning no guarantee of convergence in general (though recent theoretical work has established\nconvergence for certain special cases (Weiss 2000)).\n\nIn recent work, (Vickrey and Koller 2002) proposed a number of heuristics for equilibria\ncomputation in graphical games, including a constraint satisfaction generalization of KLS\nthat essentially provides a junction tree approach for arbitrary graphical games. They also\ngave promising experimental results for this heuristic on certain loopy graphs that result in\nmanageable junction trees.\n\nIn this work, we introduce the NashProp algorithm, a different KLS generalization which\nprovides an approach analogous to loopy belief propagation for graphical games. Like\nbelief propagation, NashProp is a local message-passing algorithm that operates directly\non the original graph of the game, requiring no triangulation or moralization 1 operations.\nNashProp is a two-phase algorithm. In the \ufb01rst phase, nodes exchange messages in the form\nof two-dimensional tables. The table player\nin the graph\nand the information he has\nreceived in tables from his other neighbors, a kind of conditional Nash equilibrium. In the\nsecond phase of NashProp, the players attempt to incrementally construct an equilibrium\nobeying constraints imposed by the tables computed in the \ufb01rst phase.\n\n\u201cbelieves\u201d he can play given a setting of \u0001\n\nindicates the values \n\nsends to neighboring player\n\nInterestingly, we can provide rather strong theory for the \ufb01rst phase, proving that the tables\nmust always converge, and result in a reduced search space that can never eliminate an\nequilibrium. When run using a discretization scheme introduced by KLS, the \ufb01rst phase of\nNashProp will actually converge in time polynomial in the size of the game representation.\n\n(\n\n), each with\n\nwhen the joint action of the\n\nmatrices\npayoff to player\n\nWe also report on a number of controlled experiments with NashProp on loopy graphs,\nincluding some that would be dif\ufb01cult via the junction tree approach due to the graph\ntopology. The results appear to be quite encouraging, thus growing the body of heuristics\navailable for computing equilibria in compactly represented games.\n2 Preliminaries\nThe normal or tabular form of an \u0002 -player, two-action2 game is de\ufb01ned by a set of \u0002\n\u0003\u0005\u0004\n\u0003\u0005\u0004\n\u001e entries. The actions 0 and 1 are the pure strategies of each player, while a mixed strategy\n\u0006$\u001b that the player will play 0. For any joint\nfor player \t\n, we de\ufb01ne the expected payoff to player\n5 and\n\u00112\u0012\n\u0003%\u0004&\r1\u000f\n\u0003%\u0004&\r'\u000f\n5 .\n1 with probability \u0006764!\nWe use \u000f\n! except in the \t th component,\n. A (Nash) equilibrium for the game is a mixed\nstrategy \u000f\n\u0006$\u001b , \u0003\u0005\u0004>\r?\u000f\n\u0012 . (We\n.) In other words, no player can improve their\nsay that !\nexpected payoff by deviating unilaterally from a Nash equilibrium. The classic theorem\nof (Nash 1951) states that for any game, there exists a Nash equilibrium in the space of\njoint mixed strategies. We will also use a straightforward de\ufb01nition for approximate Nash\n\n\u0006\b\u0007\n\t\u000b\u0007\f\u0002\nis given by the probability !\n\u0014\"\u0016\n\u0018#\u0019\nmixed strategy, given by a product distribution\nindicates that each \u0011\u001a5\nas\n\u0012)(+*-,\n\u001143\n.\u0010/\n\u001b to denote the vector which is the same as \u000f\n\t98:!2;\n!<;\nsuch that for any player \t , and for any !\n\u0018\u001a\u0019\nis a best response to the rest of \u000f\n\nindices. The entry\n\u0011\u001d\u0014\n\nwhere the value has been changed to\n\nspeci\ufb01es the\nhas\n\n\u0018\u001a\u0019\n. Thus, each\n\n\u0011\u0013\u0012\u0015\u0014\u0017\u0016\n\u0001\u001f\u001e\n\nis 0 with probability\n\n\u0003\u0005\u0004\u000e\r\u0010\u000f\n\u0018\u001a\u0019\n\nplayers is\n\n\u0003\u0005\u0004>\r?\u000f\n\n, where\n\n\u0014=\u0016\n\n\u00129@\n\n\u0006\u001c\u001b\n\n\tA8\u001f!\n\nequilibria. An B -Nash equilibrium is a mixed strategy \u000f\nany value !2;\n\n\u0018\u001a\u0019\n\u0014C\u0016\n.) Thus, no player can improve their expected payoff by more than\n\nsuch that for any player \t , and for\nis an B -best response\n\n\u0012 . (We say that !\n\nto the rest of\n\n\tE8\u001a!2;\n\n\u0006\u001c\u001b , \u0003\n\n\u0012)D\n\nby\n\n?\u000f\n\n'\u000f\n\n1Unlike for inference, moralization may be required for games even on undirected graphs.\n2For simplicity, we describe our results for two actions, but they generalize to multi-action games.\n\n\n\u0001\n\u0002\n\t\n\u0002\n\u000f\n\n\u0006\n \n\u0004\n\u000f\n!\n\t\n!\n,\n0\n\u0016\n\u001b\n\u000f\n\u000f\n!\n!\n!\n\u0016\n\u0004\n\u0004\n!\n;\n\u0004\n!\n!\n\u0016\n;\n\u0004\n\u001b\n\u0004\n!\n!\n\u0004\n\u0004\n!\nB\n@\n\u0003\n\u0004\n!\n\u0016\n\u0004\n\u001b\n\u0004\n\u000f\n!\nB\n\fdeviating unilaterally from an approximate Nash equilibrium.\n\n.\n\n\u0001\n\n\u0019\u0003\u0002\n\n\u0018\u001a\u0019\n\n\b\n\t\f\u000b\n\n\u0014+\u0016\n\nhas \n\nneighbors of\n\n\u0006\u001c\u001b\u0001\b\n\t\f\u000b , \u0003\n\n. Thus the matrix\n\n-player graphical game is a pair\n\nis\nentries, which may be\n\nThe following de\ufb01nitions are due to KLS. An\n\nhas an index for each of the\n\u0018\u001a\u0019\n\nis an undirected graph on \u0002 vertices and \u0002\n\n\u0014\u0006\u0002\nitself, and for \u000f\n. The expected payoff under a mixed strategy \u000f\n\n\u0012 ,\nis a set of \u0002 matrices \u0003\"\u0004 called the\n, and the interpretation\nis that each player\u2019s payoff is determined solely by the actions in their local neighborhood\n, and\nin\n\nwhere \nlocal game matrices. Each player is represented by a vertex in \n\u0003\u0005\u0004\nan index for \u0001\n\u0011<\u0012 denotes the payoff to \u0001 when he\n\r\u0010\u000f\nand his \u0007 neighbors play \u000f\nde\ufb01ned analogously. Note that in the two-action case,\n\u0003\u000f\u0004\nconsiderably smaller than \nto be the\nNote that any game can be trivially represented as a graphical game by choosing \ncomplete graph, and letting the local game matrices be the original tabular form matrices.\nHowever, any time in which the local neighborhoods in\n, the\n\u0007\u0011\u0010\u0012\u0010\ngraphical representation is exponentially smaller than the normal form. We are interested\nin heuristics that can exploit this succinctness computationally.\n3 NashProp: Table-Passing Phase\nThe table-passing phase of NashProp proceeds in a series of rounds. In each round, every\nnode will send a different binary-valued table to each of its neighbors in the graph. Thus,\nand \u0013\nin round \u0014 shall be denoted\n\u0012 . Since the vertices are always clear from the lower-case table indices, we shall\n\u0012 . This table is indexed by the continuum of\n\u0019\u001e\u001b\n, respectively. Intuitively, the\n\u2019s (possibly incorrect) \u201cbelief\u201d that there exists a\nand\n\nif vertices \u0001\n\u0015\u0017\u0016\n\u0019\u001c\u001b\ndrop the subscript and simply write \u0015\u001d\u0016\npossible mixed strategies \u0019\n\u0018#\u0019\nbinary value \u0015\u001f\u0016\nindicates player \u0001\n(global) Nash equilibrium in which\n\nare neighbors, the table sent from \u0001\n\n\u001a\u0019\n\u0006\u001c\u001b for players \u0013\n\ncan be bounded by\n\n\u0006$\u001b\r\b\n\t\u000e\u000b\n\nto \u0013\nand \u0001\n( \u001b .\n\nAs these tables are indexed by continuous values, it is not clear how they can be \ufb01nitely\nrepresented. However, as in KLS, we shall shortly introduce a \ufb01nite discretization of these\ntables whose resolution is dependent only on local neighborhood size, yet is suf\ufb01cient to\ncompute global (approximate) equilibria. For the sake of generality we shall work with the\nexact tables in the ensuing formal analysis, which will immediately apply to the approxi-\nmation algorithm as well.\n\n\u0019\u001e\u001b\n\n\u0019\u001e\u001b\n\n\u0001\u0019\n\n\u001a\u0019\n\n\u0012 , the table-passing phase initialization is \u0015\u001d\"\n(if any) by \u000f\n\n\u0019\u001e\u001b\u0015\u0014\n\u0012 . For\nis assigned the value 1 if and only if there\n\n\u0019\u001e\u001b\n\n\u0001\u0019\n\n\u0006 for all \u0019\n\u0019$#%#%#\u001c\u0019\n\b'&(\u000b\nsuch that\n\n\u0006$\u001b\n\n\u0019\u001e\u001b%\u0014\n\nFor every edge \r!\u0013\n\u0006\u001c\u001b . Let us denote the neighbors of \u0001 other than \u0013\n\u0018\u001a\u0019\n, the table entry \u0015\u001f\u0016\neach\n\u0018\u001a\u0019\n\u0019\u001e\u001b\nexists a vector of mixed strategies \u000f\n\u0019%#$#%#$\u0019\u001c)\n)4(\n\b*&+\u000b\n; and\n\u0007-\u0007\n\u0007\"\t\n6=\u0006\nis a best response to \u000f\na witness to \u0015\u001d\u0016\n\nWe shall call such a\n\n\u0019\u001c)\n(\u000f\u001b\n\n\t\f\u000b'\r\u0001\u0019\n\nfor all\n\n\u0015\u0017\u0016\n\n2.\n\n1.\n\n\u0018\u001a\u0019\n\n\u0006\u001c\u001b,\b'&(\u000b for \u000f\n\n.\n\n\u0019\u001e\u001b\n\n\u0019\u001c\u001b\n\n. If\n\n\u00124(\n\n\t\u000e\u000b'\r\u001a\u0019\n\n(+\u0018 .\n\nhas no neighbors other than\n, we de\ufb01ne Condition 1 above to hold vacuously. If either condition is violated, we set\n\u0015\u0017\u0016\n\t\f\u000b'\r\u0001\u0019\nLemma 1 For all edges \r!\u0013\n\u0018 , the table sent from \u0001\n\u0015\u0017\u0016\n\u0015\u001d\u0016\ncontract or remain the same: \n\u0019\u001c\u001b\nProof: By induction on \u0014 . The base case \u0014\n\u0006 holds trivially due to the table initialization\nto contain all 1 entries. For the induction, assume for contradiction that for some\n,\n\u00141.\f\u0006\nthere exists a pair of neighboring players\nsuch\n\u0012 and a strategy pair\n\u0018\u001a\u0019\n\r\u001a\u0019\n\u0006$\u001b!2\nthat \u0015\u0017\u0016\n\u0006 , the de\ufb01nition of the\n\t\f\u000b'\r\u001a\u0019\nother\ntable-passing phase implies that there exists a witness \u000f\nof \u0001\n\n\u0014=\u0016\n\u0019\u001e\u001b\nfor the neighbors \u000f\n\n\u0012 and all \u0014/.\n\u0019\u001e\u001b\n\t\f\u000b?\r\u001a\u0019\n\n!\u0013\n\u0006 . Since \u0015\u001f\u0016\n\n\u0018 yet \u0015\u001f\u0016\n\nto \u0013\n\u0019\u001e\u001b\n\ncan only\n\n\t\u000e\u000b'\r\u001a\u0019\n\n\u0012\b(\n\n\u0001\u00120\n\n\u0019\u001c\u001b\n\n\u0019\u001c\u001b\n\n\u0019\u001e\u001b\n\n\u0019\u001e\u001b\n\n\u0001\u0019\n\n\u0001\u0019\n\n\u0001\u0019\n\n\u001a\u0019\n\n.\n\n\u0002\n\n\u0007\n\u0001\n\u0011\n\u0014\n\u0016\n\u0004\n\u0011\n!\n\u001e\n\n\u0002\n\u0018\n\u0004\n\u0014\n\u0016\n\u0012\n\u0013\n(\n\u0019\n\u0001\n\u0019\n\u0001\n\u0012\n(\n\u0016\n\n(\n\n\n\u000b\n\n\u0019\n\u0016\n\u0012\n\n)\n\u000b\n\u0012\n\u0014\n\u0016\n\n\n\u001b\n\u0004\n\u0012\n(\n\u0006\n\u0006\n\u0001\n\n(\n\u000f\n)\n\u0019\n\u0013\n(\n\u0019\n\u000f\n)\n\u0006\n\u0001\n\u0013\n\u0012\n\u0019\n\u0001\n\u0012\n8\n\u0012\n(\n\u0006\n\n\u0012\n8\n\u0012\n(\n\u0006\n\u0001\n(\n\u0019\n\u0001\n\u0012\n\u0012\n(\n\u0012\n(\n)\n\n\fmeeting Conditions 1 and 2 above. By induction, the fact that \u0015\n\nthan\n\u001b2\u0019\u001e)\nCondition 1 implies that \u0015\u001f\u0016\n. Since \u0015\u0017\u0016\n\r\u001a\u0019\nit must be that\n. But then\n) cannot be a\nwitness to \u0015\u0017\u0016\n\t\f\u000b'\r\u0001\u0019\nSince all tables begin \ufb01lled with 1 entries, and Lemma 1 states entries can only change\nfrom 1 to 0, the table-passing phase must converge:\n\n&(\u000b1\r\nis a not best response to \u000f\n\u0006 , a contradiction.\n\n\u0007\u000b6+\u0006\n\n\u0019%#%#$#\u001c\u0019\n\n(\u0006\u001b\n\u0019\u001e\u001b\n\nfor all\n\n\u0012E(\n\n\u0019\u001c)\n\n\u0019\u001c\u001b\n\nin\n\n\u0019\u001e\u001b\n\n\u0018#\u0019\n\n\u0014%\u0016\n\n\u0006\u001c\u001b,2 , the limit \u0001\u0003\u0002\u0005\u0004\n\n\u0015\r\f\nTheorem 2 For all \r\u0001\u0019\n\u0016\u0007\u0006\t\b\nIt is also immediately obvious that the limit tables \nmust all simultaneously\nbalance each other, in the sense of obeying Conditions 1 and 2. That is, we must have that\nfor\nfor all edges\n\r!\u0013\nsuch that \u0015\r\f\n. If this\n\nwere not true the tables would be altered by a single round of the table-passing phase.\n\nimplies the existence of a witness\n\nis a best response to \u000f\n\n\u0012 , \u0015\u000e\f\n\r\u001a\u0019\n\u0006 for all \t , and \u0001\n\n\u0012 and all\n\u0012)(\n\n\u0012 exists.\n\n\u0019\u001e\u001b\n(\u000f\u001b\n\n\u0015\u0017\u0016\n\u0015\u000e\f\n\n\u001b2\u0019\u001e)\n\n\u0019\u001c\u001b\n\u0019\u001e\u001b\n\n\u0012\u000b\n\n\u001a\u0019\n\r\u0001\u0019\n\n\u0019\u001c\u001b\n\n\u0019\u001c\u001b\n\n\u001a\u0019\n\n\u001a\u0019\n\n.\n\nWe next establish that the table-passing phase will never eliminate any global Nash equi-\n\n\u0006$\u001b\n\n'\u000f\n\n?\u000f\n\n\u0018#\u0019\n\n, \u0015\u001d\u0016\n\n\u0012 , \u0015\u001d\u0016\n\nby \u000f\n\u0006 .\n\n. The base case\nand neighbor\n\n\u001e be a Nash equilibrium. Then for all rounds \u0014\n\nlibria. Let \u000f\n\u0014\u0005\u0016\nuse \u000f\nLemma 3 Let \u000f\npassing phase, and every edge \r\r\u0013\n\n\u001b to denote the mixed strategy assigned to player \u0001\n\u0012)(\n\n\u001e be any mixed strategy for the entire population of players, and let us\n\u0018 of the table-\n\u0018\u001a\u0019\n\u0006$\u001b\nProof: By induction on\n(C\u0018 holds trivially by the table initialization.\nBy induction, for every\n, satisfying Con-\nof\ndition 1 for \u0015\nis a Nash\n\u0006 . Condition 2 is immediately satis\ufb01ed since \u000f\nequilibrium.\nWe can now establish a strong sense in which the set of balanced limit tables \ncharacterizes the Nash equilibria of the global game. We say that\nthe \nand \u000f\n\u201callowed\u201d by the \nare witnesses.\nTheorem 4 Let \u000f\n\u0014\u0017\u0016\nbalanced limit tables \nProof: The forward direction is easy. If\ninition, for all\n\n\u0001\u0019\n\u0006 ,\n\u00129(\n\u0019\u001e\u001b\n\u00019\u001b\nis a witness to this value. In other words, every edge assignment made in \u000f\n, and furthermore the neighborhood assignments made by \u000f\n\u001e be any global mixed strategy. Then \u000f\n\u0019\u001e\u001b\n\u0015\u000e\f\nis consistent with the \n\nif for every vertex \u0001 with neighbors \u0013\n\nis consistent with\n\r?\u000f\n\n\u001a\u0019\nis a best response to the local neighborhood\n\nif and only if it is a Nash equilibrium.\n\n we have \u0015\u0010\f\n\nis consistent with the\n\n\u0015\u0010\f\n\r\u001a\u0019\n9\u001b\n\n, then by def-\n\n\u0001\u0019\n\u0018\u001a\u0019\n\u0015\u0010\f\n\n&(\u000b1\r?\u000f\n\n\u0006$\u001b\n\r\u0001\u0019\n\nA\u001b\n\n\u0015\u000f\f\n\n\u0019\u001c\u001b\n\n\u0019\u001e\u001b\n\n\u0019\u001e\u001b\n\nis\n\n,\nis a Nash equilibrium.\n\nA\u001b . Hence, \u000f\n\nis a Nash equilibrium, then for all\n\nFor the other direction, if\nbest response to the strategy of its neighbors\n\u0015\u0010\f\nwith the \n\u0015\r\f\nin Lemma 3.\n\n, it remains to show that for every player \u0001\nand \u0015\r\f\n\nis certainly a\n. So for consistency\n,\n. This has already been established\n\n9\u001b\nand its neighbors \u0013\n\nfor all\n\n\u0019\u001e\u001b\n\n\u0001\u0019\n\n,\n\n'\u000f\n\n)\u0004\n\n'\u000f\n\nTheorem 4 is important because it establishes that the table-passing phase provides us with\nan alternative \u2014 and hopefully vastly reduced \u2014 seach space for Nash equilibria. Rather\nthan search for equilibria in the space of all mixed strategies, Theorem 4 asserts that we\ncan limit our search to the space of \u000f\nthat are consistent with the balanced limit tables\n\u0015\u0010\f\n, with no fear of missing equilibria. The demand for consistency with the limit\ntables is a locally stronger demand than merely asking for a player to be playing a best\nresponse to its neighborhood. Heuristics for searching this constrained space are the topic\nof Section 5.\n\n\u0019\u001e\u001b\n\n\u001a\u0019\n\n\u0013\n\u0016\n\n\u0004\n\u0012\n(\n\u0006\n\u001b\n\u0004\n\u0012\n(\n\u0006\n\t\n(\n\u0006\n\u0018\n\u0001\n\n(\n\u000f\n)\n\u0019\n\u0013\n(\n\u0019\n\u000f\n\u0012\n(\n\u0012\n\u0012\n\u0001\n\u0019\n\u0001\n\u0012\n(\n\u0006\n\u000f\n)\n\u000f\n\n\n\u0004\n\n(\n\u000f\n)\n\u0019\n\u0013\n(\n\u0019\n!\n!\n\u0016\n\u0001\n!\n!\n\u0014\n\u0016\n@\n\u0019\n\u0001\n!\n\u0016\n\u0013\n\u001b\n\u0019\n\u000f\n!\n\u0016\n\u0001\n\u001b\n\u0014\n\u0014\n\u0001\n\n\u0001\n!\n\u0016\n\u0001\n\u001b\n\u0019\n\u000f\n!\n\u0016\n\u0012\n(\n\u0006\n\u0016\n!\n\u0016\n\u0013\n\u001b\n\u0019\n\u000f\n!\n\u0016\n\u0001\n\u001b\n\u0012\n(\n!\n\u0012\n\u0001\n\u000f\n!\n\u0012\n\u0001\n\u0019\n\u000f\n!\n\u0016\n\u0013\n\u001b\n\u0019\n\u000f\n!\n\u0016\n!\n\u0016\n\u000f\n!\n\u0015\n\f\n\u0012\n\u0001\n!\n!\n!\n\u0012\n\u0001\n\u000f\n!\n\u0012\n\u0001\n\u0001\n\u0001\n(\n\u000f\n!\n\u0016\n\u0001\n\u001b\n\u0013\n(\n\u000f\n!\n\u0016\n\u0013\n\u001b\n\u0019\n\u000f\n\n(\n\u000f\n!\n\u0016\n\u000f\n!\n\u000f\n!\n\u0001\n\u0001\n(\n\u000f\n!\n\u0016\n\u0001\n\u001b\n\u0013\n(\n\u000f\n!\n\u0016\n\u0013\n\u001b\n\u0019\n\u000f\n\n(\n\u000f\n!\n\u0016\n\u000f\n\u0012\n\u0001\n\u0019\n\u000f\n\n!\n\u0016\n\u0001\n\u001b\n\u0019\n\u000f\n!\n\u0016\n\u0013\n\u001b\n\u0012\n(\n\u0006\n!\n\u0016\n\u0001\n\u001b\n\u0019\n\u000f\n!\n\u0016\n\u001b\n\u0012\n(\n\u0006\n\t\n!\n\n\u0012\n\u0001\n\f\u0015\t\f\n\n\u001a\u0019\n\n\u0019\u001c\u001b\n\nmight constitute\nBut \ufb01rst let us ask in what ways the search space de\ufb01ned by the \na signi\ufb01cant reduction. The most obvious case is that in which many of the tables contain\na large fraction of 0 entries, since every such entry eliminates all mixed strategies in which\nthe corresponding pair of vertices plays the corresponding pair of values. As we shall see\nin the discussion of experimental results, such behavior seems to occur in many \u2014 but\ncertainly not all \u2014 interesting cases. We shall also see that even when such reduction\ndoes not occur, the underlying graphical structure of the game may still yield signi\ufb01cant\ncomputational bene\ufb01ts in the search for a consistent mixed strategy.\n4 Approximate Tables\nThus far we have assumed that the binary-valued tables \u0015\n\u0012 have continuous indices \u0019\nand \u001b , and thus it is not clear how they can be \ufb01nitely represented 3. Here we brie\ufb02y address\nthis issue by asserting that it can be handled using the discretization scheme of KLS. More\nprecisely, in that work it was established that if we restrict all table indices to only assume\ndiscrete values that are multiples of\n, and we relax Condition 2 in the de\ufb01nition of the\n\u001b be only an\ntable-passing phase to ask that\n,\n\u0001\u0003\u0002\u0005\u0004\n\u0012>\u0012 suf\ufb01ces to preserve B -Nash equilibria in the tables.\nthen the choice\n\nis the maximum degree of any node in the graph. The total number of entries in\n, but the payoff matrices for the players\n, so our tables remain polynomial in the size of the graphical\nare already exponential in\ngame representation. The crucial point established in KLS is that the required resolution is\nindependent of the total number of players. It is easily veri\ufb01ed that none of the key results\nestablishing this fact (speci\ufb01cally, Lemmas 2, 3 and 4 of KLS) depend on the underlying\ngraph being a tree, but hold for all graphical games.\n\n\b\n\t\f\u000b\n2 and thus exponential in \u0007\n\nHere \u0007\neach table will be \n\n-best response to\n\n\u0019\u001c\u001b\n\n\u0007\n\n\u001a\u0019\n\nPrecise analogues of all the results of the preceding section can thus be established for the\ndiscretized instantiation of the table-passing phase (details omitted). In particular, the table-\npassing phase will now converge to \ufb01nite balanced limit tables, and consistency with these\n-Nash equilibria. Furthermore, since every round prior to convergence\ntables characterizes\nmust change at least one entry in one table, the table-passing phase must thus converge in\nrounds, which is again polynomial in the size of the game representation.\nEach round of the table-passing phase takes at most on the order of\ncomputational\n(\b\n\t\f\u000b\nsteps in the worst case (though possibly considerably less), giving a total running time to\nthe table-passing phase that scales polynomially with the size of the game.\n\nat most \u0002\n\nWe note that the discretization of each player\u2019s space of mixed strategies allows one to for-\nmulate the problem of computing an approximate NE in a graphical game as a CSP(Vickrey\nand Koller 2002), and there is a precise connection between NashProp and constraint prop-\nagation algorithms for (generalized) arc consistency in constraint networks 4.\n5 NashProp: Assignment-Passing Phase\nrepresent a solution space that may\nWe have already suggested that the tables \n\u0019\u001c\u001b\nbe considerably smaller than the set of all mixed strategies. We now describe heuristics for\nsearching this space for a Nash equilibrium. For this it will be convenient to de\ufb01ne, for\neach vertex\n\u0018#\u0019\n\u0006\u001c\u001b\n(or by their allowed values in the aforementioned discretization scheme). The purpose of\nby all of its neighbors. Thus, if \u000f\n(again called\nis a best response to\n\nis simply to consolidate the information sent to \u0001\n\n\u0012 , which is indexed by the possible values \u001b\n\nare all the neighbors of\na witness to\n\nto be 1 if and only if there exists\n\n, its projection set\n\nfor all\n\n, and\n\n\u0015\u000e\f\n\n\u001a\u0019\n\n, we de\ufb01ne\n) such that \u0015\n\n; otherwise we de\ufb01ne\n\n\u0019\u001c)\n\nto be 0.\n\nIf \u000f\n\nis any global mixed strategy, it is easily veri\ufb01ed that \u000f\n\n3We note that the KLS proof that the exact tables must admit a rectilinear representation holds\n\nis consistent with the \n\n\u0015\t\f\n\n\u0019\u001e\u001b\n\n\u0001\u0019\n\ngenerally, but we cannot bound their complexity here.\n\n4We are grateful to Michael Littman for helping us establish this connection.\n\n\u0012\n\u0001\n\u0016\n\n\u0001\n(\nB\n\u0013\n(\n\u0019\n\u0019\n\u000f\n\n(\n)\n\n(\nB\n\u0001\n\n \n\u0007\n\u0006\n\u0001\n\n\u0012\n\u0007\nB\n\u0007\n\u0001\n\n2\n\u0002\n\u0007\n\u0001\n\u0012\n\u0001\n\u0001\n\u0006\n\f\n\n\u001b\n\u0014\n\u0016\n\u0006\n\f\n\n\u001b\n\u0012\n\n\u0001\n\u0006\n\f\n\n\u001b\n\u0012\n\u000f\n)\n\u0006\n\f\n\n\u001b\n\u0012\n(\n\u0006\n\n\u001b\n\u0004\n\u0012\n(\n\u0006\n\t\n\u0001\n(\n\u001b\n\u000f\n\n(\n\u000f\n)\n\u0006\n\f\n\n\u001b\n\u0012\n!\n!\n\u0012\n\u0001\n\falso exchange their projections\n\n\u0012 and\n\n\u0012 .\n\n\u001a\u0019\n\nas a witness. The \ufb01rst step of the assignment-passing phase of NashProp is thus the\n, which is again a local computation in the graph.\n\n, with the assignment of the neighbors of\n\nif and only if\n\r?\u000f\nin\ncomputation of the\nNeighboring nodes \u0001\n\nfor all nodes\n\n\u0012E(\n\u0012 at each vertex \u0001\nand \u0013\n\nLet us begin by noting that the search space for a Nash equilibrium is immediately reduced\nto the cross-product of the projection sets by Theorem 4, so if the table-passing phase\nhas resulted in many 0 values in the projections, even an exhaustive search across this\n(discretized) cross-product space may sometimes quickly yield a solution. However, we\nwould obviously prefer a solution that exploits the local topology of the solution space\ngiven by the graph. At a high level, such a local search algorithm is straightforward:\n\n.\n\n\u00127(\n\nsuch that\n\nfor all \t . \u0001\n\nand any values \u001b2\u0019\n\n(in some \ufb01xed ordering) that has already been assigned some value\n\n1. Initialization: Choose any node \u0001\n, and\nthe value )\n2. Pick the next node \u0001\n\u001b . If there is a partial assignment to the neighbors of \u0001\nto\n\u0006 such that\n\u0006\u000b\r\ntheir values in this witness. If all the neighbors of\nis a best response.\n\n\u0006 with witness\nassigns itself value \u001b , and assigns each of its neighbors \n, attempt to extend it to a witness \u000f\n\u0006 for all \t , and assign any previously unassigned neighbors\n( \u001b\nThus, the \ufb01rst vertex chosen assigns both itself and all of its neighbors, but afterwards ver-\ntices assign only (some of) their neighbors, and receive their own values from a neighbor. It\nis easily veri\ufb01ed that if this process succeeds in assigning all vertices, the resulting mixed\nand thus a Nash equilibrium (or approximate\nstrategy is consistent with the \nequilibrium in the discretized case). The dif\ufb01culty, of course, is that the inductive step of\nthe assignment-passing phase may fail due to cycles in the graph \u2014 we may reach a node\n\nhave been assigned, make sure\n\n\u0012)(\n\n\u0015\u0010\f\n\n\u001a\u0019\n\n\u0019\u001c\u001b\n\nwhose neighbor partial assignment cannot be extended, or whose assigned value\n\nis not a best response to its complete neighborhood assignment. In this case, as with any\nstructured local search phase, we have reached a failure point and must backtrack.\n\nThe overall NashProp algorithm thus consists of the (always converging) table-passing\nphase followed by the backtracking local assignment-passing phase. NashProp directly\ngeneralizes the algorithm of KLS, and as such, on certain special topologies such as trees\nmay provably yield ef\ufb01cient computation of equilibria. Here we have shown that NashProp\nenjoys several natural and desirable properties even on arbitrary graphs. We now turn to\nsome experimental investigation of NashProp on graphs containing cycles.\n6 Experimental Results\nWe have implemented the NashProp algorithm (with distinct table-passing and assignment-\npassing 5 phases) as described, and run a series of controlled experiments on loopy graphs\nof varying size and topology. As discussed in Section 4, there is a relationship suggested\nand the global approximation quality\nby the KLS analysis between the table resolution\n\nB , but in practice this relationship may be pessimistic (Vickrey and Koller 2002) . Our\n\nas inputs, and attempts to \ufb01nd an\n\nimplementation thus takes both\nrunning NashProp on tables of resolution\n\n-Nash equilibrium\n\nand\n\n.\n\nWe \ufb01rst draw attention to Figure 1, in which we provide a visual display of the evolution of\nthe tables computed by the NashProp table-passing phase for a small (3 by 3) grid game.\nNote that for this game, the table-passing phase constrains the search space tremendously\n\u2014 so much so that the projection sets entirely determine the unique equilibrium, and the\nassignment-passing phase is super\ufb02uous. This is of course ideal behavior.\n\nThe main results of our controlled experiments are summarized in Figure 2. One of our\n\n5We did not implement backtracking, but this caused an overall rate of failure of only 3% across\n\nall 3000 runs described here.\n\n\u0006\n\f\n!\n\u0016\n\u0001\n\u001b\n\u0006\n\u0001\n\u0001\n\u000f\n!\n\u0006\n\f\n\n\u001b\n\u0006\n\f\n\n\u001b\n\u0006\n\f\n\u000f\n)\n\u0006\n\f\n\n\u001b\n\u0012\n(\n\u000f\n)\n\u0006\n\f\n\n)\n\u0004\n\u0006\n\u0004\n\u0004\n)\n\u001b\n\u0012\n(\n\u0006\n\f\n\n)\n\u0004\n\u0001\n\u0001\n\u0012\n\u0001\n\u0001\n\u0001\n(\n\u001b\n\n\nB\nB\n\n\fr = 1\n\nr = 2\n\nr = 3\n\nr = 8\n\nFigure 1: Visual display of the NashProp table-passing phase after rounds 1,2 and 3 and 8 (where\nconvergence occurs). Each row shows \ufb01rst the projection set, then the four outbound tables, for each\nof the 9 players in a 3 by 3 grid. For the reward functions, each player has a distinct preference\nfor one of his two actions. For 15 of the 16 possible settings of his 4 neighbors, this preference is\nthe same, but for the remaining setting it is reversed. It is easily veri\ufb01ed that every player\u2019s payoff\n\ndepends on all of his neighbors. (Settings used: \u0002\u0001\n\n\u0003\u0005\u0004\u0007\u0006\t\b\u000b\n\n\u0003\u0005\u0004\n\n\u0003\r\f ).\n\nprimary interests is how the number of rounds in each of the two phases \u2014 and therefore\nthe overall running time \u2014 scales with the size and complexity of the graph. More detail\nis provided in the caption, but we created graphs varying in size from 5 to 100 nodes with\na number of different topologies: single cycles; single cycles to which a varying number\nof chords were added, which generates considerably more cycles in the graph; grids; and\n\u201cring of rings\u201d (Vickrey and Koller 2002). We also experimented with local payoff matrices\n, and with \u201cbiased\u201d rewards, in which\nin which each entry was chosen randomly from \u0016\nfor some \u000e \ufb01xed number of the settings of its neighbors, each node has a strong preference\nfor one of their actions, and in the remaining settings, a strong preference for the other. The\n\u000e settings were chosen randomly subject to the constraint that no neighbor is marginalized\n\n(thus no simpli\ufb01cation of the graph is possible). These classes of graphs seems to generate\na nice variability in the relative speed of the table-passing and assignment-passing phases\nof NashProp, which is why we chose them.\n\n\u0018#\u0019\n\n\u0006$\u001b\n\nWe now make a number of remarks regarding the NashProp experiments. First, and most\nbasically, these preliminary results indicate that the algorithm performs well across a range\nof loopy topologies, including some (such as grids and cycles with many chords) that might\npose computational challenges for junction tree approaches as the number of players be-\ncomes large. Excluding the small fraction of trials in which the assignment-passing phase\nfailed to \ufb01nd a solution, even on grid and loopy chord graphs with 100 nodes, we \ufb01nd\nconvergence of both the table and assignment-passing phases in less than a dozen rounds.\n\nWe next note that there is considerable variation across topologies (and little within) in the\namount of work done by the table-passing phase, both in terms of the expected number of\nrounds to convergence, and the fraction of 0 entries that have been computed at comple-\ntion. For example, for cycles the amount of work in both senses is at its highest, while\nfor grids with random rewards it is lowest. For grids and chordal cycles, decreasing the\n(and thus increasing the bias of the payoff matrices) generally causes more to\nbe accomplished by the table-passing phase. Intuitively, when rewards are entirely random\nand unbiased, nodes with large degrees will tend to rarely or never compute 0s in their\n\nvalue of \u000e\n\n\u0001\n\fs\nd\nn\nu\no\nr\n \nf\n\no\n \nr\ne\nb\nm\nu\nn\n\n14\n\n12\n\n10\n\n8\n\n6\n\n4\n\n2\n\n0\n\nTable-Passing Phase\n\nAssignment-Passing Phase\n\n10\n\n8\n\ncycle\ngrid\nchordal(0.25,1,2,3)\nchordal(0.25,1,1,2)\nchordal(0.25,1,1,1)\nchordal(0.5,1,2,3)\nchordal(0.5,1,1,2)\nchordal(0.5,1,1,1)\ngrid(3)\ngrid(2)\ngrid(1)\nringofrings\n\n6\n\ns\nd\nn\nu\no\nr\n \nf\n\no\n \nr\ne\nb\nm\nu\nn\n\n4\n\n2\n\n0.53\n\n0.65\n0.59\n0.60\n0.42\n\n0.81\n0.61\n\n0.81\n0.78\n\n0.87\n\n0.93\n\n1.00\n\ncycle\ngrid\nchordal(0.25,1,2,3)\nchordal(0.25,1,1,2)\nchordal(0.25,1,1,1)\nchordal(0.5,1,2,3)\nchordal(0.5,1,1,2)\nchordal(0.5,1,1,1)\ngrid(3)\ngrid(2)\ngrid(1)\nringofrings\n\n0\n\n20\n\n40\n60\nnumber of players\n\n80\n\n100\n\n0\n\n0\n\n20\n\n40\n60\nnumber of players\n\n80\n\n100\n\nFigure 2: Plots showing the number of rounds taken by the NashProp table-passing (left) and\nassignment-passing (right) phases in computing an equilibrium, for a variety of different graph\ntopologies. The -axis shows the total number of vertices in the graph. Topologies and rewards\nexamined included cycles, grids and \u201cring of rings\u201d(Vickrey and Koller 2002) with random rewards\n(denoted cycle, grid and ringofrings in the legend); cycles with a fraction \u0001 of random chords added,\nand with biased rewards in which nodes of degree 2 have \u0002\n\u0002\u0006\u0005 , and degree 4\n\u0002\b\u0007 (see text for de\ufb01nition of \u0002 ), denoted chordal(\u0001\nhave \u0002\n\u0002\b\u0007 ); and grids with biased rewards\nwith \u0002 , denoted grid(\u0002 )). Each data point represents averages over 50 trials for the given topology and\nnumber of vertices. In the table-passing plot, each curve is also annotated with the average fraction\n\f ; for ring of\nof 1 values in the converged tables. For cycles, settings used were \nrings, \n\n\u0003\u000b\n ; for all other classes, \n\n\u0002\u0004\u0003 , degree 3 have \u0002\n\n\t\u0005\b\u000b\n\n\u0006\r\b\u000b\n\n\u0003\u0005\u0004\n\n\t .\n\noutbound tables \u2014 there have too many neighbors whose combined setting can act as a\nwitnesses for a 1 in an outbound table.\n\nHowever, as suggested by the theory, greater progress (and computation) in the table-\npassing phase pays dividends in the assignment-passing phase, since the search space may\nhave been dramatically reduced. For example, for chordal and grid graphs with biased\nrewards, the ordering of plots by convergence time is essentially reversed from the table-\npassing to assignment-passing phases. This suggests that, when it occurs, the additional\nconvergence time in the table-passing phase is worth the investment. However, we again\nnote that even for the least useful table-passing phase (for grids with random rewards), the\nassignment-passing phase (which thus exploits the graph structure alone) still manages to\n\ufb01nd an equilibrium rapidly.\n\nReferences\nM. Kearns, M. Littman, and S. Singh. Graphical models for game theory.\nConference on Uncertainty in Arti\ufb01cial Intelligence, pages 253\u2013260, 2001.\n\nIn Proceedings of the\n\nS. Lauritzen and D. Spiegelhalter. Local computations with probabilities on graphical structures and\n\ntheir application to expert systems. J. Royal Stat. Soc. B, 50(2):157\u2013224, 1988.\n\nJ. F. Nash. Non-cooperative games. Annals of Mathematics, 54:286\u2013295, 1951.\nJ. Pearl. Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, 1988.\nD. Vickrey and D. Koller. Multi-agent algorithms for solving graphical games. In Proceedings of the\n\nNational Conference on Arti\ufb01cial Intelligence (AAAI), 2002. To appear.\n\nYair Weiss. Correctness of local probability propagation in graphical models with loops. Neural\n\nComputation, 12(1):1\u201341, 2000.\n\n\u0001\n\u0001\n\u0001\n\b\n\u0002\n\u0003\n\b\n\u0002\n\u0005\n\b\n\u0001\n\u0001\n\u0003\n\u0004\n\u0003\n\u0001\n\u0003\n\u0004\n\u0001\n\u0003\n\u0004\n\u0001\n\u0003\n\u0004\n\t\n\b\n\n\u0001\n\u0003\n\u0004\n\f", "award": [], "sourceid": 2263, "authors": [{"given_name": "Luis", "family_name": "Ortiz", "institution": null}, {"given_name": "Michael", "family_name": "Kearns", "institution": null}]}