{"title": "Uniqueness of Belief Propagation on Signed Graphs", "book": "Advances in Neural Information Processing Systems", "page_first": 1521, "page_last": 1529, "abstract": "While loopy Belief Propagation (LBP) has been utilized in a wide variety of applications with empirical success, it comes with few theoretical guarantees. Especially, if the interactions of random variables in a graphical model are strong, the behaviors of the algorithm can be difficult to analyze due to underlying phase transitions. In this paper, we develop a novel approach to the uniqueness problem of the LBP fixed point; our new \u201cnecessary and sufficient\u201d condition is stated in terms of graphs and signs, where the sign denotes the types (attractive/repulsive) of the interaction (i.e., compatibility function) on the edge. In all previous works, uniqueness is guaranteed only in the situations where the strength of the interactions are \u201csufficiently\u201d small in certain senses. In contrast, our condition covers arbitrary strong interactions on the specified class of signed graphs. The result of this paper is based on the recent theoretical advance in the LBP algorithm; the connection with the graph zeta function.", "full_text": "Uniqueness of Belief Propagation on Signed Graphs\n\nYusuke Watanabe(cid:3)\n\nThe Institute of Statistical Mathematics\n\n10-3 Midori-cho, Tachikawa, Tokyo 190-8562, Japan\n\nwatay@ism.ac.jp\n\nAbstract\n\nWhile loopy Belief Propagation (LBP) has been utilized in a wide variety of ap-\nplications with empirical success, it comes with few theoretical guarantees. Es-\npecially, if the interactions of random variables in a graphical model are strong,\nthe behaviors of the algorithm can be dif\ufb01cult to analyze due to underlying phase\ntransitions. In this paper, we develop a novel approach to the uniqueness problem\nof the LBP \ufb01xed point; our new \u201cnecessary and suf\ufb01cient\u201d condition is stated in\nterms of graphs and signs, where the sign denotes the types (attractive/repulsive)\nof the interaction (i.e., compatibility function) on the edge. In all previous works,\nuniqueness is guaranteed only in the situations where the strength of the interac-\ntions are \u201csuf\ufb01ciently\u201d small in certain senses. In contrast, our condition covers\narbitrary strong interactions on the speci\ufb01ed class of signed graphs. The result\nof this paper is based on the recent theoretical advance in the LBP algorithm; the\nconnection with the graph zeta function.\n\n1 Introduction\n\nThe belief propagation algorithm [1] was originally proposed as an ef\ufb01cient method for the exact\ncomputation in the inference with graphical models associated to trees; the algorithm has been\nextended to general graphs with cycles and called Loopy Belief Propagation (LBP) algorithm. It has\nshown empirical success in a wide class of problems including computer vision, compressed sensing\nand error correcting codes [2, 3, 4]. In such applications, existence of cycles and strong interactions\nbetween variables make the behaviors of the LBP algorithm dif\ufb01cult to analyze. In this paper we\npropose a novel approach to the uniqueness problem of LBP \ufb01xed point.\nAlthough a considerable number of researches have been done in this decade [5, 6], understating\nof the LBP algorithm is not yet complete. An important step toward better understanding of the\nalgorithm has been the variational interpretation by the Bethe free energy function; the \ufb01xed points\nof LBP correspond to the stationary points of the Bethe free energy function [7]. This view provides\na number of algorithms that (provably) \ufb01nd a stationary point of the Bethe free energy function\n[8, 9, 10, 11]. For the uniqueness problem of the LBP \ufb01xed point a number of conditions has been\nproposed [12, 13, 14, 15]. (Note that the convergence property implies uniqueness by de\ufb01nition.)\nIn all previous works, the uniqueness is guaranteed only in the situations where the strength of the\ninteractions are \u201csuf\ufb01ciently\u201d small in certain senses.\nIn this paper we propose a completely new approach to the uniqueness condition of the LBP algo-\nrithm; it should be emphasized that strength of interactions on speci\ufb01ed class of signed graphs can\nbe arbitrary large in this condition. (The signs denote the attractive/repulsive types of the compat-\nibility function on the edges.) Generally speaking, the behavior of the algorithm is complex if the\nstrength of interactions are strong. In such regions, phase transition phenomena can occur in the un-\nderlying computation tree [15], making theoretical analyses dif\ufb01cult. To overcome such dif\ufb01culties,\n\n(cid:3)\n\nCurrent af\ufb01liation: SONY, Intelligent Systems Research Laboratory. YusukeB.Watanabe@jp.sony.com\n\n1\n\n\fwe utilize the connection between the Bethe free energy and the graph zeta function established in\n[16]; the determinant of the Hessian of the Bethe free energy equals the reciprocal of the graph zeta\nfunction up to a positive factor. Combined with the index formula [16], the uniqueness problem is\nreduced to a positivity property of the graph zeta function.\nThis paper is organized as follows. In section 2 we introduce the background of LBP. In section 3\nwe explain the condition for the uniqueness, which is the main result of this paper. In section 4 the\nproof of the main result is given by a graph theoretic approach. In section 5 we remark foregoing\nresearches based on the new technique.\n\n2 Loopy Belief Propagation, Bethe free energy and graph zeta function\n\nIn this section, we provide basic facts on LBP; the connection with the Bethe free energy and graph\nzeta function. Throughout this paper, G = (V; E) is a connected undirected graph with V , the\nvertices, and E, the undirected edges. We consider the binary pairwise model, which is given by the\nfollowing factorization form with respect to G:\n\nY\n\nij2E\n\np(x) =\n\n1\nZ\n\nY\n\ni2V\n\n ij(xi; xj)\n\n i(xi);\n\n(1)\n\nwhere x = (xi)i2V is a list of binary ( i.e., xi 2 f(cid:6)1g) variables, Z is the normalization constant\nand  ij;  i are positive functions called compatibility functions. Without loss of generality we\nassume that  ij(xi; xj) = exp(Jijxixj) and  i(xi) = exp(hixi). We refer Jij as interaction and\nits absolute value as \u201cstrength\u201d.\nIn various applications, we would like to compute marginal distributions\n\np(x)\n\nand\n\npij(xi; xj) :=\n\n(2)\n\nX\n\npi(xi) :=\n\nxnfxig\n\nX\n\np(x)\n\nxnfxixjg\n\nthough exact computations are often intractable due to the combinatorial complexities. If the graph\nis a tree, however, they are ef\ufb01ciently computed by the belief propagation algorithm [1]. Even if\nthe graph has cycles, the direct application of the algorithm (Loopy Belief Propagation; LBP) often\ngives good approximation [6].\nLBP is a message passing algorithm. For each directed edge, a message vector (cid:22)i!j(xj) is assigned\nand initialized arbitrarily. The update rule of messages is given by\n\ni!j (xj) /\n(cid:22)new\n\n ji(xj; xi) i(xi)\n\n(cid:22)k!i(xi);\n\n(3)\n\nY\n\nk2Ninj\n\nwhere Ni is the neighborhood of i 2 V . The order of edges in the update is arbitrary; the set of\n\ufb01xed point does not depend on the order. If the messages converge to some \ufb01xed point f(cid:22)\ni!j(xj)g,\n1\nthe approximations of pi(xi) and pij(xi; xj) are calculated as\n\nbij(xi; xj) /  ij(xi; xj) i(xi) j(xj)\n\n1\nk!i(xi)\n(cid:22)\nk2Ninj\n\n(cid:22)\nk2Njni\n\nbi(xi) /  i(xi)\nP\nP\n\nxi\n\nxj\n\n1\nk!i(xi);\n(cid:22)\n\nY\n\nP\n\nY\n\n1\nk!j(xj);\n\n(4)\n\n(5)\n\nwith normalization\nbij(xi; xj) > 0, and\n\nbi(xi) = 1 and\nbij(xi; xj) = bi(xi) are automatically satis\ufb01ed.\n\nxi;xj\n\nbij(xi; xj) = 1. From (3) and (5), the constraints\n\nX\n\nxi\n\nY\n\nk2Ni\n\n2.1 The Bethe free energy\n\nThe LBP algorithm is interpreted as a variational problem of the Bethe free energy function [7]. In\nthis formulation, the domain of the function is given by\n\nn\nfqi; qijg; qij(xi; xj) > 0;\n\nX\n\nL(G) =\n\no\n\n(6)\n\nX\n\nqij(xi; xj) = 1;\n\nqij(xi; xj) = qi(xi)\n\nxi;xj\n\nxj\n\n2\n\n\fand element of this set is called pseudomarginals, i.e., a set of locally consistent probability distri-\nbutions. The closure of this set is called local marginal polytope [6]. The objective function called\nBethe free energy is de\ufb01ned on L(G) by:\n\nF (q) := (cid:0)\n\nqij(xi; xj) log  ij(xi; xj) (cid:0)\n\nqi(xi) log  i(xi)\n\nX\nX\n\nij2E\n\nX\nX\n\nxixj\n\n+\n\nij2E\n\nqij(xi; xj) log qij(xi; xj) +\n\n(7)\nwhere di = jNij. The outcome of this variational problem is the same as that of LBP. More precisely,\nthere is a one-to-one correspondence between the set of stationary points of the Bethe free energy\nand the set of \ufb01xed points of LBP. The correspondence is given by (4, 5).\n\nqi(xi) log qi(xi);\n\ni2V\n\nxixj\n\nxi\n\nX\n\nX\nX\n\ni2V\n\nxi\n\n(1 (cid:0) di)\n\nX\n\n2.2 Zeta function and Ihara\u2019s formula\n\nIn this section, we explain the connection of LBP to the graph zeta function. We use the follow-\ning terms for graphs [17, 16]. Let ~E be the set of directed edges obtained by duplicating undi-\nrected edges. For each directed edge e 2 ~E, o(e) 2 V is the origin of e and t(e) 2 V is\nthe terminus of e. For e 2 ~E, the inverse edge is denoted by (cid:22)e, and the corresponding undi-\nrected edge by [e] = [(cid:22)e] 2 E. A closed geodesic in G is a sequence (e1; : : : ; ek) of directed\n6= (cid:22)ei+1 for i 2 Z=kZ. For a closed geodesic c, we may\nedges such that t(ei) = o(ei+1); ei\nform the m-multiple, cm, by repeating it m-times. A closed geodesic c is prime if there are no\nclosed geodesic d and natural number m((cid:21) 2) such that c = dm. For example, a closed geodesic\nc = (e1; e2; e3; e1; e2; e3) is not prime and c = (e1; e2; e3; e4; e1; e2; e3) is prime. Two closed\ngeodesics are said to be equivalent if one is obtained by cyclic permutation of the other. For exam-\nple, closed geodesics (e1; e2; e3); (e2; e3; e1) and (e3; e1; e2) are equivalent. An equivalence class\nof prime closed geodesics is called a prime cycle. Let P be the set of prime cycles of G. For given\n(complex or real) weights u = (ue)e2 ~E, the Ihara\u2019s graph zeta function [18, 19] is given by\n\ng(p) := ue1\n\n(cid:1)(cid:1)(cid:1) uek\n\nfor p = (e1; : : : ; ek);\n\nY\n\n(cid:16)G(u) :=\n\n(cid:0)1\n\n(cid:0)1;\n\np2P\n\n(1 (cid:0) g(p))\n= det(I (cid:0) UM)\n(cid:26)\n\nwhere the second equality is the determinant representation [19] with matrices indexed by the di-\nrected edges. The de\ufb01nitions of M and U are\n\nif e 6= (cid:22)e0 and o(e) = t(e\n0\notherwise.\n\nMe;e0 :=\nand Ue;e0 := ue(cid:14)e;e0, respectively.\nThe following theorem gives the connection between the Bethe free energy and the zeta function.\nMore precisely, the theorem asserts that the determinant of the Hessian of the Bethe free energy\nfunction is the reciprocal of the zeta function up to a positive factor.\nTheorem 1 ([16, 20]). The following equality holds at any point of L(G):\n\n(8)\n\n1\n0\n\n);\n\nY\n\nY\n\nY\n\nY\n\nqi(xi)1(cid:0)di 22jV j+4jEj\n\n(9)\n\n(cid:0)1 = det(r2F )\n\n(cid:16)G(u)\n\nqij(xi; xj)\n\nxi;xj =(cid:6)1\n\ni2V\n\nij2E\n\nxi=(cid:6)1\n\n(cid:31)ij (cid:0) mimj\ni )(1 (cid:0) m2\n\nwhere the derivatives are taken over a af\ufb01ne coordinate of L(G): mi = Eqi[xi]; (cid:31)ij = Eqij [xixj],\nand\n\n=\n\nui!j =\n\nj )g1=2\n\nf(1 (cid:0) m2\n\nfVarqi[xi]Varqj [xj]g1=2\n\n(10)\nNote that, from (7), the Hessian r2F does not depend on Jij and hi. Since the weight (10) in\nTheorem 1 is symmetric with respect to the inversion of edges, the zeta function can be reduced\nto undirected edge weights. To avoid confusion, we introduce a notation:\nthe zeta function of\nundirected edge weights (cid:12) = ((cid:12)ij)ij2E is denoted by ZG((cid:12)). Note also that, since (cid:12)ij is the\ncorrelation coef\ufb01cient of qij, we have j(cid:12)ijj < 1. The equality does not occur by the positivity\nassumption of probabilities.\n\n: =: (cid:12)ij\n\nCovqij [xi; xj]\n\n3\n\n\fFigure 1: w1-reduction\n\nFigure 2: Example of the complete w-\nreduction.\n\n3 Signed graphs with unique solution\n\nIn this section, we state the main result of this paper, Theorem 3. The result shows a new type of\napproach towards uniqueness conditions. The proof of the theorem is given in the next section.\n\n3.1 Existing conditions on uniqueness\n\nThere have been many works on the uniqueness and/or convergence of the LBP algorithm for dis-\ncrete graphical models [12, 13, 14, 15] and Gaussian graphical models [21]. As we are discussing\nbinary pairwise graphical models, we review some of the conditions for the model. The following\ncondition is given by Mooij and Kappen:\nTheorem 2. [[13]] Let (cid:26)(X) denote the spectral radius (i.e., the maximum of the absolute value of\nthe eigenvalues) of a matrix X. If (cid:26)(J M) < 1, then the LBP converges to the unique \ufb01xed point,\nwhere J is a diagonal matrix de\ufb01ned by Je;e0 = tanh(jJej)(cid:14)e;e0.\nThis theorem gives the uniqueness property by bounding the strengths of the interactions, i.e.,\nfjJijjgij2E. Therefore, the condition does not depend on the signs of the interactions. The situ-\nations are the same in other existing conditions [12, 13, 14, 15]. For example, Heskes\u2019s condition\n[12] is\n\nX\n\nj2Ni\n\njJijj < 1:\n\n(11)\n\nThese conditions are unsatisfactory in a sense that they do not use the information of the signs,\nfsgn Jijgij2E. In fact, the behaviors of LBP algorithm can be dramatically different if the signs\nof the compatibility functions are changed. Note that each edge compatibility function  ij tend to\nforce the variables xi; xj equal if Jij > 0 and not equal if Jij < 0; the \ufb01rst case is called attractive\ninteraction and the latter repulsive. In contrast to the above uniqueness conditions, we pursue another\napproach: we use the information of signs, fsgn Jijgij2E, rather than the strengths. In this paper,\nwe characterize the signed graphs that guarantee the uniqueness of the solution; this result is stated\nin Theorem 3.\n\n3.2 Statement of main theorem of this section\n\nWe introduce basic terms to state the main theorem. A signed graph, (G; s), is a graph equipped\nwith a sign map, s, from the edges to f(cid:6)1g. A compatibility function de\ufb01nes the sign function, s,\nby s(ij) = sgn Jij. The sign function of all plus (resp. minus) sign is denoted by s+ (resp. s(cid:0)).\nThe deletion and subgraph of a signed graph is de\ufb01ned naturally restricting the sign function.\nDe\ufb01nition 1. A w-reduction of a signed graph (G; s) is a signed graph that is obtained by one of\nthe following operations:\n(w1) Erasure of a vertex of degree two. (Let j be a vertex of degree two and ij; jk (i 6= k) be\nthe connecting edges. Delete them and make a new edge ik with the sign s(ij)s(jk). See\nFigure 1.)\n\n(w2) Deletion of a loop with minus sign. (An edge ij is called a loop if i = j.)\n(w3) Contraction of a bridge. (An edge is a bridge if the deletion of the edge makes the number of\n\nthe connected component increase. The sign on the bridge can be either +1 or (cid:0)1.)\n\n4\n\n\fFigure 3: B3\n\nFigure 4: P3\n\nFigure 5: D4.\n\nFigure 6: Example 4 in Subsection 3.3.\n\nNote that all the operations decrease the number of edges by one. A signed graph is w-reduced if\nno w-reduction is applicable. Any signed graph is reduced to the unique w-reduced signed graph\ncalled the complete w-reduction. Example of a complete w-reduction is given in Figure 2. From\nthe viewpoint of the computational complexity, \ufb01nding the complete w-reduction is easy. (See the\nsupplementary material for further discussions.)\n\nHere are important (signed) graphs. See Figures 3, 4 and 5. A bouquet graph, Bn, is a graph with the\nsingle node with n loops. Pn is a graph with two vertices and n parallel edges. Kn is the complete\ngraph of n vertices. Cn is cycle of length n. Dn is a signed graph obtained by duplicating each edge\nof Cn with plus and minus signs.\nDe\ufb01nition 2. Two signed graphs (G; s) and (G; s\nmap g : V (cid:0)! f(cid:6)1g such that s\n0\nTheorem 3. For a signed graph (G; s) the following conditions are equivalent.\n\n) are said to be gauge equivalent if there exists a\n(ij) = s(ij)g(i)g(j). The map g is called gauge transformation.\n\n0\n\n1. LBP algorithm on G has the unique \ufb01xed point for any compatibility functions with sign s.\n\n2. The complete w-reduction of (G; s) is one of the followings:\n\n(i) B0 (ii) (B1; +) (iii)\n(P3; +;(cid:0);(cid:0)) and (P3; +; +;(cid:0)). (iv) (K4; s(cid:0)) and its gauge equivalent signed graphs.\n(v) Dn and its w-reduced subgraphs (n (cid:21) 2).\nThe proof of this theorem is given in the next section.\n\n3.3 Examples and experiments\n\nIn this subsection we present concrete examples of signed graphs which do or do not satisfy the\ncondition of Theorem 3.\n(Ex.1) Trees and graphs with a single cycle: In these cases it is well known that LBP has the unique\n\ufb01xed point irrespective of the compatibility functions [1, 22]. This fact is easily derived by Theorem\n3 since the complete w-reduction of them are B0 or (B1; +). (Ex.2) Complete graph Kn: (Kn; s)\nis w-reduced as we can not apply w-reduction. For n = 4, the condition of sign is given in 2.(iv).\nIf n (cid:21) 5 it does not satisfy the condition for any sign. (Ex.3) 2 (cid:2) 2 grid graph: This graph does\nnot satisfy the condition for any sign because its complete w-reduction is different from the signed\ngraphs in the item 2 of Theorem 3. (Ex.4) Consider a signed graph in Figure 6. Notice that the\nproducts of signs along the \ufb01ve cycles are all minus. Applying (w2) and (w3), we see that the\ncomplete w-reduction is B0. Therefore the signed graph satis\ufb01es the condition.\nWe experimentally check convergence behaviors of the LBP algorithm on D4, which satis\ufb01es the\ncondition of Theorem 3. Since the LBP \ufb01xed point is unique, it is the absolute minimum of the\nBethe free energy function. We set the compatibility functions Jij = (cid:6)J; hi = h and initialized\n(cid:0)3 after 50\nmessages randomly. We judged convergence if average message update is less than 10\niterations. The result is shown in Figure 7. LBP is not convergent in the right white region and\nconvergent in the rest of gray region. Convergence is theoretically guaranteed for tanh(jJj) < 1=3\n(jJj / 0:347) by Theorem 2. In the non-convergent region LBP appears to be unstable around the\n\ufb01xed point.\n\n4 Proofs: conditions in terms of graph zeta function\n\nThe aim of this subsection is to prove Theorem 3. For the proof, Lemma 2, which is purely a result\nof the graph zeta function, is utilized.\n\n5\n\n\fFigure 7: Convergence region of LBP.\n\nFigure 8: X1 and X2.\n\n4.1 Graph theoretic results\nWe denote by G(cid:0) (cid:15) the deletion of an undirected edge (cid:15) from a graph G and by G=(cid:15) the contraction.\nA minor of a graph is obtained by the repeated applications of the deletion, contraction and removal\nof isolated vertices. The Deletion and contraction operations have natural meaning in the context of\nthe graph zeta function as follows:\nLemma 1.\n\n1. Let ij be an edge, then (cid:16)\n\notherwise.\n\n(cid:0)1\nG(cid:0)ij(u) = (cid:16)\n\nG (~u), where ~ue is equal to ue if [e] 6= ij and 0\n(cid:0)1\nG (~u), where ~ue is equal to ue if [e] 6= ij\n(cid:0)1\n\n(cid:0)1\nG=ij(u) = (cid:16)\n\n2. Let ij be a non-loop edge, then (cid:16)\n\nand 1 otherwise.\n\nProof. From the prime cycle representation of zeta functions, both of the assertions are trivial.\n\nNext, to prove Theorem 3, we formally de\ufb01ne the notion of deletions, contractions and minors on\nsigned graphs [23]. For a signed graph the signed-deletion of an edge is just the deletion of the\nedge along with the sign on it. The signed-contraction of a non-loop edge ij 2 E is de\ufb01ned up to\ngauge equivalence as follows. For any non-loop edge ij, there is a gauge equivalent signed graph\nthat has the sign + on ij. The signed-contraction is obtained by contracting the edge. The resulting\nsigned graph is determined up to gauge equivalence. A signed minor of a signed graph is obtained\nby repeated applications of the signed-deletion, signed-contraction, and removal of isolated vertices.\nLemma 2. For a signed graph, (G; s), the following conditions are equivalent.\n\n1. (G; s) is U-type. That is, if (cid:12)ij 2 Is(ij) for all ij 2 E then Z\n\n((cid:12)ij)ij2E, I+ = [0; 1) and I(cid:0) = ((cid:0)1; 0].\n\n(cid:0)1\nG ((cid:12)) > 0, where (cid:12) =\n\n2. (G; s) is weakly U-type. That is, if (cid:12)ij 2 Is(ij) for all ij 2 E then Z\n3. (B2; s+) is not contained as a signed minor.\n\nG ((cid:12)) (cid:21) 0\n(cid:0)1\n\n4. The complete w-reduction of (G; s) is one of the followings: (i) B0 (ii) (B1; s+) (iii)\n(P3; +;(cid:0);(cid:0)) and (P3; +; +;(cid:0)). (iv) (K4; s(cid:0)) and its gauge equivalent signed graphs.\n(v) Dn and its w-reduced subgraphs (n (cid:21) 2).\n\nThe uniqueness condition in Theorem 3 is equivalent to all the conditions in this lemma. Here, we\nremark properties of this condition (the proof is straightforward from de\ufb01nition and Lemma 2):\n\n(1) (G; s) is U-type iff its gauge equivalents are U-type.\n(2) If (G; s) is U-type then its signed minors are U-type.\n\nWe prove the equivalence cyclic manner. Here we give a sketch of the proof (Detail is given in the\nsupplementary material.)\n\n6\n\n\fProof of 1 ) 2. Trivial.\nProof of 2 ) 3. If (G; s) is weakly U-type, then its signed minors are weakly U-type; this is obvious\nfrom Lemma 1. However, direct computation of the zeta of (B2; s+) shows that this signed graph is\nnot weakly U-type. In fact, the directed edge matrix with weight of B2 is\n\n264 (cid:12)(cid:15)1 (cid:12)(cid:15)1\n\nBM =\n\n375\n\n0\n(cid:12)(cid:15)2 (cid:12)(cid:15)2 (cid:12)(cid:15)2\n0\n(cid:12)(cid:15)2\n\n(cid:12)(cid:15)1\n0\n(cid:12)(cid:15)1 (cid:12)(cid:15)1 (cid:12)(cid:15)1\n(cid:12)(cid:15)2 (cid:12)(cid:15)2\n0\n(cid:0) 3(cid:12)(cid:15)1 (cid:12)(cid:15)2 ). This value can be negative in\n(cid:0) (cid:12)(cid:15)2\n\nand det(I (cid:0) BM) = (1 (cid:0) (cid:12)(cid:15)1 )(1 (cid:0) (cid:12)(cid:15)2 )(1 (cid:0) (cid:12)(cid:15)1\nthe region 0 (cid:20) (cid:12)(cid:15)1 ; (cid:12)(cid:15)2 < 1.\nProof of 3 ) 4. Note that if (G; s) does not contain (B2; s+) as a signed minor then any w-\nreductions of (G; s) also do not contain (B2; s+) as a signed minor; we can check this property\nfor each type of w-reductions, (w,1,2,3).\nTherefore, it is suf\ufb01cient to show that if a w-reduced signed graph (G; s) does not contain (B2; +; +)\nas a signed minor then it is one of the \ufb01ve types. Notice that G has no vertex of degree less than\nthree. First, if the nullity of G is less than three, it is not hard to see that the signed graph is type\n(i), (ii) or (iii). Secondly, we consider the case that the graph G has nullity three. Note that all\nw-reduced signed graphs of nullity two have the signed minor (B1; +). Therefore, we can assume\nthat G does not have (plus) loop. Since (G; s) is w-reduced, G must be one of the following graphs:\nK4; P4; X1 and X2, where X1 and X2 are de\ufb01ned in Figure 8. It is easy to check that possible way\nof assigning signs on these graphs are one of the types, (iii-v). Finally, we consider the case of the\nnullity, n, is more than three. In this case, we can show that (G; s) must be Dn or its subgraph.\n(Details are found in the supplementary material.)\nY\nProof of 4 ) 1. First we claim the following statement: if\n\nG (u) (cid:21) 0 8u = (ue) 2\n(cid:0)1\n\n(cid:16)\n\nf0; s([e])g;\n\n(12)\n\nG (u) = det(I (cid:0) UM)\n(cid:0)1\nthen (G; s) is U-type. This claim can be proved using the property that (cid:16)\n(cid:0)1\nis linear for each variable, ue. (That is, if we \ufb01x u except for one variable, say ue1, then (cid:16)\nG =\nC1 + C2ue1.) Take the product of the closed intervals from 0 to s(e) (e 2 ~E) and make a hypercube.\nIf there is a non-positive point in the hypercube then there must be a non-positive point in a face; we\ncan repeat this argument until we arrive at a vertex.\nWe check the condition (12) for all the four classes. Notice that if (G; s) satis\ufb01es (12) then its gauge\nequivalents, the deletion and signed-contraction has the same property. So far, we have proven the\nassertion for w-reduced graphs; we extend the proof to arbitrary signed graphs. For any signed\ngraph, the complete w-reductions are obtained by \ufb01rst using reductions (w1,w2) and then reducing\nthe bridges (w3) because (w3) always makes the degree bigger and does not make a loop. Therefore,\nthe following two claims complete the proof.\n0\nClaim 1. Let (G\n0\na bridge (cid:15). If (G\n\n0\n) be a (w3)-reduction of a signed graph (G; s), i.e., obtained by contraction of\n) has the property (12) then (G; s) also has the property.\n\n; s\n0\n; s\n\ne2 ~E\n\nProof of Claim 1. Let b and (cid:22)b be the corresponding directed edges of (cid:15). Since any prime cycles pass\nb and (cid:22)b at the same number of times,\n(cid:0)1\n(cid:16)\nG (u) = (cid:16)\n\n(cid:0)1\nG(cid:0)(cid:15)(~u) + ubu(cid:22)bf (~u);\n\nwhere ~u is restriction of u on G (cid:0) (cid:15) and f is a function. Assume that s((cid:15)) = 1.\n(The case\ns((cid:15)) = (cid:0)1 is completely analogous.) Since (G\n0\n) has the property (12), (G; s) has the property\nfor (ub; u(cid:22)b) = (1; 1). For (ub; u(cid:22)b) = (0; 0); (1; 0); (0; 1) cases, we can deduce form the property of\nG (cid:0) (cid:15). (cid:4)\n0\nClaim 2. Let (G\n; s\nthen (G; s) is U-type.\n\n) be a (w1) or (w2)-reduction of a signed graph (G; s). If (G\n\n) is U-type\n\n(13)\n\n; s\n\n0\n\n0\n\n0\n\n; s\n\n0\n\n7\n\n\fQ\n\nProof of Claim 2. The case of (w1) is trivial. We prove the case (w2). From the multivariate\nIhara\u2019s formula, the positivity of Z\nij2E Is(ij) implies the positive de\ufb01niteness of\nI + ^D0 (cid:0) ^A0 on the set. Adding a minus loop correspond to adding 2(cid:12)2(1 (cid:0) (cid:12)2)\n(cid:0)1 (cid:0) 2(cid:12)(1 (cid:0) (cid:12)2)\n= (cid:0)2(cid:12)(1 + (cid:12)) on the diagonal, where (cid:0)1 < (cid:12) (cid:20) 0. Therefore the new matrix is also positive\nde\ufb01nite and (G; s) is U-type. (cid:4)\n\n(cid:0)1\nG0 ((cid:12)) on the set\n\n4.2 Proof of Theorem 3\nProof of 2 ) 1. The basic strategy is to use the following theorem.\nTheorem 4 (Index sum theorem [16]). As usual, consider the Bethe free energy function, F , de\ufb01ned\non L(G). Assume that detr2F (q) 6= 0 for all LBP \ufb01xed points q. Then the sum of indices at the\nLBP \ufb01xed points are equal to one:\n\nX\n\n(cid:0)\n\nq:rF (q)=0\n\n(cid:1)\n\ndetr2F (q)\n\nsgn\n\n= 1;\n\nwhere sgn(x) :=\n\n(cid:26)\nif x > 0;\n1\n(cid:0)1 if x < 0:\n\n(We call each summand, which is +1 or (cid:0)1, the index of F at q.)\n\nof (cid:12)ij and Jij are equal [16], (cid:12) = ((cid:12)ij) 2Q\n(cid:12) = ((cid:12)ij) 2Q\n\nAt each LBP \ufb01xed point, the beta values for a solution can be computed using (10). Since the signs\nij2E Is(ij) is satis\ufb01ed. Therefore, from the assumption\nand Lemma 2, the index of the solution is positive. We conclude the uniqueness of the solution from\nthe above index sum theorem.\nProof of 1 ) 2. We show the contraposition. From Lemma 2, (G; s) is not weakly U-type; there is\nG ((cid:12)) < 0. Take pseudomarginals q = fqijgij2E [ fqigi2V\n(cid:0)1\nthat has the correlation coef\ufb01cients of qij equal to (cid:12)ij. (For example, set (cid:31)ij = (cid:12)ij; mi = 0.) We\ncan choose Jij and hi such that\n\nij2E Is(ij) such that (cid:16)\n\nY\n\nY\n\n0@X\n\nX\n\n1A :\n\nqij(xi; xj)\n\ni2V\n\nij2E\n\nq1(cid:0)di\n\ni\n\n(xi) / exp\n\nJijxixj +\n\nhixi\n\ni2V\n\nij2E\n\n(14)\n\nThis construction implies that q correspond to a LBP \ufb01xed point with compatibility functions\nfJij; hig. This solution has index -1 by de\ufb01nition.\nIf this is the unique solution, it contradicts\nthe index sum formula. Therefore, there must be other solutions.\n\n5 Concluding remarks\n\nIn this paper we have developed a new approach to the uniqueness problem of the LBP algorithm.\nAs a result, we have obtained a new class of LBPs that are guaranteed to have the unique solution.\nThe uniqueness problem is reduced to the properties of graph zeta functions, Lemma 2, using the\nindexed formula. In contrast to the existing conditions, our uniqueness guarantee includes graphical\nmodels with strong interactions. Though our result is shown in the case of binary pairwise models,\nthe idea can be extended to factor graph models with many states. In fact, Theorem 1 has been\nextended to the general settings of the LBP algorithm on factor graphs [20].\nOne direction for the future research is to combine the information of the signs and strengths of\nthe interactions to show the uniqueness. The uniqueness problem is reduced to the positivity of the\ngraph zeta function on a restricted set, rather than the hypercube of size one. If we can check the\npositivity of graph zeta functions theoretically or algorithmically, the result can be used for a better\nguarantee of the uniqueness.\n\nReferences\n[1] J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference.\n\nMorgan Kaufmann Publishers, San Mateo, CA, 1988.\n\n8\n\n\f[2] P.F. Felzenszwalb and D.P. Huttenlocher. Ef\ufb01cient belief propagation for early vision. Inter-\n\nnational journal of computer vision, 70(1):41\u201354, 2006.\n\n[3] D. Baron, S. Sarvotham, and R.G. Baraniuk. Bayesian compressive sensing via belief propa-\n\ngation. Signal Processing, IEEE Transactions on, 58(1):269\u2013280, 2010.\n\n[4] R.J. McEliece, D.J.C. MacKay, and J.F. Cheng. Turbo decoding as an instance of Pearl\u2019s\n\n\u201dbelief propagation\u201d algorithm. IEEE J. Sel. Areas Commun., 16(2):140\u201352, 1998.\n\n[5] S. Ikeda, T. Tanaka, and S. Amari. Stochastic reasoning, free energy, and information geome-\n\ntry. Neural Computation, 16(9):1779\u20131810, 2004.\n\n[6] M.J. Wainwright and M.I. Jordan. Graphical models, exponential families, and variational\n\ninference. Foundations and Trends in Machine Learning, 1(1-2):1\u2013305, 2008.\n\n[7] J.S. Yedidia, W.T. Freeman, and Y. Weiss. Generalized belief propagation. Adv. in Neural\n\nInformation Processing Systems, 13:689\u201395, 2001.\n\n[8] A.L. Yuille. CCCP algorithms to minimize the bethe and kikuchi free energies: Convergent\n\nalternatives to belief propagation. Neural computation, 14(7):1691\u20131722, 2002.\n\n[9] A.L. Yuille and A. Rangarajan. The concave-convex procedure. Neural Computation,\n\n15(4):915\u2013936, 2003.\n\n[10] Y.W. Teh, M. Welling, et al. The uni\ufb01ed propagation and scaling algorithm. Advances in\n\nneural information processing systems, 2:953\u2013960, 2002.\n\n[11] T. Heskes. Convexity arguments for ef\ufb01cient minimization of the bethe and kikuchi free ener-\n\ngies. Journal of Arti\ufb01cial Intelligence Research, 26(1):153\u2013190, 2006.\n\n[12] T. Heskes. On the uniqueness of loopy belief propagation \ufb01xed points. Neural Computation,\n\n16(11):2379\u20132413, 2004.\n\n[13] J. M. Mooij and H. J. Kappen. Suf\ufb01cient Conditions for Convergence of the Sum-Product\n\nAlgorithm. IEEE Transactions on Information Theory, 53(12):4422\u20134437, 2007.\n\n[14] A.T. Ihler, JW Fisher, and A.S. Willsky. Loopy belief propagation: Convergence and effects\n\nof message errors. Journal of Machine Learning Research, 6(1):905\u2013936, 2006.\n\n[15] S. Tatikonda and M.I. Jordan. Loopy belief propagation and Gibbs measures. Uncertainty in\n\nAI, 18:493\u2013500, 2002.\n\n[16] Y. Watanabe and K. Fukumizu. Graph zeta function in the bethe free energy and loopy belief\n\npropagation. Adv. in Neural Information Processing Systems, 22:2017\u20132025, 2009.\n\n[17] M. Kotani and T. Sunada. Zeta functions of \ufb01nite graphs. J. Math. Sci. Univ. Tokyo, 7(1):7\u201325,\n\n2000.\n\n[18] K. Hashimoto. Zeta functions of \ufb01nite graphs and representations of p-adic groups. Automor-\n\nphic forms and geometry of arithmetic varieties, 15:211\u2013280, 1989.\n\n[19] H.M. Stark and A.A. Terras. Zeta functions of \ufb01nite graphs and coverings. Advances in\n\nMathematics, 121(1):124\u2013165, 1996.\n\n[20] Y. Watanabe and K. Fukumizu. Loopy belief propagation, Bethe free energy and graph zeta\n\nfunction. arXiv:1103.0605.\n\n[21] D.M. Malioutov, J.K. Johnson, and A.S. Willsky. Walk-sums and belief propagation in Gaus-\n\nsian graphical models. The Journal of Machine Learning Research, 7:2064, 2006.\n\n[22] Y. Weiss. Correctness of Local Probability Propagation in Graphical Models with Loops.\n\nNeural Computation, 12(1):1\u201341, 2000.\n\n[23] Thomas Zaslavsky. Characterizations of signed graphs. Journal of Graph Theory, 5(4):401\u2013\n\n406, 1981.\n\n9\n\n\f", "award": [], "sourceid": 862, "authors": [{"given_name": "Yusuke", "family_name": "Watanabe", "institution": null}]}