{"title": "The Neurodynamics of Belief Propagation on Binary Markov Random Fields", "book": "Advances in Neural Information Processing Systems", "page_first": 1057, "page_last": 1064, "abstract": null, "full_text": "The Neurodynamics of Belief Propagation on Binary\n\nMarkov Random Fields\n\nThomas Ott\n\nInstitute of Neuroinformatics\n\nETH/UNIZH Zurich\n\nSwitzerland\n\nRuedi Stoop\n\nInstitute of Neuroinformatics\n\nETH/UNIZH Zurich\n\nSwitzerland\n\ntott@ini.phys.ethz.ch\n\nruedi@ini.phys.ethz.ch\n\nAbstract\n\nWe rigorously establish a close relationship between message passing algorithms\nand models of neurodynamics by showing that the equations of a continuous Hop-\n(cid:2)eld network can be derived from the equations of belief propagation on a binary\nMarkov random (cid:2)eld. As Hop(cid:2)eld networks are equipped with a Lyapunov func-\ntion, convergence is guaranteed. As a consequence, in the limit of many weak con-\nnections per neuron, Hop(cid:2)eld networks exactly implement a continuous-time vari-\nant of belief propagation starting from message initialisations that prevent from\nrunning into convergence problems. Our results lead to a better understanding of\nthe role of message passing algorithms in real biological neural networks.\n\n1 Introduction\n\nReal brain structures employ inference algorithms as a basis of decision making. Belief Propagation\n(BeP) is a popular, widely applicable inference algorithm that seems particularly suited for a neu-\nral implementation. The algorithm is based on message passing between distributed elements that\nresembles the signal transduction within a neural network. The analogy between BeP and neural net-\nworks is emphasised if BeP is formulated within the framework of Markov random (cid:2)elds (MRF).\nMRF are related to spin models [1] that are often used as abstract models of neural networks with\nsymmetric synaptic weights. If a neural implementation of BeP can be realised on the basis of MRF,\neach neuron corresponds to a message passing element (hidden node of a MRF) and the synaptic\nweights re(cid:3)ect their pairwise dependencies. The neural activity then would encode the messages\nthat are passed between connected nodes. Due to the highly recurrent nature of biological neural\nnetworks, MRF obtained in this correspondence to a neural network are naturally very (cid:147)loopy(cid:148).\nConvergence of BeP on loopy structures is, however, a delicate matter [1]-[2] .\nHere, we show that BeP on binary MRF can be reformulated as continuous Hop(cid:2)eld networks along\nthe lines of the sketched correspondence. More precisely, the equations of a continuous Hop(cid:2)eld\nnetwork are derived from the equations of BeP on a binary MRF, if there are many, but weak con-\nnections per neuron. As a central result in this case, attractive (cid:2)xed points of the Hop(cid:2)eld network\nprovide very good approximations of BeP (cid:2)xed points of the corresponding MRF. In the Hop(cid:2)eld\ncase a Lyapunov function guarantees the convergence towards these (cid:2)xed points. As a consequence,\nHop(cid:2)eld networks implement BeP with guaranteed convergence. The result of the inference is di-\nrectly represented by the activity of the neurons in the steady state. To illustrate this mechanism,\nwe compare the magnetisations obtained in the original BeP framework to that from the Hop(cid:2)eld\nnetwork framework, for a symmetric ferromagnetic model.\nHop(cid:2)eld networks may also serve as a guideline for the implementation or the detection of BeP\nin more realistic, e.g., spiking, neural networks. By giving up the symmetric synaptic weights\nconstraints, we may generalise the original BeP inference algorithm towards capturing neurally\ninspired message passing.\n\n\f2 A Quick Review on Belief Propagation in Markov Random Fields\n\nwritten in the factorised form\n\n\u001e*o\n\nreformulated as an Ising system with the Energy\n\nis the normalisation constant [1]. (1) can directly be\n\nis usually given by a joint probability,\n\n\u0005\u0004\u000e-\u000b*\u0001\r\u000e0\u001e1\u000b\n\n\u001b03\n\nMRF have been used to formulate inference problems, e.g. in Boltzmann machines (which actually\nare MRF [3]) or in the (cid:2)eld of computer vision [4] and are related to Bayesian networks. In fact, both\nconcepts are equivalent variants of graphical models [1]. Typically, from a given set of observations\nthat, in our case, take on either of the two values\n\n\u0002\u0001\u0004\u0003 , we want to infer some hidden quantities\u0002\u0005\u0006\u0003\n\b\u0007\n\t\b\u000b\f\t\r\u0003 . For instance, the pixel values of a grey-scaled image may be represented by\u0002\u0001\u0004\u0003 , whereas\na particular variable\u0005\u000f\u000e describes whether pixel\u0010 belongs to an object (\u0005\u0011\u000e\u0013\u0012\u0014\t ) or to the background\n(\u0005\u0015\u000e\u0013\u0012\u0014\u0007\n\t ). The natural question that emerges in this context is: Given the observations\u0002\u0001\u0016\u000e\u0017\u0003 , what is\nthe probability for\u0005\n\u0012\u0018\t ? The relation between\u0002\u0001\u0004\u0003 and\u0002\u0005\u0019\u0003\n\u001d\u0005\u0006\u0003\b\u000b\f\u0002\u0001\u0004\u0003\u001f\u001e \u0012\n\u0005\u0004\u000e-\u000b*\u0005\u0016(\u001d\u001e\n\u000e,(\n\u0002\u0005\u0006\u0003!\u001e \u0012\n\u001a\u0013\u001b\n\u001a\u001c\u001b\n\"$#\n\u000e/.\n(*)!+\nwhere the functions\n\u0003 describe the pairwise dependencies of the hidden variables\u001d\u0005\u0006\u0003 and the\n\u0003 give the evidences from\u0002\u0001\u0004\u0003 .\"\nfunctions\n\u000e9\u001e1\u000b\n(\u001d\u001e:\u0007;5\n\u000e8(\n\u001e4\u0012\u0018\u0007/5\n\u001b93\n\u001b03\n\u001b93\n\u000e=<\n\u000e'&\n(*)76\n\u001e of a spin con(cid:2)guration3 ,\nwhere the Boltzmann distribution provides the probability\u001a\u001c\u001b03\n)0E*F>G\n%DC\n\u001e>\u0012\n\u001a\u0013\u001b03\n\"@?\bA\u000fB\n\u0005\u0004\u000e*\u000b*\u0001\r\u000e0\u001e .\n\u0005\u0015\u000e*\u000b-\u0005N(\u001d\u001e and<\n(\f\u001eIH\u001fJK\u0012KLDM\n\u000e8(\n\u000e:\u0012=\u0005\u0015\u000e ,6\n\u000e8(\n\u000e-\u000b\n\u000e9\u001e*H\u001fJK\u0012KLDM\nA comparison with (1) yields3\n\u001b03\n\u000eQ\u001eR\u0012\n( and that<\n\u000e,(\n\u000e-\u000b\n(\u001d\u001eO\u0012\n(P\u0012\n\u000e8(\n(*\u000e\n\u001b93\n\u001b03\nIn many cases, it is reasonable to assume that6\n\u000e8( and<\n\u000e are real-valued constants, so that (2) transforms into the familiar Ising Hamil-\n\u000e , where6\ntonian [5]. For convenience, we setJS\u0012\u0014\t .\n\u0002\u0005\u0006\u0003!\u001e\n\u0005\u0004\u000e\u0017\u001e4\u0012\n\u001a\u0013\u001b\nV\rW\nT\u001fU\n\u000e according to Eq. (4) is generally very time-consuming. BeP provides\nAn exact evaluation of\u001a\n\u000e,(ZY\n\u0012\\[ ) interchange messages that contain a\nconnected elements (where a connection is given by6\n\u0005\u0016(!\u001e`\u0003 at timea , the messages at timea\u0019b\n\t are determined by\n\u0002]_^\n\u000e,(\n^'c\u0013d\n\u000e8(\n\u000b-\u0005\n\u000b*\u0001\n\u001e>\u0012\nVk\u000e\n\u000e8(\n\u000eD)ihj(\nV\u001df\rg\nT\u001de\nHere,]l\u000e,( denotes the message sent from the hidden variable (or node)\u0010\nto nodem .n\nm denotes\nthe set of all neighbouring nodes of\u0010 withoutm . Usually, the messages are normalised at every time\n\u000e are approximated by\n]_^\n\u0007\n\t\u001d\u001ep\u0012q\t . After (5) has converged, the marginals\u001a\nstep, i.e.,]_^\n\t\u001f\u001e\n\u000e,(\n\u000e,(\n\u000e that are calculated according to\nthe so called beliefsr\n(*\u000e\n\u001e4\u0012=s\n\u000b*\u0001\n\u001e1\u000b\n(1f\rg\nwheres\n\u0007\n\t\u001f\u001e , the so-called local magnetisation.\n\t\u001d\u001e \u0007\ninterested in the quantity]\n\nus with approximated marginals within a reasonable time. This approach is based on the idea that\n\nrecommendation about what state the other elements should be in [1]. Given the set of messages\n\nis a normalisation constant. In particular in connection with Ising systems, one is primarily\n\nThe inference task inherent to MRF amounts to extracting marginal probabilities\n\nFor a detailed introduction of BeP on MRF we refer to [1].\n\n(1)\n\n(2)\n\n(3)\n\n(4)\n\n(5)\n\n(6)\n\n3 BeP and the Neurodynamics of Hop(cid:2)eld Networks\n\nThe goal of this section is to establish the relationship between the update rule (5) and the dynamical\nequation of a continuous Hop(cid:2)eld network,\n\nu\bv\n\n\u0012w\u0007\n\nbyx{z\n\nV1\u000e\n\nV}|\n\n\u001e\u0017~\n\nb\u0080\u007f\n\n(7)\n\n\u000e'&\n\n\u000et)\n\n\u000e\n\t\n%\n\u001b\n#\n\u000e\n\u001b\n+\n.\n2\n%\n\u000e\n3\n\u000e\n\t\n3\n+\n\u001b\n\u000e\n.\n\u001b\n3\n6\n3\n\u000e\n3\n6\n3\n\u000e\n3\n\u000e\n<\n\u000e\n3\n\u001a\n\u000e\n\u001b\n5\n&\nX\n\u000e\nG\n\u001b\n]\n\u001b\n\u0005\n(\n5\n.\n\u000e\n\u001b\n\u0005\n\u000e\n\u000e\n\u001e\n+\n\u001b\n\u0005\n\u000e\n(\n\u001e\n#\n%\n]\n^\n\u001b\n\u0005\n\u000e\n\u001e\nG\n\u001b\n\u0010\n\u001b\nb\n\u001b\nr\n\u000e\n\u001b\n\u0005\n\u000e\n.\n\u000e\n\u001b\n\u0005\n\u000e\n\u000e\n\u001e\n#\n%\n]\n\u001b\n\u0005\n\u000e\n\u000e\n\u0012\nr\n\u000e\n\u001b\nr\n\u000e\n\u001b\n\u000e\n\u001b\na\n\u001e\nu\na\nv\n\u000e\n\u001b\na\n\u001e\n5\nv\nV\n\u001b\na\n\u000e\n\u001b\na\n\u001e\nG\n\fto one reparam-\n\n(e.g., the membrane potential) and\n\nrealise the translation from MRF into Hop(cid:2)eld networks as follows:\n\nThis will establish the exact relationship between Hop(cid:2)eld and BeP.\n\n(2) Translation into a continuous system.\n(3) Translation of the obtained equations into the equations of a Hop(cid:2)eld network, where we (cid:2)nd\n\nis an external signal or bias (see, e.g., [6] for a\ngeneral introduction to Hop(cid:2)eld networks). According to the sketched picture, each neuron repre-\n\nthis encoding will be worked out below. The Hop(cid:2)eld architecture implements the point attractor\nparadigm, i.e., by means of the dynamics the network is driven into a (cid:2)xed point. At the (cid:2)xed point,\n\nHerev\nis some quantity describing the activity of neuron\u0010\n\u0005\u000f\u001e\n\u0005\u0004\u001e\u0081\u0012=\u0082I\u0083\u0084M\u0086\u0085\n\u0005\u000f\u001e .\nis the activation function, typically implemented in a sigmoid form, such asx\n\u000e8(\n(*\u000e are the connection (synaptic) weights which need to be symmetric in the Hop(cid:2)eld model.\nThe connectivity might be all-to-all or sparse.\u007f\n\u000e and|\n\u000e , whereas the messages are encoded in the variablesv\n\u000e,( . The exact nature of\nsents a node\u0005\n\u000e can be read out. In the MRF picture, this corresponds to (5) and (6). We will now\nthe beliefsr\n\u0007\n\t\u001f\u001e\n\t\u001f\u001e and]l\u000e,(\n(1) Reduction of the number of messages per connection from]Z\u000e,(\n\u000e,( .\neterised variable\u0087\n\u000e8(\n\u000e and|\n\u000e,( .\nin terms ofv\nthe encoding of the variables\u0087\n\u0005\u0015\u000e\u0017\u001e can be reparameterised [2] according to\nIn the case of binary variables\u0005\u000f\u000e , the messages]P(*\u000e\n\u0005\u0016(p\u0012\u0014\u0007\n\t\u001d\u001e\n\u0005\u0016(\u0088\u0012\u0014\t\u001f\u001e\u001c\u0007\u008a]l\u000e8(\n\u000e,(\u0088\u0012\u0089]_\u000e,(\n\u0082I\u0083\rMN\u0085\nBy this, the update rules (5) transform into update rules for the new (cid:147)messages(cid:148)\u0087\ndP\u008f\n^'c\u0013d\n\u000e,(\u001f\u001e7\u0082I\u0083\rMN\u0085\u0080\u0091\n\u0082`\u0083\u0084MN\u0085\n\u0094@\u0095\n\u000ei\u0093\n\u0012\u008e\u0082`\u0083\u0084MN\u0085\n\u000e8(\nV\u001df\rg\n\u000et)0h\u0017(\n\u000e,( . We can now directly calculate the\nFor each connection\u0010\u0098\u0097\u0099m we obtain one single message\u0087\nVk\u000e\nlocal magnetisation according to]\n\u0012\u0089\u0082I\u0083\rMN\u0085\n\u001e [2]. The Jacobian of (9) in a point\u008b\nV\u001df\rg\n\u001b9\u009a\ne,\u00a1\n\u001e\u0002\u00a3\nis denoted byu\n\u001b\u001f\u009b\u0002\u009c\r\u009dD\u009e\u00a0\u009f\nU-\u00a2\n\u009b\u001d\u009c\n\nThe used reparametrisation translates the update rules into an additive form ((cid:147)log domain(cid:148)) which\nis a basic assumption of most models of neural networks.\n\n3.1 Reparametrisation of the messages\n\n\u001b0\u008b\n\u001e4\u0012\n\n\u001e\u008d\u008c\n\nV1\u000e\n\n\u000e,(\n\n(8)\n\n(9)\n\n.\n\n\u001bi\u008b\n\n3.2 Translation into a time-continuous system\n\nEq. (9) can be translated into the equivalent time-continuous system\n\n(10)\n\n\u001e\u0017\u0093\n\n\u001bi\u008b\u008d\u001b\n\n\u000e,(\u001f\u001e7\u0082I\u0083\rMN\u0085\u0080\u0091\n\n\u000e8(\nd\u00a6\u008f\n\u0082`\u0083\u0084MN\u0085\n\u0082`\u0083\u0084MN\u0085\n\u0094\u00a6\u0095\n\u0096\u00a7\u000b\nV1\u000e\n\u000e8(\n\u001e*\u001e>\u0012\u0018\u0007\n\u0012\u008e\u00a5\r\u000e,(\nV\u001ff\rg\n\u000et)ihj(\n\u001e\u00a8\u0012\nis time-independent. The corresponding Jacobian in a point\u008b\nwhere<\n\u00a3 -dimensional identity matrix (\u00a3\n\u00a3 is the number of mes-\nis the\u00a3\n\u001e , where\u00ab\n\u001e\u00a9\u0012\u00aa\u0007\u00ac\u00ab\n\u001b0\u008b\n\u001b0\u008b\n\u000e,( ). Obviously, (9) and (10) have the same (cid:2)xed points\u008b:\u00adD\u00ae which are given by\nsages\u0087\n\u000e,(\u001f\u001e7\u0082I\u0083\rMN\u0085\n\u0096\u00a7\u000b\nV1\u000e\n\u0082`\u0083\u0084MN\u0085\n\u000e,(p\u0012\u00af\u0082`\u0083\u0084MN\u0085\nV\u001ff\rg\n\u000et)ihj(\n\u001e be smaller than 1, whereas for the stability of\npart of the largest eigenvalue of the Jacobianu\n\u00adD\u00ae\n\u001bi\u008b\n\u001e must\n\u001e\u00ac\u0012\u00b0\u0007\u00ac\u00ab\n(10) the condition is that the real part of the largest eigenvalue ofu\n\nwith identical stability properties in both frameworks: For stability of (9) it is required that the real\n\nbe smaller than 0. It is obvious that both conditions are identically satis(cid:2)ed.\n\nis denoted by\n\n(11)\n\n\u001bi\u008b \u00adt\u00ae\n\n\u001bi\u008b\u0013\u00adt\u00ae\n\n\u000e\nx\n\u001b\n\u001b\n\u001b\n|\n\u0012\n|\n\u000e\n\u001b\na\n\u001e\n\u001b\n\u001b\n\u001b\n\u0087\n\u001b\n\u001b\nG\nx\n\u0087\nA\n\u0090\n\u001b\n6\n\u0092\n5\n%\n\u0087\n^\nb\n<\n\u0096\nG\n\u000e\ne\n\u0087\nb\n<\n\u000e\nx\n\u009d\n\u00a4\nu\n\u0087\n\u001b\na\n\u001e\nu\na\na\n\u0087\n\u001b\na\n\u001e\nb\nA\n\u0090\n\u001b\n6\n\u0092\n5\n%\n\u0087\n\u001b\na\n\u001e\nb\n<\n\u000e\n\u001b\na\n\u000e\n\u001b\na\n<\n\u000e\nu\n\u00a5\nu\nb\nu\nx\nu\n\u0087\n\u0087\n\u0087\nA\nd\n\u008f\n\u0090\n\u001b\n6\n\u0091\n\u0092\n5\n%\n\u0087\nb\n<\n\u000e\n\u0093\n\u0094\n\u0095\nx\n\u00a5\nu\nb\nu\nx\n\f3.3 Translation into a Hop(cid:2)eld network\n\n\u000e,(\n\n\u000eDV\n\n(12)\n\n. Moreover,\n\ncan be neglected. Thus (12) simpli(cid:2)es to\n\nto the presynaptic neural activity weighted by the synaptic strength. Formally, we may de(cid:2)ne a\n\n\u000e with\u0087\n\u000e,( .\nThe comparison between Eq. (7) and Eq. (10) does not lead to a direct identi(cid:2)cation ofv\n\u000e . That is, a message corresponds\n\u000e,( with|\nRather, under certain conditions, we can identify\u0087\n\u000e8(\u0088\u0012\n\u000e,(\nvariablev\n\u000e and rewrite Eq. (10) as\n\u000e by\u0087\n(*\u000e\nVk\u000e\n\u000e,(\n\u000e,(\n\u000e8(\n\u0096Z\u000b\n\u0092\u00b15\n\u0082I\u0083\rMN\u0085\n\u0082I\u0083\rMN\u0085\n\u0012\u0014\u0007\n\u000et)\nV\u001df\rg\n\u000e8(\n\u000e8( are rela-\n\u000e,(\n\u001e .1 In the following, we assume that the synaptic weights|\n\u0012\u00af\u0082I\u0083\rMN\u0085\nwhere we set|\n\u0005\u000f\u001e can be approximated by\u0082`\u0083\u0084MN\u0085\n\u000e,(O\u00b2\u00b3\t . Hence\u0082I\u0083\u0084M\u0086\u0085\n\u0005\u0004\u001e\u00ac\u00b4w\u0005\ntively small, i.e.,|\n(*\u000e\n\u000e\u0013\u00b6\u00b7\t ) then the single contribution|\nif a neuron receives many inputs (number of connections\u00b5\n\u000e,(\n\u000e8(\n\u0012\u0018\u0007\n\u001e\u0017\u0093\nVk\u000e\n\u000e,(\u0006\u0082I\u0083\u0084M\u0086\u0085\u0080\u0091\n\u000eD)\nV\u001df\rg\n\u000e,( , we arrive at the equation\nUpon a division by|\n\u0012\u0014\u0007\n\u001e\u0017\u0093\nVk\u000e\n\u0092\u00b85\n\u0082I\u0083\u0084M\u0086\u0085\u0080\u0091\nV\u001df\rg\n\u000eD)\nGtGDG\n[7\u001e\u00b9\u0012\n[7\u001e\u00b9\u0012\n[7\u001e for all\u0010 preserves this uniformity\nwhich for a uniform initialisationv\nv7\u00ba\nv\u0016\u00bb\nGtGDG\n\u001e . In other words, the subset de(cid:2)ned byv\n\u001e\u00b9\u0012\n\u001e\u00b9\u0012\nthrough time, i.e.,v\nv\u00a0\u00ba\nv\u0016\u00bb\nv7\u00ba\nGDGtG\nv\u00a0\u00bb\n\u000e , which leads to the equation\na\u0010 allv\n\u000e by a single variablev\nu\rv\nVk\u000e\n\u0082I\u0083\rMN\u0085\u0080\u0091\n\u0012\u0018\u0007\n\u001e\u0017\u0093\n\u000eD)\nV\u001df\rg\n\u0001\u0016\u001e\u00bc\u00b4q\u0082`\u0083\u0084MN\u0085\nUsing\u0082I\u0083\rMN\u0085\n\u000e we end up with the pos-\n\u0082`\u0083\u0084MN\u0085\n\u0001N\u001e\nif\u0001y\u00b2\u00bd\t , and with\u0001y\u0012\n\u0005\u0004\u001e\n\u000e . This is because the (cid:2)xed point and the read out equations collapse under the\nsimply the activityv\n\u0012K\u00bf\u0080\u001e>\u0012\u0089]_\u000e .\n\u000e , i.e.,v\nV\u0084\u001e\nVk\u000e\n\u000e0\u001e\u00be\u00b4\u00af\u0082I\u0083\rMN\u0085\napproximation\u0082`\u0083\u0084MN\u0085\nVk\u000e\n\u001bQ\u009a\n\u001b\u0017\u009a\nb\u0080\u007f\n\u000e8( and the external (cid:2)elds<\n\u001e are relatively weak, (II) that each neuron\nthat (I) the single weights|\n[\b\u001e4\u0012\nreceives many inputs and (III) that the original messages have been initialised according tov\nVk\u00c2\nV`\u00c1\nGDGtG\n\u000eDV\n\u000e\u00c0V\n\u000e\u00c0V\n\u000e\u00c0V\n\u000e\u00c0V\n[\b\u001e\u008d\u0012\n[7\u001e\u00be\u0012\n\u00c1\u001dH\n\u00c1p\u0012\ne . From a biological point of view,\n\nthe (cid:2)rst two points seem reasonable. The effect of a single synapse is typically small compared\nto the totality of the numerous synaptic inputs of a cell [7]-[8]. In this sense, single weights are\nconsidered weak. In order to establish a (cid:2)rm biological correspondence, particular consideration\nwill be required for the last point. In the next section, we show that Hop(cid:2)eld networks are guaranteed\nto converge and thus, the required initialisation can be considered a natural choice for BeP on MRF\nwith the properties (I) and (II).\n\nIn summary, we can emulate the original BeP procedure by a continuous Hop(cid:2)eld network provided\n\ntulated equation (7). After the convergence to an attractor (cid:2)xed point, the local magnetisation is\n\nis invariant under the dynamics of (14). For such an initialisation we can therefore replace for\n\n(15)\n\n(13)\n\n(14)\n\n3.4 Guarantee of convergence\n\nA basic Hop(cid:2)eld model of the form\n\nand references therein). For the former model, an explicit Lyapunov function has been constructed\n\n\u0005\u0004\u001e , has the same attractor structure as the model (7) described above (see [6]\n\n\u0012\u0014\u0007\u00a9\u0005\u0015\u000e\n\u00c6 are automatically restricted to the interval\u00c7\u00a0\u00c8Z\u00c9\u0002\u00cak\u00c9\u001f\u00cb .\n\n\u0005\u000f\u001e\u00c3\u0012q\u0082`\u0083\u0084MN\u0085\nwithx\n1Hence the synaptic weights\u00c4>\u00c5\n\n\u00abI(\b\u000b\n\n\u0005\u0016(\n\n(*\u000e\n\n\u001e-\u001e\n\n(16)\n\nv\n(\n|\nv\n(\nu\nu\na\n|\nv\n(\n\u000e\n|\nv\n(\n\u000e\nb\nA\nd\n\u008f\n\u0090\n|\n\u0091\n%\n|\nv\n\u000e\nV\n\u0007\n|\nv\n\u000e\n(\nb\n<\n\u000e\n\u001b\na\n\u001e\n\u0093\n\u0094\n\u0095\n\u001b\n6\nA\nd\n\u001b\nA\nd\n\u001b\nv\n\u000e\n(\nu\nu\na\n|\nv\n(\n\u000e\n|\nv\n(\n\u000e\nb\n|\n\u0092\n5\n%\n|\nv\n\u000e\nV\nb\n<\n\u000e\n\u001b\na\n\u0094\nG\nu\nu\na\nv\n(\n\u000e\nv\n(\n\u000e\nb\n%\n|\nv\n\u000e\nV\nb\n<\n\u000e\n\u001b\na\n\u0094\nd\n\u000e\n\u001b\n\u000e\n\u001b\n\u0012\ne\n\u000e\n\u001b\nd\n\u000e\n\u001b\na\n\u000e\n\u001b\na\n\u0012\ne\n\u000e\n\u001b\na\nd\n\u000e\n\u0012\n\u000e\n\u0012\ne\n\u000e\n(\n\u000e\nu\na\nv\n\u000e\nb\n\u0092\n5\n%\n|\nv\nV\nb\n<\n\u000e\n\u001b\na\n\u0094\nG\n\u001b\n\u0005\nb\n\u001b\nb\n\u001b\n<\nV\n|\nv\nV\nb\n<\nV\n|\nv\n\u000e\n\u001b\na\n\u000e\n\u001b\na\nV\n\u009f\n\u000e\n\u001b\n\u0087\n\u009f\nH\n|\n\u009f\n\u0012\nv\n\u000e\n\u001b\n\u0087\n|\n\u0012\nv\ne\n\u000e\n\u001b\n\u0087\n\u00c2\ne\nH\n|\n\u00c2\nu\n\u0005\n\u000e\n\u001b\na\n\u001e\nu\na\n\u001b\na\n\u001e\nb\n5\n(\n|\nx\n\u001b\n\u001b\na\nb\n\u001b\n\u001b\n\f1\n\nm\n\n0.2\n\n1\n\nm\n\n0.2\n\n5\n\nT\n\n25\n\n0.1\n\n0.4\n\nw\n\n(a)\u00ccP\u00cd\u00c0\u00ce\u0081\u00cf\nFigure 1: The magnetisation]\n\nfor the symmetric ferromagnetic model. The\nresults for the original BeP (grey stars) and for the Hop(cid:2)eld network (black circles) are compared.\n\n[9] which assures that these networks and with them the networks considered by us are globally\nasymptotically stable [6].\nMoreover, the time-continuous model (7) can be translated back into a time-discrete model, yielding\n\n(b)\u00cc_\u00cd\u00d0\u00c4\u008d\u00cf\n\nas a function ofJ\n\nand|\n\n\t\u001f\u001e \u0012\u008e\u0082`\u0083\u0084MN\u0085\n\na\u0019b\n\n(*\u000e\n\nFrom Eq.\n\nas a scaling parameter.\n\n(17)\n\n(fer-\n\nis\n\n(18)\n\nThis equation is the proper analogue of Eq. (9).\n\n4 Results for the Ferromagnetic Model\n\nin(cid:2)nitely extended network or of a network with some spatial periodicity, e.g., a network on a torus.\n\nb\u0080\u007f\n\u0012q\u00bfy\u001e for networks\nIn this section, we evaluate the Hop(cid:2)eld-based inference solution]\n\u000e,(\nwith a simple connectivity structure: We assume constant positive synaptic weights|\nromagnetic couplings) and a constant number of connections per neuron\u00b5 . We furthermore abstain\n\u0012\u0018[ . To realise this symmetric model, we may either think of an\nfrom an external (cid:2)eld and set\u007f\n\t\u001fH!J\u0088\u001e ,\n\u001e\u00d1\u0012\u00b0\u0082`\u0083\u0084MN\u0085\n\u0012\u00b0\u0082I\u0083\rMN\u0085\nis related to6\nin a spin model via|\nAccording to the last section,|\nwhere, for convenience, we reintroduced a quasi-temperatureJ\nGDGDG\nv\r\u00d3k\u00d4\nis a (cid:2)xed point of the system ifv\b\u00d31\u00d4\nv\r\u00d3k\u00d4\nv\r\u00d3k\u00d4\n\u00adD\u00ae\n(7), it is clear that\u00d2\n\u001e . This equation has always a solutionv\b\u00d31\u00d4\n\u0082I\u0083\rMN\u0085\n\u0012\u00aa[ . However, the stability ofv\u0084\u00d5\nv!\u00d5\nrestricted toJK\u00d6\u00afJ\u0088\u00d7\n^ , where the bifurcation point is given by\n\u00d8Q\u00d9\n\u00d8Q\u00d9\n\u0082I\u0083\rMN\u0085\n\u0012\u00e1\t . ForJ$\u00e2}J\n\u00bbj\u00de\u0011\u00df\nThis follows from the critical condition\u009b\u0015\u00dai\u00dbI\u00dc\f\u00dd\n^ , two additional and\n\u00dfI\u00e0\n\u00d8Q\u00d9\nstable (cid:2)xed pointsv\b\u00e3 emerge which are symmetric with respect to the origin. After the convergence\nforJ\u00e1\u00e2\u00e4J\nforJ\u00e1\u00d6\u00aaJ\n^ , the obtained magnetisation]\u0099\u0012\nto a stable (cid:2)xed point,v\n^ andv\n\u00d8Q\u00d9\n\u00d8Q\u00d9\nis shown in dependence ofJ\n\u0012w\u00e5\r[ . The\n\u0082I\u0083\rMN\u0085\n\u001e equal tov\r\u00d31\u00d4\nin Fig. 1a (black circles), for\u00b5\n\u00e6\r\u00e7\n\u0012\u0018\t\u001fH\u0013\u0082`\u0083\u0084MN\u0085\ncritical point is found at a temperatureJ\n\t!H\u0084\u00e5\r[\b\u001e \u0012w\t\u001d\u00e6\n\u00d8Q\u00d9\nGDGtG\n[\u0086\u000b*[N\u000bI[\nunderstood from Eq. (9), for which the point given by the messages\u008b4\u00e8\nJ\u0088\u00e9Q\u00ea\n\u00d8Q\u00d9\n\u0082`\u0083\u0084MN\u0085\n\u00e6\b\u00e7 .Jp\u00e99\u00ea\n\u0012\u00b8\t\u0002\u00e7\n\u00d8Q\u00d9\nFor the value\u00b5\n\u0012\u00ed\u00e5\n\u0012\u00ec\u00eb , we getJ\n\u00e99\u00ea\n\u00d8Q\u00d9\ngrids obtained in the Bethe-Peierls approximation (for\u00b5\n\nThe result is compared to the result obtained on the basis of the original BeP equations (5) (grey\nstars in Fig 1a). We see that the critical point is slightly lower in the original BeP case. This can be\nlooses stability\n\n\u0012q\u00e5\u0084[ , this yieldsJp\u00e9Q\u00ea\n\u00d8Q\u00d9\n\nis in fact the critical temperature for Ising\n[5]). In\n\nv\r\u00d31\u00d4\n\nv\r\u00d31\u00d4\n\nat the critical temperature\n\n(19)\n\n\u00e7\r\u00e77\u00ee\u0084\u00ef\b\u00e6\n\nv\n\u000e\n\u001b\n\u0091\n\u0092\n5\n(\n|\nv\n(\n\u001b\na\n\u001e\n\u0093\n\u0094\n\u000e\n\u001b\na\n\u001e\nG\n\u000e\n\u0012\nv\n\u000e\n\u001b\na\n\u0012\n|\n\u000e\n\u001b\n6\n\u001b\n\u0012\n\u001b\n\u000b\n\u000b\n\u000b\n\u001e\n\u0012\n\u001b\n\u00b5\n|\n\u0012\n\u009c\n\u000e\nJ\n\u00d7\n\u009c\n\u000e\n^\n\u0012\n\t\nA\nd\n\u001b\nd\n\u00bb\n\u001e\nG\n%\n)\n\u009b\n\u00df\n\u00a3\n\u00df\nX\n\u00d7\n\u009c\n\u000e\nc\n\u00d7\n\u009c\n\u000e\n\u00d5\n\u00d7\n\u009c\n\u000e\n\u001b\n\u00b5\n|\n\u00d7\n\u009c\n\u000e\n^\nA\nd\n\u001b\nG\n\u0012\n\u001b\n\u001e\n\u00d4\n\u000e\n^\n\u0012\n\t\nA\nd\n\u001b\nd\n\u00bb\nA\nd\n\u001e\nG\n\u00d4\n\u000e\n^\nG\n\u00d4\n\u000e\n^\n\u00d4\n\u000e\n^\nG\n\fthis way, we casually come across the deep relationship of BeP and Bethe-Peierls which has been\nestablished by the theorem stating that stable BeP (cid:2)xed points are local minima of the Bethe free\nenergy functional [1],[10].\n\n.\n\n, the results for Hop(cid:2)eld nets and BeP must be identical.\nin both cases. For very large weights,\n, the results are also identical in the case of the ferromagnetic couplings studied here,\n\nIn the limit of small weights, i.e. largeJ\nThis, in fact, is certainly true forJq\u00d6\u0018J\n^ , where]\u00f0\u0012\u00b0[\n\u00d8Q\u00d9\ni.e., smallJ\nas]\n\t . It is only around the critical values, where the two results seem to differ. A comparison\nof the results against the synaptic weight|\n, however, shows an almost perfect agreement for all|\nThe differences can be made arbitrarily small for larger\u00b5 .\n\n5 Discussion and Outlook\n\nIn this report, we outlined the general structural af(cid:2)nity between belief propagation on binary\nMarkov random (cid:2)elds and continuous Hop(cid:2)eld networks. According to this analogy, synaptic\nweights correspond to the pairwise dependencies in the MRF and the neuronal signal transduc-\ntion corresponds to the message exchange. In the limit of many synaptic connections per neuron,\nbut comparatively small individual synaptic weights, the dynamics of the Hop(cid:2)eld network is an\nexact mirror of the BeP dynamics in its time-continuous form. To achieve the agreement, the choice\nof initial messages needs to be con(cid:2)ned. From this we can conclude that Hop(cid:2)eld network attrac-\ntors are also BeP attractors (whereas the opposite does not necessarily hold). Unlike BeP, Hop(cid:2)eld\nnetworks are guaranteed to converge to a (cid:2)xed point. We may thus argue that Hop(cid:2)eld networks\nnaturally implement useful message initialisations that prevent trapping into a limit cycle. As a fur-\nther bene(cid:2)t, the local magnetisations, as the result of the inference process, are just re(cid:3)ected in the\nasymptotic neural activity. The binary basis of the implementation is not necessarily a drawback,\nbut could simply re(cid:3)ect the fact that many decisions have a yes-or-no character.\nOur work so far has preliminary character. The Hop(cid:2)eld network model is still a crude simpli(cid:2)cation\nof biological neural networks and the relevance of our results for such real-world structures remains\nsomewhat open. However, the search for a possible neural implementation of BeP is appealing and\ndifferent concepts have already been outlined [11]. This approach shares our guiding idea that the\nneural activity should directly be interpreted as a message passing process. Whereas our approach\nis a mathematically rigorous intermediate step towards more realistic models, the approach chosen\nin [11] tries to directly implement BeP with spiking neurons. In accordance with the guiding idea,\nour future work will comprise three major steps. First, we take the step from Hop(cid:2)eld networks to\nnetworks with spiking elements. Here, the question is to what extent can the concepts of message\npassing be adapted or reinterpreted so that a BeP implementation is possible. Second, we will give\nup the arti(cid:2)cial requirement of symmetric synaptic weights. To do this, we might have to modify\nthe original BeP concept, while we still may want to stick to the message passing idea. After\nall, there is no obvious reason why the brain should implement exactly the BeP algorithm. It rather\nseems plausible that the brain employs inference algorithms that might be conceptually close to BeP.\nThird, the context and the tasks for which such algorithms can actually be used must be elaborated.\nFurthermore, we need to explore how the underlying structure could actually be learnt by a neural\nsystem.\nMessage passing-based inference algorithms offer an attractive alternative to traditional notions of\ncomputation inspired by computer science, paving the way towards a more profound understanding\nof natural computation [12]. To judge its eligibility, there is - ultimately - one question: How can\nthe usefulness (or inappropriateness) of the message passing concept in connection with biological\nnetworks be veri(cid:2)ed or challenged experimentally?\n\nAcknowledgements\n\nThis research has been supported by a ZNZ grant (Neuroscience Center Zurich).\n\n\u00d7\n\u009c\n\u000e\n\u0097\n\fReferences\n\n[1] Yedidia, J.S., Freeman, W.T., Weiss, Y. (2003) Understanding belief propagtion and its gen-\neralizations.\nIn G. Lakemeyer and B. Nebel (eds.) Exploring Arti(cid:2)cial Intelligence in the New\nMillenium, Morgan Kaufmann, San Francisco.\n[2] Mooij, J.M., Kappen, H.J. (2005) On the properties of the Bethe approximation and loopy belief\npropagation on binary networks. J.Stat.Mech., doi:10.1088/1742-5468/2005/11/P11012.\n[3] Welling, M., Teh, W.T. (2003) Approximate inference in Boltzmann machines. Arti(cid:2)cial Intelli-\ngence 143:19-50.\n[4] Geman, S., Geman, D. (1984) Stochastic relaxation, Gibbs distributions, and the Bayesian\nrestoration of images. IEEE-PAMI 6(6):721-741.\n[5] Huang, K. (1987) Statistical mechanics. Second edition, John Wiley & Sons, New York, Chapter\n13.\n[6] Haykin, S. (1999) Neural networks - a comprehensive foundation. Second edition, Prentice-Hall,\nInc., Chapter 14.\n[7] Koch, C. (1999), Biophysics of computation. Oxford University Press, Inc., New York.\n[8] Douglas, R.J., Mahowald, M., Martin, K.A.C., Stratford, K.J. (1996) The role of synapses in\ncortical computation. Journal of Neurocytology 25: 893-911.\n[9] Hop(cid:2)eld, J.J. (1984) Neurons with graded response have collective computational properties like\nthose of two-state neurons. PNAS 81:3088-3092.\n[10] Heskes, T. (2004) On the uniqueness of loopy belief propagation (cid:2)xed points. Neural Comput.\n16:2379-2413.\n[11] Shon, A.P., Rao, R.P.N. (2005) Implementing belief propagation in neural circuits. Neurocom-\nputing 65-66:877-884.\n[12] Stoop, R., Stoop, N. (2004) Natural computation measured as a reduction of complexity. Chaos\n14(3):675-679.\n\n\f", "award": [], "sourceid": 3153, "authors": [{"given_name": "Thomas", "family_name": "Ott", "institution": null}, {"given_name": "Ruedi", "family_name": "Stoop", "institution": null}]}