{"title": "Permitted and Forbidden Sets in Symmetric Threshold-Linear Networks", "book": "Advances in Neural Information Processing Systems", "page_first": 217, "page_last": 223, "abstract": null, "full_text": "Permitted and Forbidden Sets in \n\nSymmetric Threshold-Linear Networks \n\nRichard H.R. Hahnloser and H. Sebastian Seung \n\nDept. of Brain & Cog. Sci., MIT \n\nCambridge, MA 02139 USA \n\nrh~ai.mit.edu, seung~mit.edu \n\nAbstract \n\nAscribing computational principles to neural feedback circuits is an \nimportant problem in theoretical neuroscience. We study symmet(cid:173)\nric threshold-linear networks and derive stability results that go \nbeyond the insights that can be gained from Lyapunov theory or \nenergy functions. By applying linear analysis to subnetworks com(cid:173)\nposed of coactive neurons, we determine the stability of potential \nsteady states. We find that stability depends on two types of eigen(cid:173)\nmodes. One type determines global stability and the other type \ndetermines whether or not multistability is possible. We can prove \nthe equivalence of our stability criteria with criteria taken from \nquadratic programming. Also, we show that there are permitted \nsets of neurons that can be coactive at a steady state and forbid(cid:173)\nden sets that cannot. Permitted sets are clustered in the sense that \nsubsets of permitted sets are permitted and supersets of forbidden \nsets are forbidden. By viewing permitted sets as memories stored \nin the synaptic connections, we can provide a formulation of long(cid:173)\nterm memory that is more general than the traditional perspective \nof fixed point attractor networks. \n\nA Lyapunov-function can be used to prove that a given set of differential equations is \nconvergent. For example, if a neural network possesses a Lyapunov-function, then \nfor almost any initial condition, the outputs of the neurons converge to a stable \nsteady state. In the past, this stability-property was used to construct attractor \nnetworks that associatively recall memorized patterns. Lyapunov theory applies \nmainly to symmetric networks in which neurons have monotonic activation functions \n[1, 2]. Here we show that the restriction of activation functions to threshold-linear \nones is not a mere limitation, but can yield new insights into the computational \nbehavior of recurrent networks (for completeness, see also [3]). \n\nWe present three main theorems about the neural responses to constant inputs. The \nfirst theorem provides necessary and sufficient conditions on the synaptic weight ma(cid:173)\ntrix for the existence of a globally asymptotically stable set of fixed points. These \nconditions can be expressed in terms of copositivity, a concept from quadratic pro(cid:173)\ngramming and linear complementarity theory. Alternatively, they can be expressed \nin terms of certain eigenvalues and eigenvectors of submatrices of the synaptic weight \nmatrix, making a connection to linear systems theory. The theorem guarantees that \n\n\fthe network will produce a steady state response to any constant input. We regard \nthis response as the computational output of the network, and its characterization \nis the topic of the second and third theorems. \n\nIn the second theorem, we introduce the idea of permitted and forbidden sets. Under \ncertain conditions on the synaptic weight matrix, we show that there exist sets \nof neurons that are \"forbidden\" by the recurrent synaptic connections from being \ncoactivated at a stable steady state, no matter what input is applied. Other sets are \n\"permitted,\" in the sense that they can be coactivated for some input. The same \nconditions on the synaptic weight matrix also lead to conditional multistability, \nmeaning that there exists an input for which there is more than one stable steady \nstate. In other words, forbidden sets and conditional multistability are inseparable \nconcepts. \n\nThe existence of permitted and forbidden sets suggests a new way of thinking about \nmemory in neural networks. When an input is applied, the network must select a set \nof active neurons, and this selection is constrained to be one of the permitted sets. \nTherefore the permitted sets can be regarded as memories stored in the synaptic \nconnections. \n\nOur third theorem states that there are constraints on the groups of permitted \nand forbidden sets that can be stored by a network. No matter which learning \nalgorithm is used to store memories, active neurons cannot arbitrarily be divided \ninto permitted and forbidden sets, because subsets of permitted sets have to be \npermitted and supersets of forbidden sets have to be forbidden. \n\n1 Basic definitions \n\nOur theory is applicable to the network dynamics \n\n+ \ndx\u00b7 - ' + x \u00b7 = b\u00b7 + \"W\u00b7 \u00b7x \u00b7 \n1 \ndt \n\n[ \n'L...J \n\n' \n\n'J J \n\nj \n\n(1) \n\nwhere [u]+ = maxi u, O} is a rectification nonlinearity and the synaptic weight \nmatrix is symmetric, W ij = W ji . The dynamics can also be written in a more \ncompact matrix-vector form as :i; + x = [b + W x]+. The state of the network is x. \nAn input to the network is an arbitrary vector b. An output of the network is a \nsteady state;!;. in response to b. The existence of outputs and their relationship to \nthe input are determined by the synaptic weight matrix W. \nA vector v is said to be nonnegative, v ~ 0, if all of its components are nonnegative. \nThe nonnegative orthant {v : v ~ O} is the set of all nonnegative vectors. It can \nbe shown that any trajectory starting in the nonnegative orthant remains in the \nnonnegative orthant. Therefore, for simplicity we will consider initial conditions \nthat are confined to the nonnegative orthant x ~ O. \n\n2 Global asymptotic stability \n\nDefinition 1 A steady state;!;. is stable if for all initial conditions sufficiently close \nto ;!;., the state trajectory remains close to ;!;. for all later times. \n\nA steady state is asymptotically stable if for all initial conditions sufficiently close \nto ;!;., the state trajectory converges to ;!;.. \n\nA set of steady states is globally asymptotically stable if from almost all initial \n\n\fconditions, state trajectories converge to one of the steady states. Exceptions are \nof measure zero. \n\nDefinition 2 A principal submatrix A of a square matrix B is a square matrix that \nis constructed by deleting a certain set of rows and the corresponding columns of \nB. \n\nThe following theorem establishes necessary and sufficient conditions on W for \nglobal asymptotic stability. \n\nTheorem 1 If W is symmetric, then the following conditions are equivalent: \n\n1. All nonnegative eigenvectors of all principal submatrices of I - W have \n\npositive eigenvalues. \n\n2. The matrix 1-W is copositive. That is, xT (I - W)x > 0 for all nonnegative \n\nx, except x = O. \n\n3. For all b, the network has a nonempty set of steady states that are globally \n\nasymptotically stable. \n\nProof sketch: \n\n\u2022 (1) ~ (2). Let v* be the minimum of vT(I - W)v over nonnegative v on \nthe unit sphere. If (2) is false, the minimum value is less than or equal \nto zero. It follows from Lagrange multiplier methods that the nonzero \nelements of v* comprise a nonnegative eigenvector of the corresponding \nprincipal submatrix of W with eigenvalue greater than or equal to unity. \n\u2022 (2) ~ (3). By the copositivity off - W, the function L = ~xT (I - W)x-bT X \nis lower bounded and radially unbounded. It is also nonincreasing under \nthe network dynamics in the nonnegative orthant, and constant only at \nsteady states. By the Lyapunov stability theorem, the stable steady states \nare globally asymptotically stable. In the language of optimization theory, \nthe network dynamics converges to a local minimum of L subject to the \nnonnegativity constraint x ~ O. \n\n\u2022 (3) ~ (1). Suppose that (1) is false. Then there exists a nonnegative \neigenvector of a principal submatrix of W with eigenvalue greater than or \nequal to unity. This can be used to construct an unbounded trajectory of \nthe dynamics .\u2022 \n\nThe meaning of these stability conditions is best appreciated by comparing with \nthe analogous conditions for the purely linear network obtained by dropping the \nrectification from (1). In a linear network, all eigenvalues of W would have to be \nsmaller than unity to ensure asymptotic stability. Here only nonnegative eigenvec(cid:173)\ntors are able to grow without bound, due to the rectification, so that only their \neigenvalues must be less than unity. All principal submatrices of W must be con(cid:173)\nsidered, because different sets of feedback connections are active, depending on the \nset of neurons that are above threshold. In a linear network, I - W would have to \nbe positive definite to ensure asymptotic stability, but because of the rectification, \nhere this condition is replaced by the weaker condition of copositivity. \nThe conditions of Theorem 1 for global asymptotic stability depend only on W, but \nnot on b. On the other hand, steady states do depend on b. The next lemma says \nthat the mapping from input to output is surjective. \n\n\fLemma 1 For any nonnegative vector v 2:: 0 there exists an input b, such that v is \na steady state of equation 1 with input b. \nProof: Define c = v-1::W1::v, where 1:: = diag(rTl, ... ,rTN) and rTi = 1 if Vi > 0 and \nrTi = 0 if Vi = O. Choose bi = Ci for Vi > 0 and bi = -1 -\nThis Lemma states that any nonnegative vector can be realized as a fixed point. \nSometimes this fixed point is stable, such as in networks subject to Theorem 1 in \nwhich only a single neuron is active. Indeed, the principal submatrix of I - W \ncorresponding to a single active neuron corresponds to a diagonal elements, which \naccording to (1) must be positive. Hence it is always possible to activate only a \nsingle neuron at an asymptotically stable fixed point. However, as will become \nclear from the following Theorem, not all nonnegative vectors can be realized as an \nasymptotically stable fixed point. \n\n(1::W1::V)i for Vi = 0 .\u2022 \n\n3 Forbidden and permitted sets \n\nThe following characterizations of stable steady states are based on the interlacing \nTheorem [4]. This Theorem says that if A is an - 1 by n - 1 principal submatrix \nof a n by n symmetric matrix B, then the eigenvalues of A fall in between the \neigenvalues of B. In particular, the largest eigenvalue of A is always smaller than \nthe largest eigenvalue of B. \n\nDefinition 3 A set of neurons is permitted if the neurons can be coactivated at \nan asymptotically stable steady state for some input b. On the other hand, a set \nof neurons is forbidden, if they cannot be coactivated at an asymptotically stable \nsteady state no matter what the input b. \n\nAlternatively, we might have defined a permitted set as a set for which the corre(cid:173)\nsponding square sub-matrix of I - W has only positive eigenvalues. And, similarly, \na forbidden set could be defined as a set for which there is at least one non-positive \neigenvalue. It follows from Theorem 1 that if the matrix I - W is copositive, then \nthe eigenvectors corresponding to non-positive eigenvalues of forbidden sets have to \nhave both positive and non-positive components. \n\nTheorem 2 If the matrix I - W is copositive, then the following statements are \nequivalent: \n\n1. The matrix I - W is not positive definite. \n\n2. There exists a forbidden set. \n\n3. The network is conditionally multistable. That is, there exists an input b \n\nsuch that there is more than one stable steady state. \n\nProof sketch: \n\n\u2022 (1) => (2) . I - W is not positive definite and so there can be no asymptot(cid:173)\nically stable steady state in which all neurons are active, e.g. the set of all \nneurons is forbidden . \n\n\u2022 (2) => (3). Denote the forbidden set with k active neurons by 1::. Without \nloss of generality, assume that the principal submatrix of I - W correspond(cid:173)\ning to 1:: has k - 1 positive eigenvalues and only one non-positive eigenvalue \n(by virtue of the interlacing theorem and the fact that the diagonal ele(cid:173)\nments of I - W must be positive, there is always a subset of 1::, for which \n\n\fthis is true). By choosing bi > 0 for neurons i belonging to 1; and bj \u00ab 0 for \nneurons j not belonging to 1;, the quadratic Lyapunov function L defined \nin Theorem 1 forms a saddle in the nonnegative quadrant defined by 1;. \nThe saddle point is the point where L restricted to the hyperplane defined \nby the k - 1 positive eigenvalues reaches its minimum. But because neurons \ncan be initialized to lower values of L on either side of the hyperplane and \nbecause L is non-increasing along trajectories, there is no way trajectories \ncan cross the hyperplane. In conclusion, we have constructed an input b \nfor which the network is multistable. \n\n\u2022 (3) => (1). Suppose that (1) is false. Then for all b the Lyapunov function \nL is convex and so has only a single local minimum in the convex domain \nx ~ O. This local minimum is also the global minimum. The dynamics \nmust converge to this minimum .\u2022 \n\nIf I - W is positive definite, then a symmetric threshold-linear network has a unique \nsteady state. This has been shown previously [5]. The next Theorem is an expansion \nof this result, stating an equivalent condition using the concept of permitted sets. \n\nTheorem 3 If W is symmetric, then the following conditions are equivalent: \n\n1. The matrix I - W is positive definite. \n\n2. All sets are permitted. \n\n3. For all b there is a unique steady state, and it is stable. \n\nProof: \n\n\u2022 (1) => (2). If I - W is positive definite, then it is copositive. Hence (1) \nin Theorem 2 is false and so (2) in Theorem 2 is false, e.g. all set are \npermitted. \n\n\u2022 (2) => (1). Suppose (1) is false, so the set of all neurons active must be \n\nforbidden, not all sets are permitted. \n\n\u2022 (1) {:::=> (3). See [5] .\u2022 \n\nThe following Theorem characterizes the forbidden and the permitted sets. \n\nTheorem 4 Any subset of a permitted set is permitted. Any superset of a forbidden \nset is forbidden. \n\nProof: According to the interlacing Theorem, if the smallest eigenvalue of a sym(cid:173)\nmetric matrix is positive, then so are the smallest eigenvalues of all its principal \nsubmatrices. And, if the smallest eigenvalue of a principal submatrix is negative, \nthen so is the smallest eigenvalue of the original matrix .\u2022 \n\n4 An example - the ring network \n\nA symmetric threshold-linear network with local excitation and larger range inhibi(cid:173)\ntion has been studied in the past as a model for how simple cells in primary visual \ncortex obtain their orientation tuning to visual stimulation [6, 7]. Inspired by these \nresults, we have recently built an electronic circuit containing a ring network, using \nanalog VLSI technology [3]. We have argued that the fixed tuning width of the \nneurons in the network arises because active sets consisting of more than a fixed \n\n\fnumber of contiguous neurons are forbidden. Here we give a more detailed account \nof this fact and provide a surprising result about the existence of some spurious \npermitted sets. \n\nLet the synaptic matrix of a 10 neuron ring-network be translationally invariant. \nThe connection between neurons i and j is given by Wij = -(3 +o:oclij + 0:1 (cli,j+l + \ncli+l,j) + 0:2 (cli,j+2 + cli+2,j), where (3 quantifies global inhibition, 0:0 self-excitation, \n0:1 first-neighbor lateral excitation and 0:2 second-neighbor lateral excitation. In \nFigure 1 we have numerically computed the permitted sets of this network, with \nthe parameters taken from [3], e.g. 0:0 = 0 0:1 = 1.1 0:2 = 1 (3 = 0.55. The per(cid:173)\nmitted sets were determined by diagonalising the 210 square sub-matrices of I - W \nand by classifying the eigenvalues corresponding to nonnegative eigenvectors. The \nFigure 1 shows the resulting parent permitted sets (those that have no permitted \nsupersets). Consistent with the finding that such ring-networks can explain contrast \ninvariant tuning of VI cells and multiplicative response modulation of parietal cells, \nwe found that there are no permitted sets that consist of more than 5 contiguous \nactive neurons. However, as can be seen, there are many non-contiguous permitted \nsets that could in principle be activated by exciting neurons in white and strongly \ninhibiting neurons in black. \n\nBecause the activation of the spurious permitted sets requires highly specific input \n(inhibition of high spatial frequency), it can be argued that the presence of the \nspurious permitted sets is not relevant for the normal operation of the ring net(cid:173)\nwork, where inputs are typically tuned and excitatory (such as inputs from LGN to \nprimary visual cortex). \n\n..... \nQ.) \n.0 \nE \n:::l \nC \n\n\"0 \n\nID en \n-~ \nE \nQ.) a.... \n\nNeuron number \n\nNeuron number \n\nFigure 1: Left: Output of a ring network of 10 neurons to uniform input (random \ninitial condition). Right: The 9 parent permitted sets (x-axis: neuron number, \ny-axis: set number). White means that a neurons belongs to a set and black means \nthat it does not. Left-right and translation symmetric parent permitted sets of the \nones shown have been excluded. The first parent permitted set (first row from the \nbottom) corresponds to the output on the left. \n\n5 Discussion \n\nWe have shown that pattern memorization in threshold linear networks can be \nviewed in terms of permitted sets of neurons, e.g. sets of neurons that can be \ncoactive at a steady state. According to this definition, the memories are stored by \nthe synaptic weights, independently of the inputs. Hence, this concept of memory \ndoes not suffer from input-dependence, as would be the case for a definition of \n\n\fmemory based on the fixed points of the dynamics. \n\nPattern retrieval is strongly constrained by the input. A typical input will not allow \nfor the retrieval of arbitrary stored permitted sets. This comes from the fact that \nmultistability is not just dependent on the existence of forbidden sets, but also on \nthe input (theorem 2). For example, in the ring network, positive input will always \nretrieve permitted sets consisting of a group of contiguous neurons, but not any of \nthe spurious permitted sets, Figure 1. Generally, multistability in the ring network \nis only possible when more than a single neuron is excited. \n\nNotice that threshold-linear networks can behave as traditional attractor networks \nwhen the inputs are represented as initial conditions of the dynamics. For example, \nby fixing b = 1 and initializing a copositive network with some input, the permitted \nsets unequivocally determine the stable fixed points. Thus, in this case, the notion of \npermitted sets is no different from fixed point attractors. However, the hierarchical \ngrouping of permitted sets (Theorem 4) becomes irrelevant, since there can be only \none attractive fixed point per hierarchical group defined by a parent permitted set. \n\nThe fact that no permitted set can have a forbidden subset represents a constraint \non the possible computations of symmetric networks. However, this constraint does \nnot have to be viewed as an undesired limitation. On the contrary, being aware of \nthis constraint may lead to a deeper understanding of learning algorithms and rep(cid:173)\nresentations for constraint satisfaction problems. We are reminded of the history of \nperceptrons, where the insight that they can only solve linearly separable classifica(cid:173)\ntion problems led to the invention of multilayer perceptrons and backpropagation. \nIn a similar way, grouping problems that do not obey the natural hierarchy inher(cid:173)\nent in symmetric networks, might necessitate the introduction of hidden neurons to \nrealize the right geometry. For the interested reader, see also [8] for a simple pro(cid:173)\ncedure of how to store a given family of possibly overlapping patterns as permitted \nsets. \n\nReferences \n[1] J. J. Hopfield. Neurons with graded response have collective properties like those \n\nof two-state neurons. Proc. Natl. Acad. Sci. USA, 81:3088- 3092, 1984. \n\n[2] M.A. Cohen and S. Grossberg. Absolute stability of global pattern formation and \nparallel memory storage by competitive neural networks. IEEE Transactions on \nSystems, Man and Cybernetics, 13:288- 307,1983. \n\n[3] Richard H.R. Hahnloser, Rahul Sarpeshkar, Misha Mahowald, Rodney J . Dou(cid:173)\nglas, and Sebastian Seung. Digital selection and ananlog amplification coexist \nin a silicon circuit inspired by cortex. Nature, 405:947- 51, 2000. \n\n[4] R.A. Horn and C.R. Johnson. Matrix analysis. Cambridge University Press, \n\n1985. \n\n[5] J. Feng and K.P. Hadeler. Qualitative behaviour of some simple networks. J. \n\nPhys. A:, 29:5019- 5033, 1996. \n\n[6] R. Ben-Yishai, R. Lev Bar-Or, and H. Sompolinsky. Theory of orientation tuning \n\nin visual cortex. Proc. Natl. Acad. Sci. USA, 92:3844- 3848, 1995. \n\n[7] R.J. Douglas, C. Koch, M.A. Mahowald, K.A.C. Martin, and H. Suarez. Recur(cid:173)\n\nrent excitation in neocortical circuits. Science, 269:981- 985, 1995. \n\n[8] Xie Xiaohui, Richard H.R. Hahnloser, and Sebastian Seung. Learning winner(cid:173)\ntake-all competition between groups of neurons in lateral inhibitory networks. \nIn Proceedings of NIPS2001 - Neural Information Processing Systems: Natural \nand Synthetic, 2001. \n\n\f", "award": [], "sourceid": 1793, "authors": [{"given_name": "Richard", "family_name": "Hahnloser", "institution": null}, {"given_name": "H. Sebastian", "family_name": "Seung", "institution": null}]}