{"title": "Learning Winner-take-all Competition Between Groups of Neurons in Lateral Inhibitory Networks", "book": "Advances in Neural Information Processing Systems", "page_first": 350, "page_last": 356, "abstract": null, "full_text": "Learning winner-take-all competition between \ngroups of neurons in lateral inhibitory networks \n\nXiaohui Xie, Richard Hahnloser and H. Sebastian Seung \n\nE25-21O, MIT, Cambridge, MA 02139 \n{xhxielrhlseung}@mit.edu \n\nAbstract \n\nIt has long been known that lateral inhibition in neural networks can lead \nto a winner-take-all competition, so that only a single neuron is active at \na steady state. Here we show how to organize lateral inhibition so that \ngroups of neurons compete to be active. Given a collection of poten(cid:173)\ntially overlapping groups, the inhibitory connectivity is set by a formula \nthat can be interpreted as arising from a simple learning rule. Our analy(cid:173)\nsis demonstrates that such inhibition generally results in winner-take-all \ncompetition between the given groups, with the exception of some de(cid:173)\ngenerate cases. In a broader context, the network serves as a particular \nillustration of the general distinction between permitted and forbidden \nsets, which was introduced recently. From this viewpoint, the computa(cid:173)\ntional function of our network is to store and retrieve memories as per(cid:173)\nmitted sets of coactive neurons. \n\nIn traditional winner-take-all networks, lateral inhibition is used to enforce a localized, \nor \"grandmother cell\" representation in which only a single neuron is active [1, 2, 3, 4]. \nWhen used for unsupervised learning, winner-take-all networks discover representations \nsimilar to those learned by vector quantization [5]. Recently many research efforts have \nfocused on unsupervised learning algorithms for sparsely distributed representations [6, 7]. \nThese algorithms lead to networks in which groups of multiple neurons are coactivated to \nrepresent an object. Therefore, it is of great interest to find ways of using lateral inhibition \nto mediate winner-take-all competition between groups of neurons, as this could be useful \nfor learning sparsely distributed representations. \n\nIn this paper, we show how winner-take-all competition between groups of neurons can be \nlearned. Given a collection of potentially overlapping groups, the inhibitory connectivity \nis set by a simple formula that can be interpreted as arising from an online learning rule. \nTo show that the resulting network functions as advertised, we perform a stability analysis. \nIf the strength of inhibition is sufficiently great, and the group organization satisfies certain \nconditions, we show that the only sets of neurons that can be coactivated at a stable steady \nstate are the given groups and their subsets. Because of the competition between groups, \nonly one group can be activated at a time. In general, the identity of the winning group \ndepends on the initial conditions of the network dynamics. If the groups are ordered by the \naggregate input that each receives, the possible winners are those above a cutoff that is set \nby inequalities to be specified. \n\n\f1 Basic definitions \n\nLet m groups of neurons be given, where group membership is specified by the matrix \n\nfl = {I if the ith neuron is in the ath group \n, \u00b0 otherwise \n\n(1) \n\nWe will assume that every neuron belongs to at least one group l, and every group contains \nat least one neuron. A neuron is allowed to belong to more than one group, so that the \ngroups are potentially overlapping. The inhibitory synaptic connectivity of the network is \ndefined in terms of the group membership, \n\nJi ' = lIm (1 _ ~a ~'!) = {o \n\nJ \n\na=I \n\n, \n\nJ \n\n1 otherwIse \n\nif i and. j both belong to a group \n\n(2) \n\nOne can imagine this pattern of connectivity arising by a simple learning mechanism. Sup(cid:173)\npose that all elements of J are initialized to be unity, and the groups are presented sequen(cid:173)\ntially as binary vectors e, ... ,~m. The ath group is learned through the update \n\n(3) \n\nIn other words, if neurons i and j both belong to group a, then the connection between \nthem is removed. After presentation of all m groups, this leads to Eq. (2). At the start \nof the learning process, the initial state of J corresponds to uniform inhibition, which is \nknown to implement winner-take-all competition between individual neurons. It will be \nseen that, as inhibitory connections are removed during learning, the competition evolves \nto mediate competition between groups of neurons rather than individual neurons. \n\nThe dynamics of the network is given by \n\n- ' + x- = b- + ax - - (3 '\" J- -x-\ndx -\ndt \n\n'J J \n\n[ \n\n, \n\n' \n\n, \n\n] + \n\nL...J \nj \n\n(4) \n\nwhere [z]+ = max{z,O} denotes rectification, a > \u00b0 the strength of self-excitation, and \n(3 > \u00b0 the strength of lateral inhibition. \nEquivalently, the dynamics can be written in matrix-vector form as :i; + x = [b + W x]+, \nwhere W = aI - (3J includes both self-excitation and lateral inhibition. The state of the \nnetwork is specified by the vector x, and the external input by the vector b. A vector v is \nsaid to be nonnegative, v 2: 0, if all of its components are nonnegative. The nonnegative \northant is the set of all nonnegative vectors. It can be shown that any trajectory of Eq. (4) \nstarting in the nonnegative orthant remains there. Therefore, for simplicity we will consider \ntrajectories that are confined to the nonnegative orthant x 2: 0. However, we will consider \ninput vectors b whose components are of arbitrary sign. \n\n2 Global stability \n\nThe goal of this paper is to characterize the steady state response of the dynamics Eq. (4) \nto an input b that is constant in time. For this to be a sensible goal, we need some guarantee \nthat the dynamics converges to a steady state, and does not diverge to infinity. This is \nprovided by the following theorem. \n\nTheorem 1 Consider the network Eq. (4). The following statements are equivalent: \n\nlThis condition can be relaxed, but is kept for simplicity_ \n\n\f1. For any input b, there is a nonempty set of steady states that is globally asymptot(cid:173)\n\nically stable, exceptfor initial conditions in a set of measure zero. \n\n2. The strength a of self-excitation is less than one. \n\nProof sketch: \n\n\u2022 (2) => (1): Ifa < 1, the function Hl-a)xTx+~xT Jx-bTxis bounded below \nand radially unbounded in the nonnegative orthant. Furthermore it is nonincreas(cid:173)\ning under the dynamics Eq. (4), and constant only at steady states. Therefore it is \na Lyapunov function, and its local minima are globally asymptotically stable . \n\n\u2022 (1) => (2): Suppose that (2) is false. If a ~ 1, it is possible to choose b and an \ninitial condition for x so that only one neuron is active, and the activity of this \nneuron diverges, so that (1) is contradicted .\u2022 \n\n3 Relationship between groups and permitted sets \n\nIn this section we characterize the conditions under which the lateral inhibition of Eq. (4) \nenforces winner-take-all competition between the groups of neurons. That is, the only sets \nof neurons that can be coactivated at a stable steady state are the groups and their subsets. \nThis is done by performing a linear stability analysis, which allows us to classify active \nsets using the following definition. \n\nDefinition 1 If a set of neurons can be coactivated by some input at an asymptotically \nstable steady state, it is called permitted. Otherwise, it is forbidden \n\nElsewhere we have shown that whether a set is permitted or forbidden depends on the \nsubmatrix of synaptic connections between neurons in that set[l]. If the largest eigenvalue \nof the sub-matrix is less than unity, then the set is permitted. Otherwise, it is forbidden. \nWe have also proved that any superset of a forbidden set is forbidden, while any subset of \na permitted set is also permitted. \n\nOur goal in constructing the network (4) is to make the groups and their subsets the only \npermitted sets of the network. To determine whether this is the case, we must answer two \nquestions. First, are all groups and their subsets permitted? Second, are all permitted sets \ncontained in groups? The first question is answered by the following Lemma. \n\nLemma 1 All groups and their subsets are permitted. \n\nProof: If a set is contained in a group, then there is no lateral inhibition between the \nneurons in the set. Provided that a < 1, all eigenvalues of the sub-matrix are less than \nunity, and the set is permitted .\u2022 \n\nThe answer to the second question, whether all permitted sets are contained in groups, is \nnot necessarily affirmative. For example, consider the network defined by the group mem(cid:173)\nbership matrix ~ = {(I, 1,0), (0, 1, 1), (1,0,1)}. Since every pair of neurons belongs to \nsome group, there is no lateral inhibition (J = 0), which means that there are no forbidden \nsets. As a result, (1,1,1) is a permitted set, but obviously it is not contained in any group. \n\nLet's define a spurious permitted set to be one that is not contained in any group. For \nexample, {I, 1, I} is a spurious permitted set in the above example. To eliminate all the \nspurious permitted sets in the network, certain conditions on the group membership matrix \n~ have to be satisfied. \n\nDefinition 2 The membership ~ is degenerate if there exists a set of n ~ 3 neurons that is \nnot contained in any group, but all of its subsets with n - 1 neurons belong to some group. \n\n\fOtherwise, ~ is called nondegenerate. For example, ~ = {(I, 1, 0), (0, 1, 1), (1,0, I)} is \ndegenerate. \n\nUsing this definition, we can formulate the following theorem. \nTheorem 2 The neural dynamics Eq. (4) with a < 1 and (3 > 1 - a has a spurious \npermitted set if and only if ~ is degenerate. \n\nBefore we prove this theorem, we will need the following lemma. \nLemma 2 If (3 > 1- a, any set containing two neurons not in the same group isforbidden \nunder the neural dynamics Eq. (4). \n\nProof sketch: We will start by analyzing a very simple case, where there are two neu(cid:173)\nrons belonging to two different groups. Let the group membership be {(I, 0), (0, I)}. In \nthis case, W = {(a, -(3), (-(3, a)}. This matrix has eigenvectors (1,1) and (1, -1) and \neigenvalues a - (3 and a + (3. Since a < 1 for global stability and (3 > 0 by definition, the \n(1,1) mode is always stable. But if (3 > 1 - a, the (1, -1) mode is unstable. This means \nthat it is impossible for the two neurons to be coactivated at a stable steady state. Since any \nsuperset of a forbidden set is also forbidden, the general result of the lemma follows .\u2022. \n\nProof of Theorem 2 (sketch): \n\n\u2022 \u00a2::: If ~ is degenerate, there must exist a set n ~ 3 neurons that is not contained in \nany group, but all of its subsets with n - 1 neurons belong to some group. There is \nno lateral inhibition between these n neurons, since every pair of neurons belongs \nto some group. Thus the set containing all n neurons is permitted and spurious . \n\n\u2022 =>: If there exists a spurious permitted set P, we need to prove that ~ must be \ndegenerate. We will prove this by contradiction and induction. Let's assume ~ is \nnondegenerate. \nP must contain at least 2 neurons since anyone neuron subset is permitted and \nnot spurious. By Lemma 2, these 2 neurons must be contained in some group, or \nelse it is forbidden. Thus P must contain at least 3 neurons to be spurious, and \nany pair of neurons in P belongs to some group by Lemma 2. \nIf P contains at least n neurons and all of its subsets with n - 1 neurons belong \nto some group, then the set with these n neurons must belong to some group, \notherwise ~ is degenerate. Thus n must contain at least n + 1 neurons to be \nspurious, and all its n subsets belong to some group. \nBy induction, this implies that P must contain all neurons in the network, in which \ncase, P is either forbidden or nonspurious. This contradicts with the assumption \nP is a spurious permitted set. \u2022 \n\nFrom Theorem 2, we can easily have the following result. \n\nCorollary 1 If every group contains some neuron that does not belong to any other group, \nthen there is no any spurious permitted set. \n\n4 The potential winners \n\nWe have seen that if ~ is nondegenerate, the active set must be contained in a group, pro(cid:173)\nvided that lateral inhibition is strong \u00ab(3 > 1 - a). The group that contains the active set \nwill be called the \"winner\" of the competition between groups. The identity of the winner \ndepends on the input b, and also on the initial conditions of the dynamics. For a given input, \nwe need to characterize which pattern could potentially be the winner. \n\n\fSuppose that the group inputs B a = Li [biJ + ~i are distinct. Without loss of generality, \nwe order the group inputs as Bl > ... > Bm. Let's denote the largest input as bmax = \nmaxi{bi} and assume bmax > 0. \n\nTheorem 3 For nonoverlapping groups, the top c groups with the largest group input could \nend up the winner depending on the initial conditions of the dynamics, where c is deter(cid:173)\nmined by the equation BC 2': (1 - a)(3-1bmax > B C+! \n\nProof sketch: Suppose the ath group is the winner. For all neurons not in this group to be \ninactive, the self-consistent condition should read \n\"'[ J+a I-a \n~ bi \ni \n\n~i 2': -(3- max{ bj \n\n(5) \n\n[J+ \n} \n\nJ~a \n\nIf a group containing the neuron with the largest input, this condition can always be sat(cid:173)\nisfied. Moreover, this group is always in the top c groups. For groups not containing the \nneuron with the largest input, this condition can be satisfied if and only if they are in the \ntop c groups .\u2022 \n\nThe winner-take-all competition described above holds only for the case of strong inhibi(cid:173)\ntion (3 > 1 - a. On the other hand, if (3 is small, the competition will be weak and may \nnot result in group-winner-take-all. In particular, if (3 < (1 - a) / Amax, where Am ax is \nthe largest eigenvalue of -J, then the set of all neurons is permitted. Since every subset \nof a permitted set is permitted, that means there are no forbidden sets and the network is \nmonostable. Hence, group-winner-take-all does not hold. If (1 - a) / Amax < (3 < 1 - a, \nthe network has forbidden sets, but the possibility of spurious permitted sets cannot be \nexcluded. \n\n5 Examples \n\nTraditional winner-take-all network This is a special case of our network with N \ngroups, each containing one of the N neurons. Therefore, the group membership matrix ~ \nis the identity matrix, and J = 11 T -\nI, where 1 denotes the vector of all ones. According \nto Corollary 1, only one neuron is permitted to be active at a stable steady state, provided \nthat (3 > 1 - a. We refer to the active neuron as the \"winner\" of the competition mediated \nby the lateral inhibition. \nIf we assume that the inputs bi have distinct values, they can be ordered as b1 > b2 > ... > \nbN , without loss of generality. According to Theorem 3, any of the neurons 1 to k can be \nthe winner, where k is defined by bk 2': (1 - a)(3-1b1 > bk+!. The winner depends on \nthe initial condition of the network dynamics. In other words, any neuron whose input is \ngreater than (1 - a) / (3 times the largest input can end up the winner. \n\nTopographic organization Let the N neurons be organized into a ring, and let every \nset of d contiguous neurons be a group. d will be called the width. For example, in a \nnetwork with N = 4 neurons and group width d = 2, then the membership matrix is \n~ = {(I, 1,0,0), (0,1,1,0), (0,0,1,1), (1,0,0, I)}. This ring network is similar to the \none proposed by Ben-Yishai et al in the modeling of orientation tuning of visual cortex[9]. \n\nUnlike the WTA network where all groups are non-overlapping which implies that ~ is \nalways nondegenerate, in the ring network neurons are shared among different groups, ~ \nwill become degenerate when the width of the group is large. To guarantee all permitted \nsets are the subsets of some group, we have the following corollary, which can be derived \nfrom Theorem 2. \n\n\fA \n\nD \n\n10 \n\n15 L--_~_~ \n\n50 \n\n100 \n\nC \n\n150 \n\n200 \n\n100 \n\n200 \n\nF \n\n300 \n\n400 \n\n10 \n\n15L--_~_~ \n\n10 \n\n15 \n\nFigure 1: Permitted sets of the ring network. The ring network is comprised of 15 neurons with \nQ = 0.4 and /3 = 1. In panels A and D, the 15 groups are represented by columns. Black refers to \nactive neurons and white refers to inactive neurons. (A) 15 groups of width d = 5. (B) All permitted \nsets corresponding to the groups in A. (C) The 15 permitted sets in B that have no permitted supersets. \nThey are the same as the groups in A. (D) 15 groups with width d = 6. (E) All permitted set \ncorresponding to groups in D. (F) There are 20 permitted sets in E that have no permitted supersets. \nNote that there are 5 spurious permitted sets. \n\nCorollary 2 In the ring network with N neurons, if the width d < N /3 + 1, then there is \nno spurious permitted set. \n\nFig. (1) shows the permitted sets of a ring network with 15 neurons. From Corollary 2, we \nknow that if the group width is no larger than 5 neurons, there will not exist any spurious \npermitted set. In the left three panels of Fig. (1), the group width is 5 and all permitted sets \nare subsets of these groups. However, when the group width is 6 (right three panels), there \nexists 5 spurious permitted sets as shown in panel F. \n\nAs we have mentioned earlier, the lateral inhibition strength (3 plays a critical role in de(cid:173)\ntermining the dynamics of the network. Fig. (2) shows four types of steady states of a ring \nnetwork corresponding to different values of (3. \n\n6 Discussion \n\nWe have shown that it is possible to organize lateral inhibition to mediate a winner-take-all \ncompetition between potentially overlapping groups of neurons. Our construction utilizes \nthe distinction between permitted and forbidden sets of neurons. \n\nIf there is strong lateral inhibition between two neurons, then any set that contains \nthem is forbidden (Lemma 2). Neurons that belong to the same group do not have \nany mutual inhibition, and so they form a permitted set. Because the synaptic con(cid:173)\nnections between neurons in the same group are only composed of self-excitation, their \noutputs equal their rectified inputs, amplified by the gain factor of 1/(1 - a). Hence \nthe neurons in the winning group operate in a purely analog regime. The coexis(cid:173)\ntence of analog filtering with logical constraints on neural activation represents a form \nof hybrid analog-digital computation that may be especially appropriate for percep(cid:173)\ntual tasks. It might be possible to apply a similar method to the problem of data re-\n\n\fset of basis vectors. The constraints on the linear \n\nA \n\nc \n\nB \n\nD \n\nconstruction using a constrained \ncombination of basis vec-\ntors could for example im(cid:173)\nplement sparsity or non-\nnegativity constraints. \n\n1. \n\no. \no. \n\nAs we have shown in The(cid:173)\norem 2, there are some de(cid:173)\ngenerate cases of overlap(cid:173)\nping groups, to which our \nmethod does not apply. It is \nan interesting open question \nwhether there exists a gen-\neral way of how to translate \narbitrary groups of coactive \nneurons into permitted sets \nwithout involving spurious \npermitted sets. \n\n15 \n\n5 \n\n10 \n\n15 \n\nIn the past, a great deal of \nresearch has been inspired \nby the idea of storing mem(cid:173)\nories as dynamical attrac(cid:173)\ntors in neural networks [10]. \nOur theory suggests an al(cid:173)\nternative viewpoint, which \nis to regard permitted sets \nas memories latent in the \nsynaptic connections. From \nthis viewpoint, the contribu-\ntion of the present paper is a method of storing and retrieving memories as permitted sets \nin neural networks. \n\nFigure 2: Lateral inhibition strength f3 determines the behavior \nof the network. The network is a ring network of 15 neurons with \nwidth d = 5 and where 0: = 0.4 and input bi = 1, Vi. These \npanels show the steady state activities of the 15 neurons. (A) \nThere are no forbidden sets. (B) The marginal state f3 = (1 -\nO:)/Amax = 0.874, in which the network forms a continuous \nattractor. (C) Forbidden sets exist, and so do spurious permitted \nsets. (D) Group-winner-take-a11 case, no spurious permitted sets. \n\nReferences \n\n[1] R. Hahnloser, R. Sarpeshkar, M. Mahowald, Douglas R., and H.S. Seung. Digital selection and \nanalog amplification coexist in an electronic circuit inspired by neocortex. Nature, 3:609- 616, \n2000. \n\n[2] Shun-Ichi Amari and Michael A. Arbib. Competition and Cooperation in Neural Nets, pages \n\n119- 165. Systems Neuroscience. Academic Press, 1977. J. Metzler (ed). \n\n[3] 1. Feng and K.P. Hadeler. Qualitative behaviour of some simple networks. 1. Phys. A:, 29:5019-\n\n5033, 1996. \n\n[4] Richard H.R. Hahnloser. About the piecewise analysis of networks of linear threshold neurons. \n\nNeural Networks, 11:691- 697, 1998. \n\n[5] T. Kohonen . Self-Organization and Associative Memory. Springer-Verlag, Berlin, 3 edition, \n\n1989. \n\n[6] D. D. Lee and H. S. Seung. Learning the parts of objects by nonnegative matrix factorization. \n\nNature, 401:788- 91, 1999. \n\n[7] B. A. Olshausen and D. 1. Field. Emergence of simple-cell receptive field properties by learning \n\na sparse code for natural images. Nature, 381:607-609, 1996. \n\n[8] R. Ben-Yishai, R. Lev Bar-Or, and H. Sompolinsky. Theory of orientation tuning in visual cortex. \n\nProc. Natl. Acad. Sci. USA , 92:3844-3848, 1995. \n\n[9] 1. J. Hopfield. Neurons with graded response have collective properties like those of two-state \n\nneurons. Proc. Natl. Acad. Sci. USA, 81:3088- 3092, 1984. \n\n\f", "award": [], "sourceid": 1829, "authors": [{"given_name": "Xiaohui", "family_name": "Xie", "institution": null}, {"given_name": "Richard", "family_name": "Hahnloser", "institution": null}, {"given_name": "H. Sebastian", "family_name": "Seung", "institution": null}]}