{"title": "On the K-Winners-Take-All Network", "book": "Advances in Neural Information Processing Systems", "page_first": 634, "page_last": 642, "abstract": null, "full_text": "634 \n\nON THE K-WINNERS-TAKE-ALL NETWORK \n\nE. Majani \n\nJet Propulsion Laboratory \n\nCalifornia Institute of Technology \n\nR. Erlanson, Y. Abu-Mostafa \n\nDepartment of Electrical Engineering \n\nCalifornia Institute of Technology \n\nABSTRACT \n\nWe present and rigorously analyze a generalization of the Winner(cid:173)\nTake-All Network: the K-Winners-Take-All Network. This net(cid:173)\nwork identifies the K largest of a set of N real numbers. The \nnetwork model used is the continuous Hopfield model. \n\nI - INTRODUCTION \n\nThe Winner-Take-All Network is a network which identifies the largest of N real \nnumbers. Winner-Take-All Networks have been developed using various neural \nnetworks models (Grossberg-73, Lippman-87, Feldman-82, Lazzaro-89). We present \nhere a generalization of the Winner-Take-All Network: the K-Winners-Take-All \n(KWTA) Network. The KWTA Network identifies the K largest of N real numbers. \nThe neural network model we use throughout the paper is the continuous Hopfield \nnetwork model (Hopfield-84). If the states of the N nodes are initialized to the N \nreal numbers, then, if the gain of the sigmoid is large enough, the network converges \nto the state with K positive real numbers in the positions of the nodes with the K \nlargest initial states, and N - K negative real numbers everywhere else. \nConsider the following example: N = 4, K = 2. There are 6 = (~) stable \nstates:(++-_)T, (+_+_)T, (+--+)T, ( __ ++)T, (_+_+)T, and (_++_)T. \nIf the initial state of the network is (0.3, -0.4, 0.7, O.l)T, then the network will \nconverge to (Vi,V2,V3,v4)T where Vi> 0, V2 < 0, V3 > 0, V4 < 0 ((+ _ +_)T). \nIn Section II, we define the KWTA Network (connection weights, external inputs). \nIn Section III, we analyze the equilibrium states and in Section IV, we identify all \nthe stable equilibrium states of the KWTA Network. In Section V, we describe the \ndynamics of the KWTA Network. In Section VI, we give two important examples \nof the KWTA Network and comment on an alternate implementation of the KWTA \nNetwork. \n\n\fOn the K-Winners-Take-All Network \n\n635 \n\nII - THE K-WINNERS-TAKE-ALL NETWORK \n\nThe continuous Hopfield network model (Hopfield-84) (also known as the Grossberg \nadditive model (Grossberg-88)), is characterized by a system of first order differen(cid:173)\ntial equations which governs the evolution of the state of the network (i = 1, .. . , N) : \n\nThe sigmoid function g(u) is defined by: g(u) = f(G\u00b7 u), where G > 0 is the \ngain of the sigmoid, and f(u) is defined by: 1. \"f/u, 0 < f'(u) < f'(O) = 1, \n2. limu .... +oo f( u) = 1, 3. limu .... -oo f( u) = -l. \nThe KWTA Network is characterized by mutually inhibitory interconnections Taj = \n-1 for i \u00a5= j, a self connection Tai = a, \n(Ial < 1) and'an external input (identical \nfor every node) which depends on the number K of winners desired and the size of \nthe network N : ti = 2K - N. \nThe differential equations for the KWTA Network are therefore: \n\nfor all i, Cd~i = -Aui + (a + l)g(ui) - (tg(u j ) - t) , \n\nJ=l \n\n(1) \n\nwhere A = N - 1 + lal, -1 < a < +1, and t = 2K - N. Let us now study the \nequilibrium states of the dynamical system defined in (1). We already know from \nprevious work (Hopfield-84) that the network is guaranteed to converge to a stable \nequilibrium state if the connection matrix (T) is symmetric (and it is here). \n\nIII - EQUILIBRIUM STATES OF THE NETWORK \n\nThe equilibrium states u\u00b7 of the KWTA network are defined by \n\nfor all i, dUi - 0 \n, \n\ndt -\n\nI.e., \n\nfor all i, \n\ng(u'!') = --u'!' + \n\nA \n\na+1 I \n\nI \n\n(E. g(u~) - (2K - N)) \n\nJ \n\nJ \n\na+1 \n\n\u2022 \n\n(2) \n\nLet us now develop necessary conditions for a state u\u00b7 to be an equilibrium state \nof the network. \nTheorem 1: For a given equilibrium state u\u00b7, every component ui of u\u00b7 can be \none of at most three distinct values. \n\n\f636 \n\nMajani, Erlanson and Abu-Mostafa \n\nProof of Theorem 1. \n\nIf we look at equation (2), we see that the last term of the righthandside expression \nis independent of i; let us denote this term by H(u*). Therefore, the components \nut of the equilibrium state u* must be solutions of the equation: \n\ng(ui) = _A_u; + H(u*). \n\na+1 \n\nSince the sigmoid function g(u) is monotone increasing and A/(a + 1) > 0, then \nthe sigmoid and the line a~l ut + H(u*) intersect in at least one point and at most \nthree (see Figure 1). Note that the constant H(u*) can be different for different \n\u2022 \nequilibrium states u*. \n\nThe following theorem shows that the sum of the node outputs is constrained to \nbeing close to 2K - N, as desired. \n\nTheorem 2: If u* is an equilibrium state of (1), then we have: \n\n(a+ 1)maxg(ui) < '\" g(uJ~) -2K +N < (a+ 1) min g(ui). \n\n(3) \n\nu'!'>o \n\u2022 \n\nu~ 0) nor too low (if ui < 0). Therefore, we must have \n\n{ (Lj g(uj) - (2K - N)) < (a + 1)g(un, \n(Lj g(uj) - (2K - N)) > (a + 1)g(ut), \n\nfor all ut > 0, \nfor all ut < 0, \n\nwhich yields (3). \n\nTheorem 1 states that the components of an equilibrium state can only be one of \nat most three distinct values. We will distinguish between two types of equilibrium \nstates, for the purposes of our analysis: those which have one or more components \n\nut such that g'( un > a~l' which we categorize as type I, and those which do not \n\n(type II). We will show in the next section that for a gain G large enough, no \nequilibrium state of type II is stable. \n\n\u2022 \n\n\fOn the K-Winners-Take-All Network \n\n637 \n\nIV - ASYMPTOTIC STABILITY OF \n\nEQUILIBRIUM STATES \n\nWe will first derive a necessary condition for an equilibrium state of (1) to be \nasymptotically stable. Then we will find the stable equilibrium states of the KWTA \nNetwork. \nIV-I. A NECESSARY CONDITION FOR ASYMPTOTIC \nSTABILITY \nAn important necessary condition for asymptotic stability is given in the following \ntheorem. \n\nTheorem 3: Given any asymptotically stable equilibrium state u*, at most one of \nthe components ut of u* may satisfy: \n\n'( *) \n9 u\u00b7 > --. \n\n, - a+ 1 \n\nA \n\nProof of Theorem 3. \n\nTheorem 3 is obtained by proving the following three lemmas. \n\nLemma 1: Given any asymptotically stable equilibrium state u*, we always have \nfor all i and j such that i # j : \n\ng'(u~) + g'(u~) Ja 2 (g'(un - g'(ujn 2 + 4g'(ung'(uj) \n. \n\n'2 \n\nJ + \n\n2 \n\nA> a \n\n(4) \n\nProof of Lemma l. \n\nSystem (1) can be linearized around any equilibrium state u* : \n\nd(u ~ u*) ~ L(u*)(u _ u*), where L(u*) = T\u00b7 diag (g'(ui), ... ,g'( uN\u00bb - AI. \n\nA necessary and sufficient condition for the asymptotic stability of u* is for L(u*) \nto be negative definite. A necessary condition for L(u*) to be negative definite is \nfor all 2 X 2 matrices Lij(U*) of the type \n\n* \n\nLij(U ) = \n\n(ag'(u~)-A \n\n-g,'(ut) \n\n-g'(U~\u00bb) \n' \n\nag'(uj):'- A \n\n(i # j) \n\nto be negative definite. This results from an infinitesimal perturbation of compo(cid:173)\nnents i and j only. Any matrix Lij (u*) has two real eigenvalues. Since the largest \none has to be negative, we obtain: \n\n~ (ag'(ui) - A + ag'(uj) - A + Ja 2 (g'(ut) - g'(ujn 2 + 49'(Ut)g'(Uj\u00bb) < 0 .\u2022 \n\n\f638 \n\nMajani, Erlan80n and Abu-Mostafa \n\nLemma 2: Equation (4) implies: \n\nmin (g'(u:),g'(u1)) < 2-1 . \na+ \n\n(5) \n\nProof of Lemma 2. \n\nConsider the function h of three variables: \n\n, * _ g'(u;) + g'(u;) va2 (g'(u;) - g'(u;))2 + 4g'(u;)g'(uj) \n\n, \u2022 \n\nh (a,g (ua),g (Uj)) - a \n\n2 \n\n2 \n\n. \n\n+ \n\nIf we differentiate h with respect to its third variable g'(uj), we obtain: \n\n{)h (a, g'(ut) , g'(uj)) = ~ + \n\n{)g'(uj) \n\na2g'(uj) + (2 - a2)g'(ut) \n\n2 2va2 (g'(un-g'(uj))2 +4g'(ung'(uj) \n\nwhich can be shown to be positive if and only if a > -1. But since lal < 1, then if \ng'(u;) < g'(uj) (without loss of generality), we have: \n\nh (a,g'(ui),g'(u1)) > h(a,g'(ui),g'(ui)) = (a+ 1)g'(ut), \n\nwhich, with (4), yields: \n\nwhich yields Lemma 2. \nLemma 3: If for all i # j, \n\n'( *) \n9 Us < --1' \n\nA \na+ \n\nmin (g'(ut),g'(u1)) < 2-1 , \na+ \n\nthen there can be at most one ui such that: \n\n\u2022 \n\nA \n\ng'(u~) > - - . \n\n- a+ 1 \n\nI \n\nProof of Lemma 3. \n\nLet us assume there exists a pair (ui, uj) with i # j such that g'( ut) > 0;1 and \ng'(uj) > 0;1' then (5) would be violated. \nI \n\n\fOn the K-Winners-Take-All Network \n\n639 \n\nIV-2. STABLE EQUILmRIUM STATES \nFrom Theorem 3, all stable equilibrium states of type I have exactly one component \n, (at least one and at most one) such that g' ( ,) ~ 0; l' Let N + be the number \nof components a with g'(a) < 0;1 and a > 0, and let N_ be the number of \ncomponents (3 with g'(f3) < 0;1 and f3 < 0 (note that N+ + N_ + 1 = N). For \na large enough gain G, g(a) and g(f3) can be made arbitrarily close to +1 and \n-1 respectively. Using Theorem 2, and assuming a large enough gain, we obtain: \n-1 < N + - K < O. N + and K being integers, there is therefore no stable equilibrium \nstate of type I. \nFor the equilibrium states of type II, we have for all i, ut = a(> 0) or f3( < 0) where \ng'(a) < 0~1 and g'(f3) < 0;1' For a large enough gain, g(a) and g(f3) can be made \narbitrarily close to +1 and -1 respectively. Using theorem 2 and assuming a large \nenough gain, we obtain: -(a + 1) < 2(N+ - K) < (a + 1), which yields N+ = K. \nLet us now summarize our results in the following theorem: \n\nTheorem 4: For a large enough gain, the only possible asymptotically stable \nequilibrium states u\u00b7 of (1) must have K components equal to a > 0 and N - K \ncomponents equal to f3 < 0, with \n\n{ \n\n( ) -.....L + K(g(a)-g(p)-2)+N(1+g(P\u00bb \ng a \n-\n0+1 a \n...Lf3 + K(g(a)-g(p)-2)+N(1+g(,8\u00bb \ng({3) -\n-\n\n0+1 \n\n0+1 \n\n0+1 \n\n, \n\n\u2022 \n\n(7) \n\nSince we are guaranteed to have at least one stable equilibrium state (Hopfield-84), \nand since any state whose components are a permutation of the components of a \nstable equilibrium state, is clearly a stable equilibrium state, then we have: \n\nTheorem 5: There exist at least (~) stable equilibrium states as defined in Theo(cid:173)\nrem 4. They correspond to the (~) different states obtained by the N! permutations \nof one stable state with K positive components and N - K positive components. \n\nv - THE DYNAMICS OF THE KWTA NETWORK \n\nNow that we know the characteristics of the stable equilibrium states of the KWTA \nNetwork, we need to show that the KWTA Network will converge to the stable state \nwhich has a > 0 in the positions of the K largest initial components. This can be \nseen clearly by observing that for all i ;/; j : \n\nd(u' - u\u00b7) \n\nC \n\n'dt \n\nJ =.>t(ui- uj)+(a+1)(g(Ui)-g(Uj\u00bb. \n\nIf at some time T, ui(T) = uj(T), then one can show that Vt, Ui(t) = Uj(t). \nTherefore, for all i ;/; j, Ui(t) - Uj(t) always keeps the same sign. This leads to the \nfollowing theorem. \n\n\f640 \n\nMajani, Erlan80n and Abu-Mostafa \n\nTheorem 6: (Preservation of order) For all nodes i # j, \n\nWe shall now summarize the results of the last two sections. \n\nTheorem 7: Given an initial state u-(O) and a gain G large enough, the KWTA \nNetwork will converge to a stable equilibrium state with K components equal to a \npositive real number (Q > 0) in the positions of the K largest initial components, \nand N - K components equal to a negative real number (13 < 0) in all other N - K \npositions. \n\nThis can be derived directly from Theorems 4, 5 and 6: we know the form of all \nstable equilibrium states, the order of the initial node states is preserved through \ntime, and there is guaranteed convergence to an equilibrium state. \n\nVI - DISCUSSION \n\nThe well-known Winner-Take-All Network is obtained by setting K to 1. \n\nThe N/2-Winners-Take-All Network, given a set gf N real numbers, identifies which \nnumbers are above or below the mediaIl~ This task is slightly more complex com(cid:173)\nputationally (~ O(N log(N\u00bb than that of the Winner-Take-All (~ O(N\u00bb. The \nnumber of stable states is much larger, \n\n( N) \n\n2N \n\nN/2 ~ J21rN' \n\ni.e., asymptotically exponential in the size of the network. \n\nAlthough the number of connection weights is N2, there exists an alternate imple(cid:173)\nmentation of the KWTA Network which has O(N) connections (see Figure 2). The \nsum of the outputs of all nodes and the external input is computed, then negated \nand fed back to all the nodes. In addition, a positive self-connection (a + 1) is \nneeded at every node. \n\nThe analysis was done for a \"large enough\" gain G. In practice, the critical value of \nGis a~i for the N/2-Winners-Take-All Network, and slightly higher for K # N/2. \nAlso, the analysis was done for an arbitrary value of the self-connection weight a \n(Ial < 1). In general, if a is close to +1, this will lead to faster convergence and a \nsmaller value of the critical gain than if a is close to -1. \n\n\fOn the K-Winners-Take-All Network \n\n641 \n\nVII - CONCLUSION \n\nThe KWTA Network lets all nodes compete until the desired number of winners \n(K) is obtained. The competition is ibatained by using mutual inhibition between \nall nodes, while the number of winners K is selected by setting all external inputs \nto 2K - N. This paper illustrates the capability of the continuous Hopfield Network \nto solve exactly an interesting decision problem, i.e., identifying the K largest of N \nreal numbers. \n\nAcknowledgments \n\nThe authors would like to thank John Hopfield and Stephen DeWeerth from the \nCalifornia Institute of Technology and Marvin Perlman from the Jet Propulsion \nLaboratory for insightful discussions about material presented in this paper. Part of \nthe research described in this paper was performed at the Jet Propulsion Laboratory \nunder contract with NASA. \n\nReferences \n\nJ .A. Feldman, D.H. Ballard, \"Connectionist Models and their properties,\" Cognitive \nScience, Vol. 6, pp. 205-254, 1982 \n\nS. Grossberg, \"Contour Enhancement, Short Term Memory, and Constancies in \nReverberating Neural Networks,\" Studies in Applied Mathematics, Vol. LII (52), \nNo.3, pp. 213-257, September 1973 \n\nS. Grossberg, \"Non-Linear Neural Networks: Principles, Mechanisms, and Archi(cid:173)\ntectures,\" Neural Networks, Vol. 1, pp. 17-61, 1988 \n\nJ.J. Hopfield, \"Neurons with graded response have collective computational prop(cid:173)\nerties like those of two-state neurons,\" Proc. Natl. Acad. Sci. USA, Vol. 81, pp. \n3088-3092, May 1984 \n\nJ. Lazzaro, S. Ryckebusch, M.A. Mahovald, C.A. Mead, \"Winner-Take-All Networks \nof O(N) Complexity,\" in this volume, 1989 \n\nR.P. Lippman, B. Gold, M.L. Malpass, \"A Comparison of Hamming and Hopfield \nNeural Nets for Pattern Classification,\" MIT Lincoln Lab. Tech. Rep. TR-769, 21 \nMay 1987 \n\n\f642 \n\nMajani, Erlanson and Abu-Mostafa \n\nu \n\n,1 \n/ \n\n, \n\nFj gure 1; I ntersecti on of si gmoj d and line, \n\na+1 \n\nFigure 2; An Implementation of the KWTA Network, \n\nN-2K \n\n\f", "award": [], "sourceid": 157, "authors": [{"given_name": "E.", "family_name": "Majani", "institution": null}, {"given_name": "Ruth", "family_name": "Erlanson", "institution": null}, {"given_name": "Yaser", "family_name": "Abu-Mostafa", "institution": null}]}