Part of Advances in Neural Information Processing Systems 1 (NIPS 1988)

*E. Majani, Ruth Erlanson, Yaser Abu-Mostafa*

We present and rigorously analyze a generalization of the Winner(cid:173) Take-All Network: the K-Winners-Take-All Network. This net(cid:173) work identifies the K largest of a set of N real numbers. The network model used is the continuous Hopfield model.

I - INTRODUCTION

The Winner-Take-All Network is a network which identifies the largest of N real numbers. Winner-Take-All Networks have been developed using various neural networks models (Grossberg-73, Lippman-87, Feldman-82, Lazzaro-89). We present here a generalization of the Winner-Take-All Network: the K-Winners-Take-All (KWTA) Network. The KWTA Network identifies the K largest of N real numbers. The neural network model we use throughout the paper is the continuous Hopfield network model (Hopfield-84). If the states of the N nodes are initialized to the N real numbers, then, if the gain of the sigmoid is large enough, the network converges to the state with K positive real numbers in the positions of the nodes with the K largest initial states, and N - K negative real numbers everywhere else. Consider the following example: N = 4, K = 2. There are 6 = (~) stable states:(++-*)T, (+*+*)T, (+--+)T, ( _* ++)T, (*+*+)T, and (*++*)T. If the initial state of the network is (0.3, -0.4, 0.7, O.l)T, then the network will converge to (Vi,V2,V3,v4)T where Vi> 0, V2 < 0, V3 > 0, V4 < 0 ((+ _ +_)T). In Section II, we define the KWTA Network (connection weights, external inputs). In Section III, we analyze the equilibrium states and in Section IV, we identify all the stable equilibrium states of the KWTA Network. In Section V, we describe the dynamics of the KWTA Network. In Section VI, we give two important examples of the KWTA Network and comment on an alternate implementation of the KWTA Network.

On the K-Winners-Take-All Network

635

II - THE K-WINNERS-TAKE-ALL NETWORK

The continuous Hopfield network model (Hopfield-84) (also known as the Grossberg additive model (Grossberg-88)), is characterized by a system of first order differen(cid:173) tial equations which governs the evolution of the state of the network (i = 1, .. . , N) :

The sigmoid function g(u) is defined by: g(u) = f(G· u), where G > 0 is the gain of the sigmoid, and f(u) is defined by: 1. "f/u, 0 < f'(u) < f'(O) = 1, 2. limu .... +oo f( u) = 1, 3. limu .... -oo f( u) = -l. The KWTA Network is characterized by mutually inhibitory interconnections Taj = -1 for i ¥= j, a self connection Tai = a, (Ial < 1) and'an external input (identical for every node) which depends on the number K of winners desired and the size of the network N : ti = 2K - N. The differential equations for the KWTA Network are therefore:

for all i, Cd~i = -Aui + (a + l)g(ui) - (tg(u j ) - t) ,

Do not remove: This comment is monitored to verify that the site is working properly