634 
ON THE K-WINNERS-TAKE-ALL NETWORK 
E. Majani 
Jet Propulsion Laboratory 
California Institute of Technology 
R. Erlanson, Y. Abu-Mostafa 
Department of Electrical Engineering 
California Institute of Technology 
ABSTRACT 
We present and rigorously analyze a generalization of the Winner- 
Take-All Network: the K-Winners-Take-All Network. This net- 
work identifies the K largest of a set of N real numbers. The 
network model used is the continuous Hopfield model. 
I- INTRODUCTION 
The Winner-Take-All Network is a network which identifies the largest of N real 
numbers. Winner-Take-All Networks have been developed using various neural 
networks models (Grossberg-73, Lippman-87, Feldman-82, Lazzaro-89). We present 
here a generalization of the Winner-Take-All Network: the K-Winners-Take-All 
(KWTA) Network. The KWTA Network identifies the K largest of N real numbers. 
The neural network model we use throughout the paper is the continuous Hopfield 
network model (Hopfield-84). If the states of the N nodes are initialized to the N 
real numbers, then, if the gain of the sigrnoid is large enough, the network converges 
to the state with K positive real numbers in the positions of the nodes with the K 
largest initial states, and N - K negative real numbers everywhere else. 
Consider the following example: N : 4, K = 2. There are 6 = () stable 
states:(++__)T, (+_ +_)T, (+__+)T, (__++)T, (_+_+)T, and (_++_)T. 
If the initial state of the network is (0.3, -0.4, 0.7, 0.1) ', then the network will 
converge to (vx, v2, vs, v4) ' where vx > 0, v2 < 0, v3 > 0, v4 < 0 ((+ - +_)a,). 
In Section II, we define the KWTA Network (connection weights, external inputs). 
In Section III, we analyze the equilibrium states and in Section IV, we identify all 
the stable equilibrium states of the KWTA Network. In Section V, we describe the 
dynamics of the KWTA Network. In Section VI, we give two important examples 
of the KWTA Network and comment on an alternate implementation of the KWTA 
Network. 
On the K-Winners-Take-All Network 635 
II- THE K-WINNERS-TAKE-ALL NETWORK 
The continuous Hopfield network model (Hopfield-84) (also known as the Grossberg 
additive model (Grossberg-88)), is characterized by a system of first order differen- 
tial equations which governs the evolution of the state of the network (i = 1,..., N): 
The sigrnoid function g(u) is defined by: g(u) = f(G. u), where G > 0 is the 
gain of the sigmoid, and f(u) is defined by: 1. u, 0 < f'(u) < f'(O) = 1, 
2. lim,-.+oo f(u) = 1, 3. lin_._oo f(u) = -1. 
The KWTA Network is characterized by mutually inhibitory interconnections T/ - 
-1 for i  j, a self connection 5qi = a, (lal < 1) and-an external input (identical 
for every node) which depends on the number K of winners desired and the size of 
the network N: ti = 2K - N. 
The differential equations for the KWTA Network are therefore: 
for alli, C - Aui+(a+l)g(ui)- g(uj)-t , (1) 
j=l 
where A = N- 1 + lal, -1 < a < +1, and t = 2K - N. Let us now study the 
equilibrium states of the dynamical system defined in (1). We already know from 
previous work (Hopfield-84) that the network is guaranteed to converge to a stable 
equilibrium state if the connection matrix (T) is symmetric (and it is here). 
III- EQUILIBRIUM STATES OF THE NETWORK 
The equilibrium states u* of the KWTA network are defined by 
dtti 
for all i, dt = O, 
for all i, g(uT) --- a q------.u? q- . (2) 
aq-1 
Let us now develop necessary conditions for a state u* to be an equilibrium state 
of the network. 
Theorem 1: For a given equilibrium state u*, every component u of u* can be 
one of at most three distinct values. 
636 Majani, Erlanson and Abu-Mostafa 
Proof of Theorem 1. 
If we look at equation (2), we see that the last term of the righthandside expression 
is independent of i; let us denote this term by H(u*). Therefore, the components 
u of the equilibrium state u* must be solutions of the equation: 
g(u) = -- * H(u*). 
a+lUi 
Since the sigmoid function g(u) is monotone increasing and Al(a + 1) > 0, then 
x  H(u* in at point and at most 
the sigmoid and the line a-.-fui + ) intersect least one 
three (see Figure 1). Note that the constant H(u*) can be different for different 
equilibrium states u*. 
The following theorem shows that the sum of the node outputs is constrained to 
being close to 2K - N, as desired. 
Theorem 2: If u* is an equilibrium state of (1), then we have: 
N 
(a + 1) max g(u?) < Zg(u)- 2K + N < (a + 1) min g(u?). 
u?<o u?>o 
./=1 
(3) 
Proof of Theorem 2. 
Let us rewrite equation (2) in the following way: 
Au? -- (a + 1)g(u)- (Zg(u;) -- (2K - N)). 
Since u and g(u) are of the same sign, the term (j g(u)-(2K-N)) can neither 
be too large (if u > 0) nor too low (if u < 0). Therefore, we must have 
I t* -- (a * 
(Yqg( j)-(2K N)) < +l)g(u), for all u i > O, 
(Y'.jg(u)-(2K-N))  (a+ 1)g(u), for all u < 0, 
which yields (3). 
Theorem 1 states that the components of an equilibrium state can only be one of 
at most three distinct values. We will distinguish between two types of equilibrium 
states, for the purposes of our analysis: those which have one or more components 
 which we categorize as type I, and those which do not 
* such that g'(u) > 
(type II). We will show in the next section that for a gain G large enough, no 
equilibrium state of type II is stable. 
On the K-Winners-Take-All Network 637 
IV- ASYMPTOTIC STABILITY OF 
EQUILIBRIUM STATES 
We will first derive a necessary condition for an equilibrium state of (1) to be 
asymptotically stable. Then we will find the stable equilibrium states of the KWTA 
Network. 
IV-1. A NECESSARY CONDITION FOR ASYMPTOTIC 
STABILITY 
An important necessary condition for asymptotic stability is given in the following 
theorem. 
Theorem 3: Given any asymptotically stable equilibrium state u*, at most one of 
the components u' of u* may satisfy: 
g )_ 
a+l' 
Proof of Theorem 3. 
Theorem 3 is obtained by proving the following three lemmas. 
Lemma 1: Given any asymptotically stable equilibrium state u*, we always have 
for all i and j such that i  j: 
A> g'(u)+g'(u) /aa(g'(u?)--g'(u))a+4g'(u;)g'(u) 
a 2 + 2 (4) 
Proof of Lemma 1. 
System (1) can be linearized around any equilibrium state u*: 
dt 
 L(u*)(u- u*), where L(u*) = T.diag(g'(u;),...,g'(uv)) - .I. 
A necessary and sufficient condition for the asymptotic stability of u* is for L(u*) 
to be negative definite. A necessary condition for L(u*) to be negative definite is 
for all 2 x 2 matrices Lij(u*) of the type 
f ag'(u;) - A --g'(u]) ) 
Li'i(u*) = k -g'(uT) ag'(u])- A ' 
(i j) 
to be negative definite. This results from an infinitesimal perturbation of compo- 
nents i and j only. Any matrix Lij (u*) has two real eigenvalues. Since the largest 
one has to be negative, we obtain: 
i (ag'(u)--A 4-ag'(u;)--A 4- a 2 (g'(u?)- g'(u;))24- 4g'(u?)g'(u;)) < 0.. 
2 
638 Majani, Erlanson and Abu-Mostafa 
Lemma 2: Equation (4) implies: 
min (g'(u?),g'(u)) < 
a+l' 
(5) 
Proof of Lemma 2. 
Consider the function h of three variables: 
h(a,g'(u?),g'(u)) =a 
g'(u) + g'(u) 
2 2 
+ qa 2 (g'(u?)-g'(u))a+4g'(u')g'(u) 
If we differentiate h with respect to its third variable g'(u), we obtain: 
Oh(a,g'(u?),g'(u)) 
Og'(u) 
a aag'(u) -k (2- 
= + 2qa 2 (g'(u?)--g'(u))  
4 ' u* , u* 
+ g( i)g(j) 
which can be shown to be positive if and only if a > -1. But since lal < 1, then if 
g'(u?) < g'(u) (without loss of generality), we have: 
h(a,g'(u),g'(u)) > h(a,g'(u),g'(u;)) = (aq- 1)g'(u'), 
which, with (4), yields: 
g'(u?) < -- 
a+l' 
which yields Lemma 2. 
Lemma 3: If for all i  j, 
min (g'(u?),g'(u)) < 
a+l' 
* such that: 
then there can be at most one u i 
g'(u,*.) > -- 
-a+l' 
Proof of Lemma 3. 
x and 
Let us assume there exists a pair (u[, u) with i  j such that g(u?) > 
(., x then (5) would be violated. 
g u.) )_ a+l, 
On the K-Winners-Take-All Network 639 
IV-2. STABLE EQUILIBRIUM STATES 
From Theorem 3, all stable equilibrium states of type I have exactly one component 
7 (at least one and at most one) such that g(7) > x Let N+ be the number 
-- a+l' 
of components c with g(c) < h-i and c > 0, and let N_ be the number of 
components f with g(f) < a"f and f < 0 (note that N+ .9 N_ -9 i = N). For 
a large enough gain G, g(c) and g(ft) can be made arbitrarily close to -91 and 
-1 respectively. Using Theorem 2, and assuming a large enough gain, we obtain: 
-1 < N+ -K < 0. N+ and K being integers, there is therefore no stable equilibrium 
state of type I. 
For the equilibrium states of type II, we have for all i, u = c(> 0) or f(< 0) where 
' c x and '  For a large enough gain, g(a) and g(/9) can be made 
g ( ) < g < 
arbitrarily close to -91 and -1 respectively. Using theorem 2 and assuming a large 
enough gain, we obtain: -(a -9 1) < 2(N+ - K) < (a -9 1), which yields N+ -' K. 
Let us now summarize our results in the following theorem: 
Theorem 4: For a large enough gain, the only possible asymptotically stable 
equilibrium states u* of (1) must have K components equal to c > 0 and N - K 
components equal to ft < 0, with 
(7) 
Since we are guaranteed to have at least one stable equilibrium state (Hopfield-84), 
and since any state whose components are a permutation of the components of a 
stable equilibrium state, is clearly a stable equilibrium state, then we have: 
Theorem 5: There exist at least (K N) stable equilibrium states as defined in Theo- 
rem 4. They correspond to the (K N) different states obtained by the N! permutations 
of one stable state with K positive components and N - K positive components. 
V- THE DYNAMICS OF THE KWTA NETWORK 
Now that we know the characteristics of the stable equilibrium states of the KWTA 
Network, we need to show that the KWTA Network will converge to the stable state 
which has c > 0 in the positions of the K largest initial components. This can be 
seen clearly by observing that for all i  j: 
c,d(ui - u) ___ A(ui - u.i ) -9 (a -9 1)(g(ui) - g(u.i)) 
dt ' 
If at some time T, ui(T) = uj(T), then one can show that t, ui(t) = uj(t ). 
Therefore, for all i  j, ui(t) - uj(t) always keeps the same sign. This leads to the 
following theorem. 
640 Majard, Erlanson and Abu-Mostafa 
Theorem 6: (Preservation of order) For all nodes i  j, 
< (vt, u,(/) < = (vt, - 
We shall now summarize the results of the last two sections. 
Theorem 7: Given an initial state u*(0) and a gain G large enough, the KWTA 
Network will converge to a stable equilibrium state with K components equal to a 
positive real number (a > 0) in the positions of the K largest initial components, 
and N - K components equal to a negative real number ( < 0) in all other N - K 
positions. 
This can be derived directly from Theorems 4, 5 and 6: we know the form of all 
stable equilibrium states, the order of the initial node states is preserved through 
time, and there is guaranteed convergence to an equilibrium state. 
VI- DISCUSSION 
The well-known Winner-Take-All Network is obtained by setting K to 1. 
The N/2-Winners-Take-All Network, given a set of N real numbers, identifies which 
numbers are above or below the mediaq, This task is slightly more complex com- 
putationally ( O(Nlog(N)) than that of the Winner-Take-All ( O(N)). The 
number of stable states is much larger, 
i.e., asymptotically exponential in the size of the network. 
Although the number of connection weights is N 2, there exists an alternate imple- 
mentation of the KWTA Network which has O(N) connections (see Figure 2). The 
sum of the outputs of all nodes and the external input is computed, then negated 
and fed back to all the nodes. In addition, a positive self-connection (a + 1) is 
needed at every node. 
The analysis was done for a "large enough" gain G. In practice, the critical value of 
(7 is  for the N/2-Winners-Take-All Network, and slightly higher for K  N/2. 
Also, the analysis was done for an arbitrary value of the self-connection weight a 
(lal < 1). In general, if a is close to +1, this will lead to faster convergence and a 
smaller value of the critical gain than if a is close to -1. 
On the K-Winners-Take-All Network 641 
VII- CONCLUSION 
The KWTA Network lets all nodes compete until the desired number of winners 
(K) is obtained. The competition is ibatained by using mutual inhibition between 
all nodes, while the number of winners K is selected by setting all external inputs 
to 2K- N. This paper illustrates the capability of the continuous Hopfield Network 
to solve exactly an interesting decision problem, i.e., identifying the K largest of N 
real numbers. 
Acknowledgments 
The authors would like to thank John Hopfield and Stephen DeWeerth from the 
California Institute of Technology and Marvin Periman from the Jet Propulsion 
Laboratory for insightful discussions about material presented in this paper. Part of 
the research described in this paper was performed at the Jet Propulsion Laboratory 
under contract with NASA. 
References 
J.A. Feldman, D.H. Ballard, "Connectionist Models and their properties," Cognitive 
Science, Vol. 6, pp. 205-254, 1982 
S. Grossberg, "Contour Enhancement, Short Term Memory, and Constancies in 
Reverberating Neural Networks," Studies in Applied Mathematics, Vol. LII (52), 
No. 3, pp. 213-257, September 1973 
S. Grossberg, "Non-Linear Neural Networks: Principles, Mechanisms, and Archi- 
tectures," Neural Networks, Vol. 1, pp. 17-61, 1988 
J.J. Hopfield, "Neurons with graded response have collective computational prop- 
erties like those of two-state neurons," Proc. Natl. Acad. Sci. USA, Vol. 81, pp. 
3088-3092, May 1984 
J. Lazzaro, S. Ryckebusch, M.A. Mahovald, C.A. Mead, "Winner-Take-All Networks 
of O(N) Complexity," in this volume, 1989 
R.P. Lippman, B. Gold, M.L. Malpass, "A Comparison of Hamming and Hopfield 
Neural Nets for Pattern Classification," MIT Lincoln Lab. Tech. Rep. TR-769, 21 
May 1987 
649. Majani, Erlanson and Abu-Mostafa 
. & g(u) , 
/ 
u 
/ t / I- 
/ / , 
Figure 1: Intersection of' sigmoid and line, 
a+ 
g(ul) 
a+l 
a+l 
N-2K 
Figure 2: An Implementation of the KWTA Network, 
