694 MacKay and Miller 
Analysis of Linsker's Simulations 
of Hebbian rules 
David J. C. MacKay 
Computation and Neural Systems 
Caltech 164-30 CNS 
Pasadena, CA 91125 
mackayaurel. cns. calech. edu 
Kenneth D. Miller 
Department of Physiology 
University of California 
San Francisco, CA 94143 - 0444 
kenphyb. ucs. edu 
ABSTRACT 
Linsker has reported the development of centre-surround receptive 
fields and oriented receptive fields in simulations of a Hebb-type 
equation in a linear network. The dynamics of the learning rule 
are analysed in terms of the eigenvectors of the covariance matrix 
of cell activities. Analytic and computational results for Linsker's 
covariance matrices, and some general theorems, lead to an expla- 
nation of the emergence of centre-surround and certain oriented 
structures. 
Linsker [Linsker, 1986, Linsker, 1988] has studied by simulation the evolution of 
weight vectors under a Hebb-type teacherless learning rule in a feed-forward linear 
network. The equation for the evolution of the weight vector w of a single neuron, 
derived by ensemble averaging the Hebbian rule over the statistics of the input 
patterns, is: 1 
0 
= kl + (Q,j + 
subject to --Wma x _< W i <__ Wma x 
(1) 
1 Our definition of equation 1 differs from Linsker's by the omission of a factor of 1/N before 
the sum term, where N is the number of synapses. 
Analysis of Linsker's Simulations of Hebbian Rules 695 
where Q is the covariance matrix of activities of the inputs to the neuron. The 
covariance matrix depends on the covariance function, which describes the depen- 
dence of the covariance of two input cells' activities on their separation in the input 
field, and on the location of the synapses, which is determined by a synaptic density 
function. Linsker used a gaussian synaptic density function. 
Depending on the covariance function and the two parameters kl and k2, different 
weight structures emerge. Using a gaussian covariance function (his layer B -+ C), 
Linsker reported the emergence of non-trivial weight structures, ranging from satu- 
rated structures through centre-surround structures to bi-lobed oriented structures. 
The analysis in this paper examines the properties of equation (1). We concen- 
trate on the gaussian covariances in Linsker's layer B -+ C, and give an explanation 
of the structures reported by Linsker. Several of the results are more general, 
applying to any covariance matrix Q. Space constrains us to postpone general 
discussion, and criteria for the emergence of centre-surround weight structures, 
technical details, and discussion of other model networks, to future publications 
[MacKay, Miller, 1990]. 
I ANALYSIS IN TERMS OF EIGENVECTORS 
We write equation (1) as a first order differential equation for the weight vector w: 
r = (Q + k2J)w + kin subject to --Wma x _< Wi _< Wmax 
(2) 
where J is the matrix Jii = 1 i, j, and n is the DC vector rti -- 1 i. This equation 
is linear, up to the hard limits on wi. These hard limits define a hypercube in weight 
space within which the dynamics are confined. We make the following assumption: 
Assumption I The principal features of the dynamics are established before the 
hard limits are reached. When the hypercube is reached, it captures and preserves 
the ezisling weight structure with little subsequent change. 
The matrix Q+k2J is symmetric, so it has a complete orthonormal set of eigenvectors 
e(a) with real eigenvalues Aa. The linear dynamics within the hypercube can be char- 
acterised in terms of these eigenvectors, each of which represents an independently 
evolving weight configuration. First, equation (2) has a fixed point at 
w + k2J)-'n: 
= A ne() (3) 
Second, relative to the fixed point, the component of w in the direction of an eigen- 
vector grows or decays exponentially at a rate proportional to the corresponding 
eigenvalue. Writing w(t) = -. w(t)e (), equation (2) yields 
FP FP)eAt 
= (wa(0)- (4) 
2The indices a and b will be used to denote the eigenvector basis for w, while the indices i and 
j will be used for the synaptic basis. 
696 MacKay and Miller 
Thus, the principal emergent features of the dynamics are determined by the fol- 
lowing three factors: 
1. The principal eigenvectors of Q + k2J, that is, the eigenvectors with largest 
positive eigenvalues. These are the fastest growing weight configurations. 
2. Eigenvectors of Q + k2J with negative eigenvalue. Each is associated with an 
attracting constraint surface, the hyperplane defined by wa - w P. 
3. The location of the fixed point of equation (1). This is important for two 
reasons: a) it determines the location of the constraint surfaces; b) the fixed point 
gives a "head start" to the growth rate of eigenvectors e (a) for which IwrPI is large 
compared to 
2 EIGENVECTORS OF Q 
We first examine the eigenvectors and eigenvalues of Q. The principal eigenvector 
of Q dominates the dynamics of equation (2) for kl = 0, k2 - 0. The subsequent 
eigenvectors of Q become important as kl and k are varied. 
2.1 PROPERTIES OF CIRCULARLY SYMMETRIC SYSTEMS 
If an operator commutes with the rotation operator, its eigenfunctions can be writ- 
ten as eigenfunctions of the rotation operator. For Linsker's system, in the contin- 
uum limit, the operator Q + kaJ is unchanged under rotation of the system. So the 
eigenfunctions of Q + kaJ can be written as the product of a radial function and 
one of the angular functions cos/O, sin lO, l = 0, 1,2... To describe these eigenfunc- 
tions we borrow from quantum mechanics the notation n = 1,2,3... and l = s, p, 
d... to denote the total number of number of nodes in the function = 0, 1, 2... and 
the number of angular nodes = 0, 1,2... respectively. For example, "2s" denotes a 
centre-surround function with one radial node and no angular nodes (see figure 1). 
For monotonic and non-negative covariance functions, we conjecture that the eigen- 
functions of Q are ordered in eigenvalue by their numbers of nodes such that the 
eigenfunction [nl] has larger eigenvalue than either [(n + 1)/] or [n(l + 1)]. This 
conjecture is obeyed in all analytical and numerical results we have obtained. 
2.2 ANALYTIC CALCULATIONS FOR k2 = 0 
We have solved analytically for the first three eigenfunctions and eigenvalues of the 
covariance matrix for layer B -+ C of Linsker's network, in the continuum limit 
(Table 1). ls, the function with no changes of sign, is the principal eigenfunction 
of Q; 2p, the bilobed oriented function, is the second eigenfunction; and 2s, the 
centre-surround eigenfunction, is third. 3 
Figure l(a) shows the first six eigenfunctions for layer B -- C of [Linsker, 1986]. 
32s is degenerate with 3d at k2 = O. 
Analysis of Linsker's Simulations of Hebbian Rules 697 
Table 1: The first three eigenfunctions of the operator Q(r, r') 
Q(r,r ) = e-(r-r')2/2Ce -r'2/2A, where C and A denote the characteristic sizes of 
the covariance function and synaptic density function. r denotes two-dimensional 
spatial position relative to the centre of the synaptic arbor, and r = Irl . The 
eigenvalues A are all normalised by the effective number of synapses. 
Name Eigenfunction A/N 
ls e -"2/2t lC/A 
2p r cos Oe-r/n lC/A 
2s (1 -- rlro)e-r/n laCIA 
R -- C_q_ (1 -F V/1 -F 4A/C) 
-- 2 
l _ R- (0</< 1) 
-- 
2A 
= 
Figure 1: Eigenfunctions of the operator Q + k2J. 
Largest eigenvalue is in the top row. Eigenvalues (in arbitrary units): (a) k2 = 0: 
ls, 2.26; 2p, 1.0; 2s & 3d (only one 3d is shown), 0.41. (b) k2 = -3: 2p, 1.0; 
2s, 0.66; ls, -17.8. The greyscale indicates the range from maximum negative to 
maximum positive synaptic weight within each eigenfunction. Eigenfunctions of 
the operator (e-( r-r'?/c + k)e -r'/A were computed for C/A = 2/3 (as used by 
Linsker for most layer B- C simulations) on a circle of radius 12.5 grid intervals, 
x/--6.15 grid intervals. 
(^) 
1s 
with 
(B) 
2P 
2P 
2 2S ,, 
::::::::::::::::::::: 
========================== 
3D 
lS 
698 MacKay and Miller 
3 THE EFFECTS OF THE PARAMETERS AND k2 
Varying k2 changes the eigenvectors and eigenvalues of the matrix Q + k2J. Varying 
k moves the fixed point of the dynamics with respect to the origin. We now analyse 
these two changes, and their effects on the dynamics. 
Definition: Let fi be the unit vector in the direction of the DC vector n. We 
refer to (w. fi) as the DC component of w. The DC component is proportional 
to the sum of the synaptic strengths in a weight vector. For example, 2p and all 
the other eigenfunctions with angular nodes have zero DC component. Only the 
s-modes have a non-zero DC component. 
3.1 GENERAL THEOREM: THE EFFECT OF k 
We now characterise the effect of adding kJ to any covariance matrix Q. 
Theorem I For any covariance matrix Q, the spectrum of eigenveciors and eigen- 
values of Q + kJ obeys the following: 
1. Eigenvectors of Q with no DC component, and their eigenvalues, are unaffected 
by k. 
2. The other eigenveciors, wiih non-zero DC component, vary with k2. Their eigen- 
values increase continuously and monoionically with k between asymptotic limits 
such that the upper limit of one eigenvalue is the lower limit of the eigenvalue above. 
3. There is at most one negaiive eigenvalue. 
4. All but one of the eigenvalues remain finite. In the limits k - rko there is a 
DC eigenvecior fi with eigenvalue -+ kN, where N is the dimensionaliiy of Q, i.e. 
the number of synapses. 
The properties stated in this theorem, whose proof is in [MacKay, Miller, 1990], are 
summarised pictorially by the spectral structure shown in figure 2. 
3.2 IMPLICATIONS FOR LINSKER'S SYSTEM 
For Linsker's circularly symmetric systems, all the eigenfunctions with angular 
nodes have zero DC component and are thus independent of k. The eigenval- 
ues that vary with k2 are those of the s-modes. The leading s-modes at k2 = 0 are 
ls, 2s; as k is decreased to -0, these modes transform continuously into 2s, 3s 
respectively (figure 2). 4 ls becomes an eigenvector with negative eigenvalue, and it 
approaches the DC vector fl. This eigenvector enforces a constraint w. fi = w rP. fi, 
and thus determines that the final average synaptic strength is equal to w Fp. n/N. 
Linsker used k = -3 in [Linsker, 1986]. This value of k2 is sufficiently large that 
the properties of the k2 --* -ca limit hold [MacKay, Miller, 1990], and in the fol- 
lowing we concentrate interchangeably on k = -3 and k -+ -<x. The computed 
eigenfunctions for Linsker's system at layer /-  are shown in figure l(b) for 
4The 2s eigenfunctions at k2 = 0 and k2 = -x) both have one radial node, but are not identical 
functions. 
Analysis of Linsker's Simulations of Hebbian Rules 699 
Figure 2: General spectrum of eigenvalues of Q + k2J as a function of k2. 
A: Eigenvectors with DC component. B: Eigenvectors with zero DC component. 
C: Adjacent DC eigenvalues share a common asymptote. D: There is only one 
negative eigenvalue. 
The annotations in brackets refer to the eigenvectors of Linsker's system. 
..................................... los ) 
(2s) .............. I .................. 
x D  
k2 = -3. The principal eigenfunction is 2p. The centre-surround eigenfunction 2s 
is the principal symmetric eigenfunction, but it still has smaller eigenvalue than 2p. 
3.3 EFFECT OF kl 
Varying k changes the location of the fixed point of equation (2). From equation 
(3), the fixed point is displaced from the origin only in the direction of eigenvectors 
that have non-zero DC component, that is, only in the direction of the s-modes. 
This has two important effects, as discussed in section 1: a) The s-modes are given 
a head start in growth rate that increases as k is increased. In particular, the 
principal s-mode, the centre-surround eigenvector 2s, may outgrow the principal 
eigenvector 2p. b) The constraint surface is moved when k is changed. For large 
negative ka, the constraint surface fixes the average synaptic strength in the final 
weight vector. To leading order in 1/ka, Linsker showed that the constraint is: 
3.4 SUMMARY OF THE EFFECTS OF kl AND k2 
We can now anticipate the explanation for the emergence of centre-surround cells: 
For kx = 0, k2 = 0, the dynamics are dominated by ls. The centre-surround 
STo second order, this expression becomes  wj = k/Ik 2 + ql, where q = (Qi3), the average 
covariance (averaged over i and j). The additional term largely resolves the discrepancy between 
Linsker's g and k/Ik21 in [Linsker, 1986]. 
700 MacKay and Miller 
eigenfunction 2s is third in line behind 2p, the bi-lobed function. Making k2 large 
and negative removes ls from the lead. 2p becomes the principal eigenfunction 
and dominates the dynamics for k _ 0, so that the circular symmetry is bro- 
ken. Finally, increasing k/Ik21 gives a head start to the centre--surround function 
2s. Increasing k/Ik21 also increases the final average synaptic strength, so large 
k/Ikl also produces a large DC bias. The centre-surround regime therefore lies 
sandwiched between a 2p-dominated regime and an all-excitatory regime. k/Ikl 
has to be large enough that 2s dominates over 2p, and small enough that the DC 
bias does not obscure the centre-surround structure. We estimate this parameter 
regime in [MacKay, Miller, 1990], and show that the boundary between the 2s- and 
2p-dominated regimes found by simulated annealing on the energy function may be 
different from the boundary found by simulating the time-development of equation 
(1), which depends on the initial conditions. 
4 CONCLUSIONS AND DISCUSSION 
k2=0, k=0: 
k2 = large positive 
and/or kl = large 
k2 = large negative, 
k _0 
k2 = large negative, 
k = intermediate 
For Linsker's B - C connections, we predict four main parameter regimes for vary- 
ing k and k2. 6 These regimes, shown in figure 3, are dominated by the following 
weight structures: 
The principal eigenvector of Q, ls. 
The fiat DC weight vector, which leads to the same satu- 
rated structures as ls. 
The principal eigenvector of Q q- k2J for k2 - -o, 2p. 
The principal circularly symmetric function which is given 
a head start, 2s. 
Higher layers of Linsker's network can be analysed in terms of the same four regimes; 
the principal eigenvectors are altered, so that different structures can emerge. The 
development of the interesting cells in Linsker's system depends on the use of neg- 
ative synapses and on the use of the terms k and k to enforce a constraint on the 
final percentages of positive and negative synapses. Both of these may be biolog- 
ically problematic [Miller, 1990]. Linsker suggested that the emergence of centre- 
surround structures may depend on the peaked synaptic density function that he 
used [Linsker, 1986, page 7512]. However, with a fiat density function, the eigen- 
functions are qualitatively unchanged, and centre-surround structures can emerge 
by the same mechanism. 
Acknowledgements 
D.J.C.M. is supported by a Caltech Fellowship and a Studentship from SERC, UK. 
K.D.M. thanks M.P. Stryker for encouragement and financial support while this 
work was undertaken. K.D.M. was supported by an N.E.I. Fellowship and the In- 
not counting the symmetric regimes (kl, k2) - (--kl, k2) in which all the weight structures 
are inverted in sign. 
Analysis of Linsker's Simulations of Hebbian Rules 701 
Figure 3: Parameter regimes for Linsker's system. The DC bias is approx- 
imately constant along the radial lines, so each of the regimes with large negative 
k2 is wedge-shaped. 
. 
Q k 
ternational Joint Research Project Bioscience Grant to M. P. Stryker (T. Tsumoto, 
Coordinator) from the N.E.D.O., Japan. 
This collaboration would have been impossible without the internet/NSFnet, long 
may their daemons flourish. 
References 
[Linsker, 1986] 
R. Linsker. From Basic Network Principles to Neural Architecture 
(series), PNAS USA, 83, Oct.-Nov. 1986, pp. 7508-7512, 8390-8394, 
8779-8783. 
[Linsker, 1988] R. Linsker. Self-Organization in a Perceptual Network, Computer, 
March 1988. 
[Miller, 1990] 
K.D. Miller. "Correlation-based mechanisms of neural develop- 
ment," in Neuroscience and Connectionist Theory, M.A. Gluck and 
D.E. Rumelhart, Eds. (Lawrence Erlbaum Associates, Hillsboro N J) 
(in press). 
[MacKay, Miller, 1990] D.J.C. MacKay and K.D. Miller. "Analysis of Linsker's Sim- 
ulations of Hebbian rules" (submitted to Neural Computation); and 
"Analysis of Linsker's application of Hebbian rules to linear net- 
works" (submitted to Network). 
