Neural Representation of Multi-Dimensional 
Stimuli 
Christian W. Eurich, Stefan D. Wilke and Helmut Schwegler 
Institut ffir Theoretische Physik 
Universitit Bremen, Germany 
(eurich, swilke, schwegler) @physik. uni-bremen.de 
Abstract 
The encoding accuracy of a population of stochastically spiking neurons 
is studied for different distributions of their tuning widths. The situation 
of identical radially symmetric receptive fields for all neurons, which 
is usually considered in the literature, turns out to be disadvantageous 
from an information-theoretic point of view. Both a variability of tun- 
ing widths and a fragmentation of the neural population into specialized 
subpopulations improve the encoding accuracy. 
1 Introduction 
The topic of neuronal tuning properties and their functional significance has focused much 
attention in the last decades. However, neither empirical findings nor theoretical consider- 
ations have yielded a unified picture of optimal neural encoding strategies given a sensory 
or motor task. More specifically, the question as to whether narrow tuning or broad tuning 
is advantageous for the representation of a set of stimulus features is still being discussed. 
Empirically, both situations are encountered: small receptive fields whose diameter is less 
than one degree can, for example, be found in the human retina [7], and large receptive 
fields up to 180  in diameter occur in the visual system of tongue-projecting salamanders 
[10]. On the theoretical side, arguments have been put forward for small [8] as well as for 
large [5, 1, 9, 3, 13] receptive fields. 
In the last years, several approaches have been made to calculate the encoding accuracy 
of a neural population as a function of receptive field size [5, 1, 9, 3, 13]. It has turned 
out that for a firing rate coding, large receptive fields are advantageous provided that D > 
3 stimulus features are encoded [9, 13]. For binary neurons, large receptive fields are 
advantageous also for D = 2 [5, 3]. 
However, so far only radially symmetric tuning curves have been considered. For neural 
populations which lack this symmetry, the situation may be very different. Here we study 
the encoding accuracy of a population of stochastically spiking neurons. A Fisher infor- 
mation analysis performed on different distributions of tunings widths will indeed reveal a 
much more detailed picture of neural encoding strategies. 
116 C. W. Eurich, S. D. Willre and H. Schwegler 
2 Model 
Consider a D-dimensional stimulus space, X. A stimulus is characterized by a position 
x = (zi,..., ZD)  X, where the value of feature i, zi (i = 1,..., D), is measured 
relative to the total range of values in the i-th dimension such that it is dimensionless. 
Information about the stimulus is encoded by a population of N stochastically spiking 
neurons. They are assumed to have independent spike generation mechanisms such that the 
joint probability distribution for observing n = (nO),..., n(k),..., n(iV)) spikes within a 
time interval r, Ps (n; x), can be written in the form 
N 
?s(n;x) = II ??)(.(*);x), 
k=l 
where P?) (n ; x) is the single-neuron probability distribution of the number of observed 
spikes given the stimulus at position x. Note that (1) does not exclude a correlation of the 
neural firing rates, i.e., the neurons may have common input or even share the same tuning 
function. 
The firing rates depend on the stimulus via the local values of the tuning functions, such that 
P?) (n(k); x) can be written in the form P?)(n(); x) = S (n  , f()(x), r), where the 
tuning function of neuron k, f(k) (x), gives its mean firing rate in response to the stimulus 
at position x. We assume here a form of the tuning function that is not necessarily radially 
symmetric, 
where c  = (c?),... ,c?) is the center of the tuning curve of neuron k, a? ) is its 
F()2 (tO,2. ()2 
tuning width in the i-th dimension, - :- (zi for i - 1, D, and 
-ci ) IO'i ..., 
(t)2 := ?)2 + ... + )2. F > 0 denotes the maximal firing rate of the neurons, which 
requires that maxz>0 b(z) = 1. 
We assume that the tuning widths o'? ) a ) of each neuron k are drawn from a distri- 
bution P, (Crl,..., CrD). For a population of tuning functions with centers c(1),..., c( , a 
N 
density r/(x) is introduced according to r/(x) := Ek--1 (x - c ). 
The encoding accuracy can be quantified by the Fisher information matrix, J, which is 
defined as 
Jij(x):=E[(xilnP(n;x)) (lnP(n;x))], (3) 
where E[...] denotes the expectation value over the probability distribution P(n; x) [2]. 
The Fisher information yields a lower bound on the expected error of an unbiased estimator 
that retrieves the stimulus x from the noisy neural activity (Cram6r-Rao inequality) [2]. The 
minimal estimation error for the i-th feature xi, i,min, is given by e. 2  : (j-1)ii which 
,mln 
reduces to e.2 . = 1/Jii(x) if J is diagonal. 
,mln 
We shall now derive a general expression for the population Fisher information. In the 
next chapter, several cases and their consequences for neural encoding strategies will be 
discussed. 
For model neuron (k), the Fisher information (3) reduces to 
(4) 
Neural Representation of Multi-Dimensional Stimuli 117 
where the dependence on the tuning widths is indicated by the list of arguments. The 
function AO depends on the shape of the tuning function and is given in [13]. The in- 
dependence assumption (1) implies that the population Fisher information is the sum of 
the contributions of the individual neurons, -N j)(x; a? ) , , a (k)) We now define 
k=l _ ' '.' .D ' . . 
a population Fisher information which is averaged over the distribution of tuning widths 
P,,(a,... 
Introducing the density of tuning curves, r/(x), into (5) and assuming a constant distri- 
bution, r/(x)   _-- const., one obtains the result that the population Fisher information 
becomes independent of x and that the off-diagonal elements of J vanish [ 13]. The average 
population Fisher information then becomes 
(Jij) =DK(F,r,D) ( Htl at ) 
2 
60, (6) 
where K depends on the geometry of the tuning curves and is defined in [ 13]. 
3 Results 
In this section, we consider different distributions of tuning widths in (6) and discuss ad- 
vantageous and disadvantageous strategies for obtaining a high representational accuracy 
in the neural population. 
Radially symmetric tuning curves. For radially symmetric tuning curves of width , 
the tuning-width distribution reads 
D 
,0) = II - 
i=1 
see Fig. 1 a for a schematic visualization of the arrangement of the tuning widths for the 
case D = 2. The average population Fisher information (6) for i = j becomes 
(Jii) = riDK(F, r, D)D-2, (7) 
a result already obtained by Zhang and Sejnowski [ 13]. Equation (7) basically shows that 
the minimal estimation error increases with  for D - 1, that it does not depend on  for 
D - 2, and that it decreases as  increases for D _> 3. We shall discuss the relevance of 
this case below. 
Identical tuning curves without radial symmetry. Next we discuss tuning curves which 
are identical but not radially symmetric; the tuning-width distribution for this case is 
D 
= II - 
i=1 
where  denotes the fixed width in dimension i. For i = j, the average population Fisher 
information (6) reduces to [ 11, 4] 
-2 (8) 
o' i 
118 C. W. Eurich, S. D. Vqlke and H. Schwegler 
(c) 
(a) 
,/ 
b 1 
(b) 
(d) 
IJ 1 IJ 1 
Figure 1: Visualization of different distributions of 
tuning widths for D - 2. (a) Radially symmetric tun- 
ing curves. The dot indicates a fixed , while the diag- 
onal line symbolizes a variation in  discussed in [13]. 
(b) Identical tuning curves which are not radially sym- 
metric. (c) Tuning widths uniformly distributed within 
a small rectangle. (d) Two subpopulations each of 
which is narrowly tuned in one dimension and broadly 
tuned in the other direction. 
Equation (8) contains (7) as a special case. From (8) it becomes immediately clear that the 
expected minimal square encoding error for the i-th stimulus feature, 2 
i,min- 1/(Jii(x))a, 
depends on i, i.e., the population specializes in certain features. The error obtained in 
dimension i thereby depends on the tuning widths in all dimensions. 
Which encoding strategy is optimal for a population whose task it is to encode a single 
feature, say feature i, with high accuracy while not caring about the other dimensions? In 
order to answer this question, we re-write (8) in terms of receptive field overlap. 
For the tuning functions f (k) (x) encountered empirically, large values of the single-neuron 
Fisher information (4) are typically restricted to a region around the center of the tuning 
function, c . The fraction p(fl) of the Fisher information that falls into a region ED ' 
V/(k) 2 _< fl around c  is given by 
f dZx E/= JJ/)(x) 
p() := Er> 
f dVx =t J/)(x) 
x 
f d D+I Acp(2, F T) 
0 
f d t>+t A(2, F, r) 
0 
(9) 
where the index (k) was dropped because the tuning curves are assumed to have iden- 
tical shapes. Equation (9) allows the definition of an effective receptive field, RFr ), 
inside of which neuron k conveys a major fraction P0 of Fisher information, RF? := 
{xl V/- ()2 _< 0 }, where 0 is chosen such that P(fio): Po. The Fisher information a 
n,() 
neuron k carries is small unless x E *- eft  This has the consequence that a fixed stimulus 
x is actually encoded only by a subpopulation of neurons. The point x in stimulus space is 
covered by 
D 
2rt>/2()t> H J (10) 
Ncode :=. Dr(D/2) 
receptive fields. With the help of (10), the average population Fisher information (8) can 
be re-written as 
D2r( D /2) %ode 
(Jii)a = 27rD/2(o) D Kq(F,r,D) -2 (11) 
ai 
Equation (11) can be interpreted as follows: We assume that the population of neurons 
encodes stimulus dimension i accurately, while all other dimensions are of secondary im- 
portance. The average population Fisher information for dimension i, (Jii) , is determined 
by the tuning width in dimension i, i, and by the size of the active subpopulation, -/code. 
There is a tradeoff between these quantities. On the one hand, the encoding error can be 
decreased by decreasing i, which enhances the Fisher information carried by each single 
Neural Representation of Multi-Dimensional Stimuli 119 
neuron. Decreasing i, on the other hand, will also shrink the active subpopulation via 
(10). This impairs the encoding accuracy, because the stimulus position is evaluated from 
the activity of fewer neurons. If (11) is valid due to a sufficient receptive field overlap, 
Ncode can be increased by increasing the tuning widths, j, in all other dimensions j  i. 
This effect is illustrated in Fig. 2 for D = 2. 
X 2 
X2, s 
I  
Xl,s X 1 
X 2 
X2, s 
Xl,s X 1 
Figure 2: Encoding strategy for a stimulus characterized by parameters arl,s and ar2,s. Fea- 
ture ar is to be encoded accurately. Effective receptive field shapes are indicated for both 
populations. If neurons are narrowly tuned in ar2 (left), the active population (solid) is 
small (here: Ncode = 3). Broadly tuned receptive fields for are (right) yield a much larger 
population (here: Ncode = 27) thus increasing the encoding accuracy. 
It shall be noted that although a narrow tuning width i is advantageous, the limit i > 0 
yields a bad representation. For narrowly tuned cells, gaps appear between the receptive 
fields: The condition r/(x) _= const. breaks down, and (6) is no longer valid. A more 
detailed calculation shows that the encoding error diverges as i > 0 [4]. The fact that 
the encoding error decreases for both narrow tuning and broad tuning - due to (11) - proves 
the existence of an optimal tuning width. An example is given in Fig. 3a. 
0.8 
0.6 
"0.4 
v 
0.2 
0 0 0.5 1 1.5 2 
a [a] 
Figure 3: (a) Example for the encoding behavior with narrow tuning curves arranged on 
a regular lattice of dimension D = 1 (grid spacing A). Tuning curves are Gaussian, and 
neural firing is modeled as a Poisson process. Dots indicate the minimal square encoding 
error averaged over a uniform distribution of stimuli, 2 
(min), as a function of . The mini- 
mum is clearly visible. The dotted line shows the corresponding approximation according 
to (8). The inset shows Gaussian tuning curves of optimal width, -opt m 0.4&. (b) gD (A) 
as a function of A for different values of D. 
120 C. W. Eurich, S. D. Wilke and H. Schwegler 
Narrow distribution of tuning curves. In order to study the effects of encoding the 
stimulus with distributed tuning widths instead of identical tuning widths as in the previous 
cases, we now consider the distribution 
Pa(O'l,... ,O'D) --- 
(12) 
where O denotes the Heaviside step function. Equation (12) describes a uniform distri- 
bution in a D-dimensional cuboid of size bl,..., bD around (l,... D); cf. Fig. lc. A 
straightforward calculation shows that in this case, the average population Fisher informa- 
+ co . (13) 
tion (6) for i = j becomes 
= { 
-2 1 + 
A comparison with (8) yields the astonishing result that an increase in bi results in an 
increase in the i-th diagonal element of the average population Fisher information matrix 
and thus in an improvement in the encoding of the i-th stimulus feature, while the encoding 
in dimensions j  i is not affected. Correspondingly, the total encoding error can be 
decreased by increasing an arbitrary number of edge lengths of the cube. The encoding by 
a population with a variability in the tuning curve geometries as described is more precise 
than that by a uniform population. This is true for arbitrary D. Zhang and Sejnowski [13] 
consider the more artificial situation of a correlated variability of the tuning widths: tuning 
curves are always assumed to be radially symmetric. This is indicated by the diagonal 
line in Fig. 1 a. A distribution of tuning widths restricted to this subset yields an average 
population Fisher information ec (v-2) and does not improve the encoding for D = 2 or 
D=3. 
Fragmentation into D subpopulations. Finally, we study a family of distributions of 
tuning widths which also yields a lower minimal encoding error than the uniform popula- 
tion. Let the density of tuning curves be given by 
D 
i=1 
H 6(crj -), (14) 
where ,X > 0. For A = 1, the population is uniform as in (7). For A  1, the population 
is split up into D subpopulations; in subpopulation i, cr i is modified while crj =  for 
j  i. See Fig. ld for an example. The diagonal elements of the average population Fisher 
information are 
, (15) 
where the term in brackets will be 
this case because of the symmetry 
case (7) differ by gD (/) which will 
values of D. For ,X = 1, gD (/) 
abbreviated as gi9(,). (Y) does not depend on i in 
in the subpopulations. Equation (15) and the uniform 
now be discussed. Figure 3b shows gD (/) for different 
-- 1 and (7) is recovered as expected. gv() = 1 
also holds for/ = 1/(D - 1) < 1: narrowing one tuning width in each subpopulation 
will at first decrease the resolution provided D > 3; this is due to the fact that ]Vcode is 
decreased. For , < 1/(D - 1), however, gD(,) > 1, and the resolution exceeds (Jii) in 
(7) because each neuron in the i-th subpopulation carries a high Fisher information in the 
i-th dimension. D - 2 is a special case where no impairment of encoding occurs because 
the effect of a decrease of ]Vcode is less pronounced. Interestingly, an increase in , also 
yields an improvement in the encoding accuracy. This is a combined effect resulting from 
an increase in Node on the one hand and the existence of D subpopulations, D - 1 of 
Neural Representation of Multi-Dimensional Stimuli 121 
which maintain their tuning widths in each dimension on the other hand. The discussion 
of /D (,X) leads to the following encoding strategy. For small ,X, (Jii) increases rapidly, 
which suggests a fragmentation of the population into D subpopulations each of which 
encodes one feature with high accuracy, i.e., one tuning width in each subpopulation is 
small whereas the remaining tuning widths are broad. Like in the case discussed above, the 
theoretical limit of this method is a breakdown of the approximation of r/= const. and the 
validity of (6) due to insufficient receptive field overlap. 
4 Discussion and Outlook 
We have discussed the effects of a variation of the tuning widths on the encoding accuracy 
obtained by a population of stochastically spiking neurons. The question of an optimal 
tuning strategy has turned out to be more complicated than previously assumed. More 
specifically, the case which focused most attention in the literature - radially symmetric 
receptive fields [5, 1, 9, 3, 13] - yields a worse encoding accuracy than most other cases we 
have studied: uniform populations with tuning curves which are not radially symmetric; 
distributions of tuning curves around some symmetric or non-symmetric tuning curve; and 
the fragmentation of the population into D subpopulations each of which is specialized in 
one stimulus feature. 
In a next step, the theoretical results will be compared to empirical data on encoding prop- 
erties of neural populations. One aspect is the existence of sensory maps which consist 
of neural subpopulations with characteristic tuning properties for the features which are 
represented. For example, receptive fields of auditory neurons in the midbrain of the barn 
owl have elongated shapes [6]. A second aspect concerns the short-term dynamics of re- 
ceptive fields. Using single-unit recordings in anaesthetized cats, Wfrgftter et al. [ 12] 
observed changes in receptive field size taking place in 50-100ms. Our findings suggest 
that these dynamics alter the resolution obtained for the corresponding stimulus features. 
The observed effect may therefore realize a mechanism of an adaptable selective signal 
processing. 
References 
[9] 
[10] 
[11] 
[12] 
[13] 
[1] Baldi, P. & Heiligenberg, W. (1988) Biol. Cybern. 59:313-318. 
[2] Deco, G. & Obradovic, D. (1997) An Information-Theoretic Approach to Neural Computing. 
New York: Springer. 
[3] Eurich, C. W. & Schwegler, H. (1997) Biol. Cybern. 76: 357-363. 
[4] Eurich, C. W. & Wilke, S. D. (2000) Neural Comp. (in press). 
[5] Hinton, G. E., McClelland, J. L. & Rumelhart, D. E (1986) In Rumelhart, D. E. & McClelland, 
J. L. (eds.), Parallel Distributed Processing, Vol. 1, pp. 77-109. Cambridge MA: MIT Press. 
[6] Knudsen, E. I. & Konishi, M. (1978) Science 200:795-797. 
[7] Kuffier, S. W. (1953) J. Neurophysiol. 16:37-68. 
[8] Lettvin, J. Y., Maturana, H. R., McCulloch, W. S. & Pitts, W. H. (1959) Proc. Inst. Radio Eng. 
NY 47:1940-1951. 
Snippe, H. P. & Koenderink, J. J. (1992) Biol. Cybern. 66:543-551. 
Wiggers, W., Roth, G., Eurich, C. W. & Straub, A. (1995) J. Comp. Physiol. A 176:365-377. 
Wilke, S. D. & Eurich, C. W. (1999) In Verleysen, M. (ed.), ESANN 99, European Symposium 
on Artificial Neural Networks, pp. 435--440. Brussels: D-Facto. 
WiSrgiStter, F., Suder, K., Zhao, Y., Kerscher, N., Eysel, U. T. & Funke, K. (1998) Nature 
396:165-168. 
Zhang, K. & Sejnowski, T. J. (1999) Neural Comp. 11:75-84. 
