The Effect of Correlations on the Fisher 
Information of Population Codes 
Hyoungsoo Yoon 
hyoungOfiz.huji.ac.il 
Haim Sompolinsky 
haimfiz.huji.ac.il 
Racah Institute of Physics and Center for Neural Computation 
Hebrew University, Jerusalem 91904, Israel 
Abstract 
We study the effect of correlated noise on the accuracy of popu- 
lation coding using a model of a population of neurons that are 
broadly tuned to an angle in two-dimension. The fluctuations in 
he neuronat activity is modeled as a Gaussian noise with pairwise 
correlations which decays exponentially with the difference between 
the preferred orientations of the pair. By calculating the Fisher in- 
formation of the system, we show that in the biologically relevant 
regime of parameters positive correlations decrease the estimation 
capability of the network relative to the uncorrelated population. 
Moreover strong positive correlations result in information capac- 
ity which saturates to a finite value as the number of cells in the 
population grows. In contrast, negative correlations substantially 
increase the information capacity of the neuronal population. 
1 Introduction 
In nany neural systems, information regarding sensory inputs or (intended) motor 
outputs is found to be distributed throughout a localized pool of neurons. It is gener- 
ally believed that one of the main characteristics of the population coding scheme is 
its redundancy in representing information (Paradiso 1988; Snippe and Koenderink 
1992a; Seung and Sompolinsky 1993). Hence the intrinsic neuronal noise, which 
has detrimental impact on the information processing capability, is expected to be 
compensated by increasing the number of neurons in a pool. Although this expec- 
tation is universally true for an ensemble of neurons whose stochastic variabilities 
are statistically independent, a general theory of the efficiency of population coding 
when the neuronal noise is correlated within the population, has been lacking. The 
conventional wisdom has been that the correlated variability limits the information 
processing' capacity of neuronal ensembles (Zohary, Shadlen, and Newsome 1994). 
168 H. Yoon and H. Sompolinsky 
However, detailed studies of simple models of a correlated population that code for 
a single real-valued parameter led to apparently contradicting claims. Snippe and 
Koenderink (Snippe and Koenderink 1992b) conclude that depending on the details 
of the correlations, such as their spatial range, they may either increase or decrease 
the information capacity relative to the uncorrelated one. Recently, Abbott and 
Dayan (Abbott and Dayan 1998) claimed that in many cases correlated noise im- 
proves the accuracy of population code. Furthermore, even when the information 
is decreased it still grows linearly with the size of the population. If true, this con- 
clusion has an important implication on the utility of using a large population to 
improve the estimation accuracy. Since cross-correlations in neuronal activity are 
frequently observed in both primary sensory and motor areas (Fetz, Yoyama, and 
Snith 1991; Lee, Port, Kruse, and Georgopoulos 1998), understanding the effect of 
noise correlation in biologically relevant situations is of great importance. 
In this paper we present an analytical study of the effect of noise correlations on the 
population coding of a pool of cells that code for a single one-dimensional variable, 
an angle on a plane, e.#., an orientation of a visual stimulus, or the direction of 
an arm movement. By assuming that the noise follows the multivariate Gaussian 
distribution, we investigate analytically the effect of correlation on the Fisher in- 
formation. This model is similar to that considered in (Snippe and Koenderink 
1992b; Abbott and Dayan 1998). By analyzing its behavior in the biologically rel- 
evant regime of tuning width and correlation range, we derive general conclusions 
about the effect of the correlations on the information capacity of the population. 
2 Population Coding with Correlated Noise 
We consider a population of N neurons which respond to a stimulus characterized 
by an angle 0, where -r < 0 _< r. The activity of each neuron (indexed by i) is 
assumed to be Gaussian with a mean fi(O) which represents its tuning curve, and a 
uniform variance a. The noise is assumed to be pairwise-correlated throughout the 
population. Hence the activity profile of the whole population, R = {rl, r2,'  , rN }, 
given a stimulus 0, follows the following multivariate Gaussian distribution. 
P(R[O) = A/' exp 
(1 ) 
-- y(l' i -- fi(O))c?jl(rj -- f(O)) 
i,j 
(1) 
where A/is a normalization constant and Cij is the correlation matrix. 
Cij = aSij + bij (1 - 5ij). 
(2) 
It is assumed that the tuning curves of all the neurons are identical in form but 
peaked at different angles, that is fi(O) = f(O - Oi) where the preferred angles 
are distributed uniformly from -r to r with a lattice spacing, w, which is equal 
to 2r/N. We further assume that the noise correlation between a pair of neurons 
is only a function of their preferred angle difference, i.e., bij(O) = 
where ]101 - 021] is defined to be the relative angle between 01 and 02, and hence 
its maximum value is r. A decrease in the magnitude of neuronal correlations with 
the dissimilarity in the preferred stimulus is often observed in cortical areas. We 
model this by exponentially decaying correlations 
bij = b exp( IlOi - O2[[ ) (3) 
P 
where p specifies the angular correlation length. 
Fisher Information of Correlated Population Codes 169 
The amount of information that can be extracted from the above population will 
depend on tile decoding scheme. A convenient measure of the information capacity 
in the population is given by the Fisher inibrmation, which in our case is (ibr a 
given stimulus 0) 
J(0) = s,c59. (4) 
J 
i,j 
where 
9i(0) -= 00 (5) 
The utility of this measure follows from the well known Cramr-Rao bound for the 
variance of any unbiased estimators, i.e., ((0 - t) 2) _> l/J(0). For the rest of this 
pat)er, we will concentrate on the Fishel' information as a function of the noise 
correlation parameters, b and p, as well as the population size N. 
3 Results 
In the case of uncorrelated population (b = 
by (Seung and Sompolinsky 1993) 
:o= 
where 9, is the Fourier transform of 95, defined by 
_ 1 
gn -- N y e-in% gj' 
J 
The mode number n is an integer running from N- 
0i = -x(N + 1)IN + i, i = 1,..., N. Likewise, 
J=N Cn 
where C,, are the eigenvalues of the covariance matrix, 
_ 1 
i,j 
0), the Fisher information is given 
(6) 
(7) 
-I (for odd N) and 
2 to  
in the case of b  0, J is given by 
(s) 
1 - x cos() - (-)'x')  .os0)(1 - x) 
= (a- 2b) + 2b (9) 
1 - 2A cos(n) + h 2 
2 h = e -/p and N is assumed to be an odd integer. Note that the 
where w - , , 
covariance matrix Ci remains positive definite as long as 
1 1-A b 
< - < 1 (lO) 
where the lower bound holds for general N while the upper bound is valid 
large .hr. 
To evaluate the effect of correlations in a large population it is inportant to specify 
the appropriate scales of the system parameters. We consider here the biologically 
relevant case of broadly tuned neurons that have a smoothly varying tuning curve 
with a single peak. When the tuning cu,'ve is smoothly varying, t9,I 2 will be 
rapidly decaying function as n increases beyond a characteristic value which is 
170 H. Yoon and H. Sompolinsky 
proportional to the inverse of the tuning width, or. We further assume a broa(t 
tuning, namely that the tuning curve spans a substantial fraction of the angular 
extent. This is consistent with the observed typical values of half-width at half 
height in visual and motor areas, which range from 20 to 60 degrees. Likewise, it 
is reasonable to assume that the angular correlation length p spans a substantial 
fraction of the entire angular range. This broad tuning of correlations with respect 
to the difference in the preferred angles is commonly observed in cortex (Fetz, 
Yoyama, and Smith 1991; Lee, Port, Kruse, and Georgopoulos 1998). To capture 
these features we will consider the limit of large N while keeping the parameters p 
and a constant. Note that keeping a of order 1 implies that substantial contributions 
to Eq. (8) come only from n which remain of order 1 as N increases. On the 
other hand, given the enormous variability in the strength of the observed cross- 
correlations between pairs of neurons in cortex, we do not restrict the value of b at 
this point. 
Incorporating the above scaling we find that when N is large J is given by 
= + 
(  )(1- (-1)"e-/P) (11) 
Inspection of the denominator in the above equation clearly shows that for all 
positive values of b, J is smaller than Jo. On the other hand, when b is negative 
J is larger than Jo. To estimate the magnitude of these effects we consider below 
i i 
O.8 
 ]o 0.4 - 
0.2 
0.0    
0 1000 2000 3000 4000 
N 
Normalized Fisher information when p - 69(1) (p = 0.25r was used). 
We used a circular Gaussian 
three different regimes. 
1.0 
Figure 1: 
a = 1 and b = 0.1, 0.01, and 0.001 from the bottom. 
tuning curve, Eq. (13), with fmax = 10 and cr = 0.2r. 
Strong positive correlations: We first discuss the regime of strong positive 
correlations, by which we mean that 0 < b/a ,,- (3(1). In this case the second terIn 
in the denominator of Eq. (11) is of order N and Eq. (11) becomes 
j _ rp p-2 + n  
Ig'121 
This result implies that in this regime the Fisher information in the entire popu- 
lation does not scale linearly with the population size N but saturates to a size- 
independent finite limit. Thus, for these strong correlations, although the number 
of neurons in the population may be large, the number of independent degrees of 
freedom is small. 
We demonstrate the above phenomenon by a numerical evaluation of J for the 
following choice of tuning curve 
f(O) = fmax exp ((cos(O)- 1)/or ) (13) 
Fisher Information of Correlated Population Codes 171 
with rr = 0.2r. The results are shown in Fig. 1 and Fig. 2. The results of Fig. 1 
clearly show the substantial decrease in J as b increases. The reduction in J/Jo 
when b -- (9(1) indicates that J does not scale with N in this limit. Fig. 2 shows the 
saturation of J when N increases. For p = 0.1 and 1 ((c) and (d)), J saturates at 
about N = 100, which means that for these parameter values the network contains 
at most 100 independent degrees of freedom. When the correlation range becomes 
either smaller or bigger, the saturation becomes less prominent ((a) and (b)), which 
is further explained later in the text. 
4O 
3O 
J 2o 
() 
0 800 
10 
/// 
I I 
200 400 600 
N 
Figure 2: Saturation of Fisher information with the correlation coefficient kept fixed: 
a: 1 and b = 0.5. Both p - (9(1) ((c) p = 0.1 and (d) p = 1) and other extreme 
limits ((a) p = 0.01 and (b) p = 10) are shown. Tuning curve with f,ax = 1 and 
rr = 0.2r was used for all four curves. 
Weak positive correlations: This regime is defined formally by positive values 
of b which scale as b/a - (9(-). In this case, while J is still smaller than Jo the 
suppressive effects of the correlations are not as strong as in the first case. This is 
shown in Fig. 3 (bottom traces) for N = 1000. While J is less than Jo, it is still a 
substantial fraction of Jo, indicating J is of order N. 
2.5 
2.0 
J 
1.0 
0.5 
0 1 2 3 4 
p 
Figure 3: Normalized Fisher information when p -- (9(1) and b/a  0(-). N = 
1000, a - 1, fax = 10, and cr = 0.2r. Tile top curves represent negative b 
(b = -0.005 and -0.002 from the top) and tile bottom ones positive b (b = 0.01 
and 0.005 from the bottom). 
Weak negative correlations: So far we have considered the case of positive b. 
As stated above, Eq. (11) implies that when b < 0, J > Jo. The lower bound of b 
(Eq. (10)) means that when the correlations are negative and p is of order 1 tim 
amplitude of the correlations must be small. It scales as b/a = /N with  which 
is of order 1 and is larger than min --' -(r/p)/(1- exp(-w/p)). In this regime 
(J - Jo)/N retains a finite positive value even for large N. This enhancement can 
172 H. Yoon and H. Sompolinsky 
be made large if  comes close to min- This behavior is shown in Fig. 3 (upper 
traces). Note that, for both positive and negative weak correlations, the curves 
have peaks around a characteristic length scale p ,- rr, which is 0.2r in this figure. 
Extremely long and short range correlations: Calculation with strictly uni- 
form correlations, i.e., bij = b, shows that in this case the positive correlations 
enhance the Fisher information of the system, leading to claims that this might 
be a generic result (Abbott and Dayan 1998). Here we show that this behavior 
is special to cases where the correlations essentially do not vary in strength. \Ve 
consider the case p ,- (9(N). This means that the strength of the correlations is the 
same fbr all the neurons up to a correction of order 1/A r. In this limit Eq. (ll) is 
not valid, and the Fisher information is obtained from Eq. (8) and Eq. (9), 
(14) 
even odd 
where  = ovp/4. Note that even in this extreme regine, only for  > 1 is J 
guaranteed to be always larger than Jo. Below this value the sign of J - ,lo depends 
on the particular shape of the tuning curve and the value of b. In fact, a more 
detailed analysis (Yoon and Sompolinsky 1998) shows that as soon as p << O(v/), 
J- Jo < 0, as in the case of p -0 (9(1) discussed above. The crossover between these 
two opposite behaviors is shown in Fig. 4. For comparison the case with p -0 O(1) 
is also shown. 
.! 
I 
4.0 
3.0 
2.0 
1.0  
0.0  
0.0 0.2 
I I I 
I I I 
0,4 0.6 (I.8 .0 
b 
Figure 4: Normalized Fisher information when b/a  O(1). N = 1000 and a = 1. 
When p , (9(1), increasing b always decreases the Fisher information (bottom curve 
p = 0.257r). However, this trend is reversed when p - O(x/) and when p > 2-N 
,l - ,lo becomes always positive. From the top p = 400, 50, and 25. 
Another extreme regime is where the correlation length p scales as 1/N but the 
tuning width remains of order 1. This means that a given neuron is correlated with 
a small number of its immediate neighbors, which remains finite as N -+ oo. In this 
limit, the Fisher information becomes, again from Eq. (8) and Eq. (9), 
- 1) ., 
J = a(h - - 1) + 2b Ig,,l'. (15) 
In this case, the behavior of J is similar to the cases of weak correlations discussed 
above. The information remains of order N but the sign of J - Jo depends on the 
sign of b. Thus, when the amplitude of the positive correlation function is O(1), J 
increases linearly with N in the two opposite extremes of very large and very small 
p a.s shown in Fig. 2 ((a) and (b)). 
Fisher Information of Correlated Population Codes 
4 Discussion 
173 
In this paper we have studied the effect of correlated variability of neuronal activity 
on the maximum accuracy of the population coding. We have shown that the 
effect of correlation on the information capacity of the population crucially depends 
on the scale of correlation length. We argue that for the sensory and motor areas 
which are presumed to utilize population coding, the tuning of both the correlations 
and the mean response profile is broad and of the same order. This implies that 
each neuron is correlated with a finite fraction of the total number of neurons, 
N, and a given stimulus activates a finite fraction of N. We show that in this 
regime positive correlations always decrease the information. When they are strong 
enough in amplitude they reduce the number of independent degrees of freedom 
to a finite number even for large population. Only in the extreme case of ahnost 
uniform correlations the information capacity is enhanced. This is reasonable since 
to overcome the positive correlations one needs to subtract the responses of different 
neurons. But in general this will reduce their signal by a larger amount. When the 
correlations are uniform, the reduction of the correlated noise by subtraction is 
perfect and can be made in a manner that will little affect the signal component. 
Acknowledgments 
H.S. acknowledges helpful discussions with Larry Abbott and Sebastian Seung. This 
research is partially supported by the Fund for Basic Research of the Israeli Academy 
of Science and by a grant from the United States-Israel Binational Science Founda- 
tion (BSF), Jerusalem, Israel. 
References 
L. F. Abbott and P. Dayan (1998). The effect of correlated variability on the 
accuracy of a population code. Neural Comp., in press. 
E. Fetz, K. Yoyama, and W. Smith (1991). Synaptic interactions between cortical 
neurons. In A. Peters and E.G. Jones (Eds.), Cerebral Cortex, Volume 9. New 
York: Plenum Press. 
D. Lee, N. L. Port, W. Kruse, and A. P. Georgopoulos (1998). Variability and 
correlated noise in the discharge of neurons in motor and parietal areas of the 
primate cortex. J. Neurosci. 18, 1161-1170. 
M. A. Paradiso (1988). A theory for the use of visual orientation information 
which exploits the columnar structure of striate cortex. Biol. Cybern. 58, 35-49. 
H. S. Seung and H. Sompolinsky (1993). Simple models for reading neuronal 
population codes. Proc. Natl. Acad. $ci. USA 90, 10749-10753. 
H. P. Snippe and J. J. Koenderink (1992a). Discrimination thresholds for channel- 
coded systems. Biol. Cybern. 66, 543-551. 
H. P. Snippe and J. J. Koenderink (1992b). Information in channel-coded system: 
correlated receivers. Biol. Cybern. 67, 183-190. 
H. Yoon and H. Sompolinsky (1998). Population coding in neuronal systems with 
correlated noise, preprint. 
E. Zohary, M. N. Shadlen, and W. T. Newsome (1994). Correlated neuronal 
discharge rate and its implications for psychophysical performance. Nature 370, 
140-143. 
