The Role of MT Neuron Receptive Field 
Surrounds in Computing Object Shape from 
Velocity Fields 
G.T.Buracas & T.D.Albright 
Vision Center Laboratory, The Salk Institute, 
P.O.Box 85800, San Diego, California 92138-9216 
Abstract 
The goal of this work was to investigate the role of primate 
MT neurons in solving the structure from motion (SFM) 
problem. Three types of receptive field (RF) surrounds 
found in area MT neurons (K. Tanaka et a/.,1986; Allman et 
a/.,1985) correspond, as our analysis suggests, to the 0 th, I st 
and 2 nd order fuzzy space-differential operators. The large 
surround/center radius ratio (> 7) allows both 
differentiation of smooth velocity fields and discontinuity 
detection at boundaries of objects. The model is in 
agreement with recent psychophysical data on surface 
interpolation involvement in SFM. We suggest that area 
MT partially segregates information about object shape 
from information about spatial relations necessary for 
navigation and manipulation. 
1 INTRODUCTION 
Both neurophysiological investigations [8] and lesioned human patients' 
data show that the Middle Temporal (MT) cortical area is crucial to 
perceiving three-dimensional shape in moving stimuli. On the other hand, 
969 
970 Buracas and Albright 
a solid body of data (e.g. [1]) has been gathered about functional properties 
of neurons in the area MT. Hoever, the relation between our ability to 
perceive structure in stimuli, simulating 3-D objects, and neuronal 
properties has not been addressed up to date. Here we discuss a 
possibility, that area MT RF surrounds might be involved in shape-from- 
motion perception. We introduce a simplifying model of MT neurons and 
analyse the implications to SFM problem solving. 
2 REDEFINING THE SFM PROBLEM 
2.1 RELATIVE MOTION AS A CUE FOR RELATIVE DEPTH 
Since Helmholtz motion parallax is known to be a powerful cue providing 
information about both the structure of the surrounding environment and 
the direction of self-motion. On the other hand, moving objects also induce 
velocity fields allowing judgement about their shapes. We can capture both 
cases by assuming that an observer is tracking a point on a surface of 
interest. The velocity field of an object then is (fig. l): V = t z + w x (R - R o ) 
=-tz+WXZ, where w=[Wx,Wy,0] is an effective rotation vector of a surface 
z=[x,y,z(x,y)]; Ro=[0,0,z0 ] is a positional vector of the fixation point; t z is a 
translational component along Z axis. 
x 
z(x,y) 
w 
y 
Fig. 1: The coordinate system assumed in this paper. The origin is set at 
the fixation point. The observer is at Z 0 distance from a surface. 
The Role of MT Neuron Receptive Field Surrounds in Computing Object Shape 971 
The component velocities of a retinal velocity field under perspective 
projection can be calculated from: 
WyZ --Xtz -wxxy+Wy x2 wZ -Ytz +WyXy--wxY 2 
ZO +z (Zo +Z) 2 , V---- 
Z 0 + Z (Z 0 + Z) 2 
In natural viewing conditions the distance to the surface z 0 is usually much 
larger than variation in distance on the surface z : z0>>z. In such the 
second term in the above equations vanishes. In the case of translation 
tangential to the ground, to which we confine our analysis, w=[0,Wy,0] - 
[0,w,0], and the retinal velocity reduces to 
u = -wz/(zo+z) -- -wz/zo, v=O (1). 
The latter relation allows the assumption of orthographic projection, which 
approximates the retinal velocity field rather well within the central 20 deg 
of the visual field. 
2.2 SFM PERCEPTION INVOLVES SURFACE INTERPOLATION 
Human SFM perception is characterized by an interesting peculiarity -- 
surface interpolation [7]. This fact supports the hypothesis that an 
assumption of surface continuity is embedded in visual system. Thus, we 
can redefine the SFM problem as a problem of characterizing the 
interpolating surfaces. The principal normal curvatures are a local 
measure of surface invariant with respect to translation and rotation of the 
coordinate system. The orientation of the surface (normal vector) and its 
distance to the observer provide the information essential for navigation 
and object manipulation. The first and second order differentials of a 
surface function allow recovery of both surface curvature and orientation. 
3 MODEL OF AREA MT RECEPTIVE FIELD SURROUNDS 
3.1 THREE TYPES OF RECEPTIVE FIELD SURROUNDS 
The Middle Temporal (MT) area of monkeys is specialized for the 
systematic representation of direction and velocity of visual motion [1,2]. 
MT neurons are known to posess large, silent (RFS, the "nonclassical RF". 
Born and Tootell [4] have very recently reported that the RF surrounds of 
neurons in owl monkey MT can be divided into antagonistic and synergistic 
types (Fig.2a). 
972 Buracas and Albright 
25 
g2o 
 10 
0 
a) 
0 10 
Annulus diameter deg 
Fig. 2: Top left (a): an example of a 
synergistic RF surround, redrawn from 
[4] (no velocity tuning known). Bottom 
left (b): a typical V-shaped tuning curve 
for RF surround The horizontal axis 
represents the logarithmic scale of ratio 
between stimulus speeds in the RF 
center and surround, redrawn from [9]. 
Bottom (c,d): monotonically increasing 
and decreasing tuning curves for RF 
surrounds, redrawn from [9]. 
1 
0.8 
0.6 
o,4 
0.2 
0 
03 
1 10 
Ratio of C/S s pc. eds 
1 
 
80.4 
ergO. 2 
c) 
0.1 1 10 
Ratio of C/S speeds 
01 1 
Ralio f C4S speeds 
About 44% of the owl monkey neuron RFSs recorded by Allman et al. [3] 
showed antagonistic properties. Approximately 33% of these demonstrated 
V(or U)oshaped (Fig. 2b), and 66% - quasi-linear velocity tuning curves 
(Fig. 2c, d). One half of Macaca fuscata neurons with antagonistic RFS 
found by Tanaka et al [9] have had V(U)-shaped velocity tuning curves, 
and 50% monotonically increasing or decreasing velocity tuning curves. 
The RFS were tested for symmetry [9] and no asymmetrical surrounds 
were found in primate MT. 
3.2 CONSTRUCTING IDEALIZED MT FILTERS 
The surround (S) and center (C) responses seem to be largely independent 
(except for the requirement that the velocity in the center must be nonzero) 
and seem to combine in an additive fashion [5]. This property allows us to 
combine C and S components in our model independently. The resulting 
filters can be reduced to three types, described below. 
3.2.1 Discrete Filters 
The essential properties of the three types of RFSs in area MT can be 
captured by the following difference equations. We choose the slopes of 
velocity tuning curves in the center to be equal to the ones in the surround; 
this is essential for obtaining the desired properties for 12 but not 10. The 0- 
order (or low-pass) and the 2nd order (or band-pass) filters are defined by: 
The Role of MT Neuron Receptive Field Surrounds in Computing Object Shape 973 
l o=goyy(uc+us(i,j))+ConstO ,; 12=g2yy(uc-us(i,j))+Const2, (2) 
i j i j 
where g is gain, wi j =1, ije [-r,r] (r = radius of integration). Speed scalars 
u(ij) at points [ij] replace the velocity vectors 17 due to eq. (1). Constants 
correspond to spontaneous activity levels. 
In order to achieve the V(U) -shaped tuning for the surround in Fig.2b, a 
nonlinearity has to be introduced: 
li =gi(Uc-Us(i,j)) 2 +Constl. (3) 
i j 
The responses of 11 and 12 filters to standard mapping stimuli used in [3,9] 
are plotted together with their biological correlates in Fig. 3. 
3.2.2 Continuous analogues of MT filters 
We now develop continuous, more biologicaly plausible, versions of our 
three MT filters. We assume that synaptic weights for both center and 
surround regions fall off with distance from the RF center as a Gaussian 
function G(x,y,(), and ( is different for center and surround: (c  Cs- Then, 
by convolving with Gaussians equation (2) can be rewritten: 
L o (i,j)= u(i,j)*G(o c )+u(i,j)*G(o s ), 
L (i,j) = +[u(i,j)*G({J c )-u(i,j)*G({J s )]. 
The continuous nonlinear L1 filter can be defined if equivalence to l (eq. 3) 
is observed only up to the second order term of power series for u(ij): 
Li (i,j)=u 2 (i,j)*G({ c )+u 2 (i,j)*G(c s )-C.[u(i,j)*G({ c )].[u(i,j)*G(c s )]; 
u2(ij) corresponds to full-wave rectification and seems to be common in 
area V1 complex neurons; C = 2/Erf2(n/2 /2) is a constant, and Erf0 is an 
error function. 
3.3 THE ROLE OF MT NEURONS IN SFM PERCEPTION. 
Expanding z(x,y) function in (1) into power series around an arbitrary 
point and truncating above the second order term yields: 
u(x,y)=w(ax2+by2+cxy+dx+ey+f)/zo, where a,b,c,d,e,f are expansion 
coefficients. We assume that w is known (from proprioceptive input) and 
=1. Then z 0 remans an unresolved scaling factor and we omit it for 
simplicity. 
974 Buracas and Albright 
DATA MODEL 
1 
O. L 2 
1/4 1/2 I 2 4 1/4 1/2 I 2 4 
Surround/Center speed ratio 
L1 response in slant space 
Fig. 3: The comparison between data 
[9] and model velocity tuning curves 
for RF surrounds. The standard 
mapping stimuli (optimaly moving bar 
in the center of RF, an annulus of 
random dots with varying speed) were 
applied to L 1 and L 2 filters. Thee 
output of the filters was passed 
through a sigmoid transfer function to 
accout for a logarithmic compresion in 
the data. 
Fig. 4: Below, left: the response profile 
of the L1 filter in orientation space (x 
and y axes represent the components of 
normal vector ). Right: the response 
profile of the L 2 filter in curvature 
space. x and y axes represent the two 
normal principal curvatures. 
L2 response in curvature space 
1C 
-10 
-5 
-10 
-15 -15 
-15 -10 -5 0 5 10 15 -15 -10 -5 0 5 10 15 
Applying L 0 on u(x,y), high spatial frequency information is filtered out, 
but otherwise u(x,y) does not change, i.e. L0*u covaries with lower 
frequencies of u(x,y). L 2 applied on u(x,y) yields: 
2 ((c 2 -(s ) v2u, (4) 
L2 *u=(2a+2b)C2(lJc 2 -{Js )=C2 2 
that is, L 2 shows properties of the second order space-differential operator - 
Laplacian; C2((c 2 - (s 2) is a constant depending only on the widths of the 
center and surround Gaussians. Note that L2*u -- c I + c 2 , (cL2 are 
principal normal curvatures) at singular points of surface z(x,y). 
The Role of MT Neuron Receptive Field Surrounds in Computing Object Shape 975 
When applied on planar stimuli Up(X,y) = d x + e y, L 1 has properties of a 
squared first order differential operator: 
2 2 2 I x +(y) lUp ' (5) 
L 1 *Up =(d 2 +e 2 )C 1 (Oc 2 -o s )- C 1 (o c -o s ) ( )2 2 
where C2(Oc 2 - Os2) is a function of O c and o s only. Thus the output of L1 is 
monotonically related to the norm of gradient vector. It is straightforward 
to calculate the generic second order surface based on outputs of three Lo, 
four L and one L 2 filters. 
Plotting the responses of L and L 2 filters in orientation and curvature 
space can help to estimate the role they play in solving the SFM problem 
(Fig.4). The iso-response lines in the plot reflect the ambiguity of MT filter 
responses. However, these responses covary with useful geometric 
properties of surfaces -- norm of gradient (L l) and mean curvature (L2). 
3.4 EXTRACTING VECTOR QUANTITIES 
Equations (4) and (5) show, that only averaged scalar quantities can be 
extracted by our MT operators. The second order directional derivatives 
for estimating vectorial quantities can be computed using an oriented RFs 
with the following profile: O2=G(x,o s) [G(y,o s) - G(y,oc)]. O1 then can be 
defined by the center - surround relationship of L1 filter. The outputs of 
MT filters L 1 and L 2 might be indispensible in normalizing responses of 
oriented filters. The normal surface curvature can be readily extracted 
using combinations of MT and hypothetical O filters. The oriented spatial 
differential operators have not been found in primate area MT so far. 
However, preliminary data from our lab indicate that elongated RFs may 
be present in areas FST or MST [6]. 
3.5 L2: LAPLACIAN VS. NAKAYAMA'S CONVEXITY OPERATOR 
The physiologically tested ratio of standard deviations for center and sur- 
round Gaussians Os/O c > 7. Thus, besides performing the second order 
differentiation in the low frequency domain, L9. can detect discontinuities 
in optic flow. 
4. CONCLUSIONS 
We propose that the RF surrounds in MT may enable the neurons to 
function as differential operators. The described operators can be thought 
of as providing a continuous interpolation of cortically represented 
surfaces. 
Our model predicts that elongated RFs with flanking surrounds will be 
found (possibly in areas FST or MST [6]). These RFs would allow extraction 
976 Buracas and Albright 
of the directional derivatives necessary to estimate the principal curvatures 
and the normal vector of surfaces. 
From velocity fields, area MT extracts information relevant to both the 
"where" stream (motion trajectory, spatial orientation and relative distance 
of surfaces) and the "what" stream (curvature of surfaces). 
Acknowledgements 
Many thanks to George Carman, Lisa Croner, and Kechen Zhang for 
stimulating discussions and Jurate Bausyte for helpful comments on the 
poster. This project was sponsored by a grant from the National Eye 
Institute to TDA and by a scholarship from the Lithuanian Foundation to 
GTB. The presentation was supported by a travel grant from the NIPS 
foundation. 
References 
[1] Albright, T.D. (1984) Direction and orientation selectivity of neurons in 
visual area MT of the macaque. J. Neurophysiol., 52: 1106-1130. 
[2] Albright, T.D., R. Desimone. (1987) Local precision of visuotopic 
organization in the middle temporal area (MT) of the macaque. Exp. Brain 
Res., 65, 582-592. 
[3] Allman, J., Miezin, F., McGuinnes. (1985) Stimulus specific responses 
from beyond the classical receptive field. Ann. Rev. Neurosci., 8, 407-430. 
[4] Born R.T. & Tootell R.B.H. (1992) Segregation of global and local motion 
processing in primate middle temporal visual area. Nature, 357, 497-499. 
[5] Born R.T. & Tootell R.B.H. (1993) Center - surround interactions in 
direction - selective neurons of primate visual area MT. Neurosci. Abstr., 
19, 315.5. 
[6] Carman G.J., unpublished results. 
[7] Hussain M., Treue S. & Andersen R.A. (1989) Surface interpolation in 
three-dimensional Structure-from-Motion perception. Neural Computation, 
1, 324-333. 
[8] Siegel, R.M. and R.A. Andersen. (1987) Motion perceptual deficits 
following ibotenic acid lesions of the middle temporal area in the behaving 
rhesus monkey. Soc. Neurosci.Abstr., 12, 1183. 
[9]Tanaka, K., Hikosaka, K., Saito, H.-A., Yukie, M., Fukada, Y., Iwai, E. 
(1986) Analysis of local and wide-field movements in the superior temporal 
visual areas of the macaque monkey. J. Neurosci., 6, 134-144. 
